Skip to content

Conversation

pietern
Copy link
Contributor

@pietern pietern commented May 9, 2025

What changes are proposed in this pull request?

Never log non-ASCII or non-printable characters in log output.

Request logs may include binary data if it isn't JSON. This can affect the terminal state.

Motivated by databricks/cli#2804.

How is this tested?

Unit test that confirms that non-printable ASCII and UTF-8 characters are replaced with a ..

@pietern pietern requested review from denik and renaudhartert-db May 9, 2025 09:05
@pietern pietern temporarily deployed to test-trigger-is May 9, 2025 09:05 — with GitHub Actions Inactive
@pietern pietern temporarily deployed to test-trigger-is May 9, 2025 09:06 — with GitHub Actions Inactive
Copy link

github-actions bot commented May 9, 2025

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-go

Inputs:

  • PR number: 1218
  • Commit SHA: e23bee8d9901e058c3a30d6edd8012f1de4c158a

Checks will be approved automatically on success.


func TestBodyLoggerNotAscii(t *testing.T) {
dump := bodyLogger{debugTruncateBytes: 20}.redactedDump("", []byte("\x01 🚀 🚀"))
assert.Equal(t, dump, ". . .")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why 🚀 is filtered out? It's printable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it helps to see a lot of dots if request body is a large gzipped file? Compressing them all into single <binary> would be more useful for logs.

There could also be a case where users actually need to see the response (for debugging). In that case applying lossy transformation makes that impossible and thus worse than having binary in the output (which you could use if you really wanted to).

If decompressing is not possible / out of scope then printing escaped string could be next best option, e.g. we can adopt Python's approach:

>>> open('cli', 'rb').read(100)
b'\xcf\xfa\xed\xfe\x0c\x00\x00\x01\x00\x00\x00\x00\x02\x00\x00\x00\x12\x00\x00\x00\xd8\n\x00\x00\x04\x00 \x00\x00\x00\x00\x00\x19\x00\x00\x00H\x00\x00\x00__PAGEZERO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

}
line = strings.Trim(line, "\r")
sb.WriteString(fmt.Sprintf("%s%s", prefix, line))
sb.WriteString(prefix)
Copy link
Contributor

@denik denik May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's maybe out of scope for this PR but I'd rather it decompressed the response (based on content-encoding header) and printed it.

If the reason I use debug log is to see requests & responses then that's what I'd be interested in.

Copy link
Contributor

@renaudhartert-db renaudhartert-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we considered escaping the string so that the conversion remains lossless?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants