Ship botlog URI with query params intact#9
Merged
Conversation
- Add ParsedLine.RawURI (path + query string) alongside RawPath so observers see the full hit URL; populated in all three parse paths (text/$request, text/$uri, JSON request_uri) - Switch botlog Observer to prefer RawURI with fallback chain RawURI -> RawPath -> URI; gatesrv-side HasQueryParams now works on bot-log events instead of always returning false - Bump DefaultBatchInterval 5s -> 30s; quiet hosts were flushing ~50-event payloads every tick, larger batches compress better - Cover the new field across both parsers including percent-encoded query strings, plus the observer fallback chain
- RawPath was only read by botlog Observer as the middle of a three-level fallback (RawURI -> RawPath -> URI). All three parser paths populate RawPath and RawURI together, so RawPath was never the only field set — middle fallback was unreachable - Remove rawPathFromRequest and rawPathFromJSON; this eliminates a duplicate strings.SplitN per parsed line on the hot path - Collapse observer fallback to RawURI -> URI; drop the synthetic TestObserver_FallsBackToRawPathWhenRawURIEmpty that exercised the unreachable state - Rename TestParsedLine_RawPathUnnormalized to *_RawURIUnnormalized and drop wantRawPath assertions; recordedLine.rawPath gone too
- Add Config.URITruncate (default 2048) symmetrical to UATruncate; IE's legacy 2083-char URL limit covers >99% of legitimate traffic - Observer truncates Event.URI before BuildEvent so pathological bot probes (4-8KB base64/SQLi payloads in query strings) cannot bloat WAL writes or the gatesrv POST body - Pre-fix RawPath stripped queries kept URIs naturally short; with RawURI carrying the query verbatim we need an explicit cap
- BatchInterval 5s -> 30s - Add URITruncate row (default 2048)
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a bot-logs UI bug:
Event.URIshipped to gatesrv had its query string stripped, soHasQueryParamson the gatesrv side always returned false.Three commits, each independent:
1.
4f0704eShip botlog URI with query params intactParsedLine.RawURI(path + query string) — populated in all three parse paths: text/$request, text/$uri, JSONrequest_uriObservernow ships the full URI instead of the path-only formDefaultBatchInterval5s → 30s — quiet hosts were flushing ~50-event payloads every tick; larger batches compress better2.
50cb7c7Drop redundant ParsedLine.RawPathCleanup found by
/simplifyreview:RawPathandRawURIwere always populated together — the middle of theRawURI → RawPath → URIfallback was never reachableRawPathfield plus the helpersrawPathFromRequest/rawPathFromJSON, eliminating a duplicatestrings.SplitNper parsed line on the hot pathRawURI → URI3.
8ca89d7Cap botlog Event.URI at URITruncateDefense-in-depth for the now-full-query URIs:
Config.URITruncate(default 2048) symmetrical toUATruncateRawPathstripped queries kept URIs naturally short; withRawURIwe need an explicit capTest plan
make fmt lint test— 0 issues, all greenTestObserver_UsesRawURIOverURI— Event.URI carries query stringTestObserver_FallsBackToURIWhenRawURIEmpty— legacy$urifallbackTestObserver_TruncatesURI— 5KB URI capped at 2048TestParsedLine_RawURIUnnormalizedcovering both parsers and percent-encoded chars (%20,%2F,%3D,%26)?...in URI and gatesrv UI groups with/without query params correctlyOut of scope (deliberate)
stripQuerywas an implicit privacy guard; now they ship to gatesrv. Decision is to handle redaction on the gatesrv side or accept the risk (trusted infra). Worth a follow-up conversation.