Skip to content

add request ID injection to context to enable tracking requests across services #6895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

erlan-z
Copy link
Contributor

@erlan-z erlan-z commented Jul 20, 2025

What this PR does:
This PR adds support for injecting a request ID into the request context to enable tracking requests across services.

  • If the configured request_id header is present and non-empty, its value is used as the request ID.
  • Otherwise, a new UUID is generated. For ruler queries, requestId is always generated.
  • The request ID is stored in the context, accessible via requestmeta.RequestIdFromContext(r.Context()).

This ID is propagated through the service call chain and enables future tracking of downstream operations tied to a single originating request.

To enable request ID propagation, we followed the same mechanism previously used for targetHeaders propagation. As part of this change, we refactored the existing propagation logic to avoid duplication and to make it extensible for future request metadata.

For example, we may later want to add a request-origin field to the context to differentiate between ruler-initiated and ad-hoc queries at the storage layer. With the new shared request metadata structure, such additions can be made with minimal changes to the surrounding code.

Note: While there is no active usage of the request ID yet, the immediate goal is to support tracing storage-related resource usage—particularly identifying heavy or costly queries that stem from a single logical request.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Logs from testsing split queries got same request id

  • query-frontend logs:
ts=2025-07-24T23:14:11.965255Z caller=handler.go:314 level=error header1=val1 header2=val2 x-cortex-request-id=request-id-sharded4 org_id=fake msg="query request received in QFE handler" requestId=request-id-sharded4
  • querier logs:
ts=2025-07-24T23:14:11.971059Z caller=scheduler_processor.go:161 level=error header1=val1 header2=val2 x-cortex-request-id=request-id-sharded4 org_id=fake msg="query request received in Querier" requestId=request-id-sharded4
ts=2025-07-24T23:14:11.971067Z caller=scheduler_processor.go:161 level=error header1=val1 header2=val2 x-cortex-request-id=request-id-sharded4 org_id=fake msg="query request received in Querier" requestId=request-id-sharded4
ts=2025-07-24T23:14:11.97107Z caller=scheduler_processor.go:161 level=error header1=val1 header2=val2 x-cortex-request-id=request-id-sharded4 org_id=fake msg="query request received in Querier" requestId=request-id-sharded4
  • ingester logs:
ts=2025-07-24T23:14:11.980543Z caller=ingester.go:2208 level=error msg="query request received in Ingester" requestId=request-id-sharded4
ts=2025-07-24T23:14:11.980543Z caller=ingester.go:2208 level=error msg="query request received in Ingester" requestId=request-id-sharded4
ts=2025-07-24T23:14:11.980543Z caller=ingester.go:2208 level=error msg="query request received in Ingester" requestId=request-id-sharded4
  • store-gateway:
ts=2025-07-24T23:14:11.985171Z caller=gateway.go:414 level=error msg="query request received in SG" requestId=request-id-sharded4
ts=2025-07-24T23:14:11.985168Z caller=gateway.go:414 level=error msg="query request received in SG" requestId=request-id-sharded4
ts=2025-07-24T23:14:11.985173Z caller=gateway.go:414 level=error msg="query request received in SG" requestId=request-id-sharded4
  • ruler logs:
ts=2025-07-24T23:28:39.885103Z caller=compat.go:193 level=error msg="query request id in ruler request" requestId=1d2f6e12-37b5-40fd-8a75-1021ff7ed114
  • ingester logs:
ts=2025-07-24T23:28:39.887861Z caller=ingester.go:2208 level=error msg="query request received in Ingester" requestId=1d2f6e12-37b5-40fd-8a75-1021ff7ed114

@dosubot dosubot bot added the type/feature label Jul 20, 2025
@erlan-z erlan-z force-pushed the add-request-id branch 3 times, most recently from d57c37c to c50634c Compare July 21, 2025 15:42
Copy link
Contributor

@justinjung04 justinjung04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

require.Equal(t, providedID, requestID, "Request ID from header should be used")
}

func TestExistingRequestIdIsPreserved(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the goal here? Do you imagine a downstream component changing the header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific case—just a precaution to avoid changing the context if the value already exists. This shouldn't normally happen, but in case it does, it's safer not to override it.

@friedrichg friedrichg changed the title add request ID injection to context to enable tracking reqeusts acros… add request ID injection to context to enable tracking requests across services Jul 21, 2025
Copy link
Contributor

@justinjung04 justinjung04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the logs from testing!

@erlan-z erlan-z force-pushed the add-request-id branch 2 times, most recently from c6ee576 to 47a0307 Compare July 22, 2025 01:45
Copy link
Contributor

@danielblando danielblando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code lgtm, but it feels we are rewriting code from the log headers. The logs header is specific for logging, but one of the key parts was transferring ctx between services which seems to be duplicating here. I think we can reuse at least that logic

What if, we have a new config for the requestId header, but use the same middleware but adding the new header as targetHeader?
we can also refactor that util_log context function to a generic util.
would that also work? seems we would avoid some of this new changes

CHANGELOG.md Outdated
@@ -56,6 +56,7 @@
* [ENHANCEMENT] Distributor: Add native histograms max sample size bytes limit validation. #6834
* [ENHANCEMENT] Querier: Support caching parquet labels file in parquet queryable. #6835
* [ENHANCEMENT] Querier: Support query limits in parquet queryable. #6870
* [ENHANCEMENT] API: add request ID injection to context to enable tracking reqeusts across downstream services. #6895
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/reqeusts/requests

@erlan-z erlan-z force-pushed the add-request-id branch 2 times, most recently from d07df8e to 2de8010 Compare July 24, 2025 22:58
…s downstream services

Signed-off-by: Erlan Zholdubai uulu <[email protected]>
@erlan-z
Copy link
Contributor Author

erlan-z commented Jul 24, 2025

The code lgtm, but it feels we are rewriting code from the log headers. The logs header is specific for logging, but one of the key parts was transferring ctx between services which seems to be duplicating here. I think we can reuse at least that logic

What if, we have a new config for the requestId header, but use the same middleware but adding the new header as targetHeader? we can also refactor that util_log context function to a generic util. would that also work? seems we would avoid some of this new changes

Thanks for feedback — changed it to use the same logic for transferring context between services as with targetHeaders. Also refactored the util_log context function to keep it log-specific, and moved the context propagation part into a separate util for request metadata. This keeps the concerns clearly separated and makes it easier to extend in the future.

@erlan-z erlan-z requested a review from justinjung04 July 24, 2025 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants