Skip to content

jaeger_mcp: trace tool invocations via MCP middleware (otelhttp for transport)#8160

Open
SoumyaRaikwar wants to merge 15 commits intojaegertracing:mainfrom
SoumyaRaikwar:feat/jaegermcp-observability
Open

jaeger_mcp: trace tool invocations via MCP middleware (otelhttp for transport)#8160
SoumyaRaikwar wants to merge 15 commits intojaegertracing:mainfrom
SoumyaRaikwar:feat/jaegermcp-observability

Conversation

@SoumyaRaikwar
Copy link
Contributor

@SoumyaRaikwar SoumyaRaikwar commented Mar 11, 2026

Which problem is this PR solving?

  • Makes MCP tool usage visible via traces at the MCP tool boundary.
  • Aligns jaeger_mcp observability with maintainer guidance: standard otelhttp handles transport-level observability on /mcp, while MCP-specific semantics remain in a single MCP middleware.

Description of the changes

  • Kept /mcp transport observability on standard otelhttp.NewHandler(...) in server.go.
  • Merged logging + tracing into a single MCP middleware in middleware.go (no per-tool decorators).
  • Tool spans are created only in middleware on tools/call:
    • span name: mcp.tool.<toolName>
    • attributes: mcp.tool.name, mcp.status
  • Normalized tool statuses to:
    • ok
    • invalid_argument
    • not_found
    • error
  • Recorded errors and set span status on failure paths.
  • Kept structured logging at the tool boundary for success/failure visibility.
  • No custom transport metrics were added.

How was this change tested?

  • make fmt

  • make lint

  • make test

  • Manual runtime verification with real MCP calls:

    • initialize
    • tools/callget_services
    • tools/callget_span_names
    • tools/callsearch_traces
    • failure path: tools/callget_trace_topology with invalid trace id
  • Verified in Jaeger UI that traces include:

    • transport span: jaeger_mcp
    • tool spans: mcp.tool.get_services, mcp.tool.get_span_names, mcp.tool.search_traces, mcp.tool.get_trace_topology
    • failure span status/attributes for invalid input path

Checklist

AI Usage in this PR (choose one)

  • None
  • Light
  • Moderate
  • Heavy

Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
@SoumyaRaikwar SoumyaRaikwar requested a review from a team as a code owner March 11, 2026 01:42
Copilot AI review requested due to automatic review settings March 11, 2026 01:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@SoumyaRaikwar
Copy link
Contributor Author

@yurishkuro please take a look whenever you have chance.

@github-actions
Copy link

github-actions bot commented Mar 11, 2026

CI Summary Report

Metrics Comparison

❌ 36 metric change(s) detected

View changed metrics

metrics_snapshot_cassandras_4.x_v004_e2e_auto

metrics_snapshot_cassandras_4.x_v004_e2e_manual

metrics_snapshot_cassandras_5.x_v004_e2e_auto

metrics_snapshot_cassandras_5.x_v004_e2e_manual

metrics_snapshot_elasticsearch_9.x_e2e
3 removed

  • jaeger_storage_latency_seconds
  • jaeger_storage_requests
  • rpc_server_call_duration_seconds

Code Coverage

✅ Coverage 96.8% (baseline 96.8%)

➡️ View CI run | View publish logs
2026-03-21 07:00:56 UTC

Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 98.51852% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.66%. Comparing base (c3164c3) to head (8a148a8).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
.../jaeger/internal/extension/jaegermcp/middleware.go 98.37% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8160      +/-   ##
==========================================
+ Coverage   95.63%   95.66%   +0.03%     
==========================================
  Files         319      319              
  Lines       16793    16892      +99     
==========================================
+ Hits        16060    16160     +100     
+ Misses        579      578       -1     
  Partials      154      154              
Flag Coverage Δ
badger_direct 9.05% <ø> (ø)
badger_e2e 1.04% <ø> (ø)
cassandra-4.x-direct-manual 13.25% <ø> (ø)
cassandra-4.x-e2e-auto 1.03% <ø> (ø)
cassandra-4.x-e2e-manual 1.03% <ø> (ø)
cassandra-5.x-direct-manual 13.25% <ø> (ø)
cassandra-5.x-e2e-auto 1.03% <ø> (ø)
cassandra-5.x-e2e-manual 1.03% <ø> (ø)
clickhouse 1.16% <ø> (ø)
elasticsearch-6.x-direct 16.83% <ø> (ø)
elasticsearch-7.x-direct 16.86% <ø> (ø)
elasticsearch-8.x-direct 17.01% <ø> (ø)
elasticsearch-8.x-e2e 1.04% <ø> (-0.05%) ⬇️
elasticsearch-9.x-e2e 1.09% <ø> (+0.04%) ⬆️
grpc_direct 7.79% <ø> (ø)
grpc_e2e 1.04% <ø> (ø)
kafka-3.x-v2 1.04% <ø> (ø)
memory_v2 1.04% <ø> (ø)
opensearch-1.x-direct 16.91% <ø> (ø)
opensearch-2.x-direct 16.91% <ø> (ø)
opensearch-2.x-e2e 1.04% <ø> (ø)
opensearch-3.x-e2e 1.04% <ø> (ø)
query 1.04% <ø> (ø)
tailsampling-processor 0.52% <ø> (ø)
unittests 94.36% <98.51%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jkowall jkowall requested a review from Copilot March 11, 2026 09:37
@jkowall jkowall added the changelog:experimental Change to an experimental part of the code label Mar 11, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yurishkuro
Copy link
Member

Adds operator-facing visibility into tool usage, latency, failures, and response size/count with consistent status labeling.

All of this can be achieved with standard OTEL instrumentation for HTTP, why do we need any of this custom code?

@github-actions github-actions bot added the waiting-for-author PR is waiting for author to respond to maintainer's comments label Mar 11, 2026
@SoumyaRaikwar
Copy link
Contributor Author

SoumyaRaikwar commented Mar 11, 2026

Adds operator-facing visibility into tool usage, latency, failures, and response size/count with consistent status labeling.

All of this can be achieved with standard OTEL instrumentation for HTTP, why do we need any of this custom code?

@yurishkuro that is correct and i have updated:
This PR now relies on standard otelhttp instrumentation for transport-level observability on /mcp (HTTP duration/active requests/transport metrics), and removes overlapping custom transport metrics from the MCP wrapper.
The remaining custom logic is limited to MCP semantics that HTTP alone cannot distinguish on a single /mcp endpoint:

  • normalized tool status (ok|invalid_argument|not_found|error)
  • tool-level result_count (response_items metric)
  • compact structured logs at tool boundary (safe fields only)

I also added low-cardinality otelhttp labeler attributes:

  • mcp.tool_name
  • mcp.status

@github-actions github-actions bot removed the waiting-for-author PR is waiting for author to respond to maintainer's comments label Mar 11, 2026
…l semantics

Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
@SoumyaRaikwar SoumyaRaikwar changed the title jaegermcp: add centralized MCP tool observability (metrics + structured logs) jaeger_mcp: use otelhttp for transport observability and keep minimal tool semantics Mar 11, 2026
@SoumyaRaikwar
Copy link
Contributor Author

I also updated the pr title and description to reflect these changes

Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Copilot AI review requested due to automatic review settings March 11, 2026 18:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@yurishkuro
Copy link
Member

I prefer not to collect metrics, but collect traces that capture tool usage. If someone wants metrics they can then transform traces to metrics using OTEL processors.

If you refactor to collect traces please include screenshots of how they would look in Jaeger.

@github-actions github-actions bot added the waiting-for-author PR is waiting for author to respond to maintainer's comments label Mar 19, 2026
@SoumyaRaikwar SoumyaRaikwar changed the title jaeger_mcp: use otelhttp for transport observability and keep minimal tool semantics jaeger_mcp: use per-tool tracing for MCP observability (with otelhttp transport instrumentation) Mar 20, 2026
…aegermcp-observability

# Conflicts:
#	cmd/jaeger/internal/extension/jaegermcp/server.go
Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Copilot AI review requested due to automatic review settings March 20, 2026 07:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Co-authored-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>
Signed-off-by: Soumya Raikwar <164396577+SoumyaRaikwar@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 21, 2026 03:40
Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

}
}

func instrumentTool[In, Out any](
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to use decorator when the MCP SDK already supports mcp.Middleware, which we already use for logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will switch this from per-tool decorator wrapping to MCP middleware so we use one consistent mechanism with the existing logging middleware. The middleware will create spans for tools/call, extract params.name as mcp.tool.name, and set normalized mcp.status. I will remove the instrumentTool decorator path from registration, does this sound good?

Copy link
Member

@yurishkuro yurishkuro Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and move tracing decorator and logging decorator into the same new middleware.go file (I already posted this earlier #8160 (comment))

Copy link
Contributor Author

@SoumyaRaikwar SoumyaRaikwar Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched from per-tool decorators to MCP middleware and merged logging + tracing into a single middleware.go, as requested. Tool spans are now created only in the middleware on tools/call, with mcp.tool.name and mcp.status attributes, and failures set span status/error. No custom transport metrics. I have verified live traces locally: you can see mcp.tool.* spans under the jaeger_mcp HTTP span. Screenshot attached.
Screenshot from 2026-03-21 11-06-20

Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Copilot AI review requested due to automatic review settings March 21, 2026 06:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@SoumyaRaikwar SoumyaRaikwar changed the title jaeger_mcp: use per-tool tracing for MCP observability (with otelhttp transport instrumentation) jaeger_mcp: trace tool invocations via MCP middleware (otelhttp for transport) Mar 21, 2026
Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog:experimental Change to an experimental part of the code enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants