feat(telemetry): expose OTel metrics via Prometheus /v1/metrics endpoint#6034
Draft
cdoern wants to merge 1 commit into
Draft
feat(telemetry): expose OTel metrics via Prometheus /v1/metrics endpoint#6034cdoern wants to merge 1 commit into
cdoern wants to merge 1 commit into
Conversation
OGX previously exported metrics only through OTLP push to an OTel Collector. This adds an optional Prometheus scrape endpoint so scrape-based monitoring systems can collect the existing metrics. When OGX_PROMETHEUS_ENABLED is set, setup_telemetry() attaches a PrometheusMetricReader to the MeterProvider alongside the existing OTLP reader, and the Inspect API serves all metrics at /v1/metrics in Prometheus exposition format. The endpoint opts out of authentication via PUBLIC_ROUTE_KEY, returns 404 when disabled, and is excluded from RequestMetricsMiddleware. The OTLP push path continues to work independently, so both export paths can run at once. Adds unit tests covering the Prometheus format and endpoint behavior, and a server-mode integration test that scrapes the live endpoint. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>
838dca0 to
e451304
Compare
rhdedgar
approved these changes
Jun 5, 2026
rhdedgar
left a comment
Contributor
There was a problem hiding this comment.
Nice, this covers all the criteria from RHAIENG-5156. +1
Contributor
|
This pull request has merge conflicts that must be resolved before it can be merged. @cdoern please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds an optional Prometheus scrape endpoint at
/v1/metricsthat exposes the existing OTel metrics in Prometheus exposition format, alongside the current OTLP push path. This unblocks scrape-based monitoring systems that need a Prometheus-compatible endpoint rather than OTLP push.Resolves RHAIENG-5156.
How
opentelemetry-exporter-prometheus(pulls inprometheus-client).setup_telemetry()(src/ogx/telemetry/__init__.py): now builds a list of metric readers — the existing OTLPPeriodicExportingMetricReaderwhenOTEL_EXPORTER_OTLP_ENDPOINTis set, plus aPrometheusMetricReaderwhenOGX_PROMETHEUS_ENABLEDis truthy. Both attach to a single globalMeterProvider, so the two export paths run independently.src/ogx_api/inspect_api/fastapi_routes.py):GET /v1/metricsis declared on the Inspect API router alongside/v1/healthand/v1/version. It servesprometheus_client.generate_latest()with the Prometheus content type, opts out of auth viaPUBLIC_ROUTE_KEY, and returns404when the feature is disabled./v1/metricsis added to_EXCLUDED_PATHSinRequestMetricsMiddlewareso scrapes don't inflate request counters. Auth is handled automatically byPUBLIC_ROUTE_KEY(no manual bypass needed, since the route is a registered API route).Acceptance criteria
/v1/metricsin Prometheus exposition formatOGX_PROMETHEUS_ENABLEDTest plan
Unit tests (
tests/unit/telemetry/test_prometheus_metrics.py, 17 cases): env-flag parsing, Prometheus-format exposition with labels/values, the route viaTestClient(200 +text/plainwhen enabled, 404 when disabled),PUBLIC_ROUTE_KEYpresence, and the_EXCLUDED_PATHSguard.Integration tests (
tests/integration/inspect/test_metrics_endpoint.py, server mode): scrape the live/v1/metricsover raw HTTP, assert Prometheus format andogx_requests_total, no-auth access, and that the metrics route is not self-counted.scripts/integration-tests.shsetsOGX_PROMETHEUS_ENABLEDfor native server-mode runs; the tests skip otherwise.Manual run against a live server confirmed
200+Content-Type: text/plain; version=1.0.0; charset=utf-8,ogx_requests_total/ogx_request_duration_secondspresent, scrapes excluded from counters, and404when the flag is unset.🤖 Generated with Claude Code