feat(monitor): Go SDK dashboard generator — ADR-007 Step 2a#8216
feat(monitor): Go SDK dashboard generator — ADR-007 Step 2a#8216abhay1999 wants to merge 2 commits intojaegertracing:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a standalone Go-based generator for the Jaeger Grafana dashboard (ADR-007 Step 2a), switching the emitted panels to native Grafana timeseries panels and checking the generated v2 dashboard JSON into the repo for review and side-by-side comparison.
Changes:
- Introduce
monitoring/jaeger-mixin/generate/as a standalone Go module usinggrafana-foundation-sdk/goto generate the v2 dashboard JSON. - Add
dashboard-for-grafana-v2.json(generated) and mount it in the monitoring docker-compose setup alongside the existing dashboard. - Exclude the generator module directory from Codecov coverage reporting.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| monitoring/jaeger-mixin/generate/main.go | Implements the Go generator that builds the v2 dashboard (rows/panels/PromQL). |
| monitoring/jaeger-mixin/generate/go.mod | Adds a standalone Go module for the generator with grafana foundation SDK dependency. |
| monitoring/jaeger-mixin/generate/go.sum | Records dependency checksums for the generator module. |
| monitoring/jaeger-mixin/dashboard-for-grafana-v2.json | Adds the generated Grafana v2 dashboard JSON for review/provisioning. |
| docker-compose/monitor/docker-compose.yml | Mounts the v2 dashboard into Grafana provisioning for side-by-side comparison. |
| .codecov.yml | Excludes the generator directory from coverage reporting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Editable(). | ||
| Refresh("30s"). | ||
| Time("now-1h", "now"). | ||
| Timezone(common.TimeZoneBrowser). |
There was a problem hiding this comment.
The v2 generator doesn't configure any templating variables or annotations. The existing Jaeger dashboard includes a Prometheus datasource selector via templating, which helps users with multiple datasources (and matches other dashboards in this repo that emit templating.list / annotations.list). Consider adding templating/annotations configuration so the generated JSON conforms to the usual dashboard schema and retains the datasource selector.
| Timezone(common.TimeZoneBrowser). | |
| Timezone(common.TimeZoneBrowser). | |
| // Configure templating so the dashboard includes a Prometheus datasource | |
| // selector, matching the original Jaeger dashboard and other mixin dashboards. | |
| Templating( | |
| dashboard.NewTemplatingBuilder(). | |
| WithDatasourceTemplate( | |
| "datasource", // variable name | |
| "Prometheus", // label shown in the UI | |
| "prometheus", // query to list Prometheus datasources | |
| ), | |
| ). | |
| // Configure default annotations so the dashboard emits annotations.list. | |
| Annotations( | |
| dashboard.NewAnnotationsBuilder(). | |
| WithBuiltInAnnotationsAndAlerts(), | |
| ). |
| { | ||
| "type": "timeseries", | ||
| "targets": [ | ||
| { | ||
| "expr": "sum(rate(otelcol_receiver_refused_spans_total[1m])) or vector(0)", | ||
| "legendFormat": "error" | ||
| }, | ||
| { | ||
| "expr": "sum(rate(otelcol_receiver_accepted_spans_total[1m]))", | ||
| "legendFormat": "success" | ||
| } | ||
| ], | ||
| "title": "Span Ingest Rate", | ||
| "transparent": false, |
There was a problem hiding this comment.
The timeseries panels in this dashboard JSON do not include an id field at all. Other provisioned dashboards in this repo include an id for every panel, and Grafana relies on these IDs for panel-level operations (links, repeats, edits). Please ensure each panel gets a unique id (ideally stable across regenerations) and regenerate the output.
| func buildDashboard() (dashboard.Dashboard, error) { | ||
| builder := dashboard.NewDashboardBuilder("Jaeger (v2)"). | ||
| Uid("jaeger-v2"). | ||
| Tags([]string{"jaeger"}). | ||
| Editable(). | ||
| Refresh("30s"). | ||
| Time("now-1h", "now"). | ||
| Timezone(common.TimeZoneBrowser). | ||
|
|
||
| // ── Row 1: Collector - Ingestion ─────────────────────────────────────── | ||
| WithRow(dashboard.NewRowBuilder("Collector - Ingestion")). | ||
| WithPanel(spanIngestRatePanel()). | ||
| WithPanel(spansRefusedPctPanel()). | ||
|
|
||
| // ── Row 2: Collector - Export ────────────────────────────────────────── | ||
| WithRow(dashboard.NewRowBuilder("Collector - Export")). | ||
| WithPanel(spanExportRatePanel()). | ||
| WithPanel(exportSuccessRatePanel()). | ||
|
|
||
| // ── Row 3: Storage ───────────────────────────────────────────────────── | ||
| WithRow(dashboard.NewRowBuilder("Storage")). | ||
| WithPanel(storageRequestRatePanel()). | ||
| WithPanel(storageLatencyP99Panel()). | ||
|
|
||
| // ── Row 4: Query ─────────────────────────────────────────────────────── | ||
| WithRow(dashboard.NewRowBuilder("Query")). | ||
| WithPanel(queryRequestRatePanel()). | ||
| WithPanel(queryLatencyP99Panel()). | ||
|
|
||
| // ── Row 5: System ────────────────────────────────────────────────────── | ||
| WithRow(dashboard.NewRowBuilder("System")). | ||
| WithPanel(cpuUsagePanel()). | ||
| WithPanel(memoryRSSPanel()) | ||
|
|
||
| return builder.Build() |
There was a problem hiding this comment.
The generator currently relies on SDK defaults for panel IDs, which results in dashboard-for-grafana-v2.json having duplicate row IDs and missing panel IDs. Please set explicit, unique IDs for every row and panel (preferably deterministic so diffs stay stable across regenerations) before marshaling the dashboard.
Add monitoring/jaeger-mixin/generate/ — a standalone Go module that produces dashboard-for-grafana-v2.json using grafana-foundation-sdk/go. All 10 panels are native timeseries (React-based), replacing the deprecated Angular graph panels emitted by grafana-builder/Jsonnet. Mount the v2 dashboard alongside the existing one in the SPM docker-compose stack for side-by-side comparison before the Jsonnet cutover (Step 2b). - monitoring/jaeger-mixin/generate/main.go: dashboard definition in Go - monitoring/jaeger-mixin/generate/go.mod: standalone module (grafana-foundation-sdk/go v0.0.12) - monitoring/jaeger-mixin/dashboard-for-grafana-v2.json: generated output - docker-compose/monitor/docker-compose.yml: add second volume mount for v2 dashboard - .codecov.yml: exclude generate/ (build tool, not production code) Relates to: jaegertracing#5833 Signed-off-by: abhay1999 <abhaychaurasiya19@gmail.com>
4373e6e to
ba58120
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8216 +/- ##
=======================================
Coverage 95.61% 95.61%
=======================================
Files 319 319
Lines 16793 16793
=======================================
Hits 16056 16056
Misses 582 582
Partials 155 155
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
please address all comments |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return timeseries.NewPanelBuilder(). | ||
| Id(id). | ||
| Title(title). | ||
| Span(12). | ||
| Height(8). |
There was a problem hiding this comment.
The shared panel helpers don’t set a lower y-axis bound or null-as-zero handling. The v1 dashboard sets min: 0 and nullPointMode: "null as zero" on all panels; without similar settings, the v2 visuals can diverge (negative axes, gaps instead of zeros). Consider adding Min(0) (or the SDK equivalent) and configuring null handling in these helper builders so all panels inherit it.
| } | ||
|
|
||
| func spansRefusedPctPanel() *timeseries.PanelBuilder { | ||
| return stackedPanel(3, "% Spans Refused"). |
There was a problem hiding this comment.
spansRefusedPctPanel uses stackedPanel(...), enabling stacking for a percent-of-total visualization. Stacking percentages across series can exceed 100% and is usually misleading; it also contradicts the stackedPanel doc comment (“Use for rate/count panels”). Consider switching this to timeseriesPanel(...) (no stacking) and, if desired, keep the unit/max settings.
| return stackedPanel(3, "% Spans Refused"). | |
| return timeseriesPanel(3, "% Spans Refused"). |
| } | ||
|
|
||
| func exportSuccessRatePanel() *timeseries.PanelBuilder { | ||
| return stackedPanel(6, "Export Success Rate %"). |
There was a problem hiding this comment.
exportSuccessRatePanel uses stackedPanel(...), which enables stacking for a percentage metric. For success-rate percentages, stacking multiple exporters can produce values >100% and make the panel hard to interpret. Consider using timeseriesPanel(...) here (no stacking) and keep the percent unit/max.
| return stackedPanel(6, "Export Success Rate %"). | |
| return timeseriesPanel(6, "Export Success Rate %"). |
| Editable(). | ||
| Refresh("30s"). | ||
| Time("now-1h", "now"). | ||
| Timezone(common.TimeZoneBrowser). | ||
| // Prometheus datasource selector — matches the original Jaeger dashboard |
There was a problem hiding this comment.
This generator changes dashboard-wide defaults compared to the existing dashboard: refresh is set to 30s (v1 uses 10s) and timezone is set to browser (v1 uses utc). If the goal is a like-for-like translation for side-by-side comparison, consider matching the existing refresh interval/timezone (or document why these defaults intentionally changed).
| func promTarget(expr, legend string) *prometheus.DataqueryBuilder { | ||
| return prometheus.NewDataqueryBuilder(). | ||
| Expr(expr). | ||
| LegendFormat(legend) |
There was a problem hiding this comment.
The dashboard defines a datasource templating variable, but the Prometheus targets/panels produced by this generator don’t appear to bind to it (the generated v2 JSON has no datasource field on panels/targets, unlike v1 which sets "datasource": "$datasource"). This likely means changing the Data Source variable in Grafana won’t affect these panels. Consider setting the datasource reference on each panel (or on each Prometheus query target, depending on the Foundation SDK API) to use the ${datasource} variable.
| LegendFormat(legend) | |
| LegendFormat(legend). | |
| Datasource("${datasource}") |
- Add Prometheus datasource template variable so the dashboard exposes a datasource selector matching the original Jaeger dashboard - Assign unique stable IDs (1-15) to all rows and panels; previously rows had id=0 and timeseries panels had no id field - Fix stacking: P99 latency panels (Storage, Query) and single-metric panels (CPU Usage, Memory RSS) no longer use stacking mode — stacking percentile or single-series data produces misleading visualisations - Regenerate dashboard-for-grafana-v2.json from updated generator Relates to: jaegertracing#5833 Signed-off-by: abhay1999 <abhaychaurasiya19@gmail.com>
b5f43d1 to
cf0b273
Compare
|
@yurishkuro DCO failure is fixed (missing |
|
please keep addressing/responding to bot comments |


What this PR does
Implements ADR-007 Step 2a: introduces a standalone Go generator (
monitoring/jaeger-mixin/generate/) that usesgrafana-foundation-sdk/goto producedashboard-for-grafana-v2.json.All 10 panels are now native
timeseries(React-based), replacing the deprecated Angulargraphpanels emitted by the existinggrafana-builder/Jsonnet toolchain.Changes
monitoring/jaeger-mixin/generate/main.gomonitoring/jaeger-mixin/generate/go.modgrafana-foundation-sdk/go v0.0.12)monitoring/jaeger-mixin/dashboard-for-grafana-v2.jsondocker-compose/monitor/docker-compose.yml.codecov.ymlgenerate/(build tool, not production code)Panels translated (Jsonnet → Go SDK)
How to regenerate
Test plan
Both dashboards mounted side-by-side via
docker-compose/monitor/:docker compose -f docker-compose/monitor/docker-compose.yml up # open http://localhost:3000Screenshots from live validation against microsim traffic below.
Old dashboard (Jsonnet / Angular
graphpanels):New dashboard (Go SDK / native
timeseriespanels):Same queries, same data — zero Angular panels.
Relates to: #5833
ADR: docs/adr/007-grafana-dashboards-modernization.md