fix: Initialize metrics on startup to fix restart gap; ProbeTx can be disabled#6
fix: Initialize metrics on startup to fix restart gap; ProbeTx can be disabled#6
Conversation
There was a problem hiding this comment.
Pull request overview
Ensures flashblocks stream metrics are present immediately after startup/restart by initializing metrics for all configured streams and adding a stream connectivity gauge.
Changes:
- Initialize flashblocks stream metrics to
0for all configured streams during startup. - Add
flashblocks_stream_upgauge and record1/0on successful connect / failures. - Standardize metric labels across flashblocks stream metrics (e.g.,
stream_type,network_id).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| server/l2/flashblocks_monitor.go | Initializes per-stream metrics on startup; adds stream_type labeling and records flashblocks_stream_up on connect/failure paths. |
| metrics/metrics.go | Registers the new flashblocks_stream_up gauge instrument. |
| metrics/exports.go | Exports the new gauge and wires its setup into metric initialization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| processingContext, cancel := context.WithCancel(processingContext) | ||
| fm.stop = cancel | ||
|
|
||
| fm.initializeMetricsFlashblocksStreams(ctx) | ||
|
|
||
| for stream, url := range fm.cfg.privateStreams { | ||
| fm.readStream(ctx, stream, url, fm.flashblocksPrivate) | ||
| fm.readStream(ctx, stream, url, "private", fm.flashblocksPrivate) | ||
| } | ||
| for stream, url := range fm.cfg.publicStreams { | ||
| fm.readStream(ctx, stream, url, fm.flashblocksPublic) | ||
| fm.readStream(ctx, stream, url, "public", fm.flashblocksPublic) |
There was a problem hiding this comment.
This code sets fm.stop to cancel processingContext, but initializeMetricsFlashblocksStreams and the stream goroutines are started with ctx instead of processingContext. If fm.stop() is called, processingContext will be canceled but the streams may keep running (and continue recording metrics) because they’re not observing the canceled context. Pass processingContext to initializeMetricsFlashblocksStreams(...) and to readStream(...) so shutdown behaves as intended.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Overview
Enabledbool toProbeTxconfig — probes now require bothenabled: trueand a private key to activateFixes
stream_typelabel toflashblocks_receive_failure_countWhy metric changes
After service restarts, metrics disappear until their first events occur — creating gaps in observability and preventing alerts from firing despite
chain-monitorbeing healthy. This PR fixes that by initializing all synchronous metrics to 0 at startup for all known label combinations.New metric
FlashblocksStreamUpgauge to track stream connectivity (1=up, 0=down)Metric zero-initialization at startup
FlashblocksMonitor.initializeMetricsFlashblocksStreams(): initializes per-stream metrics for all configured private/public streams:flashblocks_stream_upflashblocks_receive_success_countflashblocks_receive_failure_countBlockInspector.initializeMetrics(): initializes block inspector metrics:blocks_seen_count,blocks_landed_count,blocks_missed_count,block_missed(always)flashblocks_landed_count,flashblocks_missed_count(when flashblock number contract configured)flashtestations_landed_count,flashtestations_missed_count,workload_added_to_policy_error_count(when builder policy contract configured)registered_flashtestations_error_count(when flashtestation registry contract configured)