core, eth, miner: add metrics to track block creation and write pipeline latency#2090
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (70.45%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## v2.6.2-candidate #2090 +/- ##
===================================================
Coverage ? 49.89%
===================================================
Files ? 877
Lines ? 151448
Branches ? 0
===================================================
Hits ? 75565
Misses ? 70813
Partials ? 5070
🚀 New features to boost your workflow:
|
manav2401
left a comment
There was a problem hiding this comment.
LGTM except a small nit and a lint issue.
…s and fixed lint and addressed comments
|
@claude - review this PR |
|
Claude finished @pratikspatil024's task in 1m 21s —— View job PR #2090 ReviewSummary: This PR adds timing metrics to track the critical path between block sealing and broadcasting, which will help diagnose the 10+ second delays observed during stress testing that caused span rotations. Overall Assessment: ✅ LGTMThis is a well-structured PR that adds observability to an important part of the block production pipeline. The metrics are placed at the right points in the code and follow existing patterns in the codebase. Detailed Review1. Metrics Definition (core/blockchain.go:130-139) The new metrics are well-documented with clear comments explaining what each tracks:
Using 2. Witness Timing in writeBlockWithState (core/blockchain.go:2268-2295) The timing instrumentation correctly wraps:
One minor note: The 3. WitnessCollection Timing (core/state/statedb.go:1334-1378, 1443-1450) The timing correctly captures both:
The accumulation pattern ( 4. NewMinedBlockEvent Enhancement (core/events.go:33-37) Adding 5. Seal-to-Broadcast Latency (eth/handler.go:70-74, 794-802) The
6. WriteBlockAndSetHead Timer (miner/worker.go:108-111, 972-974) The 7. Tests Comprehensive unit tests added for each new metric:
Minor Suggestions (Non-blocking)
Codecov NoteThe 1 missing line in Verdict: The PR is ready to merge. The metrics are well-placed, properly documented, and the tests provide good coverage. This will provide valuable observability into block production latency. |
| witnessBytes := witBuf.Bytes() | ||
|
|
||
| writeStart := time.Now() | ||
| log.Debug("Writing witness", "block", block.NumberU64(), "hash", block.Hash(), "header", statedb.Witness().Header()) |
There was a problem hiding this comment.
One idea is that we can add warning logs if the time of any operation is greater than a threshold, e.g. 100ms.
There was a problem hiding this comment.
Done here. Added for witness encode, witness write, batch flush and state commit
|



Description
During stress testing, we observed span rotations caused by 10+ second delays between block sealing and broadcasting. The delay occurs in
WriteBlockAndSetHead(called inresultLoop), which includes witness encoding, batch disk writes, and state commits, but none of these operations had metrics.This PR adds timing metrics across the critical path in
writeBlockWithStateso we can immediately identify the bottleneck next time:chain/witness/encode- witness RLP encodingchain/witness/dbwrite- witness batch insertionchain/witness/collection- trie node collection into witness during IntermediateRootchain/batch/write- blockBatch.Write() flush to disk (DB compaction stalls surface here)chain/state/commit- CommitWithUpdate (pathdb diff layer flushes surface here)worker/writeBlockAndSetHead- total WriteBlockAndSetHead durationeth/seal2broadcast- latency from write completion to broadcast startChanges
Breaking changes
Please complete this section if any breaking changes have been made, otherwise delete it
Nodes audience
In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)
Checklist
Cross repository changes
Testing
Manual tests
Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it
Additional comments
Please post additional comments in this section if you have them, otherwise delete it