core/vm, core, miner: fix interrupt propagation to nested EVM calls#2092
Open
core/vm, core, miner: fix interrupt propagation to nested EVM calls#2092
Conversation
…truct field The block-building interrupt flag was threaded as a function parameter through Call(), Run(), ApplyTransaction(), ApplyMessage(), and execute(). This made it easy to pass nil at nested call sites, which is exactly what happened — DelegateCall, StaticCall, CallCode, Create, and even the CALL opcode all passed nil, silently disabling per-opcode interrupt checks for nested execution. Instead, store the interrupt once on the EVM struct via SetInterrupt() and read it in Run(). This removes the parameter from the entire call chain (29 files) and ensures the interrupt is checked at every opcode regardless of call depth.
|
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (68.62%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #2092 +/- ##
===========================================
+ Coverage 50.59% 50.63% +0.03%
===========================================
Files 875 875
Lines 151820 151824 +4
===========================================
+ Hits 76815 76872 +57
+ Misses 69929 69879 -50
+ Partials 5076 5073 -3
... and 19 files with indirect coverage changes
🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Description
Problem
The block-building interrupt flag (
interruptBlockBuilding) is not propagated to nested EVM calls. When the interpreter executes opcodes that make nested calls —CALL,DELEGATECALL,STATICCALL,CALLCODE,CREATE— they all passnilfor the interrupt parameter toevm.Run(). This causesRun()to create a dummynew(atomic.Bool)that never fires, making nested contract execution completely uninterruptible.Since virtually all real transactions involve at least one nested call (proxy patterns, DEX routers, multi-hop swaps), the per-opcode interrupt check is effectively bypassed for the bulk of EVM execution time.
Real-world impact
Block 83,527,864 on Polygon PoS mainnet took 5.654s to build instead of ~2s. The interrupt timer fired on schedule (~1.5s after block building started), but a transaction executing a complex nested call (proxy/DEX pattern) ran for an additional ~4.1 seconds because the interrupt couldn't reach the nested execution. The block was announced 3.7 seconds late.
Root cause
The interrupt flag flows through this chain for the top-level call:
But when opcodes make nested calls, they all pass
nil:When
interruptis nil,interpreter.gocreates a dummynew(atomic.Bool)that is always false — the nested execution never checks the real interrupt flag.Test: proving the bug exists
TestInterruptNestedCallincore/vm/runtime/runtime_test.godemonstrates the issue with two sub-tests:STATICCALL. On pre-fix code, the interrupt is ignored inside the nested call.Pre-fix results (commit
029c58680)To reproduce the bug, checkout the test commit before the fix and run:
git checkout 029c58680 go test -run TestInterruptNestedCall -v ./core/vm/runtime/The nested call took 1.12 seconds — it ran until gas exhaustion, completely ignoring the interrupt set at 50ms.
Post-fix results (commit
c01a363ec)git checkout c01a363ec go test -run TestInterruptNestedCall -v ./core/vm/runtime/The nested call now stops in ~51ms — a 22x improvement, matching the top-level interrupt latency.
Fix
Move
interruptfrom a function parameter to a field on theEVMstruct. This ensures all call depths — top-level, nested, and deeply nested — share the same interrupt flag.The
interruptparameter is removed fromCall(),Run(), and the entireApplyTransactionchain (29 files, removing parameter threading throughApplyMessage,ApplyMessageNoFeeBurnOrTip,execute(), etc.). The miner sets the interrupt once viaevm.SetInterrupt(&w.interruptBlockBuilding)inmakeEnv.Performance validation
An earlier optimization replaced
context.Done()withatomic.Boolfor the per-opcode interrupt check, yielding a 25x throughput improvement (10.34 ns/op → 0.41 ns/op). We validated that our refactor does not regress this optimization.Why there should be zero impact
Both pre-fix and post-fix produce identical hot-path code in the interpreter loop:
The only difference is one extra struct field read at function entry (
evm.interrupt), which happens once perRun()call — negligible compared to millions of opcode iterations per transaction.BenchmarkSimpleLoop comparison (100M gas, 6 runs, 5s benchtime)
Interpretation: The critical
loop-100Mbenchmark (pure opcode loop, most sensitive to per-opcode overhead) shows +0.63% — well within system noise. Most benchmarks fall in the 0.2–1.8% range. Theloop2-100Moutlier at +7.5% is attributed to thermal throttling: benchmarks were run sequentially (~4 minutes each), and the consistent slight increase across all benchmarks (never a decrease) is a classic sign of CPU thermal effects from back-to-back benchmark suites on Apple Silicon. The per-opcode interrupt check code is literally identical in both versions — the local*atomic.Boolvariable and.Load()call produce the same machine code.Changes
Breaking changes
No breaking changes