Hi, we’re running OP Stack with reth as execution client (dockerized), and are consistently hitting the following op-node error:
Dropping payload from payload queue because the payload queue is too large
This happens simultaneously on two machines (one full SSD, one SSD+HDD), so it does not appear to be disk-type specific.
L1 RPC is shared with other nodes and confirmed healthy.
Setup
- OP Stack
- Execution: reth (docker)
- op-node and reth in separate containers on same host
- Metrics exposed from reth on host port 7301
Observed behavior
During the incident:
- CPU is fully saturated
- op-node continuously drops payloads
- Execution falls behind derive
- Both machines enter this state at roughly the same time (similar chain height)
reth metrics show heavy static files activity:
reth_static_files_jar_provider_calls_total{segment="receipts",operation="append"} 16397259
reth_static_files_jar_provider_calls_total{segment="transactions",operation="append"} 16397259
reth_static_files_jar_provider_calls_total{segment="headers",operation="append"} 76684
reth_static_files_jar_provider_calls_total{segment="transactions",operation="commit-writer"} 22106
reth_static_files_jar_provider_calls_total{segment="receipts",operation="commit-writer"} 22108
reth_static_files_jar_provider_calls_total{segment="headers",operation="commit-writer"} 22098
reth_consensus_engine_beacon_pipeline_runs 4
This strongly suggests reth is in a static files ingestion / consolidation phase (receipts / tx / headers + commit-writer flush).
During this phase, execution ingest throughput drops significantly, while op-node continues deriving payloads, eventually overflowing the payload queue and dropping payloads.
Conclusion
This appears to be an interaction between:
- reth static files consolidation / pipeline stages
- OP Stack’s continuous payload derivation
When reth enters static files stages, it temporarily cannot keep up with real-time execution payloads, but op-node has no feedback mechanism and continues pushing, leading to queue overflow.
This reproduces even on full SSD systems, so it seems inherent to the pipeline behavior rather than raw disk speed.
Questions
- Is this expected behavior when static files are enabled?
- How to solve the problem?
Hi, we’re running OP Stack with reth as execution client (dockerized), and are consistently hitting the following op-node error:
This happens simultaneously on two machines (one full SSD, one SSD+HDD), so it does not appear to be disk-type specific.
L1 RPC is shared with other nodes and confirmed healthy.
Setup
Observed behavior
During the incident:
reth metrics show heavy static files activity:
This strongly suggests reth is in a static files ingestion / consolidation phase (receipts / tx / headers + commit-writer flush).
During this phase, execution ingest throughput drops significantly, while op-node continues deriving payloads, eventually overflowing the payload queue and dropping payloads.
Conclusion
This appears to be an interaction between:
When reth enters static files stages, it temporarily cannot keep up with real-time execution payloads, but op-node has no feedback mechanism and continues pushing, leading to queue overflow.
This reproduces even on full SSD systems, so it seems inherent to the pipeline behavior rather than raw disk speed.
Questions