perf(engine): return sorted data from compute_trie_input #19340

yongkangc · 2025-10-28T04:15:25Z

Closes #19249

Eliminates sorting overhead per block by returning TrieInputSorted instead of unsorted TrieInput from compute_trie_input.

Previously, MultiProofConfig::from_input() would call drain_into_sorted() on both nodes and state every block, performing expensive sorting operations:

This PR introduced TrieInputSorted type that holds sorted data from the start.

Now compute_trie_input sorts once at the end before returning, and MultiProofConfig::from_input becomes a simple Arc wrapper.

This eliminates 2-5ms of sorting overhead per block by returning TrieInputSorted instead of unsorted TrieInput from compute_trie_input. Previously, MultiProofConfig::from_input would call drain_into_sorted() on both nodes and state, performing expensive sorting operations every block. Now compute_trie_input sorts once at the end and returns sorted data, making MultiProofConfig::from_input a simple Arc wrapper. Changes: - Add TrieInputSorted type with sorted TrieUpdates and HashedPostState - Add clear() methods to TrieUpdatesSorted and HashedPostStateSorted - Update compute_trie_input to return (TrieInputSorted, BlockNumber) - Update MultiProofConfig::from_input to accept TrieInputSorted - Update BasicEngineValidator to store Option<TrieInputSorted> The implementation uses a "build unsorted, sort once" strategy: unsorted HashMap-based structures are used during building for fast extend operations, then sorted once before returning. This eliminates redundant sorting while maintaining performance. Resolves: #19249

Updated the handling of trie input in the compute_trie_input function to improve performance and memory efficiency. The changes include: - Replaced the use of Option<TrieInputSorted> with Option<TrieInput> to allow for better reuse of allocated capacity. - Introduced a new method, drain_into_sorted, in TrieInput to convert it into TrieInputSorted while retaining HashMap capacity for subsequent operations. - Adjusted the logic in compute_trie_input to utilize the new method, reducing unnecessary allocations and improving performance during block validations. These modifications streamline the trie input processing, enhancing overall efficiency in the engine's validation workflow.

yongkangc · 2025-10-29T05:54:00Z

crates/trie/common/src/input.rs

+/// This type holds sorted versions of trie data structures, which eliminates the need
+/// for expensive sorting operations during multiproof generation.
+#[derive(Default, Debug, Clone)]
+pub struct TrieInputSorted {


we should write some test for this

yongkangc · 2025-10-29T05:54:24Z

crates/trie/common/src/input.rs

+    ///
+    /// The sorted output allocates new `Vec` space, but the original `HashMap` capacity is
+    /// retained for the next cycle.
+    pub fn drain_into_sorted(&mut self) -> TrieInputSorted {


do we really need this?

imo it's not needed

yongkangc · 2025-10-29T05:55:02Z

crates/engine/tree/src/tree/payload_validator.rs

            StateRootStrategy::StateRootTask => {
                // get allocated trie input if it exists
-                let allocated_trie_input = self.trie_input.take();
+                let allocated = self.trie_input.take();


lets not change the name for this

yongkangc · 2025-10-29T05:56:27Z

crates/engine/tree/src/tree/payload_validator.rs

        allocated_trie_input: Option<TrieInput>,
-    ) -> ProviderResult<(TrieInput, BlockNumber)> {
-        // get allocated trie input or use a default trie input
+    ) -> ProviderResult<(TrieInputSorted, TrieInput, BlockNumber)> {


why do we still need TrieInput? cant we remove that?

Definitely, yes, allocated_trie_input should be a TrieInputSorted

yongkangc · 2025-10-29T05:57:37Z

crates/trie/common/src/input.rs

+    /// Append state to the input by reference and extend the prefix sets.
+    pub fn append_ref(&mut self, state: &HashedPostState) {
+        self.prefix_sets.extend(state.construct_prefix_sets());
+        let sorted_state = state.clone().into_sorted();


so basically we are applying sorting here, would that be a concern for performance regression?

if we're not using this method in this PR then I don't think we should add it. This logic made sense on TrieInput when we were not able to revert trie data, but now it's not necessary I don't think

Replaced the existing HashedPostState and TrieUpdates with their sorted counterparts, HashedPostStateSorted and TrieUpdatesSorted, in the ExecutedBlock struct. This change enhances the efficiency of state handling by ensuring that the trie updates and hashed state are maintained in a sorted order, improving performance during block execution and validation.

- The previous approach: - Converts every sorted block back into hash maps (cloning all keys/values once per block) because extend_with_blocks works on the unsorted representation. - After all that, drain_into_sorted() iterates those hash maps, builds sorted Vecs, and drains the allocations—more cloning and shuffling before we return to the same sorted layout we could have maintained from the start. - So the new loop cuts out the conversion overhead and reduces allocations; the old code was strictly more work for the same end result.

mediocregopher · 2025-10-29T14:05:40Z

crates/engine/tree/src/tree/payload_validator.rs

                // forms of the state/trie fields.
-                let (trie_input, multiproof_config) = MultiProofConfig::from_input(trie_input);
-                self.trie_input.replace(trie_input);
+                let (mut cleared_sorted_input, multiproof_config) =


The conversion to MultiProofConfig is no longer really necessary, you can use the trie_input fields directly when constructing the OverlaystateProviderFactory

mediocregopher · 2025-10-29T14:07:10Z

crates/engine/tree/src/tree/payload_validator.rs

        allocated_trie_input: Option<TrieInput>,
-    ) -> ProviderResult<(TrieInput, BlockNumber)> {
-        // get allocated trie input or use a default trie input
+    ) -> ProviderResult<(TrieInputSorted, TrieInput, BlockNumber)> {


Definitely, yes, allocated_trie_input should be a TrieInputSorted

mediocregopher · 2025-10-29T14:43:32Z

crates/trie/common/src/input.rs

+    ///
+    /// The sorted output allocates new `Vec` space, but the original `HashMap` capacity is
+    /// retained for the next cycle.
+    pub fn drain_into_sorted(&mut self) -> TrieInputSorted {


imo it's not needed

mediocregopher · 2025-10-29T14:44:46Z

crates/trie/common/src/input.rs

+    /// Append state to the input by reference and extend the prefix sets.
+    pub fn append_ref(&mut self, state: &HashedPostState) {
+        self.prefix_sets.extend(state.construct_prefix_sets());
+        let sorted_state = state.clone().into_sorted();


if we're not using this method in this PR then I don't think we should add it. This logic made sense on TrieInput when we were not able to revert trie data, but now it's not necessary I don't think

yongkangc added C-perf A change motivated by improving speed, memory usage or disk footprint A-engine Related to the engine implementation labels Oct 28, 2025

yongkangc requested a review from Rjected as a code owner October 28, 2025 04:15

github-project-automation bot added this to Reth Tracker Oct 28, 2025

yongkangc requested review from fgimenez, mattsse, mediocregopher and shekhirin as code owners October 28, 2025 04:15

github-project-automation bot moved this to Backlog in Reth Tracker Oct 28, 2025

yongkangc marked this pull request as draft October 28, 2025 04:21

yongkangc self-assigned this Oct 28, 2025

yongkangc moved this from Backlog to In Progress in Reth Tracker Oct 28, 2025

yongkangc added 3 commits October 28, 2025 15:41

update call site

3fd4996

fix fmt

9b83edd

yongkangc added the A-trie Related to Merkle Patricia Trie implementation label Oct 29, 2025

yongkangc commented Oct 29, 2025

View reviewed changes

yongkangc added 3 commits October 29, 2025 15:21

added sorted for test

4249c66

mediocregopher reviewed Oct 29, 2025

View reviewed changes

Uh oh!

perf(engine): return sorted data from compute_trie_input #19340

Are you sure you want to change the base?

perf(engine): return sorted data from compute_trie_input #19340

Uh oh!

Conversation

yongkangc commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yongkangc commented Oct 28, 2025 •

edited

Loading