Skip to content

Conversation

eserilev
Copy link
Member

@eserilev eserilev commented Aug 20, 2025

Issue Addressed

#7603

Proposed Changes

Custody backfill sync service

Similar in many ways to the current backfill service. There may be ways to unify the two services. The difficulty there is that the current backfill service tightly couples blocks and their associated blobs/data columns. Any attempts to unify the two services should be left to a separate PR in my opinion.

SyncNeworkContext

SyncNetworkContext manages custody sync data columns by range requests separetly from other sync RPC requests. I think this is a nice separation considering that custody backfill is its own service.

Data column import logic

The import logic verifies KZG committments and that the data columns block root matches the block root in the nodes store before importing columns

New channel to send messages to SyncManager

Now external services can communicate with the SyncManager. In this PR this channel is used to trigger a custody sync. Alternatively we may be able to use the existing mpsc channel that the SyncNetworkContext uses to communicate with the SyncManager. I will spend some time reviewing this.

TODOs

  • Test with tracing changes from Implement tracing spans for data columm RPC requests and responses #7831
  • Make sure were verifying KZG before importing columns to the store
  • Set custody sync status to pending if cgc count changes during foward/backfill sync. Resume custody sync when its ready
  • Restart custody sync if a node shuts down during custody sync
  • Custody backfill metrics
  • Fix a bug where custody sync has issues running again after a succesful custody sync
  • Devnet-5 testing
  • Non-happy path testing
  • Update ValidatorRegistrations.epoch_validator_custody_requirements while custody syncing, or once its completed
  • Resolve TODO comments in the PR
  • Ensure custody sync can be started automatically on CGC change (currently this is disabled)
  • Review copy-pasta inline comments
  • Remove manual custody sync backfill trigerring mechanism
  • Fix notifier speedo logic

Addtional notes

This needs to be throughouly tested before being included in 8.0.0-rc.0.

@eserilev eserilev added the work-in-progress PR is a work-in-progress label Aug 20, 2025
@eserilev eserilev requested a review from jxs as a code owner August 20, 2025 05:01
@eserilev eserilev added syncing das Data Availability Sampling fulu Required for the upcoming Fulu hard fork v8.0.0-rc.0 Q3 2025 release for Fusaka on Holesky labels Aug 20, 2025
@eserilev
Copy link
Member Author

eserilev commented Aug 21, 2025

I've tested one of the non-happy paths
2025-08-21_12-21

ug 21 19:13:37.088 WARN  Some data columns are missing from the batch  epoch: Epoch(69), missing_slots: [Slot(2214), Slot(2213), Slot(2208), Slot(2211), Slot(2209), Slot(2216), Slot(2215), Slot(2212), Slot(2210), Slot(2217)] 
Aug 21 19:13:37.088 WARN  Custody backfill batch processing error       error: MissingDataColumns { missing_slots_and_data_columns: {Slot(2214): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2213): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2208): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2211): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2209): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2216): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2215): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2212): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2210): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}, Slot(2217): {82, 32, 61, 74, 46, 29, 124, 41, 86, 59, 88, 108, 9, 115, 11, 13, 31, 77, 89, 42, 7, 58, 76, 34, 107, 0, 113, 15, 114, 30, 17, 60, 111, 20, 109, 48, 6, 14, 62, 5, 112, 18, 93, 106, 95, 16, 37, 40, 84, 127, 126, 57, 51, 123, 70, 38, 80, 2, 96, 92, 55, 102, 23, 44, 67, 54, 68, 66, 110, 56, 81, 53, 69, 72, 100, 52, 91, 118, 122, 28, 25, 63, 39, 117, 26, 65, 87, 125, 45, 3, 97, 90, 64, 35, 104, 83, 49, 1, 98, 85, 120, 105, 75, 73, 19, 43, 24, 8, 27, 121, 21, 79, 99, 50, 47, 10, 4, 22, 119, 71, 116, 103, 101, 94, 33, 78, 36, 12}} } 

I'm guessing that since the slot is orphaned the import should actually succeed here (nothing would get imported). So i need to fix the error handling here a bit

EDIT: Ive fixed this using forward_block_iter

@eserilev
Copy link
Member Author

eserilev commented Aug 26, 2025

In order to reduce edge cases, we should wait until the cgc epoch reaches finality before begining custody backfill sync

If a node needs to custody backfilll during periods of nonfinality they can simply restart the node and checkpoint sync.

@eserilev eserilev added ready-for-review The code is ready for review and removed work-in-progress PR is a work-in-progress labels Aug 27, 2025
Copy link

mergify bot commented Aug 27, 2025

Some required checks have failed. Could you please take a look @eserilev? 🙏

@mergify mergify bot added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Aug 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
das Data Availability Sampling fulu Required for the upcoming Fulu hard fork syncing v8.0.0-rc.0 Q3 2025 release for Fusaka on Holesky waiting-on-author The reviewer has suggested changes and awaits thier implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant