Candidate descriptor v3 cumulus changes by eskimor · Pull Request #10742 · paritytech/polkadot-sdk

eskimor · 2026-01-07T13:50:42Z

Building on top of #10472

Re-submissions are not yet handled - and not goal of this PR so far was to be able to have e2e tests for #10472. Verifying that we have everything to support them is necessary though - especially if we forgot something needed from the relay chain.

Full re-submissions are more involved and I would leave them out for a separate PR, so we can get this one merged quickly.

For simplicity, reasoning and efficiency.

+ Refactor/cleanup + some docs.

parents Remove per-parachain tracking of allowed relay parents in backing implicit view. Allowed relay parents are now computed per relay parent (globally) based on scheduling lookahead, rather than per parachain. Changes: - Remove `collating_for` parameter and para_id tracking from View - Update all callers to remove para_id arguments - Refactor tests with helper functions to reduce ~150 lines of duplication - Add comprehensive documentation to tests explaining expected behavior - Clarify `paths_via_relay_parent` returns full paths from oldest block to leaf This simplification aligns with the reality that all parachains share the same allowed relay parent windows at a given relay chain block.

+ Fix a typo.

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

iulianbarbu · 2026-03-29T16:58:31Z

/cmd fmt

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

iulianbarbu · 2026-03-30T15:18:56Z

/cmd fmt

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

alindima

Finally managed to root-cause all bugs and get the tests running well with older relay parents.

Here is a hacky implementation of the fixes: a681cae

Besides the review comments I left, here are the most critical issues that were preventing us from using older relay parents:

During the parent search, several things need to change:

we use an ancestry_lookback value equal to the scheduling lookahead. This is not sufficiently large if v3 is enabled. If v3 is enabled, we allow to build on top of relay parents that are not more than max_relay_parent_session_age sessions old. A good fix for this would be to overlook the result of build_relay_parent_ancestry and instead verify the session indices of the relay parents of the candidates during find_deepest_valid_parent, to not be older than the max_relay_parent_session_age. We probably also want to limit this ancestry however by the configured RelayParentOffset.
add a new parameter: best_hash, which is the current relay chain best hash that we're picking as a scheduling parent. This new value needs to be used inside find_parent_for_building, for calling fetch_included_from_relay_chain and persisted_validation_data. At the moment the lack of this change attempts to cause parachain forks on already finalized blocks (but is caught and prevented by the substrate backend: "Potential long-range attack: block not in finalized chain.").
This one was very nasty to discover: find_deepest_valid_parent is currently causing a very large lag in backing candidates after a session change. With v3 candidates, the relay parent can be different than the scheduling parent. Moreover, the relay parent can be from a previous session, but the scheduling parent cannot. Since we don't have resubmissions, we need to stop the parent search on a branch if the scheduling parent of a candidate we're visiting is from a session different than the current one (of the best relay chain block). This is tricky: at this point, we only know the parachain block, not the candidate. The parachain block does not contain any information about the scheduling parent of the candidate it was included in. We need to temporarily persist the scheduling parent of yet unincluded candidates on the collator side. Luckily, the candidate receipt is part of the BlockAnnounceData so we can access this data right when getting the block (because of pre async-backing stuff). I don't see this however being propagated over to block import process, so we may need to either do some hacks and keep track of this state somewhere in the collator or modify some substrate node interfaces to also get the announcement data during the block import.

alindima · 2026-04-01T10:12:40Z

cumulus/client/consensus/aura/src/collators/slot_based/block_builder_task.rs

+				.await
+			else {
 				continue;
 			};


We should also check if v3 is enabled on the relay chain network

alindima · 2026-04-01T10:22:49Z

cumulus/client/consensus/aura/src/collators/slot_based/block_builder_task.rs

+			let best_hash = para_client.info().best_hash;
+			let v3_enabled =
+				para_client.runtime_api().scheduling_v3_enabled(best_hash).unwrap_or(false);
+			slot_timer.set_v3_enabled(v3_enabled);


the relay parent offset needs to be validated against the relay chain constraints:

The relay chain has a new configuration item accessible via runtime API: max_relay_parent_session_age (if v3 node feature is enabled on the relay chain).
By default it will be 0, meaning that the relay chain will allow any relay parent from the current session.

If it is larger than 0, we need to start taking into account relay parents from the previous sessions also (how many sessions depends on the value returned by the runtime API)

alindima · 2026-04-01T10:28:20Z

cumulus/client/consensus/aura/src/collators/slot_based/slot_timer.rs

 		client: Arc<Client>,
 		time_offset: Duration,
 		relay_slot_duration: Duration,
+		v3_enabled: bool,


can't we instead just make the offset 0 is v3 is enabled?

alindima · 2026-04-01T10:35:11Z

cumulus/client/consensus/aura/src/collators/slot_based/block_builder_task.rs

 				relay_parent_num = %relay_parent_header.number(),
 				relay_parent_offset,
+				claim_queue_offset,
+				v3_enabled,


commenting here because I cannot on the untouched code:

relay_proof_request is going to be ignored by the runtime if v3 is enabled, so we can skip supplying it in that case

alindima · 2026-04-01T10:44:26Z

cumulus/client/consensus/aura/src/collators/slot_based/block_builder_task.rs

@@ -550,7 +612,9 @@ where
 			.await?


you need to stop the search if you reach the genesis block (otherwise you will get an error).
you also need to stop the search if you reach a block where this parachain did not even exist on the relay chain. Otherwise, the candidates will have PVDs (PersistedValidationData) that do not even exist on the relay chain and you'll build useless candidates

however, if you don't supply RelayParentOffset blocks in the validation function, you will fail the validation of the block. But in that case you need to wait until you get enough relay chain blocks built so that you have a sufficiently old relay parent.

This is also a zombienet test worth adding (later, once older relay parents work is all merged): a newly-onboarded v3 chain, which is not added at genesis time

alindima · 2026-04-01T11:50:45Z

polkadot/zombienet-sdk-tests/tests/functional/scheduling_v3.rs

+	assert_validator_backed_candidates(relay_node, 30).await?;
+	for i in 3..=5 {


why don't we test for all of the validators?

alindima · 2026-04-01T11:51:46Z

polkadot/zombienet-sdk-tests/tests/functional/scheduling_v3.rs

+
+#[rstest]
+#[case::zero_offset("async-backing-v3")]
+#[case::with_rpo("async-backing-v3-rpo")]


rpo is very cryptic. I assume it stands for relay parent offset. also it's best practice to add some comments on what the test is doing (like you did for the other test)

alindima · 2026-04-01T11:53:12Z

polkadot/zombienet-sdk-tests/tests/functional/scheduling_v3.rs

+			});
+
+			// Experimental collator protocol validators.
+			(6..10).fold(r, |acc, i| {


the number of validators is excessive, considering that you only have 3 cores and 2 validators per core.

alindima · 2026-04-01T13:24:06Z

cumulus/client/consensus/aura/src/collators/slot_based/block_builder_task.rs


 	if sc_consensus_babe::contains_epoch_change::<RelayBlock>(&relay_header) {
-		tracing::debug!(target: LOG_TARGET, ?relay_best_block, relay_best_block_number = relay_header.number(), "Relay parent is in previous session.");
+		tracing::debug!(target: LOG_TARGET, ?relay_best_block, relay_best_block_number = relay_header.number(), "RC tip is a session change block, skipping.");


this needs to be skipped if v3 is enabled. We don't care if the relay chain best block is the last one in a session. We can build legitimate candidates on it because these candidates will have relay parents from the previous session

Sure. IIUC, submitting legitimate candidates shouldn't care whether we're close to session boundary now, risking forks. The assumption is that soon next authors will be able to pick up those blocks and resubmit, in case the seconded/backed offchain signals are not arriving in time (once req/resp prospective-parachains protocol could be used), or in case we don't see inclusion event within unincluded segment relay chain blocks imported (this way we avoid forking, by not needing to rebuild that same block).

alindima · 2026-04-01T13:56:33Z

cumulus/test/runtime/src/lib.rs

 const RELAY_PARENT_OFFSET: u32 = 0;

+#[cfg(any(feature = "async-backing-v3", feature = "elastic-scaling-v3"))]
+const SCHEDULING_V3_ENABLED: bool = true;


UNINCLUDED_SEGMENT_CAPACITY needs to be a function of the RELAY_PARENT_OFFSET also is scheduling v3 is enabled

alindima · 2026-04-01T16:06:00Z

btw, if you want to test with a large relay parent offset, you need the following fixes on the relay chain: https://github.com/paritytech/polkadot-sdk/commits/alindima/prospective-parachains-older-rps/ (still in draft but working)

Will open a proper PR with the relay chain fixes soon

iulianbarbu · 2026-04-01T21:38:34Z

Leaving here some thoughts to discuss tomorrow:

If v3 is enabled, we allow to build on top of relay parents that are not more than max_relay_parent_session_age sessions old. A good fix for this would be to overlook the result of build_relay_parent_ancestry and instead verify the session indices of the relay parents of the candidates during find_deepest_valid_parent, to not be older than the max_relay_parent_session_age. We probably also want to limit this ancestry however by the configured RelayParentOffset.

Which ancestry do you think we need to limit by RelayParentOffset? If we drop the build_relay_parent_ancestry, the find_deepest_valid_parent will attempt a dfs starting from last included/pending block. Slot based collator can not build unincluded chains longer than a certain value, and forks shouldn't increase the width of the graph to a worring size (we don't have a protection against this right now either). I'd explore the children list for each descendant and update the parent (we'll build against) until exhausting the graph, with the condition that any children list added should be for a parent which its associated ~~relay parent's session is >= scheduling parent's session - max_relay_parent_session_age~~ (more like the scheduling parent's session for the candidate representing the para block to be >= scheduling parent's session - max_relay_parent_session_ag). In practice it shouldn't hit relay parent offset or scheduling lookahead limits, and if it is hit before we exhaust the graph, then I'd say something is misbehaving.

add a new parameter: best_hash, which is the current relay chain best hash that we're picking as a scheduling parent. This new value needs to be used inside find_parent_for_building, for calling fetch_included_from_relay_chain and persisted_validation_data. At the moment the lack of this change attempts to cause parachain forks on already finalized blocks (but is caught and prevented by the substrate backend: "Potential long-range attack: block not in finalized chain.").

Can you detail a bit more with an example, like how you've hit this? Especially how a fork on already finalized blocks could happen if querying the pvd at relay parent. I mean, yes, if the relay parent is quite old, pointing to pvd with para heads that have been already finalized, issues can arise. I'd expect to have something really weird happening to end up in this situation. Getting para heads finalized should also mean the para head advances, and the collator sees that via relay chain consensus, and block building para best hash should reflect it when building the next block. Resubmission should not happen for a finalized relay parent either - so we'd need to keep a view over announced blocks delta what's visible as included later on, when importing a relay chain block.

With v3 candidates, the relay parent can be different than the scheduling parent. Moreover, the relay parent can be from a previous session, but the scheduling parent cannot. Since we don't have resubmissions, we need to stop the parent search on a branch if the scheduling parent of a candidate we're visiting is from a session different than the current one (of the best relay chain block). This is tricky: at this point, we only know the parachain block, not the candidate. The parachain block does not contain any information about the scheduling parent of the candidate it was included in. We need to temporarily persist the scheduling parent of yet unincluded candidates on the collator side. Luckily, the candidate receipt is part of the BlockAnnounceData so we can access this data right when getting the block (because of pre async-backing stuff). I don't see this however being propagated over to block import process, so we may need to either do some hacks and keep track of this state somewhere in the collator or modify some substrate node interfaces to also get the announcement data during the block import.

Ugh, this sounds nasty. Maybe there is another way (req/resp based) to find out the scheduling parent session of the parents we're traversing via DFS, while building against a certain scheduling parent. We might be able to use the fact that for a para parent, if we query the para client at it we can find the relay parent offset. If we add this to the relay parent number of the para parent (I think we can find its hash in the para block digest, and then its header with the relay client, and then the number), we'll find out the scheduling parent number, and then be able to fetch its header with the relay client as well, and in the end we can query the session from the relay client based on header.hash().

eskimor and others added 30 commits December 17, 2025 16:00

Remove redundant implicit view from prospective parachains.

dfc51e2

Move relay chain scope things to relay chain scope.

c7de64b

For simplicity, reasoning and efficiency.

Remove pointless GetMinimumRelayParents message

29276ed

+ Refactor/cleanup + some docs.

Fix tests

a7394aa

Add prdoc

c57807f

Fix statement-distribution tests

c1440a5

Fix benchmakrs + major version bumps

d268ed7

Fix backing tests.

d1dc0da

Remove performance difference note - unrelated.

61f13f5

Fix tests.

d80f3f6

Reduce bits checked for v1 identification.

44e74de

First attempt in introducing new CandidateDescriptor

d548ee6

+ Fix a typo.

Cleanup + simplerversion checking.

ec09e00

Cleanup + make it typecheck.

a640e54

Remove yet unused SchedulingInfo.

1737e88

Drop v3 candidates in the runtime.

42d5001

Code simplification + fixes.

549244a

Simplification + fixes

f9ccd1b

Better future upgrade behavior + better docs.

b2655fe

Maintain old behavior when node feature is not set.

bd9ee57

v2 cleanup + fixes for v3.

4cdf77e

Fixes

c343c3f

Fixes.

e1aaa39

Add new accessor functions

f8dfbd1

Make it typecheck

2611b50

Fix type without blowing up Debug

851f950

Compilation fixes

754eb8a

Fixes.

18af332

Fix runtime tests.

5813ea9

iulianbarbu added 2 commits March 29, 2026 16:28

cumulus(misc): polish comments and renamings

bb6c6ee

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

polkadot(tests): test with experimental validators too

ad6ea58

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

github-actions bot and others added 10 commits March 29, 2026 17:01

Update from github-actions[bot] running command 'fmt'

83ffe08

ffix compilation issue

60265e5

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

fix some more clippy

964ea4e

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

templates: fix max_claim_queue_offset usage

956df0f

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

Merge branch 'master' into rk-cumulus-v3-integration

1329414

yap: already has relay parent offset const

7bfe1ad

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

staking-async: impl max_claim_queue_offset

0950d6e

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

cumulus: support relay parent offset zero

0fa1f7f

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

cumulus: ignore slot offset when v3 enabled

11b841d

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

polkadot(tests): make backing groups bigger

8c0d8ec

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

github-actions bot and others added 12 commits March 30, 2026 15:22

Update from github-actions[bot] running command 'fmt'

b464208

polkadot(tests): fix es v3 test

cc2e1be

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

prdoc: add all modified crates

ce638c0

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

prdoc: update 10742 with the crates again

2bba2c9

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

prdoc: fix indent

a5a200d

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

prdoc: update bumps

7a56e3e

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

Merge branch 'master' into rk-cumulus-v3-integration

91229a7

prdoc: fix according to check-semver

4d1c66b

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

polkadot(tests): add v3 with relay parent offset collators test

5704265

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

Cargo.toml: fix formatting

128b584

Signed-off-by: Iulian Barbu <iulian.barbu@parity.io>

Merge branch 'master' into rk-cumulus-v3-integration

325c48d

Merge branch 'master' into rk-cumulus-v3-integration

12c6450

alindima reviewed Apr 1, 2026

View reviewed changes

iulianbarbu mentioned this pull request Apr 1, 2026

cumulus: block bundling + V3 collations building #11613

Draft

2 tasks

		assert_validator_backed_candidates(relay_node, 30).await?;
		for i in 3..=5 {

Conversation

eskimor commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iulianbarbu commented Mar 29, 2026

Uh oh!

iulianbarbu commented Mar 30, 2026

Uh oh!

alindima left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iulianbarbu Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alindima commented Apr 1, 2026

Uh oh!

iulianbarbu commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

eskimor commented Jan 7, 2026 •

edited

Loading

iulianbarbu Apr 1, 2026 •

edited

Loading

iulianbarbu commented Apr 1, 2026 •

edited

Loading