Skip to content

Conversation

lexnv
Copy link
Contributor

@lexnv lexnv commented Sep 24, 2025

This PR increases the timeout constants of zombienet-substrate-0003-block-building-warp-sync from 60 seconds to 90 seconds.

During an unrelated PR change, the CI must have been under heavy load:

2025-09-22T14:44:51.407Z zombie::network-node [bob] Current value: 56687 for metric block height, keep trying...

	 Error:  
		 [bob] Timeout(60), "getting desired metric value 56687 within 60 secs".

┌──────────────────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 9/22/2025, 2:44:51 PM        │ ❌ bob: reports block height is greater than 56687 within 60 seconds (60001ms)                     │
└──────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────┘
2025-09-22T14:44:51.745Z zombie::network-node returning for: block_height{status="best"} from ns: substrate

Offhand, the test appears to have passed if it had been given another 6 seconds in the timeout. To err on the safer side, the timeout is bumped by 30s.

It is not entirely clear why these errors started to manifest (ci changes probably), since we had to bump some other rust tests as well:

cc @paritytech/networking

@lexnv lexnv requested review from lrubasze and a team September 24, 2025 10:44
@lexnv lexnv self-assigned this Sep 24, 2025
@lexnv lexnv added R0-no-crate-publish-required The change does not require any crates to be re-published. T10-tests This PR/Issue is related to tests. labels Sep 24, 2025
@skunert
Copy link
Contributor

skunert commented Sep 24, 2025

the CI must have been under heavy load

Hmm but we moved to native runners right? Are they still so load dependent?

@pepoviola
Copy link
Contributor

This should be already fixed in master (by #9417), I checked and the bootstrap of the node sometime take more than 60 secs (worth notice that this test is in v1 and we should move it to the sdk). This issues is mostly related to the bootstrap of the node nodes at the same time and the heavy iops operations at the bootstrap process.

@lrubasze
Copy link
Contributor

This should be already fixed in master (by #9417), I checked and the bootstrap of the node sometime take more than 60 secs (worth notice that this test is in v1 and we should move it to the sdk). This issues is mostly related to the bootstrap of the node nodes at the same time and the heavy iops operations at the bootstrap process.

If this is the case and above does not help then we could:

  • change runner from "default" to "large"
  • eventually switch to zombienet-sdk (v2) which includes (since v0.3.13) some nice optimizations to distribute bootstrapping over time (Create a graph for bootstrap network zombienet-sdk#371)
    in such case default runner should be enough

@pepoviola
Copy link
Contributor

Yes, I think we can move the substrate jobs to the sdk. I will track the effort and make it a priority in our team.
Thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-no-crate-publish-required The change does not require any crates to be re-published. T10-tests This PR/Issue is related to tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants