Skip to content

Conversation

Frando
Copy link
Member

@Frando Frando commented May 5, 2025

Description

Based on #53 and #64

This adds a large integration test that bootstraps a swarm with many nodes and lets each node broadcast some messages.

The test runs with 100 nodes and 2 messages per node.

On my machine I ran it a couple of times with 1000 nodes in release mode, completes fine, but takes a while in debug mode, thus limited to 100 nodes for now to not overwhelm CI.

Breaking Changes

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

@Frando Frando changed the base branch from proto-fixes to Frando/feat-net-shutdown May 5, 2025 13:43
Copy link

github-actions bot commented May 5, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh-gossip/pr/65/docs/iroh_gossip/

Last updated: 2025-05-20T12:06:51Z

Copy link

github-actions bot commented May 5, 2025

Simulation report
GossipAll-n100-r5 with 4 seeds
        RMR    LDH  missed  duration 
 mean  1.01  10.00    0.00     715ms 
  max  1.02  11.00    0.00     788ms 
  min  1.00   9.00    0.00     490ms 

GossipAll-n20-r30 with 4 seeds
        RMR   LDH  missed  duration 
 mean  1.38  5.03    0.00     394ms 
  max  1.46  7.00    0.00     443ms 
  min  1.12  3.00    0.00     232ms 

GossipMulti-n100-r30 with 4 seeds
        RMR    LDH  missed  duration 
 mean  0.34  10.17    0.00     298ms 
  max  1.50  14.00    0.00     409ms 
  min  0.07   5.00    0.00     147ms 

GossipMulti-n1000-r30 with 4 seeds
        RMR    LDH  missed  duration 
 mean  0.44  20.43    0.00     583ms 
  max  2.06  26.00    0.00     718ms 
  min  0.04  12.00    0.00     319ms 

GossipMulti-n20-r30 with 4 seeds
        RMR   LDH  missed  duration 
 mean  0.27  3.97    0.00     146ms 
  max  1.61  6.00    0.00     264ms 
  min  0.06  2.00    0.00      83ms 

GossipSingle-n100-r30 with 4 seeds
        RMR   LDH  missed  duration 
 mean  0.08  5.00    0.00     147ms 
  max  1.50  5.00    0.00     147ms 
  min  0.01  5.00    0.00     147ms 

GossipSingle-n1000-r30 with 4 seeds
        RMR    LDH  missed  duration 
 mean  0.07  12.00    0.00     319ms 
  max  2.06  12.00    0.00     319ms 
  min  0.00  12.00    0.00     319ms 

GossipSingle-n20-r30 with 4 seeds
        RMR   LDH  missed  duration 
 mean  0.14  3.00    0.00      83ms 
  max  1.44  3.00    0.00      83ms 
  min  0.06  3.00    0.00      83ms 

comparing GossipAll-n100-r5
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 
comparing GossipAll-n20-r30
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 
comparing GossipMulti-n100-r30
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 
comparing GossipMulti-n1000-r30
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 
comparing GossipMulti-n20-r30
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 
comparing GossipSingle-n100-r30
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 
comparing GossipSingle-n1000-r30
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 
comparing GossipSingle-n20-r30
               RMR          LDH       missed     duration 
 mean       +0.00%       +0.00%       +0.00%       +0.00% 
 max        +0.00%       +0.00%       +0.00%       +0.00% 
 min        +0.00%       +0.00%       +0.00%       +0.00% 

Last updated: 2025-05-20T12:07:07Z

@n0bot n0bot bot added this to iroh May 5, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh May 5, 2025
@Frando Frando force-pushed the Frando/feat-net-shutdown branch from d956b94 to 549bec9 Compare May 15, 2025 09:28
@Frando Frando force-pushed the Frando/test-net-big branch from a893707 to 130d5dc Compare May 15, 2025 09:28
@Frando Frando force-pushed the Frando/feat-net-shutdown branch from 549bec9 to b1b5ee7 Compare May 15, 2025 09:44
@Frando Frando force-pushed the Frando/test-net-big branch from 130d5dc to 9fc90bd Compare May 15, 2025 09:44
@Frando Frando force-pushed the Frando/feat-net-shutdown branch from b1b5ee7 to b4abbd4 Compare May 15, 2025 09:47
@Frando Frando force-pushed the Frando/test-net-big branch from 9fc90bd to 9d0a324 Compare May 15, 2025 09:47
@Frando Frando marked this pull request as ready for review May 15, 2025 09:54
@Frando Frando force-pushed the Frando/test-net-big branch from 9d0a324 to cfefa65 Compare May 15, 2025 09:56
Copy link
Contributor

@divagant-martian divagant-martian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments otherwise lgtm :) is the ci timeout expected tho?

}

#[tokio::test(flavor = "multi_thread")]
async fn gossip_net_big() -> TestResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a large enough test to merit some docs on its own. Could you describe first the setup (topology, number of nodes, churn, loss, etc, - you know the deal - plus listing what the env parameters are and what they mean in the context of the test) and then what it's asserting happens or does not happen?

.map(|x| x.parse().unwrap())
.unwrap_or(10000);
let timeout = Duration::from_millis(timeout_ms);
info!("recv timeout: {timeout:?}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to have logs?I don't see anything initializing the logger and I ran this with RUST_LOG=debug, --nocapture too and got nothing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I readded the logging. I guess I removed it on accident inbetween.

@Frando Frando force-pushed the Frando/test-net-big branch from cfefa65 to a80bf1b Compare May 19, 2025 12:13
@Frando Frando force-pushed the Frando/feat-net-shutdown branch from b4abbd4 to 1c521b9 Compare May 19, 2025 12:13
@Frando Frando changed the base branch from Frando/feat-net-shutdown to main May 19, 2025 12:26
Frando added a commit that referenced this pull request May 19, 2025
This adds a large integration test that bootstraps a swarm with many nodes
and lets each node broadcast some messages.

The test runs with 100 nodes and 2 messages per node.

On my machine I ran it a couple of times with 1000 nodes in release mode.
It completes fine, but takes a while in debug mode,
thus limited to 100 nodes for now to not overwhelm CI.
@Frando Frando force-pushed the Frando/test-net-big branch from a80bf1b to 51218d1 Compare May 19, 2025 12:28
This adds a large integration test that bootstraps a swarm with many nodes
and lets each node broadcast some messages.

The test runs with 100 nodes and 2 messages per node.

On my machine I ran it a couple of times with 1000 nodes in release mode.
It completes fine, but takes a while in debug mode,
thus limited to 100 nodes for now to not overwhelm CI.
@Frando Frando force-pushed the Frando/test-net-big branch from 51218d1 to d1ec336 Compare May 20, 2025 12:04
@Frando
Copy link
Member Author

Frando commented May 20, 2025

is the ci timeout expected tho?

It isn't. I will need to dig in why it fails on cross and times out on windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

2 participants