Skip to content

sim-rs: Add timestamp quantization and TX batching to sequential engine#824

Open
sandtreader wants to merge 2 commits intoprc/limit-tx-backlogfrom
prc/faster-sim-faster
Open

sim-rs: Add timestamp quantization and TX batching to sequential engine#824
sandtreader wants to merge 2 commits intoprc/limit-tx-backlogfrom
prc/faster-sim-faster

Conversation

@sandtreader
Copy link
Contributor

Summary

The sequential DES engine was processing events almost entirely single-threaded during TX-only periods (near 0% CPU on a 32-core machine), only bursting to ~6 cores after EB production. Two root causes:

  1. No timestamp quantization — the sequential engine never applied timestamp-resolution-ms. Every TX arrival and network delivery had a unique timestamp, so the BSP batching processed ~360M events one at a time, never reaching parallel-threshold for rayon.

  2. One TX per eventTxGeneratorCore::generate() produces one TX per call, each scheduled at a unique time. Even with quantization, TX inter-arrival times may span multiple quanta.

Changes

Timestamp quantization in the sequential engine (sim-core/src/sim/sequential.rs):

  • All event timestamps pushed to the priority queue are now quantized via Timestamp::with_resolution() — network deliveries, CPU task completions, timed events, and cross-shard messages.
  • This causes events with nearby timestamps to land in the same time bucket, creating larger batches that trigger rayon parallelism.

TX batch generation (new tx-batch-window-ms config option):

  • When set, the TxGeneration handler generates all TXs whose next scheduled time falls within the configured window, instead of one per event.
  • Independent of timestamp-resolution-ms — works by comparing against an absolute window rather than quantized timestamps.

turbo.yaml tuning:

  • timestamp-resolution-ms: 1.0 (was effectively unused at 0.000001)
  • parallel-threshold: 4 (was 10)
  • tx-batch-window-ms: 10.0 (new)

Performance impact

Tested on a 32-core machine with a 1500-slot CIP simulation:

  • Before: ~90 min wall clock, 0.2% CPU during TX-only periods
  • After: ~12 min wall clock, 70% sustained CPU utilization
  • ~7.5x speedup

Backward compatibility

All three optimizations are opt-in via turbo.yaml. Without it, defaults match previous behavior exactly:

  • timestamp-resolution-ms: 0.000001 (no effective quantization)
  • parallel-threshold: 10 (unchanged)
  • tx-batch-window-ms: null (disabled)
  • Default engine is actor, not sequential

Test plan

  • All 41 existing tests pass
  • Clippy clean
  • Verified default parameters (without turbo.yaml) match previous behavior
  • Benchmarked with turbo.yaml on 1500-slot CIP simulation

🤖 Generated with Claude Code

The sequential engine was never quantizing timestamps, so each event had a
unique timestamp and rayon parallelism never kicked in during TX-only periods.

Two independent optimizations, each configurable separately:
- Timestamp quantization: now applied to all events in the sequential engine,
  controlled by existing timestamp-resolution-ms (turbo.yaml sets 1.0ms)
- TX batch generation: new tx-batch-window-ms config option batches all TX
  generation events within a time window into one timestamp (turbo.yaml sets
  10ms). Independent of timestamp resolution.

Also lowers parallel-threshold from 10 to 4 in turbo.yaml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sandtreader sandtreader changed the title Add timestamp quantization and TX batching to sequential engine sim-rs: Add timestamp quantization and TX batching to sequential engine Mar 19, 2026
@sandtreader sandtreader requested a review from ch1bo March 19, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant