Skip to content

Perf/ibd speedups#277

Merged
blondfrogs merged 6 commits intomasterfrom
perf/ibd-speedups
Apr 22, 2026
Merged

Perf/ibd speedups#277
blondfrogs merged 6 commits intomasterfrom
perf/ibd-speedups

Conversation

@blondfrogs
Copy link
Copy Markdown
Member

@blondfrogs blondfrogs commented Apr 21, 2026

Summary

Six focused changes to reduce IBD time and clean up the BIP 152 compact block relay path. Validated against a localhost sync run: full-cache flush stalls eliminated and no behavior regressions observed.

Commits

1. Silence per-block LogPrintf calls on sync hot path

Four unconditional LogPrintf calls were firing for every PON header and every block connect/disconnect during IBD. Converted to LogPrint with categories so they only fire under -debug=pon or -debug=fluxnode.

  • src/pon/pon.cpp — PON difficulty/adjustment traces (2× per PON header) → LogPrint("pon", ...)
  • src/fluxnode/fluxnode.cppFluxnodeCache::LogDebugData gets an early-return on LogAcceptCategory("fluxnode") so the per-block string concatenation over mapStartTxTracker / mapStartTxDOSTracker is skipped when not debugging fluxnodes

2. Adopt Bitcoin Core's dbcache split to favor the in-memory UTXO set

Fluxd used to allocate 3/4 of -dbcache to the block tree DB whenever -insightexplorer was enabled, leaving only ~76 MiB for the in-memory UTXO cache at the default -dbcache=450. During IBD this caused frequent full-cache flushes (multi-second stalls every ~4k blocks).

Switched to Bitcoin Core's allocation:

  • Block tree DB gets 1/8 of the total, capped at 2 MiB without txindex or 1024 MiB with txindex / insightexplorer
  • Coin DB cache is capped at 8 MiB
  • Remainder goes to the in-memory UTXO set

Result at -dbcache=450 with all indexes on: UTXO cache grows from ~76 MiB → ~386 MiB. At -dbcache=1000 it reaches 867 MiB; at -dbcache=2000 it's ~1.7 GiB. Verified zero flush stalls in a 10k-block IBD window vs one 8-second stall every ~4k blocks previously.

3. Implement BIP 152 low-bandwidth mode (peer-initiated compact blocks)

#266 shipped high-bandwidth compact block relay (unsolicited cmpctblock to up to three peers). This adds low-bandwidth mode: lets a peer request a compact block on demand via getdata, saving bandwidth for edge peers (light clients, metered connections) that haven't opted into high-bandwidth announcements.

  • Adds MSG_CMPCT_BLOCK as a new inv type (value 11, not BIP 152's canonical 4 because type 4 is "spork" on the Flux network)
  • ProcessGetData responds to MSG_CMPCT_BLOCK requests with a cmpctblock message when the block is within depth
  • IsFluxnodeType() tightened to an explicit [4, 10] range so the new type isn't misrouted

Old fluxd peers that don't know type 11 treat it as unknown and silently ignore — no misbehavior, safe to deploy without coordination.

4. Early-return CheckForExpired/UndoExpiredStartTx when tracker is empty

Both functions are called on every ConnectBlock / DisconnectBlock. For most of chain history the relevant trackers are empty, so the iterate-and-conditionally-copy loops do no work but still pay for the cs_main-adjacent lock acquisition. The emptiness check runs under the lock so it races correctly with concurrent modifications.

5. Cap debug.log size and enforce it at runtime

Previously ShrinkDebugFile() only ran at startup and only when -debug was not set (-shrinkdebugfile default was !fDebug). With debug=1 the log could grow without bound.

  • -maxdebugfilesize upper cap is now 10 GiB when -debug is enabled and 2 GiB otherwise. The 500 MB default is unchanged.
  • -shrinkdebugfile default flipped to true so the cap is enforced even when debug logging is on.
  • ShrinkDebugFile() made safe to call at runtime via temp-file + atomic rename + fReopenDebugLog.
  • Scheduled every 5 minutes via the existing CScheduler.

6. Harden BIP 152 compact block handling

Two fixes identified by an audit against Bitcoin Core's implementation:

a. Clean up per-peer compact-block in-flight state on disconnect. FinalizeNode previously iterated mapPartialBlocks but did nothing (a TODO comment acknowledged the gap). Peer-keyed cleanup is actually available via listCompactBlocksInFlight. Walk it, drop entries for the disconnecting peer, and drop the corresponding mapPartialBlocks / mapPartialBlocksTime entries only when no other peer is still working on the same block. Prevents a slow memory leak when peers churn mid-reconstruction.

b. Unify the compact-block depth constant as MAX_CMPCTBLOCK_DEPTH, shared by the cmpctblock reception check and the new low-bandwidth MSG_CMPCT_BLOCK getdata response. Value is 100, sized to roughly Bitcoin Core's ~50-minute wall-clock window at fluxd's 30-second PON block spacing (Bitcoin uses 5 at 10-minute blocks). The prior reception value of 10 covered only ~5 minutes on PON — too narrow for typical mempool turnover.

Config / operator impact

  • No config changes required.
  • Users keeping the default -dbcache=450 will see their in-memory UTXO cache jump from ~76 MiB → ~386 MiB on restart.
  • BIP 152 changes are backwards-compatible — old peers silently ignore the new inv type.

Test plan

  • Full build under GCC 13.3 — clean
  • Localhost IBD against a known-good peer: height 141k → 272k over several minutes, no crashes, no flush stalls, cache behavior matches expected split
  • Mainnet dry-run on a test node for longer-term stability
  • Stress test: high peer churn to validate the in-flight cleanup path

Related

  • Builds on Write to disk on next block after tip disconnected #244 (write-after-disconnect flush) which is already on master.
  • Extends Compact Headers & Compact Blocks - BIP 152 Implementation #266 (BIP 152 high-bandwidth mode) with the low-bandwidth + hardening follow-ups.
  • Noted follow-ups not in this PR:
    • Pre-validation HB announcement timing (bigger restructure of announce vs ConnectBlock ordering)
    • Client-side MSG_CMPCT_BLOCK request path when peers advertise LB mode
    • Periodic pblocktree compaction (separate change — the block index LevelDB bloats to ~14 GiB without manual compaction during heavy IBD)

Four unconditional LogPrintf calls were firing on every PON header and
every block connect/disconnect during IBD, dominating debug.log volume
and measurably slowing sync. Convert them to LogPrint with a category
so they only fire under -debug=pon or -debug=fluxnode.

- src/pon/pon.cpp: PON difficulty adjustment traces (2x per PON header)
- src/fluxnode/fluxnode.cpp: FluxnodeCache::LogDebugData, plus an
  early-return so the per-block string concatenation over
  mapStartTxTracker / mapStartTxDOSTracker is skipped when the fluxnode
  category isn't active.
Fluxd used to allocate 3/4 of -dbcache to the block tree DB whenever
-insightexplorer was enabled, which left only ~76 MiB of in-memory UTXO
cache at the default -dbcache=450. During IBD this caused frequent
full-cache flushes (multi-second stalls every few thousand blocks).

Switch to Bitcoin Core's allocation: block tree DB gets 1/8 of the
total, capped at 2 MiB without txindex or 1024 MiB with txindex /
insightexplorer. Coin DB cache is capped at 8 MiB. The remainder goes
to the in-memory UTXO set.

At -dbcache=450 with all indexes on, the UTXO cache grows from ~76 MiB
to ~386 MiB; at -dbcache=2000 it reaches ~1.7 GiB, which eliminates
most flush stalls during IBD.
PR #266 shipped high-bandwidth compact block relay (unsolicited
cmpctblock messages to up to three peers). Low-bandwidth mode lets a
peer request a compact block on demand via getdata, saving bandwidth
for edge peers (light clients, metered connections) that haven't opted
into high-bandwidth announcements.

Adds MSG_CMPCT_BLOCK as a new inv type, handled by ProcessGetData.
When a block is within MAX_CMPCTBLOCK_DEPTH (5 blocks) of the tip, the
server responds with a cmpctblock message; older blocks fall back to
sending a full block because the requester is unlikely to have the
mempool state needed to reconstruct it.

Type number 11 is used instead of the BIP 152 canonical value of 4
because type 4 is already allocated to "spork" on the Flux network.
IsFluxnodeType() is tightened to an explicit [4, 10] range so the new
type isn't misrouted.
CheckForExpiredStartTx and CheckForUndoExpiredStartTx are called on
every ConnectBlock/DisconnectBlock. For most of chain history the
relevant trackers (mapStartTxTracker, mapStartTxDOSTracker) are empty,
so the iterate-and-conditionally-copy loops do no work but still pay
for the cs_main-adjacent lock acquisition and the subsequent log
lines. Skip the body entirely when there is nothing to do.

The emptiness check runs under the lock, so it races correctly with
concurrent modifications.
Previously ShrinkDebugFile() only ran at startup and only when -debug
was not set (the -shrinkdebugfile default was !fDebug). With debug=1
the log could grow without bound.

Changes:
- -maxdebugfilesize upper cap is now 10 GiB when -debug is enabled
  (so a long debug session has room to breathe) and 2 GiB otherwise.
  The 500 MB default is unchanged.
- Default -shrinkdebugfile to true so the cap is enforced even when
  debug logging is on.
- Make ShrinkDebugFile() safe to call at runtime by writing the kept
  tail to a temp file and atomically renaming; concurrent LogPrintStr
  writes land on the unlinked inode and fReopenDebugLog is set so the
  next log write reopens on the new file.
- Schedule ShrinkDebugFile() every 5 minutes so a long-running node
  with heavy debug output doesn't fill the disk between restarts.
Two fixes identified by an audit of fluxd's BIP 152 implementation
against Bitcoin Core:

1. Clean up per-peer compact-block in-flight state on disconnect.
   FinalizeNode previously iterated mapPartialBlocks but did nothing
   (a TODO comment acknowledged the gap). Peer-keyed cleanup is
   actually available via listCompactBlocksInFlight, which stores
   {NodeId, block hash} markers. Walk it, drop entries for the
   disconnecting peer, and drop the corresponding mapPartialBlocks /
   mapPartialBlocksTime entries only when no other peer is still
   working on the same block. Prevents a slow memory leak when peers
   churn mid-reconstruction.

2. Unify the compact block depth constant as MAX_CMPCTBLOCK_DEPTH,
   shared by the cmpctblock reception check and the new low-bandwidth
   MSG_CMPCT_BLOCK getdata response. Value is 100, sized to roughly
   Bitcoin Core's ~50-minute wall-clock window at fluxd's 30-second
   PON block spacing (Bitcoin uses 5 at 10-minute blocks). The prior
   reception value of 10 covered only ~5 minutes on PON, which is too
   narrow for typical mempool turnover.
@blondfrogs blondfrogs merged commit 279c37f into master Apr 22, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant