refactor(db): typed read/write pools with retry-on-busy by toksdotdev · Pull Request #666 · superradcompany/microsandbox

toksdotdev · 2026-05-04T20:18:46Z

summary

sqlite is single-writer, so when several callers in the same process try to write at once they fight for the writer lock and surface SQLITE_BUSY errors. this pr serialises in-process writes, makes the read/write split a property of the type system, and centralises busy handling so that contention turns into a deterministic queue rather than a flurry of retries.

each database now opens two pools wrapped as distinct types:

DbReadConnection (multi-connection, concurrent reads under wal)
DbWriteConnection (single connection, retry-on-busy built into its transaction method)

both implement ConnectionTrait, so existing query builders work unchanged. function signatures across the codebase declare intent: read helpers take &DbReadConnection, write helpers take &DbWriteConnection, and the few that do both take &DbPools.

cross-process busy contention (e.g. host vs in-vm runtime) is absorbed by an exponential-backoff retry helper. busy detection matches the sqlx::Error structure rather than searching the display string.

the sqlite connection builder also moves into the microsandbox-db crate so the host cli and the runtime apply identical pragmas (wal, busy_timeout, foreign_keys, synchronous=normal) from one place. the only user-facing tuning knob is database.busy_timeout_secs in ~/.microsandbox/config.json.

benchmark

200 concurrent boots × 3 runs, alpine, 64 mib, fresh db on each branch. the 73 vm-layer failures per run on both sides are a macos hypervisor.framework resource ceiling (memory pressure at 200 × 64 mib) and not affected by this pr, so the table normalises against the 127 boots that cleared the vm ceiling.

	main (median)	branch (median)	delta
succeeded	40 / 127	127 / 127	+218%
success rate	31%	100%	+69 pts
wall time	7636 ms	5950 ms	-22%
throughput	5.6/s	21.3/s	+280%
`SQLITE_BUSY` errors	49–114/run	0	gone

main loses 49–114 sandboxes per run to sqlite contention; the branch eliminates those failures entirely.

results are also far more deterministic on the branch: every run hits exactly 127 successful boots, while main swings between 25 and 99.

test plan

cargo test -p microsandbox --lib
cargo check --workspace --all-targets
manual: spawn two sandboxes concurrently and confirm no SQLITE_BUSY surfaces

split each sqlite database into a read pool (multi-connection) and a dedicated single-connection write pool. funnels in-process writes through a queue rather than fighting for the writer lock, leaving WAL-mode reads to run concurrently. centralise the sqlite connection builder in microsandbox-db so host and runtime apply identical PRAGMAs (WAL, busy_timeout, foreign_keys, synchronous=normal) from one place. drops the SQLITE_PRAGMAS string constant that was previously executed via raw SQL. add with_retry_transaction: write paths retry on SQLITE_BUSY (5) and SQLITE_BUSY_SNAPSHOT (517) with exponential backoff. busy detection matches sea_orm::DbErr structurally rather than string-searching the display output. expose busy_timeout via the global config.json only. the runtime uses microsandbox_db::pool::DEFAULT_BUSY_TIMEOUT_SECS since it is not user-configurable.

stacks on #666. ## summary we already split each database into a read pool and a write pool, but both were just `sea_orm::DatabaseConnection`. nothing in the type system stopped a write from sneaking onto the read pool, and the retry-on-busy logic had to look up the right pool dynamically. this pr makes the read/write split a property of the type. `DbReadConnection` and `DbWriteConnection` are newtypes over the underlying connection. both implement `ConnectionTrait` so existing query builders work unchanged, but write transactions are only available on `DbWriteConnection`. every host call site now declares its intent in its signature: read helpers take `&DbReadConnection`, write helpers take `&DbWriteConnection`, and the few that do both take `&DbPools`. with the typed handles in place, the dynamic write-pool lookup and the per-call max-connections override are gone, and two writes that were managing their own transactions pick up retry-on-busy for free. a 200-concurrent-boot benchmark shows a ~18% throughput improvement over the previous design (21.3 sandboxes/sec vs 18.2), with median wall time falling from ~6992 ms to ~5950 ms. zero `SQLITE_BUSY` errors observed in either run. ## test plan - [x] cargo test -p microsandbox --lib - [x] cargo check --workspace --all-targets

# Conflicts: # crates/microsandbox/lib/db/mod.rs

# Conflicts: # crates/microsandbox/lib/image/mod.rs # crates/utils/lib/lib.rs

toksdotdev · 2026-05-05T00:15:20Z

addendum: linux benchmark

ran the same benchmark on a fresh gcp n2-standard-8 (8 vcpu, 32 gib ram, ubuntu 24.04, kvm) for a cross-platform check. results are stronger than the darwin run because there's no hypervisor.framework ceiling competing with the db effect.

200 concurrent boots × 3 runs, alpine, 64 mib, fresh db on each branch:

	main (median)	branch (median)	delta
succeeded	58 / 200	200 / 200	+245%
success rate	29%	100%	+71 pts
throughput	1.2 / s	4.4 / s	+267%
`SQLITE_BUSY` errors	117–143/run	0 (run 3 had 1)	gone

main loses 117–143 sandboxes per run to sqlite contention. wall time is similar on both branches (~46s) because at this concurrency the wall clock is bounded by per-vm boot time, not by db throughput. the fix moves the failure mode entirely.

# Conflicts: # crates/microsandbox/lib/sandbox/handle.rs # crates/microsandbox/lib/sandbox/metrics.rs

override execute / execute_unprepared / query_one / query_all on dbwriteconnection to wrap each call in retry_on_busy. every entity::insert(...).exec(db_write) site across the workspace now retries on sqlite_busy / sqlite_busy_snapshot transparently. the dual-pool design already eliminated intra-process contention, but inter-process contention is fundamental at high concurrency and was previously unhandled at the trait level. drop the name parameter from retry_on_busy and dbwriteconnection::transaction. callers that want operation context in retry warnings can wrap in a tracing span. updated the 7 existing transaction() callers in sandbox/mod.rs and image/mod.rs accordingly. tighten doc comments on transaction() to explain when to use it (multi-statement atomic writes) versus auto-commit (single-statement writes that now retry automatically via the trait). validated at 50-conc on a 4-vcpu host (12x oversubscription): success rate improved from 32/50 (18 sqlite_busy errors) to 50/50 (0 errors).

the sdk's resolve-binary.js prefers the platform-pkg-bundled msb (node_modules/@superradcompany/microsandbox-linux-x64-gnu/bin/msb) over ~/.microsandbox/bin/msb. in ci that bundled msb is the last published release (e.g. 0.4.3), while the freshly-built msb installed at ~/.microsandbox/bin/msb is from the pr branch. when the sdk passes a flag the older msb doesn't recognise, the spawn fails with clap exit 2 (unix_wait_status(512)) and an empty startup line, causing the smoke suite to fail before exercising anything. set MSB_PATH explicitly so the sdk uses the just-built binary. confirmed locally: smoke suite went from 1 failed / 4 skipped to 5/5 passed. the integration test job is unaffected — it uses the host crate's resolve_msb_path() which prefers ~/.microsandbox/bin/msb over PATH.

resolved conflicts: - .github/workflows/check.yml: kept both env vars from each side. main's MSB_HOME isolates per-job ~/.microsandbox state across ci runs; our MSB_PATH forces the sdk to use the freshly-built msb instead of the bundled platform-pkg one. - crates/microsandbox/lib/db/mod.rs: took main's microsandbox_utils:: resolve_home() helper instead of the local dirs::home_dir + base join. resolve_home() respects MSB_HOME, which is the point of pr #669.

toksdotdev · 2026-05-06T13:48:06Z

tl;dr: msb's "slowness" is nested-kvm's tax, not msb's

benchmarked on google cloud:

metric	nested kvm (n2-highcpu-80)	bare metal (c3-192-metal)	speedup
1-vm warmup	567 ms	273 ms	2.1×
60-conc wall	9246 ms	391 ms	24×
60-conc throughput	6.5/sec	153/sec	24×
200-conc throughput	0–2 sandboxes/sec	100–130 sandboxes/sec	~50×

at 200 concurrent boots, nested kvm cliffs catastrophically: success collapses from 128/200 in run 1 to 0/200 by run 3. bare metal: 200/200 every run in ~1.5 s.

perf showed kernel time on nested kvm dominated by ept mapping + tlb flushes; every vmexit pays a ~5–10× tax that disappears on metal.

next: a guide for nested-virt tuning. early testing suggests the available knobs (numa balancing, transparent hugepages, scheduler) only buy ~10%.

toksdotdev mentioned this pull request May 4, 2026

refactor(db): typed read and write connection types #668

Merged

2 tasks

toksdotdev changed the title ~~refactor(db): split read/write pools, add retry helper~~ refactor(db): typed read/write pools with retry-on-busy May 4, 2026

toksdotdev added 2 commits May 4, 2026 18:57

Merge remote-tracking branch 'origin/main' into db/split-pools-retry

3bbe3cb

# Conflicts: # crates/microsandbox/lib/db/mod.rs

Merge remote-tracking branch 'origin/main' into db/split-pools-retry

7f44747

# Conflicts: # crates/microsandbox/lib/image/mod.rs # crates/utils/lib/lib.rs

toksdotdev marked this pull request as ready for review May 4, 2026 23:06

toksdotdev requested a review from appcypher as a code owner May 4, 2026 23:06

Merge remote-tracking branch 'origin/main' into db/split-pools-retry

7566636

# Conflicts: # crates/microsandbox/lib/sandbox/handle.rs # crates/microsandbox/lib/sandbox/metrics.rs

toksdotdev force-pushed the db/split-pools-retry branch from 30ad1b3 to 7566636 Compare May 5, 2026 14:18

toksdotdev added 3 commits May 6, 2026 01:28

toksdotdev merged commit b28028e into main May 6, 2026
21 of 22 checks passed

toksdotdev deleted the db/split-pools-retry branch May 6, 2026 13:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(db): typed read/write pools with retry-on-busy#666

refactor(db): typed read/write pools with retry-on-busy#666
toksdotdev merged 8 commits intomainfrom
db/split-pools-retry

toksdotdev commented May 4, 2026 •

edited

Loading

Uh oh!

toksdotdev commented May 5, 2026

Uh oh!

toksdotdev commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

toksdotdev commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

summary

benchmark

test plan

Uh oh!

toksdotdev commented May 5, 2026

addendum: linux benchmark

Uh oh!

toksdotdev commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tl;dr: msb's "slowness" is nested-kvm's tax, not msb's

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

toksdotdev commented May 4, 2026 •

edited

Loading

toksdotdev commented May 6, 2026 •

edited

Loading