Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -1275,6 +1275,8 @@ exited 0, so pure-UBSan defects passed CI green (#142); OCaml `diff_report` per-
(#144). Perf (measured back-to-back A/B): `try_emplace` for baseline price levels (~+5%, #138) and
an order-index hash `max_load_factor` cap at 0.25 (~+18.6%, #145), flamegraph regenerated
(#135/#139/#146). Determinism preserved (byte-identical fixtures; OCaml differential pass).
`make check`/`make asan` 270/270 (the latter now a real UBSan gate). After `v0.2.2`, the
`make check`/`make asan` 272/272 (the latter now a real UBSan gate). `v0.2.2` then also folded in a
documentation overhaul, a `PERFORMANCE.md` v0.1.0-to-v0.2.2 evidence report, and a bug/style/mermaid
sweep (#147-#150). After `v0.2.2`, the
highest-value remaining work is non-code and gated on #94 (external review) and #90 (full cache-PMU
evidence on a PMU-capable microarchitecture).
45 changes: 36 additions & 9 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,17 @@ All notable changes to this project. The format is loosely based on

_Nothing yet._

## [0.2.2] - 2026-06-24

A security/robustness **hardening** wave plus two measured order-book **performance** wins, driven by
a multi-round adversarial bug hunt (converged 5→2→1→0 confirmed bugs) and flamegraph-guided
optimization. Same honesty bar: a deterministic C++20 exchange simulator and cross-language
differential-testing harness, **not** a production exchange, no real-market connectivity, no latency
or profitability claims, not formal verification. Determinism preserved throughout (fixtures
byte-identical across g++/clang++ and vs the committed copies; the OCaml differential passes).
`make check`/`make asan` 270/270.
## [0.2.2] - 2026-06-25

A security/robustness **hardening** wave plus two measured order-book **performance** wins (driven by
a multi-round adversarial bug hunt, converged 5→2→1→0 confirmed bugs, and flamegraph-guided
optimization), then a full **documentation overhaul**: a reproducible performance-evidence report, a
rebuilt README, mermaid diagrams across the docs, and a repo-wide style sweep. Same honesty bar: a
deterministic C++20 exchange simulator and cross-language differential-testing harness, **not** a
production exchange, no real-market connectivity, no latency or profitability claims, not formal
verification. Determinism preserved throughout (fixtures byte-identical across g++/clang++ and vs the
committed copies; the OCaml differential passes). `make check`/`make asan` 272/272 (the latter a real
UBSan abort gate).

### Fixed

Expand Down Expand Up @@ -56,6 +58,31 @@ byte-identical across g++/clang++ and vs the committed copies; the OCaml differe
- **Flamegraph regenerated (#135, #139, #146)** against the new code, now a dense (~20k-sample),
fully-symbolized frame-pointer profile with zero `[unknown]` frames.

### Documentation and evidence

- **Performance-evidence report (#148, #150).** New `PERFORMANCE.md` profiles the matching-engine hot
path with Linux `perf` and flamegraphs on ARM64 (Apple M2), comparing the **v0.1.0 first release to
v0.2.2** (the same `qsl-perfeval` harness ported into a `v0.1.0` worktree, measured on the same
host): allocations/order 4.094 → 2.670 (-35%), cycles/order 310.7 → 289.5 (-6.8%), branch-miss rate
2.01% → 1.68%, latency unchanged. Cache-miss rate is reported unavailable, never estimated (Apple
Silicon PMU; #90). A dedicated `qsl-perfeval` target (plus a `qsl-perfeval-allocs` counting build)
makes every number reproducible. The before/after flamegraphs render fully symbolized with zero
`[unknown]` frames; the unresolvable boundary frames were identified (an fp glibc-malloc-boundary
artifact and the vDSO `clock_gettime` leaf) and folded into their resolved caller, not hidden.
- **Documentation overhaul and README rebuild (#147, #149).** Every doc, artifact, and provenance
header was refreshed to the v0.2.2 state, and the README was rebuilt to lead with the performance
numbers and the matching-engine flamegraph. Mermaid diagrams were added across the docs (matching
rules, binary protocol, persistence, concurrency model, memory ordering, gateway accept loop, OCaml
differential), and every em dash and en dash was removed repo-wide.
- **Honesty corrections, made in the open.** Two measurement errors caught by self-review and code
review were fixed and documented rather than buried: the allocation counter had missed
over-aligned allocations (so the cumulative reduction is -35%, not the -73% an earlier draft
claimed), and a thermal-warmup p99 artifact was corrected to "latency distribution unchanged".
- **Previously-unaddressed review findings (#150).** Acted on CodeRabbit comments left open on earlier
PRs: a non-finite `strtod` guard in the bench profile timer, `ENOPROTOOPT`/`EOPNOTSUPP` added to the
threaded accept-retry set to match the epoll path, and the perfeval harness's resting-order
tracking, percentile index, and argument validation.

## [0.2.1] - 2026-06-21

Two backlog items, reprioritized by the maintainer and delivered, plus a resume-anchor and
Expand Down
4 changes: 3 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -1219,6 +1219,8 @@ exited 0, so pure-UBSan defects passed CI green (#142); OCaml `diff_report` per-
(#144). Perf (measured back-to-back A/B): `try_emplace` for baseline price levels (~+5%, #138) and
an order-index hash `max_load_factor` cap at 0.25 (~+18.6%, #145), flamegraph regenerated
(#135/#139/#146). Determinism preserved (byte-identical fixtures; OCaml differential pass).
`make check`/`make asan` 270/270 (the latter now a real UBSan gate). After `v0.2.2`, the
`make check`/`make asan` 272/272 (the latter now a real UBSan gate). `v0.2.2` then also folded in a
documentation overhaul, a `PERFORMANCE.md` v0.1.0-to-v0.2.2 evidence report, and a bug/style/mermaid
sweep (#147-#150). After `v0.2.2`, the
highest-value remaining work is non-code and gated on #94 (external review) and #90 (full cache-PMU
evidence on a PMU-capable microarchitecture).
21 changes: 12 additions & 9 deletions HANDOFF.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,10 @@ connection cap, UDP send-error tracking, transient-accept survival, and threaded
handling (#137, #140, #143); CLI arg validation (#141); a **real UBSan abort gate**, `-fno-sanitize-recover=undefined`, since UBSan previously ran in recover mode and exited 0 (#142);
OCaml `diff_report` robustness (#144). Perf (measured A/B): `try_emplace` for baseline price levels
(~+5%, #138) and an order-index hash load-factor cap (~+18.6%, #145), with the flamegraph regenerated
(#135/#139/#146). `make check`/`make asan` 270/270 (the latter now under the real UBSan gate). The
next action is to finish this `v0.2.2` doc/artifact overhaul and cut the tag.
(#135/#139/#146), then a documentation overhaul, a `PERFORMANCE.md` v0.1.0-to-v0.2.2 evidence report,
and a bug/style/mermaid sweep (#147-#150). `make check`/`make asan` 272/272 (the latter now under the
real UBSan gate). `v0.2.2` is tagged; the next high-value work is non-code (#94 external review, #90
cache-PMU evidence).

Background. Linux perf evidence (merged, now bare-metal partial PMU):

Expand Down Expand Up @@ -87,13 +89,14 @@ Current state:

- latest synced main baseline: `ded6e80` (PR #127, v0.2.0); the `v0.2.1` baseline is the release-PR
merge commit, after PRs #129/#130/#131
- current active branch, if active: `docs/post-v0.2.1-overhaul` (v0.2.2 prep + doc/artifact sweep)
- current active status: `v0.2.1` is the latest tag; a post-v0.2.1 hardening + perf wave (#135, #146)
is merged to `main` and unreleased, being cut as `v0.2.2` (decoder enum rejection, network/CLI
hardening, a real UBSan abort gate, OCaml diff_report robustness, and two measured order-book perf
wins, `try_emplace` ~+5% and an index load-factor cap ~+18.6%). `make check` 270/270 and
`make asan` 270/270 (the latter now under the real UBSan gate) on the bare-metal Apple M2 Fedora
Asahi host; every touched file passes the CI CodeScene Code Health gate
- current active branch, if active: none (`main` is at the `v0.2.2` release)
- current active status: **`v0.2.2` is the latest tag**: the post-v0.2.1 hardening + perf wave (#135,
#146: decoder enum rejection, network/CLI hardening, a real UBSan abort gate, OCaml diff_report
robustness, and two measured order-book perf wins, `try_emplace` ~+5% and an index load-factor cap
~+18.6%), plus a documentation overhaul, a `PERFORMANCE.md` v0.1.0-to-v0.2.2 evidence report, and a
bug/style/mermaid sweep (#147-#150). `make check` 272/272 and `make asan` 272/272 (the latter now
under the real UBSan gate) on the bare-metal Apple M2 Fedora Asahi host; every touched file passes
the CI CodeScene Code Health gate
- release tag: `v0.2.1` (Latest, tagged on the release-PR merge commit), after `v0.2.0` and `v0.1.0`;
`v0.2.2` prepared on this branch, not yet tagged
- open follow-up issue: #90, narrowed to the full cache-counter PMU set; the bare-metal Apple host
Expand Down
60 changes: 31 additions & 29 deletions PROGRESS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,25 @@ Do not rely on prior chat memory.

## Current state

- **Active milestone:** none, `v0.2.1` is the latest tag, but a post-v0.2.1 hardening + perf wave
(12 PRs, #135, #146) has merged to `main` and is **unreleased**; it is being cut as **`v0.2.2`**
- **Status:** ☑ `v0.2.1` published on top of `v0.2.0`; ☐ `v0.2.2` in preparation, security/robustness
hardening (decoder enum-domain rejection, network/CLI hardening, a real UBSan abort gate, OCaml
diff_report robustness) plus two measured order-book perf wins
- **Active branch:** `docs/post-v0.2.1-overhaul` (the v0.2.2 prep + full doc/artifact staleness sweep)
- **Active milestone:** none, **`v0.2.2` is the latest tag**. It bundled the post-v0.2.1
hardening + perf wave (#135, #146) plus a full documentation overhaul (#147, #149), a reproducible
performance-evidence report (#148), and a bug/style sweep with mermaid diagrams (#150)
- **Status:** ☑ `v0.2.2` published on top of `v0.2.1`: security/robustness hardening (decoder
enum-domain rejection, network/CLI hardening, a real UBSan abort gate, OCaml diff_report
robustness), two measured order-book perf wins, `PERFORMANCE.md` (v0.1.0 to v0.2.2 evidence),
README rebuild, repo-wide em-dash purge, and mermaid diagrams across the docs
- **Active branch:** none (`main` is at the `v0.2.2` release)
- **Last completed milestone:** M49. NIC offload and low-latency networking study (PR #124,
d8c16b2). Releases since: `v0.2.0` (PR #127, ded6e80) and `v0.2.1` (FIX adapter #131, flamegraph
#134, anchor sweep #129). Post-v0.2.1 unreleased work on `main`: #135, #146 (see Last action)
- **Last completed docs sync:** this v0.2.2-prep overhaul, every `.md`/`.txt` audited against
current `main`; resume/release anchors, README, CHANGELOG, and all stale `results/*.txt`
d8c16b2). Releases since: `v0.2.0` (PR #127, ded6e80), `v0.2.1` (#131/#134/#129), and `v0.2.2`
(#135 through #150)
- **Last completed docs sync:** the v0.2.2 release sweep, every `.md`/`.txt` audited against
current `main`, em/en dashes removed repo-wide, mermaid diagrams added, and all `results/*.txt`
provenance digests brought current to HEAD
- **Release:** `v0.1.0` (tag on 9857e1a), `v0.2.0` (tag on ded6e80), and `v0.2.1` (tag on the
release-PR merge, marked Latest) published as GitHub-only releases; `v0.2.2` prepared here, not yet
tagged; no packages published
- **`make check` passing:** yes, `make check` 270/270 and `make asan` 270/270 (the latter now under
- **Release:** `v0.1.0` (tag on 9857e1a), `v0.2.0` (tag on ded6e80), `v0.2.1`, and `v0.2.2`
published as GitHub-only releases; no packages published
- **`make check` passing:** yes, `make check` 272/272 and `make asan` 272/272 (the latter now under
the **real** UBSan abort gate from #142) on the bare-metal Apple M2 (aarch64) Fedora Asahi host on
2026-06-24
2026-06-25
- **Last action:** post-v0.2.1 hardening + perf wave merged to `main` as 12 scoped PRs (#135, #146),
driven by a multi-round adversarial bug hunt (converged 5→2→1→0 confirmed) and flamegraph-guided
optimization. Security/robustness: reject out-of-domain enum bytes in the replay/protocol decoders
Expand All @@ -49,12 +50,12 @@ Do not rely on prior chat memory.
abort the batch (#144). Perf (measured A/B): baseline price levels use `try_emplace` (~+5%, #138)
and the order-index hash caps its load factor at 0.25 (~+18.6%, #145); flamegraph regenerated
(#135, #139, #146). Determinism preserved throughout (byte-identical fixtures, OCaml differential
pass). `make check`/`make asan` 270/270.
- **Next action:** finish the `v0.2.2` overhaul (this branch): regenerate the remaining stale
`results/*.txt` artifacts, then cut the `v0.2.2` tag/release. After that, the highest-value
remaining work is non-code and gated: issue #94 (independent external review, needs a human
reviewer) and issue #90 (full cache-counter PMU evidence, needs a PMU microarchitecture that
exposes cache events, e.g. x86_64).
pass). The wave then gained a documentation overhaul (#147, #149), a `PERFORMANCE.md` evidence
report (#148), and a bug/style/mermaid sweep (#150), all tagged as `v0.2.2`. `make check`/`make
asan` 272/272.
- **Next action:** none, `v0.2.2` is released. The highest-value remaining work is non-code and
gated: issue #94 (independent external review, needs a human reviewer) and issue #90 (full
cache-counter PMU evidence, needs a PMU microarchitecture that exposes cache events, e.g. x86_64).
- **Blockers:** issue #90 is now a *cache-counter* PMU gap, not a host-access gap, this bare-metal
Apple M2 exposes real `cycles`/`instructions`/`branches`/`branch-misses` but its PMU does not
implement `cache-references`/`cache-misses`; closing it needs a PMU microarchitecture that exposes
Expand Down Expand Up @@ -865,14 +866,15 @@ Quant Systems Lab. Linux Systems + Exchange Infrastructure Simulator

## Next action remains

`v0.2.1` is the latest tag, on top of `v0.2.0` (PR #127 ded6e80) and `v0.1.0`. A post-v0.2.1
hardening + perf wave (#135, #146) is squash-merged to `main` and **unreleased**, being cut as
`v0.2.2`: out-of-domain enum rejection in the decoders (#136); network hardening. EINTR retry,
accept fairness, connection cap, UDP send-error tracking, transient-accept survival, and fd-exhaustion
handling (#137, #140, #143); CLI arg validation (#141); a real UBSan abort gate (#142); OCaml
`diff_report` robustness (#144); and two measured order-book perf wins, `try_emplace` (~+5%, #138)
and the order-index load-factor cap (~+18.6%, #145), with the flamegraph regenerated (#135/#139/#146).
`make check`/`make asan` 270/270. The committed perf artifacts remain **partial hardware PMU
**`v0.2.2` is the latest tag**, on top of `v0.2.1`, `v0.2.0` (PR #127 ded6e80), and `v0.1.0`. It
bundled the post-v0.2.1 hardening + perf wave (#135, #146): out-of-domain enum rejection in the
decoders (#136); network hardening. EINTR retry, accept fairness, connection cap, UDP send-error
tracking, transient-accept survival, and fd-exhaustion handling (#137, #140, #143); CLI arg
validation (#141); a real UBSan abort gate (#142); OCaml `diff_report` robustness (#144); and two
measured order-book perf wins, `try_emplace` (~+5%, #138) and the order-index load-factor cap
(~+18.6%, #145), with the flamegraph regenerated (#135/#139/#146); plus a documentation overhaul, a
`PERFORMANCE.md` v0.1.0-to-v0.2.2 evidence report, and a bug/style/mermaid sweep (#147-#150).
`make check`/`make asan` 272/272. The committed perf artifacts remain **partial hardware PMU
evidence** from this bare-metal Apple M2 (aarch64) Fedora Asahi host, real
cycles/instructions/branches/branch-misses with cache-reference/cache-miss counters unsupported by
the Apple Silicon PMU, not NIC-offload, latency, or full hardware-PMU evidence.
Expand Down
17 changes: 9 additions & 8 deletions docs/release_readiness.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ after squash-merge.

| Check | Result |
|---|---|
| `make check` | 270/270 tests pass, no warnings (incl. the FIX-adapter, flamegraph-renderer, decoder enum-rejection, and CLI-arg-validation tests) |
| `make asan` (ASan + UBSan) | 270/270, sanitizer-clean; the UBSan gate now **aborts** on the first violation (`-fno-sanitize-recover=undefined`, #142), so pure-UBSan defects no longer pass green, and the tree is clean under it |
| `make check` | 272/272 tests pass, no warnings (incl. the FIX-adapter, flamegraph-renderer, decoder enum-rejection, CLI-arg-validation, and perfeval-harness tests) |
| `make asan` (ASan + UBSan) | 272/272, sanitizer-clean; the UBSan gate now **aborts** on the first violation (`-fno-sanitize-recover=undefined`, #142), so pure-UBSan defects no longer pass green, and the tree is clean under it |
| `make tsan` (ThreadSanitizer) | 20/20 concurrency-labelled tests, race-clean |
| `make check-fixtures` | committed differential fixtures match current C++ output |
| `make check-manifest` | provenance manifest matches the committed fixtures |
Expand Down Expand Up @@ -93,9 +93,10 @@ verification.

## Outcome

Release-ready as a portfolio artifact. `v0.2.1` is already tagged (FIX adapter #29, perf flamegraph
issue #32, anchor sweep) on top of `v0.2.0` (Phase III/IV systems work, M24-M49, plus the bare-metal
evidence refresh). The next GitHub-only release is **`v0.2.2`**, bundling the post-v0.2.1
hardening + perf wave merged to `main` (#135, #146): decoder enum rejection, network/CLI hardening, a
real UBSan abort gate, OCaml diff_report robustness, and the two measured order-book perf wins. It
requires explicit human approval and a squash-merge before tagging.
Release-ready as a portfolio artifact. `v0.2.2` is tagged on top of `v0.2.1` (FIX adapter #29, perf
flamegraph issue #32, anchor sweep) and `v0.2.0` (Phase III/IV systems work, M24-M49, plus the
bare-metal evidence refresh). `v0.2.2` bundled the post-v0.2.1 hardening + perf wave (#135, #146:
decoder enum rejection, network/CLI hardening, a real UBSan abort gate, OCaml diff_report robustness,
and the two measured order-book perf wins) plus a full documentation overhaul (#147, #149), a
reproducible performance-evidence report comparing v0.1.0 to v0.2.2 (#148), and a bug/style sweep
with mermaid diagrams (#150). Each release is a GitHub-only tag with explicit human approval.
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Loading