Skip to content

fix(ocaml): diff_report must not abort the batch on one malformed fixture#144

Merged
div0rce merged 1 commit into
mainfrom
fix/ocaml-diff-report-robustness
Jun 25, 2026
Merged

fix(ocaml): diff_report must not abort the batch on one malformed fixture#144
div0rce merged 1 commit into
mainfrom
fix/ocaml-diff-report-robustness

Conversation

@div0rce

@div0rce div0rce commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Summary

The round-3 bug hunt found the diff_report bin (ocaml/bin/diff_report.ml) folds over the fixture batch calling Diff_report.bundle_if_divergent with no exception handler — unlike its sibling bins verify_replay.ml and replay_snapshot.ml, which both catch Parse_error/Sys_error and exit cleanly.

bundle_if_divergent calls Stream_parser.parse_full_file (raises Stream_parser.Parse_error on a malformed line, a missing meta line, or an int token exceeding OCaml's 63-bit max_int — which a valid C++ int64 price/id can) and open_in (raises Sys_error). So one malformed/unreadable fixture raises out of List.fold_left and aborts the whole batch, silently losing the divergence bundles for every later fixture.

CI runs diff_report on the failure path over a glob batch with || true, so the crash is swallowed and the differential-failure-bundle artifact comes back empty — precisely when a developer needs the bundle for the genuinely-divergent fixture.

Fix

Guard each fixture with try/with catching Stream_parser.Parse_error and Sys_error: print a per-fixture cannot compare message and treat it as a failure (diverged=true) so the missing comparison forces a non-zero exit instead of disappearing. Mirrors the explicit handling already in the two sibling bins.

Verification

  • A malformed fixture now prints cannot compare … and exits 1 (no Fatal error crash); a batch with a malformed fixture first no longer aborts (the later fixtures are still processed).
  • Added a test asserting a malformed fixture raises the catchable Stream_parser.Parse_error the bin now handles.
  • dune runtest passes.

🤖 Generated with Claude Code

…ture

Round-3 bug hunt found the diff_report bin (ocaml/bin/diff_report.ml) folds over
the fixture batch calling Diff_report.bundle_if_divergent with NO exception
handler, unlike its sibling bins verify_replay.ml and replay_snapshot.ml which
both catch Parse_error/Sys_error and exit cleanly. bundle_if_divergent calls
Stream_parser.parse_full_file (raises Stream_parser.Parse_error on a malformed
line, a missing meta line, or an int token exceeding OCaml's 63-bit max_int —
which a valid C++ int64 price/id can) and open_in (raises Sys_error). So one
malformed/unreadable fixture raises out of List.fold_left and aborts the whole
batch, silently losing the divergence bundles for every later fixture. CI runs
diff_report on the failure path over a glob batch with `|| true`, so the crash is
swallowed and the differential-failure-bundle artifact comes back empty —
precisely when a developer needs the bundle for the genuinely-divergent fixture.

Fix: guard each fixture with try/with catching Stream_parser.Parse_error and
Sys_error; print a per-fixture "cannot compare" message and treat it as a failure
(diverged=true) so the missing comparison forces a non-zero exit instead of
disappearing. Added a test asserting a malformed fixture raises the catchable
Stream_parser.Parse_error the bin now handles.

Verified: a malformed fixture now prints "cannot compare ..." and exits 1 (no
"Fatal error" crash); a batch with a malformed fixture first no longer aborts.
dune runtest passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@div0rce, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 8 minutes and 13 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bc3b9982-8172-4b12-aa20-f9c257977604

📥 Commits

Reviewing files that changed from the base of the PR and between 2494df5 and c3cce55.

📒 Files selected for processing (2)
  • ocaml/bin/diff_report.ml
  • ocaml/test/test_diff_report.ml
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ocaml-diff-report-robustness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No application code in the PR — skipped Code Health checks.

See analysis details in CodeScene

Quality Gate Profile: Pay Down Tech Debt
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

@div0rce div0rce merged commit 70757b9 into main Jun 25, 2026
8 checks passed
@div0rce div0rce deleted the fix/ocaml-diff-report-robustness branch June 25, 2026 01:37
div0rce added a commit that referenced this pull request Jun 25, 2026
…ct (#147)

* docs: overhaul all stale docs for the post-v0.2.1 (v0.2.2) state

Full staleness audit of every prose doc against current main. The anchors were
frozen at v0.2.1 / 263 tests / "no active milestone" while 12 PRs (#135-#146)
had merged and are unreleased.

- Resume anchors (PROGRESS.md, HANDOFF.md): Current state / Last action /
  Next action / test count (263->270) brought current; the two duplicate frozen
  anchors and the stale macOS benchmark numbers fixed; a dated v0.2.2 log entry.
- CLAUDE.md + AGENTS.md: the post-M35 roadmap-memory section now records the
  post-v0.2.1 hardening + perf wave (identical edit in both).
- CHANGELOG.md: new [0.2.2] section (decoder enum rejection #136, network/CLI
  hardening #137/#140/#141/#143, real UBSan abort gate #142, ocaml diff_report
  #144, try_emplace ~+5% #138, index load-factor ~+18.6% #145). CMakeLists
  version 0.2.1 -> 0.2.2.
- README: benchmark/flamegraph/limitations sections reflect the engine wins
  (measured on the profile workload, not the micro-bench table) and the gateway
  hardening; release_readiness 270/270 + UBSan gate + v0.2.2 scope.
- Networking docs (socket_gateway, socket_hardening): connection cap, EINTR
  retry, transient-accept survival, fd-exhaustion handling, UDP send-error
  counting. replay_and_recovery: decode_command now rejects out-of-domain enums.
  binary_protocol/differential_testing/fix_protocol/SECURITY/recruiting_notes/
  CONTRIBUTING: smaller accuracy updates. results/README: add the three socket
  artifacts.

make check 270/270. Stale results/*.txt provenance digests regenerated
separately. pool_backed_storage.md table follows its artifact regeneration.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* results: regenerate stale provenance artifacts for the v0.2.2 source

The post-v0.2.1 source changes (#135-#146) left 14 results/*.txt with stale
Source digests (the authoritative staleness signal per the provenance rules,
not commit-hash drift). Regenerated via their make targets so each declares
Dirty inputs: no against current HEAD: differential, pool_backed_storage,
allocator_experiment, recovery_benchmarks, false_sharing_study, perf_stat_linux
(partial PMU, QSL_PERF_ALLOW_PARTIAL), perf_report_linux, numa_affinity_study
(linux-constrained), socket_load_summary, socket_profile_loopback,
socket_stress_summary, dpdk_environment, nic_offload_environment.

docs/pool_backed_storage.md: refreshed the median table, digest reference, and
qualitative ordering from the regenerated artifact (contiguous fastest on four of
five workloads; intrusive leads dense). The baseline rows now include the
try_emplace (#138) and index load-factor (#145) wins.

Honesty notes: these were measured on a thermally-warmed M2 from a long session,
so absolute values run higher than a cool-host snapshot — the relative orderings
and provenance digests are the load-bearing content, and every artifact carries
a hardware/build-dependence caveat. latest.txt is regenerated separately on a
cooled host to keep its headline micro-benchmark numbers representative.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* results: regenerate latest.txt (cool host) and sync README/PROGRESS numbers

latest.txt regenerated on a thermally-recovered host so the headline micro-bench
numbers are representative (protocol canary 16.1 ns/op), with a fresh Source
digest (Dirty inputs: no) matching current source. The README benchmark table
and the PROGRESS measured-results section are aligned to it: order_book ~90,
protocol ~16, gateway ~102, matching ~91, replay ~101 ns. The matching/replay
rows are slightly faster than the prior committed run (the v0.2.2 engine wins
showing on the resting-order paths); the order_book micro-bench is unchanged
(near-empty index, so it does not exercise the load-factor win).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* results: re-regenerate env-check artifacts against the final v0.2.2 docs

dpdk_environment.txt and nic_offload_environment.txt include README.md (and
CLAUDE.md/AGENTS.md/results/README.md) in their digest scope, so the README
number-sync re-staled them. Regenerated against the final committed docs;
Dirty inputs: no.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant