Skip to content

Equipment BC balance-audit followups: evolver carry-forward + 3 follow-ups#39

Merged
xmap merged 5 commits into
mainfrom
equipment-bc-audit-followups
Jun 5, 2026
Merged

Equipment BC balance-audit followups: evolver carry-forward + 3 follow-ups#39
xmap merged 5 commits into
mainfrom
equipment-bc-audit-followups

Conversation

@xmap
Copy link
Copy Markdown
Owner

@xmap xmap commented Jun 5, 2026

Summary

Five surgical follow-ups to the Equipment-BC balance audit run on 2026-06-05. Each commit is independently reviewable; the bottom commit (the evolver fix) is also self-co-authored from earlier today and survived the rebase onto slice F (PR #38) with one trivial 3-way merge in the evolver's two newly-corrected arms.

  • fa30748cc Asset evolver carries commissioned_at + decommissioned_at through the AssetOwnerRemoved and AssetAttachedToFixture arms. Without this, any commissioned Asset losing an owner or attaching to a Fixture silently nulls the PIDINST Property 11 lifecycle dates on next replay.
  • 4303513a5 Adds two complementary layers of coverage on top of the surgical regression tests in fa30748cc: a parametrized matrix over all 17 non-writer arms, and an AST-based architecture fitness that walks the evolver and asserts every non-writer arm passes both lifecycle-date fields through from prior. The fitness catches the bug class on any future evolver arm even without a matching behavior test.
  • 2b23b978b Moves aggregates/_assembly_content_hash.py into aggregates/assembly/_content_hash.py. The shared-helper layer is reserved for genuinely cross-aggregate VOs (_drawing.py, _placement.py with 6-8 consumers each). The Assembly content-hash module has 2 consumers, both inside Assembly's own features; the misplaced location was actively misleading for the future Fixture-tier content-hash sibling tracked in project_fixture_configuration_hash_followup.
  • 7f5994a94 Factors out find_missing_families_per_id(event_store, ids) in family/read.py. Both Assembly handlers (define_assembly, version_assembly) now call it and drop ~6 lines + the asyncio.gather boilerplate apiece. Deliberately scoped narrower than the audit recommended: the Model handlers' analogous lookup stays inline because of a 15-test monkeypatch coupling, and the Asset handlers' load_model calls aren't actually duplicative (one uses the loaded model for a subset gate, the other only for an existence check). Commit body explains the deferrals.
  • d5795ed0e Adds structured error-level logging at the application-handler boundary when PidinstRecordInvariantError fires. The audit's original prescription was a custom HTTP handler; that would have failed test_get_asset_pidinst_status_map_is_complete.py which explicitly forbids registering one (Lock 11 of the persistent-id design memo). The serializer can't log either (L22 purity, enforced by test_pidinst_serializer_purity.py). The query handler is the right boundary: it wraps the to_pidinst_record(view) call in a try/except, logs asset_id + principal_id + correlation_id + reason via _log.exception(...), then re-raises so FastAPI's locked default 500 still fires. New unit test pins the log emission using structlog.testing.capture_logs().

Audit recs not included

Rec #4 (Asset state.py decomposition into sibling files) was deferred per the trigger discipline: file is at 1,218 lines, audit's trigger was 1,500; in-flight work nearby; revisit when the trigger fires.

Test plan

  • test_asset_evolver_lifecycle_dates_carry_forward.py (AST fitness) green
  • test_asset_lifecycle_dates_evolver.py (35 parametrized + 2 writer tests) green
  • test_assembly_content_hash.py + test_no_assembly_asset_level_literal.py green after rename
  • test_define_assembly_handler / test_version_assembly_handler integration suites green after helper extraction
  • test_get_asset_pidinst_handler.py::test_handler_logs_and_reraises_on_pidinst_record_invariant_error green
  • All 8 directly-touched test modules: 240 passed in 6s locally
  • CI confirms full suite + architecture fitnesses pass on the rebased branch

🤖 Generated with Claude Code

xmap and others added 5 commits June 5, 2026 09:23
…d_at through AssetOwnerRemoved + AssetAttachedToFixture

The Asset evolver's Critical Invariant docstring requires every
transition arm to preserve commissioned_at and decommissioned_at from
prior state. Two arms violated the invariant by omitting both fields
from their Asset(...) construction. Replaying a stream that included
either event silently wiped both timestamps to None, corrupting PIDINST
Property 11 lifecycle dates on any Asset whose owners changed or that
was attached to a Fixture after commissioning.

Same silent-data-loss shape as the Sub-Stage B.6 bug previously caught
on the fixture_id field. That earlier fix added fixture_id to the
preserve-fields list but did not catch the parallel commissioned_at /
decommissioned_at omissions on these two arms, which were folded in
later via the PIDINST read closure work.

Surfaced by the PseudoAxis Stage-1 design's implementation-readiness
gate review (round 2) as an out-of-scope finding while staging the
partition_rule threading work.

Two targeted regression tests pin the carry-forward:
- test_evolve_asset_owner_removed_preserves_lifecycle_timestamps
- test_evolve_asset_attached_to_fixture_preserves_lifecycle_timestamps

Both assert that prior commissioned_at / decommissioned_at survive the
respective evolver transitions. Mirror the existing
test_evolve_<transition>_preserves_<field> pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…matrix + AST fitness

Follow-up to 493aad0bb (`fix(equipment): Asset evolver carries
commissioned_at + decommissioned_at through AssetOwnerRemoved +
AssetAttachedToFixture`). That surgical fix pinned the two
originally-broken arms with arm-specific regression tests. This
commit adds two complementary layers of coverage so the bug class
cannot re-emerge on a future evolver arm.

Layer 1: full carry-forward matrix in
test_asset_lifecycle_dates_evolver.py. Mirrors the existing per-field
parametrize convention (test_evolve_mutation_preserves_drawing,
..._model_id, ..._alternate_identifiers) and extends it to both
PIDINST lifecycle dates across all 17 non-writer arms, explicitly
including the owner_added / owner_removed / fixture_attached /
fixture_detached arms whose absence from the older parametrize lists
is what allowed the original bug to ship undetected. Writer-arm
behavior (AssetRegistered as genesis writer; AssetDecommissioned as
terminal writer) is pinned in two dedicated tests.

Layer 2: AST-based architecture fitness in
test_asset_evolver_lifecycle_dates_carry_forward.py. Parses
evolver.py, walks every `case <EventName>(...):` arm in `evolve`,
and asserts the inner `Asset(...)` call contains both
`commissioned_at=prior.commissioned_at` and
`decommissioned_at=prior.decommissioned_at` unless the arm is one
of the documented writers. The structural check catches arms that
no behavior test was written for; the behavior matrix catches
regressions in arms that exist today. Sanity-checked by reverting
the original evolver patch and confirming both layers report the
violations.

Surfaced by the Equipment-BC balance audit (post-ship review of
slice E.1 PIDINST work) as the highest-priority recommendation:
protects PR #34 from corrupting on any Asset that has owners
removed or is attached to a fixture.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… aggregate dir

Files at the aggregates/ level (one directory above the per-aggregate
subdirs) signal "shared by multiple aggregates." _drawing.py and
_placement.py earn that placement with consumers across Mount, Frame,
and several siblings. _assembly_content_hash.py did not: its only
consumers are define_assembly/decider.py, version_assembly/decider.py,
and its own test. Living at the shared-helper level was misleading
and would have nudged future designers (notably the Fixture-tier
content-hash follow-up tracked in project_fixture_configuration_hash_followup)
toward extending this module rather than building a sibling inside
aggregates/fixture/.

Move:
  apps/api/src/cora/equipment/aggregates/_assembly_content_hash.py
    -> apps/api/src/cora/equipment/aggregates/assembly/_content_hash.py

Touches the two import sites (define_assembly + version_assembly
deciders), the test file's import, and the hardcoded path entry in
test_no_assembly_asset_level_literal.py. The PAYLOAD_TYPE constants
and the function names are unchanged, so no event-shape or hash drift.
The source docstring is rewritten to explain the single-aggregate
scope and to flag the future Fixture-tier sibling expectation
explicitly.

Surfaced by the Equipment-BC balance audit as the lowest-effort
honest-architecture fix: zero fold-cost, zero cross-BC ripple, and
removes a load-bearing-looking signal that wasn't load-bearing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…issing_families_per_id

define_assembly and version_assembly each opened the family stream
fan-out inline: build the referenced FamilyId set, await
asyncio.gather over load_family for each id, zip the results, and
collect the missing ids into a frozenset before threading into
context. The two implementations were near-identical (10-ish lines
each, two formatting variants on the same logic). Extracting the
lookup into a named helper in family/read.py shrinks each handler
by ~6 lines, removes the asyncio import + asyncio.gather doctring
mention from both, and earns the rule of three the moment a third
per-id caller appears.

Scope deliberately narrower than the Equipment-BC balance audit
recommendation:

  - Model handlers (define_model, add_model_family) keep their
    inline `list_all_family_ids` + set-difference. They are
    monkeypatched at the handler-module level (`handler.list_all_family_ids`)
    in 15 test files; routing the lookup through a helper would
    invalidate every one of those patches and force a 15-file stub-
    shape rewrite for a 2-line per-handler win. Revisit when those
    tests get migrated to patch at the read-module location.

  - Asset handlers (register_asset, add_asset_family) keep their
    inline `load_model` calls. register_asset uses the loaded model
    only for an existence check with elaborate INFO logging on the
    not-found path; add_asset_family loads the model AND reads
    `model.declared_family_ids` for a subset gate. The two share a
    surface shape (`if model is None: raise ModelNotFoundError(id)`)
    but their downstream use differs enough that a single helper
    would either hide the asymmetry or carry an awkward parameter
    surface.

The unused sibling helper `find_missing_families_in_registry`
(bulk-SQL strategy for Model-side callers) was authored, then
deleted before commit per the project's defer-until-trigger
discipline: speculative helpers without a current caller are noise.
When the Model-side test surface rework lands, that helper can ship
alongside it in one commit.

Surfaced by the Equipment-BC balance audit (Rec #3). The audit's
original prescription targeted 6 handlers and proposed two
strategy-variant helpers in family/read.py + matching helpers in
model/read.py. Reading the actual code revealed the Model side has
a test-shape coupling the audit did not see and the Asset side
isn't a duplication at all.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ndler before re-raising

PidinstRecordInvariantError is intentionally NOT wired to a custom
HTTP handler per Lock 11 of project_asset_persistent_id_design:
it is a server-bug backstop, not a routable client-facing error,
and FastAPI's default 500 is the locked outcome (enforced by
tests/architecture/test_get_asset_pidinst_status_map_is_complete.py
which both pins the 4 routable PIDINST errors AND asserts this one
is NOT registered).

The locked policy left a real observability gap: FastAPI's default
500 returns the plain-text body `Internal Server Error` (not the
BC-uniform `{"detail": str(exc)}` shape every other Equipment
error uses) and prints the stack trace to stderr via traceback,
NOT through the project's structlog `_log` channel. Sentry /
Loki / structured monitoring sees nothing actionable: no
asset_id, no correlation_id, no principal_id, no JSON fields.

The naive fix (wrap construction inside `to_pidinst_record`) was
rejected: that module is governed by L22 / L28 purity
(tests/architecture/test_pidinst_serializer_purity.py) which
forbids I/O accumulation including logging. The serializer must
stay a pure transform.

The right placement is the application boundary: the
`get_asset_pidinst` handler wraps the `to_pidinst_record(view)`
call in a try/except for `PidinstRecordInvariantError`, emits a
structured `_log.exception("get_asset_pidinst.pidinst_record_invariant", ...)`
with asset_id + principal_id + correlation_id + invariant reason,
then re-raises. FastAPI's default 500 still fires, the body shape
is unchanged, the locked policy is preserved, and operators get
the structured trail they need to diagnose a recurrence.

The handler-level placement also matches the existing convention
for error-path logging in non-pure modules
(infrastructure/idempotency_pruner.py:97,
infrastructure/auth/bearer_auth_middleware.py:277,
infrastructure/projection/worker.py:209).

Three docstrings updated to point at the new arrangement:
- _pidinst_serializer.to_pidinst_record notes that observability
  lives in the application handler and explains why (L22).
- get_asset_pidinst/handler.py updates the 500-line in the
  error-routing table to cite L11 explicitly and to point at the
  in-handler log.
- get_asset_pidinst/route.py mirrors the handler docstring's
  500-line for the route-layer reader.

New unit test test_handler_logs_and_reraises_on_pidinst_record_invariant_error
monkeypatches to_pidinst_record to raise the invariant error, runs
the handler inside structlog.testing.capture_logs(), and asserts
both the re-raise AND the structured-log emission (event name,
asset_id, principal_id, correlation_id, reason, log_level=error).

Surfaced by the Equipment-BC balance audit (Rec #5). The audit's
original prescription was to register a custom HTTP handler
mapping PidinstRecordInvariantError to a structured 500; that
recommendation would have failed the existing fitness test the
audit did not see. The deeper investigation isolated the genuine
observability gap and a placement that respects every locked
constraint (Lock 11, L22, both fitness tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@xmap xmap merged commit d8829db into main Jun 5, 2026
3 checks passed
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 5, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  apps/api/src/cora/equipment
  _pidinst_serializer.py
  apps/api/src/cora/equipment/aggregates/assembly
  _content_hash.py
  apps/api/src/cora/equipment/aggregates/asset
  evolver.py
  apps/api/src/cora/equipment/aggregates/family
  __init__.py
  read.py 203
  apps/api/src/cora/equipment/features/define_assembly
  decider.py
  handler.py
  apps/api/src/cora/equipment/features/get_asset_pidinst
  handler.py
  apps/api/src/cora/equipment/features/version_assembly
  decider.py
  handler.py
Project Total  

This report was generated by python-coverage-comment-action

@xmap xmap deleted the equipment-bc-audit-followups branch June 5, 2026 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant