Skip to content

Unit 7 (hallucination): split criterion 2 (MEDIUM, preserve-by-default)#166

Merged
LuminLynx merged 1 commit into
mainfrom
claude/unit-07-rubric-split
May 21, 2026
Merged

Unit 7 (hallucination): split criterion 2 (MEDIUM, preserve-by-default)#166
LuminLynx merged 1 commit into
mainfrom
claude/unit-07-rubric-split

Conversation

@LuminLynx
Copy link
Copy Markdown
Owner

Summary

Unit 7 (hallucination), sixth of the MEDIUM batch, faithful preserve-by-default split. Gate-exempt per the rule merged in PR #165.

Rubric (3 → 4): c1 unchanged · c2 = name the failure mode · c3 (NEW) = explain the mechanism · c4 = regime distinction (was c3).

Decomposition

  • Every old-c2=T pair → c2=T and c3=T.
  • p007 is the lone c2=T / c3=F differential — names the mitigation-only failure without the volume-compound mechanism (per its authored label).
  • p009, p011 name no failure mode → c2=F, c3=F.
  • c4 (regime) carries old-c3 unchanged. No realignments. p007/p011 labels updated for the 4-criterion shape.

Post-split distribution (21 pairs)

8 × 4-of-4 · 3 × 3-of-4 (p006, p007-differential, p014) · 1 × 2-of-4 (p010) · 2 × 1-of-4 (p009, p011) · 5 on-topic-all-missed · 2 off-topic.

Notes

  • Known-bad p018 (emoji + structured-markdown ERROR, confirmed across Units 6+7) unaffected by the split.
  • Gate-exempt: faithful split, zero realignments, no locked-value meaning changed.

Local validation

  • lint_unit_markdown / ingest_units --check — clean
  • run_regression_set --check — 21 pairs valid
  • pytest — 20/20

Ready to merge (no gate run required).


Generated by Claude Code

…-default)

Per docs/RUBRIC_AUDIT.md (MEDIUM): old c2 bundled 'names a concrete
failure mode' with 'explains the mechanism.' Splits into
name-the-failure-mode c2 and a new c3 (explain the mechanism);
renumbers regime distinction to position 4. Rubric grows 3 -> 4.

Preserve-by-default: faithful decomposition of locked Opus values
(old-c2=T -> c2=T,c3=T). p007 is the lone c2=T/c3=F differential
(names the mitigation-only failure without the volume-compound
mechanism, per its label); p009/p011 name no failure mode -> c2=F,
c3=F. No realignments. c4 (regime) carries old-c3 unchanged. Updated
p007/p011 labels for the 4-criterion shape.

Gate-exempt (faithful split, zero realignments per
docs/REGRESSION_GATE.md). Known-bad p018 ERROR unaffected. Local
lint, schema check, ingest-check, pytest all pass.
@LuminLynx LuminLynx merged commit 7b9ecda into main May 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants