Skip to content

refactor: foundation leaf hygiene Sprint 4 — free-threading polish#240

Merged
lbliii merged 5 commits intomainfrom
lbliii/leaf-hygiene-s4
Apr 20, 2026
Merged

refactor: foundation leaf hygiene Sprint 4 — free-threading polish#240
lbliii merged 5 commits intomainfrom
lbliii/leaf-hygiene-s4

Conversation

@lbliii
Copy link
Copy Markdown
Owner

@lbliii lbliii commented Apr 20, 2026

Summary

Sprint 4 of plan/foundation-leaf-hygiene.md — three small, behavior-preserving changes targeting Python 3.14t free-threading hazards plus a perf micro-opt that lost its bundling vehicle when Sprint 3 was skipped.

Depends on #239. No file overlap with #239 — should rebase cleanly either order.

What changed

  1. async_compat._get_uvloop — replace the _uvloop_checked / _uvloop_module global state with @functools.cache. Eliminates the check-then-act race that surfaces under free-threading (two threads could both enter the import branch and double-pay the ~200ms import uvloop) and removes mutable module globals.

  2. ThreadLocalCache._cache_key — extract a small private helper so the _cache_{name}_{key or 'default'} f-string lives in one place instead of being re-built in get, clear, and clear_all. Closes the audit's "builds cache_key 3×" finding.

  3. truncate_words — switch from text.split() to text.split(maxsplit=word_count). The old call walked every whitespace token in the input even when callers (notably generate_excerpt) only wanted the first N. With maxsplit, parsing stops after N+1 parts. Output is identical; saves an unbounded scan on long markdown bodies. This is the "fused traversal" the plan called for in generate_excerpt — done at the helper level so the callsite stays a clean composition.

What didn't change

  • generate_excerpt itself stays a 2-line composition of strip_html + truncate_words. The fusion happens inside truncate_words so all callers benefit and there's no duplicated logic.

Test plan

  • targeted unit tests pass (148 — text, text_properties, thread_local)
  • full suite green (11911 passed; 9 failures all confirmed pre-existing on parent commit or flaky-under-parallel via stash + re-run; isolated re-run of the 7 ambiguous ones all passed)
  • lint-imports: 2 contracts kept, 0 broken
  • ty: 2204 diagnostics (matches Sprint 2 floor — no regression)

🤖 Generated with Claude Code

lbliii and others added 5 commits April 20, 2026 14:02
Five-sprint plan to act on findings from a four-lens layered review of
14 leaf files in bengal/utils/primitives/ and bengal/utils/concurrency/.

Sprint 0 — three design questions on paper (DotDict miss-returns-empty
contract, eager from_dict wrap, text.py split). Sprint 1 — six-item
quick-win PR (~20 LOC). Sprints 2-3 gated on Sprint 0 decisions.
Sprint 4 — free-threading polish.

Each sprint ships independently. Acceptance criteria are grep/test
commands. Risk register has mitigations pointing to specific tasks.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint 1 of plan/foundation-leaf-hygiene.md. Four tier-3 cleanups in
foundation leaf utilities; no behavioral changes.

- dates.parse_date: hoist 8-format list to module-level
  _DEFAULT_DATE_FORMATS tuple. Avoids per-call tuple allocation.
- text.truncate_words: dedupe double join on the truncated word
  list — compute `before_suffix` once, branch on the cleaned form.
- text.slugify_id: hoist `import unicodedata` to module top. Removes
  per-call import lookup (slugify_id is hot in URL generation).
- retry: drop dead `last_error` tracking and the unreachable
  post-loop blocks in both sync and async retry_with_backoff. Loop
  invariant: range(retries+1) yields >=1 iteration, body always
  returns or raises. Replaced with single AssertionError sentinel.

Notes on filtered findings from the audit:

- S1.4 (dropped): "skip backoff sleep on final attempt" — false
  positive. Verified the sleep is already inside `if attempt <
  retries:`, so the final iteration already short-circuits to raise
  without sleeping.
- S1.5 (deferred): types.unwrap_optional could rebuild a Union via
  `Union[non_none_args]` (documented tuple-subscript form) instead
  of `reduce(lambda a, b: a | b, …)`. Ruff UP007 rejects the
  Union[…] form even at runtime; not worth a `# noqa` for a
  cosmetic micro-opt. Left as-is.

Net: −15 LOC across 3 files. Tests pass. ty floor unchanged at 2207.
import-linter contracts unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…iene

Three design questions blocked Sprint 2 / Sprint 3. All resolved on
2026-04-20 via parallel evidence gathering.

- Q1 (DotDict miss returns ""): Option A — intentional. The "" return
  is load-bearing for Jinja2 falsy-chain patterns documented in
  themes/default/templates/SAFE_PATTERNS.md and exercised by
  track-helpers.html. Tests in test_dotdict.py encode it as the
  contract. Bugs layered on top (docstring drift line 89, three
  hasattr() no-ops in menu.py / validators/tracks.py) move to
  Sprint 2 as targeted follow-ups — the contract stands.

- Q2 (from_dict eager wrap): Option C — keep eager. Lazy wrap-on-access
  in __getattribute__ only covers top-level dict values; it does NOT
  recurse into list elements. Templates iterate dict-in-list (e.g.
  track_nav.html), and tests assert isinstance(data.users[0], DotDict)
  after from_dict. Going shallow would silently break templates.
  Sprint 2 will add a one-line comment referencing this decision.

- Q3 (split text.py): Option A — keep monolithic. Importer count per
  proposed move is small: humanize group has 0–2 direct importers
  each, format_path_for_display has 2. Bengal's no-shim rule means
  every importer migrates manually. Recent active edits on text.py
  (Sprint 1 + 2026-04-02 taxonomy consolidation) raise rebase risk.
  Sprint 2 adds docstring section headers instead.

Sprint 2 reshaped to absorb the Q1/Q2 follow-ups + Q3 docstring
headers. Sprint 3 marked skipped. Sprint 4 picks up the
generate_excerpt fusion that lost its bundling vehicle.

Risk register and success metrics updated to reflect resolved risks.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…latent menu bug fix

Sprint 2 of plan/foundation-leaf-hygiene.md. Seven items:

1. DotDict: extract shared `_wrap_dict_value` helper used by both
   `__getattribute__` and `__getitem__`. Eliminates the duplicated
   "isinstance check + cache lookup + wrap" pattern.

2. DotDict: fix docstring drift in `__getattribute__` (claimed it
   returned None on miss; actually returns ""). Document the
   `hasattr(d, key)` always-True side effect inline so callers reach
   for `key in d` instead.

3. DotDict: add explanatory comment to `from_dict` covering why eager
   recursion (and the dict-in-list wrap) is load-bearing — `__getattribute__`
   only lazily wraps top-level dict *values*, not dicts nested inside
   lists. Templates that iterate `site.data.tracks` rely on this.

4. menu.py + validators/tracks.py: migrate `hasattr(site.data, "tracks")`
   to `"tracks" in site.data`. DotDict's `""`-on-miss makes hasattr
   always True, so the old guard silently passed for missing keys.

5. menu.py: expand both `isinstance(data, dict)` checks to
   `isinstance(data, dict | DotDict)`. This is the BEHAVIOR-CHANGING
   fix in the sprint: `site.data[key]` returns a DotDict (eager-wrapped
   via `from_dict`), so the old plain-dict check was always False in
   production — the data-driven dropdown feature was effectively dead
   code there. Old MagicMock-based tests passed only because they
   returned literal dicts.

6. lru_cache.get_or_set: collapse two return-on-hit branches into a
   single locked block. Same semantics, less duplication.

7. text.py: keep monolithic per Sprint 0 Q3 decision, but add a
   "Function groups" section to the module docstring listing the
   six concerns (slugify, HTML, truncation, excerpt, whitespace,
   humanize, path display) so readers can navigate.

Tests: update `test_tracks.py` and `test_menu_dev_dropdown_autodoc.py`
fixtures to use real `DotDict` instead of `MagicMock`. The MagicMock
default `__contains__` returns False, which masked the menu isinstance
bug (the old test was green because the lookup branch never executed).
Aligning fixtures with production runtime types both surfaced the bug
and prevents future drift.

Verification:
- targeted unit tests: 153 passed (utils + dotdict)
- menu + tracks suites: 183 passed
- full suite: 11924 passed (4 pre-existing failures unrelated, verified
  on parent commit via stash)
- lint-imports: 2 contracts kept, 0 broken
- ty: 2204 diagnostics (improved from 2207 floor)

Depends on #238 (Sprint 0 design doc). Both branches touch
`text.py` docstring — rebase if #238 lands first.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…bundled micro-opts

Sprint 4 of plan/foundation-leaf-hygiene.md. Three items:

1. async_compat.py: replace the manual `_uvloop_checked` /
   `_uvloop_module` global state with `@functools.cache` on
   `_get_uvloop`. Eliminates the check-then-act race that surfaces
   under Python 3.14t free-threading (two threads could both enter
   the import branch and double-pay the ~200ms uvloop import) and
   removes the global mutable state entirely.

2. thread_local.py: extract a `_cache_key` helper so the
   `_cache_{name}_{key or 'default'}` f-string lives in one place
   instead of being re-built in `get`, `clear`, and `clear_all`.
   `clear_all` keeps its own prefix binding since it slices, not
   matches, the key.

3. text.py: switch `truncate_words` from `text.split()` to
   `text.split(maxsplit=word_count)`. The old call parsed every
   whitespace-separated token in the input even when the caller
   only wanted the first N. With `maxsplit`, we stop after N+1
   parts (first N words + remainder), which is the actual fix for
   the "double-traversal of long markdown bodies" the audit flagged
   in `generate_excerpt`. Output is identical — the first N words
   are the same with or without `maxsplit`, and the
   `len(parts) <= word_count` short-circuit still recognizes the
   "no truncation needed" case correctly.

Verification:
- targeted unit tests: 148 passed (text + thread_local)
- full suite: 11911 passed (9 failures, all confirmed pre-existing
  or flaky-under-parallel — verified by stashing and re-running on
  parent commit)
- lint-imports: 2 contracts kept, 0 broken
- ty: 2204 diagnostics (matches Sprint 2 floor)

Depends on #239.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@lbliii lbliii merged commit 874d6d6 into main Apr 20, 2026
22 checks passed
@lbliii lbliii deleted the lbliii/leaf-hygiene-s4 branch April 20, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant