refactor: foundation leaf hygiene Sprint 4 — free-threading polish#240
Merged
refactor: foundation leaf hygiene Sprint 4 — free-threading polish#240
Conversation
Five-sprint plan to act on findings from a four-lens layered review of 14 leaf files in bengal/utils/primitives/ and bengal/utils/concurrency/. Sprint 0 — three design questions on paper (DotDict miss-returns-empty contract, eager from_dict wrap, text.py split). Sprint 1 — six-item quick-win PR (~20 LOC). Sprints 2-3 gated on Sprint 0 decisions. Sprint 4 — free-threading polish. Each sprint ships independently. Acceptance criteria are grep/test commands. Risk register has mitigations pointing to specific tasks. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint 1 of plan/foundation-leaf-hygiene.md. Four tier-3 cleanups in foundation leaf utilities; no behavioral changes. - dates.parse_date: hoist 8-format list to module-level _DEFAULT_DATE_FORMATS tuple. Avoids per-call tuple allocation. - text.truncate_words: dedupe double join on the truncated word list — compute `before_suffix` once, branch on the cleaned form. - text.slugify_id: hoist `import unicodedata` to module top. Removes per-call import lookup (slugify_id is hot in URL generation). - retry: drop dead `last_error` tracking and the unreachable post-loop blocks in both sync and async retry_with_backoff. Loop invariant: range(retries+1) yields >=1 iteration, body always returns or raises. Replaced with single AssertionError sentinel. Notes on filtered findings from the audit: - S1.4 (dropped): "skip backoff sleep on final attempt" — false positive. Verified the sleep is already inside `if attempt < retries:`, so the final iteration already short-circuits to raise without sleeping. - S1.5 (deferred): types.unwrap_optional could rebuild a Union via `Union[non_none_args]` (documented tuple-subscript form) instead of `reduce(lambda a, b: a | b, …)`. Ruff UP007 rejects the Union[…] form even at runtime; not worth a `# noqa` for a cosmetic micro-opt. Left as-is. Net: −15 LOC across 3 files. Tests pass. ty floor unchanged at 2207. import-linter contracts unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…iene Three design questions blocked Sprint 2 / Sprint 3. All resolved on 2026-04-20 via parallel evidence gathering. - Q1 (DotDict miss returns ""): Option A — intentional. The "" return is load-bearing for Jinja2 falsy-chain patterns documented in themes/default/templates/SAFE_PATTERNS.md and exercised by track-helpers.html. Tests in test_dotdict.py encode it as the contract. Bugs layered on top (docstring drift line 89, three hasattr() no-ops in menu.py / validators/tracks.py) move to Sprint 2 as targeted follow-ups — the contract stands. - Q2 (from_dict eager wrap): Option C — keep eager. Lazy wrap-on-access in __getattribute__ only covers top-level dict values; it does NOT recurse into list elements. Templates iterate dict-in-list (e.g. track_nav.html), and tests assert isinstance(data.users[0], DotDict) after from_dict. Going shallow would silently break templates. Sprint 2 will add a one-line comment referencing this decision. - Q3 (split text.py): Option A — keep monolithic. Importer count per proposed move is small: humanize group has 0–2 direct importers each, format_path_for_display has 2. Bengal's no-shim rule means every importer migrates manually. Recent active edits on text.py (Sprint 1 + 2026-04-02 taxonomy consolidation) raise rebase risk. Sprint 2 adds docstring section headers instead. Sprint 2 reshaped to absorb the Q1/Q2 follow-ups + Q3 docstring headers. Sprint 3 marked skipped. Sprint 4 picks up the generate_excerpt fusion that lost its bundling vehicle. Risk register and success metrics updated to reflect resolved risks. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…latent menu bug fix Sprint 2 of plan/foundation-leaf-hygiene.md. Seven items: 1. DotDict: extract shared `_wrap_dict_value` helper used by both `__getattribute__` and `__getitem__`. Eliminates the duplicated "isinstance check + cache lookup + wrap" pattern. 2. DotDict: fix docstring drift in `__getattribute__` (claimed it returned None on miss; actually returns ""). Document the `hasattr(d, key)` always-True side effect inline so callers reach for `key in d` instead. 3. DotDict: add explanatory comment to `from_dict` covering why eager recursion (and the dict-in-list wrap) is load-bearing — `__getattribute__` only lazily wraps top-level dict *values*, not dicts nested inside lists. Templates that iterate `site.data.tracks` rely on this. 4. menu.py + validators/tracks.py: migrate `hasattr(site.data, "tracks")` to `"tracks" in site.data`. DotDict's `""`-on-miss makes hasattr always True, so the old guard silently passed for missing keys. 5. menu.py: expand both `isinstance(data, dict)` checks to `isinstance(data, dict | DotDict)`. This is the BEHAVIOR-CHANGING fix in the sprint: `site.data[key]` returns a DotDict (eager-wrapped via `from_dict`), so the old plain-dict check was always False in production — the data-driven dropdown feature was effectively dead code there. Old MagicMock-based tests passed only because they returned literal dicts. 6. lru_cache.get_or_set: collapse two return-on-hit branches into a single locked block. Same semantics, less duplication. 7. text.py: keep monolithic per Sprint 0 Q3 decision, but add a "Function groups" section to the module docstring listing the six concerns (slugify, HTML, truncation, excerpt, whitespace, humanize, path display) so readers can navigate. Tests: update `test_tracks.py` and `test_menu_dev_dropdown_autodoc.py` fixtures to use real `DotDict` instead of `MagicMock`. The MagicMock default `__contains__` returns False, which masked the menu isinstance bug (the old test was green because the lookup branch never executed). Aligning fixtures with production runtime types both surfaced the bug and prevents future drift. Verification: - targeted unit tests: 153 passed (utils + dotdict) - menu + tracks suites: 183 passed - full suite: 11924 passed (4 pre-existing failures unrelated, verified on parent commit via stash) - lint-imports: 2 contracts kept, 0 broken - ty: 2204 diagnostics (improved from 2207 floor) Depends on #238 (Sprint 0 design doc). Both branches touch `text.py` docstring — rebase if #238 lands first. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…bundled micro-opts
Sprint 4 of plan/foundation-leaf-hygiene.md. Three items:
1. async_compat.py: replace the manual `_uvloop_checked` /
`_uvloop_module` global state with `@functools.cache` on
`_get_uvloop`. Eliminates the check-then-act race that surfaces
under Python 3.14t free-threading (two threads could both enter
the import branch and double-pay the ~200ms uvloop import) and
removes the global mutable state entirely.
2. thread_local.py: extract a `_cache_key` helper so the
`_cache_{name}_{key or 'default'}` f-string lives in one place
instead of being re-built in `get`, `clear`, and `clear_all`.
`clear_all` keeps its own prefix binding since it slices, not
matches, the key.
3. text.py: switch `truncate_words` from `text.split()` to
`text.split(maxsplit=word_count)`. The old call parsed every
whitespace-separated token in the input even when the caller
only wanted the first N. With `maxsplit`, we stop after N+1
parts (first N words + remainder), which is the actual fix for
the "double-traversal of long markdown bodies" the audit flagged
in `generate_excerpt`. Output is identical — the first N words
are the same with or without `maxsplit`, and the
`len(parts) <= word_count` short-circuit still recognizes the
"no truncation needed" case correctly.
Verification:
- targeted unit tests: 148 passed (text + thread_local)
- full suite: 11911 passed (9 failures, all confirmed pre-existing
or flaky-under-parallel — verified by stashing and re-running on
parent commit)
- lint-imports: 2 contracts kept, 0 broken
- ty: 2204 diagnostics (matches Sprint 2 floor)
Depends on #239.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced Apr 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sprint 4 of
plan/foundation-leaf-hygiene.md— three small, behavior-preserving changes targeting Python 3.14t free-threading hazards plus a perf micro-opt that lost its bundling vehicle when Sprint 3 was skipped.Depends on #239. No file overlap with #239 — should rebase cleanly either order.
What changed
async_compat._get_uvloop— replace the_uvloop_checked/_uvloop_moduleglobal state with@functools.cache. Eliminates the check-then-act race that surfaces under free-threading (two threads could both enter the import branch and double-pay the ~200msimport uvloop) and removes mutable module globals.ThreadLocalCache._cache_key— extract a small private helper so the_cache_{name}_{key or 'default'}f-string lives in one place instead of being re-built inget,clear, andclear_all. Closes the audit's "builds cache_key 3×" finding.truncate_words— switch fromtext.split()totext.split(maxsplit=word_count). The old call walked every whitespace token in the input even when callers (notablygenerate_excerpt) only wanted the first N. Withmaxsplit, parsing stops after N+1 parts. Output is identical; saves an unbounded scan on long markdown bodies. This is the "fused traversal" the plan called for ingenerate_excerpt— done at the helper level so the callsite stays a clean composition.What didn't change
generate_excerptitself stays a 2-line composition ofstrip_html+truncate_words. The fusion happens insidetruncate_wordsso all callers benefit and there's no duplicated logic.Test plan
lint-imports: 2 contracts kept, 0 brokenty: 2204 diagnostics (matches Sprint 2 floor — no regression)🤖 Generated with Claude Code