Conversation
#717) PR #716 had to patch two version boundaries by hand because the housekeeping script's regex only matched the bolded-header patterns: - docs/archive/CHANGELOG-HISTORIC.md prose header "Older changelog entries for MCP Memory Service (vX.Y.Z and earlier)." - README.md "Full version history" footer's markdown-link text "[Older versions (vX.Y.Z and earlier)](docs/archive/CHANGELOG-HISTORIC.md)" Added two sibling helpers — update_archive_header_boundary() and update_readme_footer_boundary() — that target each format specifically, and wired them into run() at the points where the archive header and README are already being rewritten. Both target the newest-archived version, which is the correct boundary for a reader following the link. Tests: 7/7 in tests/maintenance/test_changelog_housekeeping.py cover: boundary rewriting for each format, idempotence, no-op on unrelated prose, regression check that the existing update_header_range() continues to match bolded headers. Closes #717 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces two new helper functions, update_archive_header_boundary and update_readme_footer_boundary, to the changelog housekeeping script to automate the updating of version boundaries in the archive documentation and README footer. It also includes a comprehensive suite of unit tests to verify these changes. The review feedback suggests improving the robustness of the regex patterns in both new functions by making the version boundary optional, which allows the script to correctly insert the version information even if it was previously missing or formatted differently.
| pattern = re.compile( | ||
| r"(Older changelog entries for MCP Memory Service )\(v[\d.]+ and earlier\)" | ||
| ) | ||
| return pattern.sub( | ||
| rf"\1(v{newest_archived} and earlier)", | ||
| archive_header, | ||
| ) |
There was a problem hiding this comment.
The regex currently requires the version boundary (vX.Y.Z and earlier) to be present for a match to occur. Making this part optional and using a more flexible whitespace match ensures the helper can correctly insert the boundary even if it was previously missing or formatted differently in the archive header.
| pattern = re.compile( | |
| r"(Older changelog entries for MCP Memory Service )\(v[\d.]+ and earlier\)" | |
| ) | |
| return pattern.sub( | |
| rf"\1(v{newest_archived} and earlier)", | |
| archive_header, | |
| ) | |
| pattern = re.compile( | |
| r"(Older changelog entries for MCP Memory Service)\s*(?:\(v[\d.]+ and earlier\))?" | |
| ) | |
| return pattern.sub( | |
| rf"\1 (v{newest_archived} and earlier)", | |
| archive_header, | |
| ) |
| pattern = re.compile( | ||
| r"(\[Older versions )\(v[\d.]+ and earlier\)(\]\([^)]+\))" | ||
| ) | ||
| return pattern.sub( | ||
| rf"\1(v{newest_archived} and earlier)\2", | ||
| text, | ||
| ) |
There was a problem hiding this comment.
The current regex requires the version boundary to already exist in the link text. However, the trim_readme_previous_releases function (at line 244) uses a fallback footer template that lacks this boundary. If the script ever has to recreate a missing footer, this helper will fail to update it with the correct version. Making the version part of the regex optional ensures the boundary is added correctly in all cases.
| pattern = re.compile( | |
| r"(\[Older versions )\(v[\d.]+ and earlier\)(\]\([^)]+\))" | |
| ) | |
| return pattern.sub( | |
| rf"\1(v{newest_archived} and earlier)\2", | |
| text, | |
| ) | |
| pattern = re.compile( | |
| r"(\[Older versions)\s*(?:\(v[\d.]+ and earlier\))?(\]\([^)]+\))" | |
| ) | |
| return pattern.sub( | |
| rf"\1 (v{newest_archived} and earlier)\2", | |
| text, | |
| ) |
xkonjin
left a comment
There was a problem hiding this comment.
Code Review
General
Tidy maintenance fix. The regex helpers are well-scoped, the tests are thorough, and the idempotency coverage is appreciated.
Bugs
- Regex precision on
update_archive_header_boundary: The pattern hardcodesMCP Memory Service. If the project name is ever rebranded or used as a template, this will silently no-op. Given it's a repo-specific maintenance script, that's acceptable, but consider adding a quick assertion in the test suite that at least one substitution occurs when running against the real file. - Regex greediness on
update_readme_footer_boundary: The pattern[Older versions )(v[\d.]+ and earlier)(\]([^)]+))is safe here because the link text is tightly bounded, but be aware that[^)]+in the URL group will break if the URL ever contains a closing paren (e.g., query strings with)). Unlikely in this repo, so not a blocker.
Testing
- Real-file validation: The dry-run verification is good, but consider a CI step that runs
python scripts/maintenance/changelog_housekeeping.py --dry-runon PRs touching the script so regressions are caught before merge. - 7/7 tests passing: Coverage looks solid. The no-op cases (unrelated prose, idempotency) are the most important guardrails for a script that mutates documentation.
Minor
- The docstrings and comments clearly explain why each helper exists. Keep that standard.
Overall: LGTM. Low risk, high maintenance-value fix.
Gemini review on PR #718 pointed out that the boundary regexes would fail if the existing text lacked a "(vX.Y.Z and earlier)" suffix — relevant because trim_readme_previous_releases() recreates a fallback footer without that suffix when the section would otherwise vanish. Made the version-range group optional in both regexes and moved the leading space into the replacement so the helpers now *insert* the boundary when it's missing instead of silently no-op'ing. Added two regression tests for the insert-when-missing path. 9/9 passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Iter 2 (ea16745): Gemini medium findings fixed. Both boundary regexes now make the Two regression tests added for the insert-when-missing path. 9/9 tests passing. /gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces two helper functions, update_archive_header_boundary and update_readme_footer_boundary, to automate version boundary updates in the historic changelog and README footer, supported by a new unit test suite. The review feedback highlights that the regex patterns used for version matching are too restrictive, failing on pre-release versions and potentially causing string duplication; suggestions were provided to use more inclusive patterns.
| # Boundary suffix is optional so the helper can also insert it when missing | ||
| # (e.g. a recreated fallback header without the version range). | ||
| pattern = re.compile( | ||
| r"(Older changelog entries for MCP Memory Service)(?:\s*\(v[\d.]+ and earlier\))?" |
There was a problem hiding this comment.
The regex pattern for matching the version boundary is somewhat restrictive and potentially buggy. Using [\d.]+ will fail to match versions with pre-release identifiers (e.g., v1.0.0-beta). More importantly, because the version group is optional and lacks a trailing anchor, if the existing version string doesn't match the pattern exactly, the regex will match only the prefix and append the new version, resulting in a duplicated version string (e.g., ... (v10.36.3 and earlier) (v1.0.0-beta and earlier)).
| r"(Older changelog entries for MCP Memory Service)(?:\s*\(v[\d.]+ and earlier\))?" | |
| r"(Older changelog entries for MCP Memory Service)(?:\s*\(v[^)]+ and earlier\))?" |
| # fallback footer that trim_readme_previous_releases recreates when the | ||
| # "Previous Releases" section ends up bare. | ||
| pattern = re.compile( | ||
| r"(\[Older versions)(?:\s*\(v[\d.]+ and earlier\))?(\]\([^)]+\))" |
There was a problem hiding this comment.
Similar to the archive header helper, using [\d.]+ to match the version string is restrictive and will fail for versions containing non-numeric characters (like -rc1 or -beta). While the trailing anchor ()( [^)]+)) prevents the "double version" bug here, it would still cause the update to fail (no-op) if the version format deviates from simple digits and dots.
| r"(\[Older versions)(?:\s*\(v[\d.]+ and earlier\))?(\]\([^)]+\))" | |
| r"(\[Older versions)(?:\s*\(v[^)]+ and earlier\))?(\]\([^)]+\))" |
…on bug Gemini iter 2 on PR #718 caught two defensive-coding gaps: 1. Version group used `[\d.]+`, rejecting pre-release / build-metadata identifiers like 1.0.0-beta, 1.0.0-rc.1, 1.0.0+build.5. Broadened to `[\w.+-]+` so the helpers don't no-op when users follow the full SemVer spec. 2. The archive-header helper had no trailing anchor for the optional version group. If a pre-release version slipped past the narrow `[\d.]+` class, the prefix would still match while the boundary group didn't — producing a duplicate "(vX and earlier) (vY and earlier)." chain. Added the required trailing period as the anchor: the boundary is always followed by `.` in the archive header, so this pins the match to a complete boundary (or forces no-op). The README helper already has a strong anchor (the markdown-link URL part), so only the version class needed broadening there. Tests (12/12): added three regression cases — pre-release rewrite for each helper, and a "no double-version on unparseable boundary" check. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Iter 3 (2fbe946) — final iteration per plan. Both Gemini medium findings addressed:
3 new regression tests: pre-release rewrite for each helper + "no double-version on unparseable boundary" guard. 12/12 passing. |
Summary
Closes #717. PR #716 had to patch two version-boundary strings by hand because `scripts/maintenance/changelog_housekeeping.py` only matched bolded-header patterns. This PR teaches the script to update both remaining formats automatically.
Fixes
Two new helpers alongside the existing `update_header_range()`:
```
Older changelog entries for MCP Memory Service (vX.Y.Z and earlier).
```
```markdown
... | Older versions (vX.Y.Z and earlier) | ...
```
Both wired into `run()` at the points where the archive header and README are already being rewritten. Both use `newest_archived` (the oldest version not kept in CHANGELOG.md) — that's the correct boundary for someone following the "Older versions" link.
Tests
`tests/maintenance/test_changelog_housekeeping.py` — 7/7 passing:
Verification
`python scripts/maintenance/changelog_housekeeping.py --dry-run` against current main: correctly reports "No housekeeping needed" (90 lines / 7 entries, below thresholds). Next rotation won't need manual Gemini follow-up for these footers.
Changes
🤖 Generated with Claude Code