Skip to content

feat: gstack integration + repo health audit#35

Merged
jeremylongshore merged 3 commits intomainfrom
feat/gstack-integration
Mar 18, 2026
Merged

feat: gstack integration + repo health audit#35
jeremylongshore merged 3 commits intomainfrom
feat/gstack-integration

Conversation

@jeremylongshore
Copy link
Copy Markdown
Collaborator

@jeremylongshore jeremylongshore commented Mar 18, 2026

Summary

  • Phase 0 (Repo Health): Fix vitest version mismatch (root ^3.0.8^4.0.0), refresh CLAUDE.md (v1.0.9 → v2.0.0, add new commands/structure/gstack docs)
  • Phases 1-4 (gstack Integration): Template build system (gen-skill-docs.ts), 5 Claude Code skills (review-docs, review-architecture, ship, document-release, qa-docs), quality test infrastructure (170 tests), LLM-judge eval helper, review checklist, ACKNOWLEDGMENTS.md

Verification

  • make verify → 22 templates ✅
  • npx tsx scripts/gen-skill-docs.ts --dry-run → exits 0 ✅
  • npx vitest run test/skill-validation.test.ts → 37 tests pass ✅
  • npx vitest run test/doc-quality-eval.test.ts → 133 tests pass ✅
  • npm audit → 0 vulnerabilities ✅

Test plan

  • CI passes (skill validation + template freshness)
  • make gen-skills regenerates cleanly
  • Skills invocable via Claude Code (/review-docs, /ship, etc.)
  • LLM-judge tests pass with API key (npm run test:evals)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added a comprehensive skill system for doc workflows (release sync, QA audits, architecture & review skills), plus guided project scaffolding with 22 templates and a docs-generation utility.
  • Tests

    • New validation and quality test suites, including optional LLM-based judging and helper utilities for doc scoring.
  • Chores

    • CI updated to validate skills and template freshness; new build/test targets and scripts added.
  • Documentation

    • ACKNOWLEDGMENTS, README updates, checklists, and workflow guidance added.

Phase 0 (Repo Health):
- Fix vitest version mismatch: root ^3.0.8 → ^4.0.0 to match packages
- Refresh CLAUDE.md: version v1.0.9 → v2.0.0, add new commands, structure, gstack section

Phase 1 (Template Build System):
- Add gen-skill-docs.ts: .md.tmpl → .md pipeline with {{PLACEHOLDER}} resolution
- Add shared blocks: preamble.md, template-list.md
- Convert new-project.md to template-based generation
- Add Makefile targets: gen-skills, skill-check
- Add CI steps: skill validation, template freshness check

Phase 2 (Slash Command Skills):
- Add 5 Claude Code skills: review-docs, review-architecture, ship, document-release, qa-docs
- Each with SKILL.md (YAML frontmatter) and evals/evals.json

Phase 3 (Quality Infrastructure):
- Add vitest config and test suites: skill-validation (37 tests), doc-quality-eval (133 tests)
- Add LLM-judge helper (Anthropic SDK, optional paid tier)
- Add review/doc-checklist.md rubric

Phase 4 (Attribution):
- Add ACKNOWLEDGMENTS.md crediting gstack (MIT → Apache 2.0)
- Add acknowledgments section to README.md

All 170 tests passing. Zero npm audit vulnerabilities.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 18, 2026

Warning

Rate limit exceeded

@jeremylongshore has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 47 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d765eb8b-91e5-47c9-869a-b1c178122b51

📥 Commits

Reviewing files that changed from the base of the PR and between e545e6d and 0e0acf4.

📒 Files selected for processing (1)
  • test/helpers/llm-judge.ts
📝 Walkthrough

Walkthrough

Adds a skill-driven documentation system: six new skill docs and their evals, template generation tooling and script, 22 templates with shared blocks, validation tests and LLM-judge helpers, Make/npm/vitest CI hooks, README/CLAUDE updates, and acknowledgments for adapted gstack patterns.

Changes

Cohort / File(s) Summary
Skill definitions
.claude/skills/document-release/SKILL.md, .claude/skills/qa-docs/SKILL.md, .claude/skills/review-architecture/SKILL.md, .claude/skills/review-docs/SKILL.md, .claude/skills/ship/SKILL.md
Adds six skill documentation files describing workflows, prerequisites, step-by-step procedures, outputs, edge cases, and error handling for doc release, QA, architecture review, docs review, and shipping.
Skill eval configs
.claude/skills/.../evals/evals.json
Adds evals JSON for each skill (post-ship sync, full audits, negative tests, etc.) to validate prompts and assertions across scenarios.
Template files & shared blocks
commands/new-project.md.tmpl, commands/new-project.md, commands/shared/preamble.md, commands/shared/template-list.md
Introduces a new project template generator, preamble for base-branch detection, and a shared template list enumerating 22 templates grouped by scope (MVP/Standard/Comprehensive).
Template generation script
05-Scripts/gen-skill-docs.ts
New TypeScript script to resolve .md.tmpl templates into .md, with token resolvers, dry-run stale detection, verbose logging, and exit-code behavior.
Tests & LLM helpers
test/skill-validation.test.ts, test/doc-quality-eval.test.ts, test/helpers/llm-judge.ts
Adds skill validation and template freshness tests, static template checks (expects 22 templates), optional Anthropic LLM-judge tests, and LLM helper module with retry/JSON parsing helpers.
Review/checklist references
review/doc-checklist.md, .claude/skills/review-docs/references/doc-checklist.md
Adds document-quality checklist and reporting rubric used by review skills (two-pass process, scoring, severity classifications).
Build, scripts, CI
Makefile, package.json, .github/workflows/ci.yml, vitest.config.ts
Adds Make targets (gen-skills, skill-check), npm scripts (gen:skill-docs, skill:check, test:skills, test:docs, test:evals), dev deps (@anthropic-ai/sdk, tsx, vitest), two CI steps (validate-skills, check-template-freshness), and vitest server dep inline tweak.
Project docs & acknowledgments
CLAUDE.md, README.md, ACKNOWLEDGMENTS.md
Updates CLAUDE.md with template build system, commands, and gstack integration notes; adds README acknowledgments and a new ACKNOWLEDGMENTS.md crediting adapted gstack patterns.
Command content updates
commands/new-project.md
Adds regenerated documentation for new-project command, including preamble, base-branch logic, template availability, and error handling to match new template flow.
Docs checklist reference
review/doc-checklist.md
Adds a document quality review checklist with severity matrix and scoring rubric used across skills and tests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I hopped through templates, tokens in paw,

Resolved the preamble, fixed a branch flaw,
Skills and evals now line up in rows,
Tests hum softly where the gen-script goes,
A little rabbit cheers: docs bloom and grow! 🌱

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description provides a detailed summary of changes across phases, includes verification steps with pass/fail status, and a test plan, but does not follow the required template structure with explicit Enterprise Checklist items. Align the description with the template: add the Enterprise Checklist section with explicit checkboxes, include screenshots or logs of verification, and add a Notes for Reviewers section.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main change—adding gstack integration and establishing repo health audit infrastructure—which aligns with the substantial additions across skills, templates, and testing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/gstack-integration
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Integrate gstack patterns with template build system and 5 Claude Code skills

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Integrate gstack patterns: template build system, 5 Claude Code skills, quality infrastructure
• Add gen-skill-docs.ts pipeline: resolve {{PLACEHOLDER}} tokens from shared blocks
• Implement 5 skills: review-docs, review-architecture, ship, document-release, qa-docs
• Add 170 tests: skill validation (37), doc quality eval (133), LLM-judge helpers
• Update CLAUDE.md to v2.0.0, add gstack acknowledgments, refresh CI/Makefile
Diagram
flowchart LR
  A["Template Sources<br/>.md.tmpl files"] -->|"gen-skill-docs.ts<br/>resolve placeholders"| B["Generated Docs<br/>.md files"]
  C["Shared Blocks<br/>preamble, template-list"] -->|"{{PLACEHOLDER}}<br/>resolution"| B
  B -->|"skill validation<br/>37 tests"| D["Skill Validation"]
  B -->|"doc quality eval<br/>133 tests"| E["Quality Tests"]
  F["5 Claude Skills<br/>review-docs, ship, etc"] -->|"evals.json<br/>test cases"| E
  D -->|"CI checks"| G["GitHub Actions"]
  E -->|"LLM-judge<br/>optional"| G
Loading

Grey Divider

File Changes

1. 05-Scripts/gen-skill-docs.ts ✨ Enhancement +512/-0

Template build system with placeholder resolution

05-Scripts/gen-skill-docs.ts


2. test/skill-validation.test.ts 🧪 Tests +104/-0

Skill file structural validation tests

test/skill-validation.test.ts


3. test/doc-quality-eval.test.ts 🧪 Tests +92/-0

Template quality evaluation with LLM-judge

test/doc-quality-eval.test.ts


View more (24)
4. test/helpers/llm-judge.ts 🧪 Tests +126/-0

LLM-as-judge scoring helpers for doc evaluation

test/helpers/llm-judge.ts


5. vitest.config.ts ⚙️ Configuration changes +9/-0

Vitest configuration for test suite

vitest.config.ts


6. .claude/skills/review-docs/SKILL.md ✨ Enhancement +183/-0

Document quality review skill with 8-category rubric

.claude/skills/review-docs/SKILL.md


7. .claude/skills/review-docs/evals/evals.json 🧪 Tests +5/-0

Evaluation test cases for review-docs skill

.claude/skills/review-docs/evals/evals.json


8. .claude/skills/review-docs/references/doc-checklist.md 📝 Documentation +122/-0

Document quality review checklist rubric

.claude/skills/review-docs/references/doc-checklist.md


9. .claude/skills/review-architecture/SKILL.md ✨ Enhancement +172/-0

Architecture review skill with 15 cognitive patterns

.claude/skills/review-architecture/SKILL.md


10. .claude/skills/review-architecture/evals/evals.json 🧪 Tests +5/-0

Evaluation test cases for review-architecture skill

.claude/skills/review-architecture/evals/evals.json


11. .claude/skills/ship/SKILL.md ✨ Enhancement +206/-0

Release workflow skill: validate, bump, changelog, PR

.claude/skills/ship/SKILL.md


12. .claude/skills/ship/evals/evals.json 🧪 Tests +5/-0

Evaluation test cases for ship skill

.claude/skills/ship/evals/evals.json


13. .claude/skills/document-release/SKILL.md ✨ Enhancement +177/-0

Post-ship documentation sync skill

.claude/skills/document-release/SKILL.md


14. .claude/skills/document-release/evals/evals.json 🧪 Tests +5/-0

Evaluation test cases for document-release skill

.claude/skills/document-release/evals/evals.json


15. .claude/skills/qa-docs/SKILL.md ✨ Enhancement +174/-0

Report-only documentation quality audit skill

.claude/skills/qa-docs/SKILL.md


16. .claude/skills/qa-docs/evals/evals.json 🧪 Tests +5/-0

Evaluation test cases for qa-docs skill

.claude/skills/qa-docs/evals/evals.json


17. commands/new-project.md.tmpl ✨ Enhancement +72/-0

Template source for new-project command

commands/new-project.md.tmpl


18. commands/new-project.md ✨ Enhancement +60/-0

Generated new-project command documentation

commands/new-project.md


19. commands/shared/preamble.md ✨ Enhancement +23/-0

Shared template block for branch detection

commands/shared/preamble.md


20. commands/shared/template-list.md ✨ Enhancement +31/-0

Shared template block listing all 22 templates

commands/shared/template-list.md


21. ACKNOWLEDGMENTS.md 📝 Documentation +31/-0

Attribution for gstack-adapted patterns and code

ACKNOWLEDGMENTS.md


22. CLAUDE.md 📝 Documentation +43/-2

Update to v2.0.0, add gstack section and new commands

CLAUDE.md


23. Makefile ⚙️ Configuration changes +7/-1

Add gen-skills and skill-check targets

Makefile


24. README.md 📝 Documentation +6/-0

Add acknowledgments section for gstack

README.md


25. package.json Dependencies +10/-2

Add test scripts, dependencies, version bump to 2.0.0

package.json


26. review/doc-checklist.md 📝 Documentation +122/-0

Document quality review checklist rubric

review/doc-checklist.md


27. .github/workflows/ci.yml ⚙️ Configuration changes +6/-0

Add skill validation and template freshness checks

.github/workflows/ci.yml


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Mar 18, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (1) 📎 Requirement gaps (0)

Grey Divider


Action required

1. review/doc-checklist.md outside 01-Docs/ 📘 Rule violation ✓ Correctness
Description
A new documentation file was added under review/ instead of 01-Docs/, and its filename does not
follow the required NNN-abv-description.ext convention. This violates the repo documentation
storage and naming requirements.
Code

review/doc-checklist.md[R1-5]

+# Document Quality Review Checklist
+
+## Instructions
+
+Review generated documentation for the issues listed below. Be specific — cite the document and section. Skip anything that's fine. Only flag real problems.
Evidence
PR Compliance ID 2 requires new documentation to be stored under 01-Docs/ and named using
NNN-abv-description.ext. This PR adds review/doc-checklist.md, which is outside 01-Docs/ and
lacks the required numeric prefix and naming structure.

CLAUDE.md
review/doc-checklist.md[1-5]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A new documentation file was added outside `01-Docs/` and does not follow the required `NNN-abv-description.ext` naming convention.
## Issue Context
Compliance requires all newly added documentation to be stored in `01-Docs/` and named per `.directory-standards.md`.
## Fix Focus Areas
- review/doc-checklist.md[1-122]
- .claude/skills/review-docs/SKILL.md[35-38]
- CLAUDE.md[66-66]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Broken script path refs 🐞 Bug ✓ Correctness
Description
CI and developer entrypoints invoke scripts under a non-existent scripts/ path (e.g.,
scripts/gen-skill-docs.ts, ./scripts/verify-templates.sh), but the repository contains these
under 05-Scripts/, causing CI failures and broken Make/npm commands.
Code

package.json[R14-20]

+    "test:skills": "npx vitest run test/skill-validation.test.ts",
+    "test:docs": "npx vitest run test/doc-quality-eval.test.ts",
+    "test:evals": "ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} npx vitest run test/doc-quality-eval.test.ts",
 "clean": "turbo run clean",
-    "verify": "./scripts/verify-templates.sh"
+    "verify": "./scripts/verify-templates.sh",
+    "gen:skill-docs": "npx tsx scripts/gen-skill-docs.ts",
+    "skill:check": "npx tsx scripts/gen-skill-docs.ts --dry-run"
Evidence
package.json, Makefile, and CI call scripts/..., but the generator and verify scripts are
actually located under 05-Scripts/, so npm run skill:check (and other commands) will fail to
locate the referenced files.

package.json[14-20]
Makefile[3-23]
.github/workflows/ci.yml[35-40]
05-Scripts/gen-skill-docs.ts[18-26]
05-Scripts/verify-templates.sh[1-12]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
CI and local commands reference `scripts/...` paths that do not exist in the repo; the actual scripts live under `05-Scripts/`. This breaks `npm run skill:check`, `npm run gen:skill-docs`, `make verify`, and related workflows.
## Issue Context
The repo currently contains `05-Scripts/gen-skill-docs.ts` and `05-Scripts/verify-templates.sh`, while CI/Makefile/package scripts invoke `scripts/gen-skill-docs.ts` and `./scripts/verify-templates.sh`.
## Fix Focus Areas
- package.json[14-20]
- Makefile[3-23]
- .github/workflows/ci.yml[35-40]
- 05-Scripts/gen-skill-docs.ts[18-26]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the project's capabilities by integrating advanced engineering workflow patterns adapted from gstack. The changes introduce a robust build system for managing documentation templates, alongside a suite of new AI-powered skills designed to automate and improve documentation quality, review processes, and release management. These additions aim to streamline development workflows and ensure higher standards for generated documentation, while also addressing minor repository health updates.

Highlights

  • gstack Integration: Integrated key engineering workflow patterns from gstack, including a template build system, five new Claude Code skills for documentation review and release, and LLM-judge evaluation infrastructure.
  • New Claude Code Skills: Added five new Claude Code skills: document-release for post-ship documentation sync, qa-docs for report-only documentation quality audits, review-architecture for architecture documentation review, review-docs for general documentation quality review, and ship for automating the release workflow.
  • Documentation Build System: Introduced a new template build system (gen-skill-docs.ts) that processes .md.tmpl files, resolves placeholders from shared blocks, and generates .md files for skills and commands.
  • Enhanced Testing Infrastructure: Implemented new testing suites for skill validation, static documentation quality checks, and LLM-judge evaluations, improving the robustness of the documentation generation process.
  • Repository Health Audit: Updated the Vitest dependency to version 4.0.0 and refreshed CLAUDE.md to reflect new commands, directory structure, and the gstack integration details.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/ci.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an impressive and substantial pull request that integrates gstack patterns, adding a new skill templating system, several powerful new skills, and a robust testing infrastructure. The changes significantly enhance the repository's capabilities. My review has identified a few critical issues that need to be addressed. Also, please note there's a file path inconsistency: the new build script is added as 05-Scripts/gen-skill-docs.ts, but all references in Makefile, package.json, and documentation point to scripts/gen-skill-docs.ts. This should be corrected for consistency.

Comment thread commands/new-project.md.tmpl Outdated
Comment on lines +23 to +24
- Ensure repo path:
cd ~/ai-dev
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The command cd ~/ai-dev is hardcoded. This assumes a specific directory structure on the user's machine which is unlikely to be correct for all users, and will cause the script to fail. Since scripts like this are typically run from the project root, this directory change is likely unnecessary and unsafe. Please remove this hardcoded path.

Comment thread test/helpers/llm-judge.ts Outdated
const client = new Anthropic();

const makeRequest = () => client.messages.create({
model: 'claude-sonnet-4-6',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The model name claude-sonnet-4-6 is not a valid Anthropic model identifier. This will cause all LLM-judge API calls to fail. Please replace it with a valid model name. Given the recent model releases, claude-3.5-sonnet-20240620 would be a good choice.

Suggested change
model: 'claude-sonnet-4-6',
model: 'claude-3.5-sonnet-20240620',

@@ -0,0 +1,122 @@
# Document Quality Review Checklist
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This file appears to be a duplicate of review/doc-checklist.md, which is also added in this pull request. Maintaining duplicate files can lead to inconsistencies and increases the maintenance burden, as any changes would need to be manually synchronized. It would be better to establish a single source of truth for this checklist. The review-docs skill already mentions both paths as possible locations, so you could remove one and update the skill's logic to only use the single canonical path.

Comment thread review/doc-checklist.md
Comment on lines +1 to +5
# Document Quality Review Checklist

## Instructions

Review generated documentation for the issues listed below. Be specific — cite the document and section. Skip anything that's fine. Only flag real problems.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. review/doc-checklist.md outside 01-docs/ 📘 Rule violation ✓ Correctness

A new documentation file was added under review/ instead of 01-Docs/, and its filename does not
follow the required NNN-abv-description.ext convention. This violates the repo documentation
storage and naming requirements.
Agent Prompt
## Issue description
A new documentation file was added outside `01-Docs/` and does not follow the required `NNN-abv-description.ext` naming convention.

## Issue Context
Compliance requires all newly added documentation to be stored in `01-Docs/` and named per `.directory-standards.md`.

## Fix Focus Areas
- review/doc-checklist.md[1-122]
- .claude/skills/review-docs/SKILL.md[35-38]
- CLAUDE.md[66-66]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread package.json
Comment on lines +14 to +20
"test:skills": "npx vitest run test/skill-validation.test.ts",
"test:docs": "npx vitest run test/doc-quality-eval.test.ts",
"test:evals": "ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} npx vitest run test/doc-quality-eval.test.ts",
"clean": "turbo run clean",
"verify": "./scripts/verify-templates.sh"
"verify": "./scripts/verify-templates.sh",
"gen:skill-docs": "npx tsx scripts/gen-skill-docs.ts",
"skill:check": "npx tsx scripts/gen-skill-docs.ts --dry-run"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Broken script path refs 🐞 Bug ✓ Correctness

CI and developer entrypoints invoke scripts under a non-existent scripts/ path (e.g.,
scripts/gen-skill-docs.ts, ./scripts/verify-templates.sh), but the repository contains these
under 05-Scripts/, causing CI failures and broken Make/npm commands.
Agent Prompt
## Issue description
CI and local commands reference `scripts/...` paths that do not exist in the repo; the actual scripts live under `05-Scripts/`. This breaks `npm run skill:check`, `npm run gen:skill-docs`, `make verify`, and related workflows.

## Issue Context
The repo currently contains `05-Scripts/gen-skill-docs.ts` and `05-Scripts/verify-templates.sh`, while CI/Makefile/package scripts invoke `scripts/gen-skill-docs.ts` and `./scripts/verify-templates.sh`.

## Fix Focus Areas
- package.json[14-20]
- Makefile[3-23]
- .github/workflows/ci.yml[35-40]
- 05-Scripts/gen-skill-docs.ts[18-26]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (11)
review/doc-checklist.md (1)

1-122: Consider single-sourcing this checklist to prevent drift.

This file and .claude/skills/review-docs/references/doc-checklist.md appear duplicated. Keeping one canonical source and generating/copying the second in CI would reduce divergence risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@review/doc-checklist.md` around lines 1 - 122, The checklist is duplicated
(the repo checklist and a second copy in the review-docs references folder);
deduplicate by choosing one canonical doc (keep the primary doc_checklist file),
delete the duplicate copy, and add a CI step to auto-generate or sync the
secondary copy from the canonical source; update references to point to the
canonical checklist and add a short README note in the canonical file indicating
it is the single source of truth.
05-Scripts/gen-skill-docs.ts (1)

193-223: Consolidate base-branch detection docs to one source of truth.

BASE_BRANCH_DETECT_CONTENT diverges from commands/shared/preamble.md (different detection order and fallbacks), which can produce conflicting operator guidance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@05-Scripts/gen-skill-docs.ts` around lines 193 - 223,
BASE_BRANCH_DETECT_CONTENT in gen-skill-docs.ts diverges from the canonical
base-branch detection in the shared preamble, causing conflicting guidance;
replace the hard-coded array with a single source of truth by importing or
loading the canonical base-branch detection snippet from the shared preamble
(the project's shared preamble base-branch block) and ensure the sequence
matches: PR base, remote HEAD (origin HEAD), then default candidates (main,
master, develop, trunk); update BASE_BRANCH_DETECT_CONTENT to reference that
shared snippet so both docs use the same logic.
.claude/skills/qa-docs/SKILL.md (1)

95-129: Consider adding language identifiers to output code blocks.

The output example blocks at lines 95, 136, and 141 lack language specifiers. Adding text would satisfy MD040 lint warnings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/qa-docs/SKILL.md around lines 95 - 129, The Markdown examples
in the "Documentation Quality Audit" section use fenced code blocks (``` ...
```) without language identifiers (e.g., the blocks showing the audit template
around the "Documentation Quality Audit" header and the examples at the same
sections referenced), which triggers MD040; edit those fenced code blocks (the
triple-backtick blocks shown in the diff) to include a language tag such as
```text (or a more specific language if appropriate) so each code fence begins
with ```text to satisfy the linter.
.claude/skills/review-architecture/SKILL.md (1)

95-129: Consider adding language identifiers to fenced code blocks.

The static analysis flagged these output example blocks (lines 95, 136, 141) as missing language specifiers. Adding text or plaintext would satisfy the linter and improve syntax highlighting consistency.

📝 Suggested fix
-```
+```text
 Architecture Review: N findings

Apply similarly to the other output blocks at lines 136 and 141.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/review-architecture/SKILL.md around lines 95 - 129, The
fenced output example blocks (the ones showing "Architecture Review: N findings"
and the other two examples) are missing language identifiers; update each
triple-backtick fence to include a language such as `text` or `plaintext` (e.g.,
change ``` to ```text) so the linter is satisfied and syntax highlighting is
consistent—look for the blocks containing "Architecture Review: N findings" and
the other similar output examples and add the language specifier to their
opening fences.
test/doc-quality-eval.test.ts (2)

29-34: Consider handling the case where no templates exist.

If the professional-templates/core directory doesn't exist, getTemplates() returns an empty array, and the test 'has 22 templates' will fail with a less informative message. Consider adding a more descriptive check.

💡 Suggested improvement
 describe('Template static validation (free tier)', () => {
   const templates = getTemplates();
 
+  it('templates directory exists', () => {
+    expect(fs.existsSync(TEMPLATES_DIR)).toBe(true);
+  });
+
   it('has 22 templates', () => {
     expect(templates.length).toBe(22);
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/doc-quality-eval.test.ts` around lines 29 - 34, The test assumes
getTemplates() returns 22 templates but gives an unhelpful failure if the
templates dir is missing; update the 'has 22 templates' test to first assert
templates is non-empty and, if empty, fail with a clear message (e.g., "No
templates found — ensure professional-templates/core exists"), then check the
exact count; reference getTemplates() and the templates variable in the test so
the failure distinguishes "no templates found" from "unexpected template count".

56-60: Verify the placeholder regex pattern.

The regex /\{\{(?!DATE\}\})[A-Z_]+\}\}/g uses a negative lookahead, but it may not correctly match all cases. For example, {{DATES}} would be flagged since it starts with "DATE" but isn't exactly {{DATE}}.

The current regex checks that the placeholder doesn't start with DATE}}, which may be the intended behavior. However, if the intent is to exclude only the exact {{DATE}} placeholder, consider this alternative:

🔧 Alternative regex if exact match is intended
-      it('has no unresolved non-DATE placeholders', () => {
-        // {{DATE}} is expected, but other {{PLACEHOLDER}} tokens should not be present
-        const otherPlaceholders = tmpl.content.match(/\{\{(?!DATE\}\})[A-Z_]+\}\}/g);
+      it('has no unresolved non-DATE placeholders', () => {
+        // {{DATE}} is expected, but other {{PLACEHOLDER}} tokens should not be present
+        const allPlaceholders = tmpl.content.match(/\{\{[A-Z_]+\}\}/g) || [];
+        const otherPlaceholders = allPlaceholders.filter(p => p !== '{{DATE}}');
+        expect(otherPlaceholders.length === 0 ? null : otherPlaceholders).toBeNull();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/doc-quality-eval.test.ts` around lines 56 - 60, The placeholder-check
regex in the "has no unresolved non-DATE placeholders" test (the
otherPlaceholders variable from tmpl.content.match) can misclassify tokens like
"{{DATES}}"; instead, capture all placeholder tokens first and then explicitly
exclude the exact "{{DATE}}" token (e.g., match all {{...}} placeholders via
tmpl.content.match and filter out the literal "{{DATE}}") before asserting none
remain; update the test logic around otherPlaceholders to perform that two-step
match-and-filter so only the exact DATE token is allowed.
.claude/skills/review-docs/SKILL.md (1)

98-121: Consider adding language identifiers to output code blocks.

The output example blocks at lines 98, 130, 135, 151, and 156 lack language specifiers. Adding text would satisfy MD040 lint warnings and maintain consistency across skill documentation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/review-docs/SKILL.md around lines 98 - 121, Update the
markdown code fences that show output example blocks (the triple-backtick blocks
in the "Document Quality Review" summary and the nearby example sections such as
the AUTO-FIXED / NEEDS INPUT examples) to include a language identifier of text
(e.g., ```text) so they satisfy the MD040 lint rule; scan for the unlabeled
triple-backtick blocks around the Document Quality Review header and replace
them with labeled fences to maintain consistency across the skill docs.
.claude/skills/ship/SKILL.md (1)

151-164: Consider adding language identifiers to output code blocks.

Similar to other SKILL files, the output example blocks at lines 151, 171, and 176 lack language specifiers. Adding text would satisfy MD040 lint warnings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/ship/SKILL.md around lines 151 - 164, Add the Markdown
language identifier "text" to the fenced code blocks that show the SHIP COMPLETE
output examples (the blocks containing "SHIP COMPLETE", "Version: X.Y.Z", etc.)
so they become triple-backtick fences like ```text to satisfy MD040; update each
example block noted in the comment (the ones around the SHIP COMPLETE output
examples) to include this identifier.
.claude/skills/document-release/SKILL.md (1)

129-137: Consider adding language identifiers to output code blocks.

The output example blocks at lines 129, 144, and 149 lack language specifiers. Adding text would satisfy MD040 lint warnings and maintain consistency with other skill documentation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/document-release/SKILL.md around lines 129 - 137, The fenced
output example code blocks shown (the triple-backtick blocks listing
"Documentation health:" entries) are missing a language identifier which
triggers MD040; update those fences to use a language tag (use `text`) for the
examples at the three output example blocks so they read ```text ... ``` to
satisfy the linter and match existing docs formatting.
test/skill-validation.test.ts (1)

83-88: Absolute-path detection is narrow for Windows paths

Line 87 only flags C:\ and common Unix home prefixes. Consider broadening to any drive-letter path and UNC paths to avoid false negatives in cross-platform repos.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/skill-validation.test.ts` around lines 83 - 88, The test "has no
absolute paths (uses ${CLAUDE_SKILL_DIR}/ instead)" currently only checks for
/home/, /Users/, and C:\ which misses other Windows drive-letter and UNC paths;
update the negative match on body (variable names: content, body, skillPath) to
use a broader regex that detects any Windows drive-letter paths (e.g. X:\) and
UNC paths (\\server\share) as well as general absolute Unix paths so the
expect(body).not.toMatch(...) fails for those cases too. Ensure the replacement
that strips attribution comments remains (body = content.replace(...)) and only
change the regex used in the expect assertion.
test/helpers/llm-judge.ts (1)

42-42: Consider making judge model configurable for future test flexibility

Line 42 hard-codes claude-sonnet-4-6, which is currently a supported and stable Anthropic model identifier. However, making the model configurable via process.env.ANTHROPIC_MODEL (with claude-sonnet-4-6 as the default) would allow test suites to easily adapt if model aliases change in the future or to test against different model versions across environments.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/helpers/llm-judge.ts` at line 42, Change the hard-coded model string to
read from an environment variable so tests can be configured: replace the
literal 'claude-sonnet-4-6' used for the model in test/helpers/llm-judge.ts with
process.env.ANTHROPIC_MODEL ?? 'claude-sonnet-4-6' (or equivalent) so the judge
model becomes configurable while preserving the current default; update any
related test setup or exports that reference the model property to use this
value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/review-docs/references/doc-checklist.md:
- Around line 13-22: Two fenced code blocks in the markdown are missing language
identifiers (triggering markdownlint MD040); add an explicit language (e.g.,
```text) to each fenced block that begins with "Document Quality Review: N
issues (X critical, Y informational)" and the block that starts with "CRITICAL
(highest severity):          INFORMATIONAL (lower severity):" so the fences read
```text and close with ``` to satisfy linting and preserve content.

In `@05-Scripts/gen-skill-docs.ts`:
- Around line 399-402: After calling resolveTemplate (the calls that assign {
resolved, count, unresolved } and set
result.placeholdersResolved/result.unresolved), add a fail-fast check: if
unresolved is non-empty (or unresolved.length > 0 / count < expected), throw an
Error (or otherwise abort the run) that includes the unresolved tokens and
context (e.g., template/source name and placeholdersResolved count) so the
process cannot succeed with unresolved placeholders; apply the same check to the
other resolveTemplate call earlier in the file (the second occurrence that sets
placeholdersResolved/unresolved).

In `@ACKNOWLEDGMENTS.md`:
- Around line 22-25: The fenced code block in ACKNOWLEDGMENTS.md is missing a
language identifier which triggers markdownlint MD040; update the opening
triple-backtick for that snippet (the block containing the two HTML comments
about gstack and MIT license) to include an appropriate language label such as
"html" (i.e., change ``` to ```html) so the block is explicitly marked and the
lint warning is resolved.

In `@commands/new-project.md.tmpl`:
- Around line 27-31: The case block handling scope (case "$scope" in ...
comprehensive) TEMPLATES="$(ls professional-templates | sort)" ;; esac) is
fragile: replace the ls-based selection with a deterministic filter that only
picks template files with the expected numeric prefix (e.g., files matching the
2-digit_prefix + "_" + name + ".md" pattern) and sort them, and add a default *)
branch to the case to handle invalid $scope by emitting a clear error and
exiting non-zero; update the TEMPLATES assignment for the comprehensive branch
(and any code that consumes TEMPLATES) to use the filtered, sorted list of
basenames rather than raw ls output.
- Line 24: Replace the hard-coded directory command `cd ~/ai-dev` in the
template with a configurable placeholder or relative path so it works on other
machines; locate the `cd ~/ai-dev` line in commands/new-project.md.tmpl and
change it to use a template variable or environment variable such as `cd
{{project_dir}}` or `cd "$PROJECT_DIR"` (or instruct the user to run `cd
<your-project-dir>`), and update any surrounding example text to mention how to
set that variable so the step is not tied to a single local setup.

In `@commands/shared/template-list.md`:
- Line 17: The template `13_risk_register.md` is labeled "Quality Assurance" in
the template list but should match the README taxonomy; update the category for
`13_risk_register.md` in the commands/shared/template-list.md entry from
"Quality Assurance" to "Development Workflow" (the same taxonomy used in
README.md for Risk Register) so naming is consistent across docs.

In `@review/doc-checklist.md`:
- Around line 13-22: Add a language identifier to the unlabeled fenced code
blocks containing the "Document Quality Review: N issues (X critical, Y
informational)" header and the block that starts with "CRITICAL (highest
severity):          INFORMATIONAL (lower severity):" by changing the opening
triple-backticks to use ```text so markdownlint MD040 is satisfied; update both
occurrences (the block around "Document Quality Review..." and the block around
"CRITICAL (highest severity)...") to begin with ```text and leave the rest of
the content unchanged.

In `@test/helpers/llm-judge.ts`:
- Around line 68-97: The judge() function currently injects the full document
into the prompt; make its input bounds consistent with docSuiteJudge() by
truncating the content to the same 500-character limit before building the
prompt. Update judge() (the function that calls callJudge<JudgeScore>) to
pre-process content (e.g., replace content with content.slice(0, 500) or the
exact truncation helper used in docSuiteJudge()) so the prompt receives the
shortened text while preserving the existing audience default and JSON response
format.
- Around line 59-63: The parsing logic in the function that reads the Anthropic
response currently assumes response.content[0] is text (see use of
response.content[0], text and jsonMatch) which can be wrong for non-text blocks;
update the code to search response.content for the first block with type ===
'text' (or otherwise find the first textual content), throw a clear error if no
text block is found, then run the JSON extraction against that text (and keep
the existing jsonMatch/JSON.parse behavior); ensure the thrown error message
names the missing text block and references response.content so diagnostics
point to the real root cause.

In `@test/skill-validation.test.ts`:
- Around line 52-56: The test "has valid YAML frontmatter" only checks for the
delimiter via frontmatterMatch from content.match and thus allows malformed
YAML; update the test to parse the captured frontmatter string
(frontmatterMatch[1]) with a YAML parser (e.g., js-yaml safeLoad/safeLoadAll or
YAML.parse) and assert that parsing does not throw and yields an object, or
alternatively rename the test to indicate it only checks for delimiters; modify
the test body around the variables content and frontmatterMatch accordingly so
it validates real YAML syntax rather than only delimiter presence.
- Around line 93-104: The test currently generates cases from getTemplateFiles()
but will silently pass if that returns an empty array; add a guard assertion
immediately after const templates = getTemplateFiles() (e.g.,
expect(templates.length).toBeGreaterThan(0) or toEqual) so the "Template
freshness" suite fails loudly when no templates are discovered; keep the rest of
the loop using tmplPath/outputPath/displayPath unchanged to locate the template
tests.

In `@vitest.config.ts`:
- Around line 1-8: Vitest fails with ERR_REQUIRE_ESM due to the ESM-only std-env
dependency; update the exported defineConfig test block in vitest.config.ts to
add a server.deps.inline entry that inlines 'std-env' (i.e., add server: { deps:
{ inline: ['std-env'] } } inside the test config) so Vitest bundles the ESM
dependency instead of requiring it; keep the existing globals, environment, and
include settings unchanged.

---

Nitpick comments:
In @.claude/skills/document-release/SKILL.md:
- Around line 129-137: The fenced output example code blocks shown (the
triple-backtick blocks listing "Documentation health:" entries) are missing a
language identifier which triggers MD040; update those fences to use a language
tag (use `text`) for the examples at the three output example blocks so they
read ```text ... ``` to satisfy the linter and match existing docs formatting.

In @.claude/skills/qa-docs/SKILL.md:
- Around line 95-129: The Markdown examples in the "Documentation Quality Audit"
section use fenced code blocks (``` ... ```) without language identifiers (e.g.,
the blocks showing the audit template around the "Documentation Quality Audit"
header and the examples at the same sections referenced), which triggers MD040;
edit those fenced code blocks (the triple-backtick blocks shown in the diff) to
include a language tag such as ```text (or a more specific language if
appropriate) so each code fence begins with ```text to satisfy the linter.

In @.claude/skills/review-architecture/SKILL.md:
- Around line 95-129: The fenced output example blocks (the ones showing
"Architecture Review: N findings" and the other two examples) are missing
language identifiers; update each triple-backtick fence to include a language
such as `text` or `plaintext` (e.g., change ``` to ```text) so the linter is
satisfied and syntax highlighting is consistent—look for the blocks containing
"Architecture Review: N findings" and the other similar output examples and add
the language specifier to their opening fences.

In @.claude/skills/review-docs/SKILL.md:
- Around line 98-121: Update the markdown code fences that show output example
blocks (the triple-backtick blocks in the "Document Quality Review" summary and
the nearby example sections such as the AUTO-FIXED / NEEDS INPUT examples) to
include a language identifier of text (e.g., ```text) so they satisfy the MD040
lint rule; scan for the unlabeled triple-backtick blocks around the Document
Quality Review header and replace them with labeled fences to maintain
consistency across the skill docs.

In @.claude/skills/ship/SKILL.md:
- Around line 151-164: Add the Markdown language identifier "text" to the fenced
code blocks that show the SHIP COMPLETE output examples (the blocks containing
"SHIP COMPLETE", "Version: X.Y.Z", etc.) so they become triple-backtick fences
like ```text to satisfy MD040; update each example block noted in the comment
(the ones around the SHIP COMPLETE output examples) to include this identifier.

In `@05-Scripts/gen-skill-docs.ts`:
- Around line 193-223: BASE_BRANCH_DETECT_CONTENT in gen-skill-docs.ts diverges
from the canonical base-branch detection in the shared preamble, causing
conflicting guidance; replace the hard-coded array with a single source of truth
by importing or loading the canonical base-branch detection snippet from the
shared preamble (the project's shared preamble base-branch block) and ensure the
sequence matches: PR base, remote HEAD (origin HEAD), then default candidates
(main, master, develop, trunk); update BASE_BRANCH_DETECT_CONTENT to reference
that shared snippet so both docs use the same logic.

In `@review/doc-checklist.md`:
- Around line 1-122: The checklist is duplicated (the repo checklist and a
second copy in the review-docs references folder); deduplicate by choosing one
canonical doc (keep the primary doc_checklist file), delete the duplicate copy,
and add a CI step to auto-generate or sync the secondary copy from the canonical
source; update references to point to the canonical checklist and add a short
README note in the canonical file indicating it is the single source of truth.

In `@test/doc-quality-eval.test.ts`:
- Around line 29-34: The test assumes getTemplates() returns 22 templates but
gives an unhelpful failure if the templates dir is missing; update the 'has 22
templates' test to first assert templates is non-empty and, if empty, fail with
a clear message (e.g., "No templates found — ensure professional-templates/core
exists"), then check the exact count; reference getTemplates() and the templates
variable in the test so the failure distinguishes "no templates found" from
"unexpected template count".
- Around line 56-60: The placeholder-check regex in the "has no unresolved
non-DATE placeholders" test (the otherPlaceholders variable from
tmpl.content.match) can misclassify tokens like "{{DATES}}"; instead, capture
all placeholder tokens first and then explicitly exclude the exact "{{DATE}}"
token (e.g., match all {{...}} placeholders via tmpl.content.match and filter
out the literal "{{DATE}}") before asserting none remain; update the test logic
around otherPlaceholders to perform that two-step match-and-filter so only the
exact DATE token is allowed.

In `@test/helpers/llm-judge.ts`:
- Line 42: Change the hard-coded model string to read from an environment
variable so tests can be configured: replace the literal 'claude-sonnet-4-6'
used for the model in test/helpers/llm-judge.ts with process.env.ANTHROPIC_MODEL
?? 'claude-sonnet-4-6' (or equivalent) so the judge model becomes configurable
while preserving the current default; update any related test setup or exports
that reference the model property to use this value.

In `@test/skill-validation.test.ts`:
- Around line 83-88: The test "has no absolute paths (uses ${CLAUDE_SKILL_DIR}/
instead)" currently only checks for /home/, /Users/, and C:\ which misses other
Windows drive-letter and UNC paths; update the negative match on body (variable
names: content, body, skillPath) to use a broader regex that detects any Windows
drive-letter paths (e.g. X:\) and UNC paths (\\server\share) as well as general
absolute Unix paths so the expect(body).not.toMatch(...) fails for those cases
too. Ensure the replacement that strips attribution comments remains (body =
content.replace(...)) and only change the regex used in the expect assertion.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f885bc4b-d315-479a-bdef-4a2819c4e841

📥 Commits

Reviewing files that changed from the base of the PR and between 4537d46 and ed08eda.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (27)
  • .claude/skills/document-release/SKILL.md
  • .claude/skills/document-release/evals/evals.json
  • .claude/skills/qa-docs/SKILL.md
  • .claude/skills/qa-docs/evals/evals.json
  • .claude/skills/review-architecture/SKILL.md
  • .claude/skills/review-architecture/evals/evals.json
  • .claude/skills/review-docs/SKILL.md
  • .claude/skills/review-docs/evals/evals.json
  • .claude/skills/review-docs/references/doc-checklist.md
  • .claude/skills/ship/SKILL.md
  • .claude/skills/ship/evals/evals.json
  • .github/workflows/ci.yml
  • 05-Scripts/gen-skill-docs.ts
  • ACKNOWLEDGMENTS.md
  • CLAUDE.md
  • Makefile
  • README.md
  • commands/new-project.md
  • commands/new-project.md.tmpl
  • commands/shared/preamble.md
  • commands/shared/template-list.md
  • package.json
  • review/doc-checklist.md
  • test/doc-quality-eval.test.ts
  • test/helpers/llm-judge.ts
  • test/skill-validation.test.ts
  • vitest.config.ts

Comment on lines +13 to +22
```
Document Quality Review: N issues (X critical, Y informational)

**AUTO-FIXED:**
- [doc:section] Problem → fix applied

**NEEDS INPUT:**
- [doc:section] Problem description
Recommended fix: suggested fix
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Specify fence languages for both code blocks.

Line 13 and Line 86 start fenced blocks without language identifiers (markdownlint MD040).

💡 Suggested fix
-```
+```text
 Document Quality Review: N issues (X critical, Y informational)
@@

@@
- +text
CRITICAL (highest severity): INFORMATIONAL (lower severity):
@@

Also applies to: 86-92

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 13-13: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/review-docs/references/doc-checklist.md around lines 13 - 22,
Two fenced code blocks in the markdown are missing language identifiers
(triggering markdownlint MD040); add an explicit language (e.g., ```text) to
each fenced block that begins with "Document Quality Review: N issues (X
critical, Y informational)" and the block that starts with "CRITICAL (highest
severity):          INFORMATIONAL (lower severity):" so the fences read ```text
and close with ``` to satisfy linting and preserve content.

Comment on lines +399 to +402
const { resolved, count, unresolved } = resolveTemplate(source, verbose);
result.placeholdersResolved = count;
result.unresolved = unresolved;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast when unresolved placeholders remain.

Currently unresolved tokens are left in output and the run can still succeed, which lets broken generated docs pass.

Suggested fix
   const { resolved, count, unresolved } = resolveTemplate(source, verbose);
   result.placeholdersResolved = count;
   result.unresolved = unresolved;
+
+  if (unresolved.length > 0) {
+    result.error = new Error(`Unresolved placeholders: ${unresolved.join(', ')}`);
+    return result;
+  }

Also applies to: 349-350

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@05-Scripts/gen-skill-docs.ts` around lines 399 - 402, After calling
resolveTemplate (the calls that assign { resolved, count, unresolved } and set
result.placeholdersResolved/result.unresolved), add a fail-fast check: if
unresolved is non-empty (or unresolved.length > 0 / count < expected), throw an
Error (or otherwise abort the run) that includes the unresolved tokens and
context (e.g., template/source name and placeholdersResolved count) so the
process cannot succeed with unresolved placeholders; apply the same check to the
other resolveTemplate call earlier in the file (the second occurrence that sets
placeholdersResolved/unresolved).

Comment thread ACKNOWLEDGMENTS.md Outdated
Comment thread commands/new-project.md.tmpl Outdated
Comment thread commands/new-project.md.tmpl
Comment thread test/helpers/llm-judge.ts Outdated
Comment thread test/helpers/llm-judge.ts
Comment on lines +68 to +97
export async function judge(docName: string, content: string, audience: string = 'technical'): Promise<JudgeScore> {
return callJudge<JudgeScore>(`You are evaluating documentation quality for a project documentation generator.

The documentation was generated from a professional template and should be:
1. Clear and well-structured
2. Complete with all sections filled in
3. Actionable with concrete next steps
4. Appropriate for the declared audience (${audience})

Rate the following document "${docName}" on four dimensions (1-5 scale):

- **clarity** (1-5): Is the document well-organized and easy to follow?
- **completeness** (1-5): Are all sections populated? Any placeholder text remaining?
- **actionability** (1-5): Can a reader take concrete action based on this document?
- **audience_fit** (1-5): Is the language and depth appropriate for a ${audience} audience?

Scoring guide:
- 5: Excellent — no issues, ready to use
- 4: Good — minor gaps easily addressed
- 3: Adequate — some rework needed
- 2: Poor — significant gaps
- 1: Unusable — fundamental problems

Respond with ONLY valid JSON:
{"clarity": N, "completeness": N, "actionability": N, "audience_fit": N, "reasoning": "brief explanation"}

Document content:

${content}`);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

What are the current input context limits and best-practice truncation guidance for Anthropic Claude Sonnet models used via the Messages API?

💡 Result:

Current Anthropic Claude Sonnet models available via Messages API (as of 2026-03-18): Claude Sonnet 4.6 (claude-sonnet-4-6), Claude Sonnet 4.5 (claude-sonnet-4-5-20250929), Claude Sonnet 4 (claude-sonnet-4-20250514). All have 1M token context windows (Sonnet 4.5/4 require beta header "context-1m-2025-08-07" beyond 200k tokens; Sonnet 4.6 is native). Max output: 64k tokens. Newer models (Sonnet 3.7+) return validation error on exceedance instead of silent truncation (enable "model-context-window-exceeded-2025-08-26" beta for older). Best-practice truncation guidance: Use /v1/messages/count_tokens API pre-request to estimate input tokens and avoid limits. For long contexts, employ server-side compaction (beta "compact-2026-01-12" for Sonnet 4.6/Opus 4.6) for auto-summarization. Context editing (beta "context-management-2025-06-27") clears old tool results/thinking blocks. Prompt for conciseness (e.g., "Respond in 2 sentences"); set low max_tokens; structure long inputs at prompt top with XML tags; use context awareness in Sonnet 4.6/4.5/Haiku 4.5. Handle "max_tokens" or "model_context_window_exceeded" stop_reasons by continuing or summarizing.

Citations:


🏁 Script executed:

# Read the llm-judge.ts file to understand the full context
cat -n test/helpers/llm-judge.ts | head -120

Repository: intent-solutions-io/intent-blueprint-docs

Length of output: 5061


Make judge() input bounds consistent with docSuiteJudge()

The judge() function at line 96 passes document content directly into the prompt without truncation, while docSuiteJudge() at line 106 truncates each document to 500 characters. Although Claude Sonnet 4.6 has a 1M token context window, add defensive truncation to match the suite's safety approach and prevent edge cases with unusually large documents.

🔧 Proposed fix
 export async function judge(docName: string, content: string, audience: string = 'technical'): Promise<JudgeScore> {
+  const clippedContent =
+    content.length > 12000
+      ? `${content.slice(0, 12000)}\n\n...[truncated for judge evaluation]`
+      : content;
+
   return callJudge<JudgeScore>(`You are evaluating documentation quality for a project documentation generator.
@@
-Document content:
+Document content (possibly truncated):
 
-${content}`);
+${clippedContent}`);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function judge(docName: string, content: string, audience: string = 'technical'): Promise<JudgeScore> {
return callJudge<JudgeScore>(`You are evaluating documentation quality for a project documentation generator.
The documentation was generated from a professional template and should be:
1. Clear and well-structured
2. Complete with all sections filled in
3. Actionable with concrete next steps
4. Appropriate for the declared audience (${audience})
Rate the following document "${docName}" on four dimensions (1-5 scale):
- **clarity** (1-5): Is the document well-organized and easy to follow?
- **completeness** (1-5): Are all sections populated? Any placeholder text remaining?
- **actionability** (1-5): Can a reader take concrete action based on this document?
- **audience_fit** (1-5): Is the language and depth appropriate for a ${audience} audience?
Scoring guide:
- 5: Excellent no issues, ready to use
- 4: Good minor gaps easily addressed
- 3: Adequate some rework needed
- 2: Poor significant gaps
- 1: Unusable fundamental problems
Respond with ONLY valid JSON:
{"clarity": N, "completeness": N, "actionability": N, "audience_fit": N, "reasoning": "brief explanation"}
Document content:
${content}`);
}
export async function judge(docName: string, content: string, audience: string = 'technical'): Promise<JudgeScore> {
const clippedContent =
content.length > 12000
? `${content.slice(0, 12000)}\n\n...[truncated for judge evaluation]`
: content;
return callJudge<JudgeScore>(`You are evaluating documentation quality for a project documentation generator.
The documentation was generated from a professional template and should be:
1. Clear and well-structured
2. Complete with all sections filled in
3. Actionable with concrete next steps
4. Appropriate for the declared audience (${audience})
Rate the following document "${docName}" on four dimensions (1-5 scale):
- **clarity** (1-5): Is the document well-organized and easy to follow?
- **completeness** (1-5): Are all sections populated? Any placeholder text remaining?
- **actionability** (1-5): Can a reader take concrete action based on this document?
- **audience_fit** (1-5): Is the language and depth appropriate for a ${audience} audience?
Scoring guide:
- 5: Excellent no issues, ready to use
- 4: Good minor gaps easily addressed
- 3: Adequate some rework needed
- 2: Poor significant gaps
- 1: Unusable fundamental problems
Respond with ONLY valid JSON:
{"clarity": N, "completeness": N, "actionability": N, "audience_fit": N, "reasoning": "brief explanation"}
Document content (possibly truncated):
${clippedContent}`);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/helpers/llm-judge.ts` around lines 68 - 97, The judge() function
currently injects the full document into the prompt; make its input bounds
consistent with docSuiteJudge() by truncating the content to the same
500-character limit before building the prompt. Update judge() (the function
that calls callJudge<JudgeScore>) to pre-process content (e.g., replace content
with content.slice(0, 500) or the exact truncation helper used in
docSuiteJudge()) so the prompt receives the shortened text while preserving the
existing audience default and JSON response format.

Comment on lines +52 to +56
it('has valid YAML frontmatter', () => {
const content = fs.readFileSync(skillPath, 'utf8');
const frontmatterMatch = content.match(/^---\n([\s\S]*?)\n---/);
expect(frontmatterMatch).not.toBeNull();
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

has valid YAML frontmatter currently does not validate YAML syntax

Line 54-Line 55 only assert delimiter presence; malformed YAML inside the block still passes. Please either parse frontmatter content as YAML in this test or rename the test to reflect delimiter-only validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/skill-validation.test.ts` around lines 52 - 56, The test "has valid YAML
frontmatter" only checks for the delimiter via frontmatterMatch from
content.match and thus allows malformed YAML; update the test to parse the
captured frontmatter string (frontmatterMatch[1]) with a YAML parser (e.g.,
js-yaml safeLoad/safeLoadAll or YAML.parse) and assert that parsing does not
throw and yields an object, or alternatively rename the test to indicate it only
checks for delimiters; modify the test body around the variables content and
frontmatterMatch accordingly so it validates real YAML syntax rather than only
delimiter presence.

Comment thread test/skill-validation.test.ts
Comment thread vitest.config.ts
jeremylongshore and others added 2 commits March 18, 2026 01:36
Fixes from Gemini, CodeRabbit, and Qodo review:

- Remove hardcoded `cd ~/ai-dev` → use `git rev-parse --show-toplevel`
- Add `*)` fallback for invalid scope in new-project template
- Use `find` instead of `ls` for comprehensive template listing
- Fail fast on unresolved placeholders in gen-skill-docs.ts
- Use `.find()` for text block in llm-judge.ts response parsing
- Add guard assertion for empty template discovery in tests
- Add `text` language identifier to fenced code blocks (MD040)
- Add `std-env` to vitest server.deps.inline (ESM compat)
- Deduplicate doc-checklist: skill reference → symlink to review/

Dismissed (incorrect):
- Gemini: claude-sonnet-4-6 "not valid" — it IS valid (2026 model)
- Qodo: scripts/ path "broken" — symlink scripts/ → 05-Scripts/ exists
- Qodo: review/ outside 01-Docs/ — intentional, it's a skill rubric

All 171 tests passing (38 skill + 133 doc quality).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/skill-validation.test.ts (1)

44-90: Consider reading SKILL.md once per skill to reduce redundant I/O.

Each test within a skill's describe block reads SKILL.md separately (lines 53, 59, 65, 71, 78, 84). While functionally correct, this results in 6 file reads per skill.

♻️ Optional refactor to read once per skill
   for (const skillName of skillDirs) {
     describe(`skill: ${skillName}`, () => {
       const skillPath = path.join(SKILLS_DIR, skillName, 'SKILL.md');
+      let content: string;
+
+      it('has SKILL.md', () => {
+        expect(fs.existsSync(skillPath)).toBe(true);
+        content = fs.readFileSync(skillPath, 'utf8');
+      });

-      it('has SKILL.md', () => {
-        expect(fs.existsSync(skillPath)).toBe(true);
-      });
-
       it('has valid YAML frontmatter', () => {
-        const content = fs.readFileSync(skillPath, 'utf8');
         const frontmatterMatch = content.match(/^---\n([\s\S]*?)\n---/);
         expect(frontmatterMatch).not.toBeNull();
       });
       // ... apply similar changes to other tests
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/skill-validation.test.ts` around lines 44 - 90, Read SKILL.md only once
per skill by moving the fs.readFileSync call into the describe scope (e.g.,
inside the for loop’s describe for each skill) and reuse that single `content`
variable (and derived `frontmatterMatch`, `body`, `lineCount`, `nameMatch` as
needed) across the it tests instead of calling fs.readFileSync multiple times;
you can initialize them with a const at the top of the describe or in a
beforeAll block so tests like the ones referencing `skillPath`, `content`,
`frontmatterMatch`, `nameMatch`, and `body` reuse the pre-read data.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@vitest.config.ts`:
- Around line 1-14: The Vitest config is written as ESM but your project is
running in CommonJS, causing the config not to load; resolve by either adding
"type": "module" to package.json to enable ESM project-wide, or rename
vitest.config.ts to vitest.config.mts so Node/Vitest treats it as an explicit
ESM module; after making one of these changes confirm that server.deps.inline:
["std-env"] in vitest.config (or the renamed file) is picked up by Vitest and
tests run successfully.

---

Nitpick comments:
In `@test/skill-validation.test.ts`:
- Around line 44-90: Read SKILL.md only once per skill by moving the
fs.readFileSync call into the describe scope (e.g., inside the for loop’s
describe for each skill) and reuse that single `content` variable (and derived
`frontmatterMatch`, `body`, `lineCount`, `nameMatch` as needed) across the it
tests instead of calling fs.readFileSync multiple times; you can initialize them
with a const at the top of the describe or in a beforeAll block so tests like
the ones referencing `skillPath`, `content`, `frontmatterMatch`, `nameMatch`,
and `body` reuse the pre-read data.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 223397d6-dafa-4dad-8115-9b4f947cf61a

📥 Commits

Reviewing files that changed from the base of the PR and between ed08eda and e545e6d.

📒 Files selected for processing (9)
  • .claude/skills/review-docs/references/doc-checklist.md
  • 05-Scripts/gen-skill-docs.ts
  • ACKNOWLEDGMENTS.md
  • commands/new-project.md
  • commands/new-project.md.tmpl
  • review/doc-checklist.md
  • test/helpers/llm-judge.ts
  • test/skill-validation.test.ts
  • vitest.config.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • ACKNOWLEDGMENTS.md
  • .claude/skills/review-docs/references/doc-checklist.md
  • test/helpers/llm-judge.ts

Comment thread vitest.config.ts
Comment on lines +1 to +14
import { defineConfig } from "vitest/config";

export default defineConfig({
test: {
globals: true,
environment: "node",
include: ["test/**/*.test.ts"],
server: {
deps: {
inline: ["std-env"],
},
},
},
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check package.json module type and vitest version
echo "=== package.json type field and vitest version ==="
jq -r '{type: .type, vitest: .devDependencies.vitest}' package.json

echo -e "\n=== Check if tsconfig has module settings ==="
cat tsconfig.json 2>/dev/null | jq '.compilerOptions | {module, moduleResolution, esModuleInterop}' || echo "No tsconfig.json found"

echo -e "\n=== Check for other ESM-only deps that might need inlining ==="
rg -l "type.*module" node_modules/*/package.json 2>/dev/null | head -10 || echo "Cannot check node_modules"

Repository: intent-solutions-io/intent-blueprint-docs

Length of output: 544


🌐 Web query:

Vitest 4.0 configuration ESM CJS compatibility server.deps.inline

💡 Result:

In Vitest 4.0, to achieve ESM and CJS compatibility, particularly for handling dependencies that cause issues like require of ESM modules, use the test.server.deps.inline configuration option. This forces specific modules to be transformed and resolved by Vite's module runner instead of being externalized and run natively by Node.js. Example configuration in vitest.config.ts: import { defineConfig } from 'vitest/config' export default defineConfig({ test: { server: { deps: { inline: ['your-esm-only-package'], // or array of packages/RegExps, or true to inline everything fallbackCJS: true, // Optional: heuristically find CJS builds for ESM entries }, }, }, }) This is especially useful for: - Mocking transitive dependencies in external packages. - Resolving "require of ES module not supported" errors when CJS code tries to require ESM. - Ensuring consistent behavior between browser (bundled) and test (unbundled) environments. Official docs warn to use sparingly as Vitest auto-handles most cases, but it's a last-resort fix for problematic deps. For performance in large test suites, prefer test.deps.optimizer.client.include (pre-bundles deps like Vite does for browser), which can be faster than inlining and also aids ESM/CJS interop by bundling. Vitest 4.0 migrated from vite-node to Vite's Module Runner but retained server.deps for compatibility (no breaking removal of top-level deps.inline). Requires Node.js >=20 and Vite >=6.

Citations:


Configuration structure is correct, but project's module context mismatch prevents config from loading.

The server.deps.inline: ["std-env"] configuration is properly structured for Vitest 4.0. However, the root issue is that vitest.config.ts uses ESM syntax while your project defaults to CommonJS (no "type": "module" in package.json). This creates a module context conflict when Vitest attempts to load the config file.

Fix by either:

  1. Add "type": "module" to package.json, or
  2. Rename vitest.config.ts to vitest.config.mts (explicit ESM marker)

The std-env inline configuration will then work as intended once the config loads successfully.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@vitest.config.ts` around lines 1 - 14, The Vitest config is written as ESM
but your project is running in CommonJS, causing the config not to load; resolve
by either adding "type": "module" to package.json to enable ESM project-wide, or
rename vitest.config.ts to vitest.config.mts so Node/Vitest treats it as an
explicit ESM module; after making one of these changes confirm that
server.deps.inline: ["std-env"] in vitest.config (or the renamed file) is picked
up by Vitest and tests run successfully.

@jeremylongshore jeremylongshore merged commit cda2d93 into main Mar 18, 2026
8 of 9 checks passed
@jeremylongshore jeremylongshore deleted the feat/gstack-integration branch March 18, 2026 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant