Skip to content

Add main-based PR evaluation harness#13

Draft
yashturkar wants to merge 6 commits into
mainfrom
feat/pr-eval-harness
Draft

Add main-based PR evaluation harness#13
yashturkar wants to merge 6 commits into
mainfrom
feat/pr-eval-harness

Conversation

@yashturkar

Copy link
Copy Markdown
Owner

Summary

  • add a harness for reviewing a target ref from an isolated -based worktree
  • document the harness in a dedicated runbook and the repo README
  • restore the required docs directory so docs lint passes on

Verification

  • python3 -m py_compile scripts/eval_pr.py
  • python3 scripts/eval_pr.py --help
  • kb-server/.venv/bin/python scripts/docs_lint.py
  • python3 scripts/eval_pr.py HEAD --tests-only --tmux-session-name fd-pr-eval-head-2
  • python3 scripts/eval_pr.py HEAD --e2e-only --tmux-session-name fd-pr-eval-e2e-2

@yashturkar yashturkar self-assigned this Mar 17, 2026
@yashturkar

Copy link
Copy Markdown
Owner Author

Addressed the Inspector harness blockers:

  • resolve symbolic refs like HEAD to a commit SHA before checkout
  • track harness-owned worktree/tmux resources so pre-existing sessions are left alone and harness-created artifacts are preserved on failure or --keep-temp
  • add focused unit tests and doc updates for the new behavior

Verification run in /tmp/flight-deck-pr13:

  • python3 scripts/docs_lint.py
  • python3 -m py_compile scripts/eval_pr.py tests/test_eval_pr.py
  • python3 -m unittest tests/test_eval_pr.py

@yashturkar

Copy link
Copy Markdown
Owner Author

Fixed the generated docs check by switching generation dates to UTC in Generated docs/generated/api-surface.md and docs/generated/env-catalog.md and refreshing plus .

@yashturkar

Copy link
Copy Markdown
Owner Author

Fixed the generated docs check by switching generation dates to UTC in scripts/generate_context_artifacts.py and refreshing docs/generated/api-surface.md plus docs/generated/env-catalog.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant