Add review-latency metric (PR review time, controlled for size)#1
Merged
Add review-latency metric (PR review time, controlled for size)#1
review-latency metric (PR review time, controlled for size)#1Conversation
Captures the brainstorm outcome: a size-bucketed review-cycle latency metric using opened/ready-for-review → merged window. Pins the bucket boundaries (1 / 2-3 / 4-9 / 10+ changed files), the schema additions, the GitHub fetch impact (timeline + per-PR calls on cold cache), and the dashboard treatment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12-task TDD plan covering: schema migration with ALTER-TABLE-on-legacy support, upsert_pr extension, fixture seed update with size + draft data, m_review_latency in metrics.py, GitHub timeline helper, fetch_prs extension, CLI smoke test, dashboard panel + renderer, sample.json demo data, and README updates. Also corrects the spec's claim about an "existing collapsible Weekly metrics (raw, sortable)" detail table — that section was sketched in the original 2026-04-24 design but never built. Spec now reflects that p90 lives in the JSON output; the chart shows median only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code reviewer caught: the seed comment had 44h, but PR 2's review_window = merged − ready_for_review_at = 34h (not 44h, which would be merged − opened_at, ignoring the COALESCE). Also fixes the same wrong number in the plan's Task 4 expected-values table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Regression caught after Task 6 ran the full suite — Task 4 added review-latency to the METRICS registry but missed updating the hardcoded set in test_run_report_returns_all_metrics_by_default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer caught two issues in the original test: - `all(...)` was vacuously true on empty data, hiding the fact that --weeks 12 from today (April 2026) excluded the October 2025 fixture rows entirely. Bumped to --weeks 500 so the fixture window is covered, and added an explicit non-empty assertion to surface this class of bug. - Missing timeout=15 on subprocess.run, inconsistent with the existing test_report_via_python_m_module pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds REVIEW_LATENCY_BUCKETS constant, renderReviewLatencyChart function, and wires it into renderForRepo. Panel shows "No data yet" until Task 10 lands sample fixture data.
Tightening pass after the branch-level review: - Move BUCKETS / _assign_bucket below SUCCESS_DEPLOY_STATUSES so the module-level constants are grouped together before any helpers. - Drop the dead `hours is None` guard in m_review_latency: opened_at is NOT NULL in the schema and the SQL filters merged_at IS NOT NULL, so the COALESCE-wrapped subtraction can never yield None. The hours < 0 guard stays (defensive against clock-skew). - Add parametrized boundary test for _assign_bucket so off-by-one regressions get caught (previously only interior bucket values were exercised). - README data model block now lists the four new pull_requests columns. - README cold-cache cost note now mentions 3 API calls per new PR (commits + pull detail + timeline) for the size + draft fields. - Spec future-work list gains an entry for promoting the dashboard's per-bucket colors from hardcoded hex to CSS variables (so the panel picks up dark-mode adaptation alongside the other charts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new
review-latencymetric that measuresmerged_at − COALESCE(ready_for_review_at, opened_at)per PR, bucketed bychanged_files(XS=1, S=2-3, M=4-9, L+=10+), reported as median (chart) + p90 (JSON) per repo/week/bucket.The existing
lead-timemetric (first_commit_at → merged_at) conflates dev work with review wait, and a 5-line typo PR is compared against a 1000-line refactor.review-latencyanswers a different question: once a PR is open and ready for review, how long does it take to land, given how big it is?What shipped
pull_requests(additions,deletions,changed_files,ready_for_review_at). Idempotent ALTER-TABLE migration handles legacy DBs in place viaPRAGMA table_info./pulls/{n}for size,/issues/{n}/timelinefor theready_for_reviewevent). Known PRs still skip everything (cache hit unchanged).m_review_latency(conn, since)insrc/dora/metrics.py, registered in theMETRICSdict. Bucketing + aggregation in Python (matchesm_lead_time's shape).Review latency) with a 4-line Chart.js series, one line per bucket. Missing buckets in a week render as gaps, not zeros.Coverage caveat
Metric coverage ramps forward: legacy DB rows stay
NULLon the new columns and are excluded from the metric. Only newly-merged PRs contribute after this upgrade. README documents this; a--rebuildflag for forced backfill is parked as future work.API cost
Cold-cache: 3 API calls per new PR (commits + pull detail + timeline) — was 1. Hot-cache (known PRs): unchanged at 0 per-PR calls. README's cold-cache cost note updated accordingly.
Spec & plan
Test plan
Known follow-ups (parked)
🤖 Generated with Claude Code