Skip to content

Releases: xbmc/kodiai

v0.30

19 Apr 15:17

Choose a tag to compare

v0.30 (2026-04-19)

Truthful Manual Rereview & Slack Webhook Relay.

Added

  • Verified webhook-to-Slack relay support via POST /webhooks/slack/relay/:sourceId, including env-backed SLACK_WEBHOOK_RELAY_SOURCES source config, generic payload normalization, optional filtering, explicit suppression/delivery failure outcomes, a dedicated relay runbook, and a fixture-backed verify:m052 proof command.
  • Operator smoke and rollout guidance for the relay surface, including documented curl flows for accepted, suppressed, and failed-delivery outcomes.

Fixed

  • @kodiai review is now the only supported manual rereview trigger; the stale ai-review / aireview team-trigger contract was removed from runtime behavior, config surfaces, docs, examples, and regression tests.
  • Manual rereview observability now treats team-only pull_request.review_requested deliveries as explicit unsupported skip signals instead of implying a supported operator retrigger path.
  • M048 phase-timing evidence handling now marks incomplete correlated phase rows as invalid-phase-payload and preserves publication unknown wording instead of collapsing partial evidence into false-green summaries.

Changed

  • README and deployment/runbook docs now describe Slack webhook relay as service-level runtime configuration rather than .kodiai.yml behavior.
  • Review-request debugging and release-proof docs now point operators at the explicit interactive-review / review.full surfaces for supported manual rereview evidence.

v0.29

15 Apr 17:42

Choose a tag to compare

v0.29 (2026-04-15)

Explicit Review Lane Hardening.

Fixed

  • Explicit @kodiai review requests now run on a dedicated interactive-review lane so stale automatic review work on the same installation no longer starves manual review requests.
  • Automatic review diff collection now bounds risky shallow-history recovery and degrades to GitHub PR file-list fallback instead of wedging the review lane on long-running merge-base recovery.
  • Explicit mention-review prompt diff construction now uses bounded PR diff collection instead of the unsafe origin/<base>..HEAD fallback, preventing unrelated upstream files from inflating shallow-clone review prompts.
  • Clean approval reviews are collapsed again: APPROVE review bodies now publish inside <details> wrappers across the shared approval builder, mention prompt contract, MCP comment server, and audit/verifier surfaces.

Changed

  • src/execution/mcp/comment-server.ts now normalizes clean approval bodies into the collapsed contract before publishing, so older visible-body variants do not leak into GitHub reviews.
  • Review-output audit/verifier surfaces now validate the collapsed clean-approval contract (details_wrapper=true) instead of the short-lived visible-body exception.
  • README and release history now reflect the explicit review lane, bounded diff fallback, and collapsed approval-body behavior shipped in this release.

v0.28 — Explicit Review Publication Recovery

16 Apr 02:30

Choose a tag to compare

v0.28 (2026-04-12)

Explicit Review Publication Recovery.

Fixed

  • Explicit @kodiai review requests now run with the full review-class turn budget and tool surface instead of the reduced conversational mention budget, restoring truthful approval publication on large PRs.
  • Clean-database CI runs now bootstrap the KnowledgeStore schema in src/knowledge/store.test.ts, removing the warm-schema false green that masked missing migrations.
  • Deploys now force a fresh ACA revision when the template would otherwise reuse the existing revision name, preventing "successful deploy, no new revision" ambiguity.

Changed

  • deploy.sh now reports the live ACA revision after deploy so operator proof can tie health checks, logs, and publish evidence to the exact running revision.
  • Deployment and review-request runbooks now document the explicit review publication path and its post-deploy proof surfaces.

v0.27 — Contributor Tier Truthfulness

16 Apr 02:30

Choose a tag to compare

v0.27 (2026-04-06)

Contributor Tier Truthfulness.

Added

  • Shared percentile tier-calculation helpers in src/contributor/tier-calculator.ts plus scorer-side recalculation hooks used by incremental expertise updates
  • Deterministic proof harnesses verify:m042:s01, verify:m042:s02, and verify:m042:s03 covering persisted-tier advancement, review-surface truthfulness, and cache/fallback hardening
  • Explicit Author tier: Review Details rendering and full-body regression coverage with required/banned phrase assertions for established and senior contributor guidance
  • Warning surface for invalid cached author tiers so malformed lower-fidelity cache data is observable without blocking reviews

Changed

  • Contributor score updates now recalculate and persist truthful contributor tiers when overall scores advance instead of persisting stale stored tiers
  • Review author-tier resolution now follows explicit precedence: contributor profile → bounded author cache → fallback classifier
  • Prompt and Review Details surfaces now render truthful developing/established/senior guidance from the resolved contributor tier, including the CrystalP-shaped repro path
  • author_cache reuse is now bounded to fallback-taxonomy values only (first-time, regular, core); unsupported cached values are ignored fail-open rather than trusted as richer contributor knowledge
  • Degraded fallback review paths preserve the resolved author tier and include the exact Search API disclosure sentence without contradicting contributor guidance

v0.26 — Structural Impact Evidence

16 Apr 02:30

Choose a tag to compare

v0.26 (2026-04-05)

Structural Impact Evidence.

Added

  • Review-time structural-impact consumer layer combining persisted graph blast-radius data with canonical current-code retrieval through explicit GraphAdapter / CorpusAdapter seams
  • Bounded StructuralImpactPayload contract with callers, impacted files, likely tests, graph coverage stats, canonical unchanged-code evidence, and explicit degradation records
  • Structural Impact subsection in Review Details with hard caps, rendered/truncated counts, and truthful confidence wording
  • ## Structural Impact Evidence prompt section and evidence-backed breaking-change guidance for C++ and Python reviews
  • Handler-level structural-impact cache with stable (repo, baseSha, headSha) keys, 256-entry LRU, and 10-minute TTL
  • Centralized degradation summarizer producing machine-readable truthfulness signals (graph-unavailable, corpus-unavailable, no-structural-evidence, etc.)
  • Deterministic proof harnesses verify:m038:s02 and verify:m038:s03 covering rendering, cache reuse, timeout fail-open, substrate-failure truthfulness, and asymmetric partial-degradation cases

Changed

  • Review flow now consumes the bounded structural-impact layer instead of reaching into substrate-native graph types directly
  • Architecture and deployment docs updated to reflect Azure Container App job execution, canonical current-code corpus, and the six-corpus retrieval stack
  • README updated to describe the Structural Impact feature and current retrieval/runtime shape

v0.24 — Hallucination Prevention & Fact Verification

01 Mar 21:12

Choose a tag to compare

Hallucination Prevention & Fact Verification

Motivation: PR #27932 — bot fabricated libxkbcommon version numbers as a [CRITICAL] finding.

Added

  • Epistemic boundary system with 3-tier knowledge classification (diff-visible, context-visible, external) in review prompts
  • Cross-surface guardrails applied consistently to PR reviews, @mention responses, and Slack assistant
  • Heuristic claim classifier labeling each finding's claims as diff-grounded, external-knowledge, or inferential
  • Severity demotion capping external-knowledge findings at medium severity (CRITICAL/MAJOR demoted)
  • Output filter rewriting findings to remove external claims or suppressing entirely when no diff-grounded core remains
  • Collapsed <details> block in review summary for transparency on suppressed findings

How It Works

  1. Prompt-level guardrails instruct the LLM to distinguish what it can see in the diff from what it "knows" externally
  2. Post-generation classification scans each finding's claims and labels them by evidence source
  3. Severity demotion prevents external-knowledge claims from being flagged as CRITICAL or MAJOR
  4. Output filtering rewrites or suppresses findings that lack diff-grounded evidence

This pipeline prevents the bot from fabricating version numbers, API behaviors, or other external facts and presenting them as high-severity findings.


5 phases (115-119) | 5 plans | ~93,000 lines of TypeScript

Full changelog: CHANGELOG.md | Full milestone history: MILESTONES.md

v0.23 — Interactive Troubleshooting

16 Apr 02:29

Choose a tag to compare

v0.23 (2026-03-01)

Interactive Troubleshooting.

Added

  • State-filtered vector search and resolution-focused thread assembler for troubleshooting retrieval from closed issues
  • Troubleshooting agent with LLM synthesis, provenance citations, and keyword-based intent classification
  • Issue outcome capture via issues.closed webhook with resolution classification and delivery-ID dedup
  • Beta-Binomial Bayesian duplicate threshold auto-tuning per repo with sample gate and [50,95] clamping
  • Nightly reaction sync polling thumbs up/down on triage comments as secondary feedback signal for threshold learning

v0.22 — Issue Intelligence

16 Apr 02:29

Choose a tag to compare

v0.22 (2026-02-27)

Issue Intelligence.

Added

  • Historical issue corpus population via backfill script with Voyage AI embeddings, HNSW-indexed vectors, PR filtering, and cursor-based resume
  • Nightly incremental sync via GitHub Actions cron job for issues and comments updated since last sync
  • High-confidence duplicate detection with top-3 candidate formatting, fail-open design, and comment-only policy (never auto-closes)
  • Auto-triage on issues.opened with config gate (autoTriageOnOpen), four-layer idempotency, and duplicate detection integration
  • PR-issue linking via explicit reference parsing (fixes/closes/relates-to regex) and semantic search fallback, with linked issue context injected into review prompts
  • Issue corpus wired as 5th source in unified cross-corpus RRF retrieval with [issue: #N] Title (status) citations

v0.21 — Issue Triage Foundation

16 Apr 02:29

Choose a tag to compare

v0.21 (2026-02-27)

Issue Triage Foundation.

Added

  • Issue corpus with PostgreSQL issues and issue_comments tables, HNSW vector indexes, and weighted tsvector GIN indexes
  • github_issue_label MCP tool with label pre-validation, partial application, closed-issue warning, and rate limit retry
  • github_issue_comment MCP tool with raw markdown and structured input, update-by-ID, and max length enforcement
  • Issue template parser extracting YAML frontmatter and section headers from .github/ISSUE_TEMPLATE/ templates
  • Triage validation agent with missing-section guidance, needs-info:{slug} label recommendations, and per-issue cooldown

v0.20 Multi-Model & Active Intelligence

26 Feb 22:54

Choose a tag to compare

What's New in v0.20

Kodiai has grown from a basic PR auto-review bot (v0.1) to a full-featured code intelligence platform spanning 19 milestones of development. v0.20 completes the multi-model and active intelligence layer, adding contributor-aware review depth, review pattern clustering, wiki staleness detection, and per-invocation cost tracking.

Highlights Since v0.1

  • Knowledge-Backed Reviews -- 4-corpus hybrid retrieval (code, review comments, wiki, code snippets) with BM25+vector search, Reciprocal Rank Fusion merging, and cross-corpus citations in review output
  • Issue Workflows -- in-thread Q&A with code-aware file pointers, apply:/change: PR creation from issues, write-mode guardrails with secret-scan refusals and permission remediation
  • Slack Integration -- #kodiai channel with thread sessions, read/write modes, high-impact confirmation gating, and answer-first concise responses
  • Multi-LLM Routing -- task-type-based model selection via Vercel AI SDK, per-repo .kodiai.yml overrides, automatic provider fallback, and per-invocation cost tracking to Postgres
  • Contributor Intelligence -- GitHub/Slack identity linking, expertise inference with exponential decay, 4-tier adaptive review depth (strict/balanced/minimal/trusted)
  • Review Quality -- HDBSCAN+UMAP pattern clustering with theme footnotes, draft PR review, dependency bump deep-review with changelog fallback, CI failure recognition with flakiness tracking, risk-weighted file prioritization
  • Infrastructure -- PostgreSQL+pgvector replacing SQLite, graceful SIGTERM shutdown with webhook queue replay, zero-downtime rolling deploys, VoyageAI embeddings (voyage-code-3, 1024 dims)

Version History

See MILESTONES.md for per-version release notes covering all 19 milestones.

Full Changelog

v0.1...v0.20