diff --git a/README.md b/README.md index 795b8d0..5128072 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ **Dispatch your GitHub issues. Receive pull requests.** -Dispatch is an AI-powered CLI tool that solves GitHub issues in batch — creating branches, implementing fixes, and opening pull requests while you sleep. +Dispatch is an AI-powered CLI tool that solves GitHub issues in batch — creating branches, implementing fixes, and opening pull requests while you sleep. v2 adds smart model routing, a memory system that learns across issues and runs, cost tracking, and support for 4 AI providers. Run it at night. Review PRs in the morning. @@ -44,12 +44,15 @@ dispatch schedule Fetches open issues, classifies them, solves each one with AI, and opens pull requests. ```bash -dispatch run # solve all open issues +dispatch run # solve all open issues (auto-detect provider) dispatch run --dry-run # preview without making changes dispatch run --label bug --label p0 # only solve bugs and P0 issues dispatch run --max-issues 5 # limit to 5 issues dispatch run --draft # create all PRs as drafts -dispatch run --model opus # use a specific model +dispatch run --provider gemini # use a specific AI provider +dispatch run --strategy cost-optimized # optimize for lowest cost +dispatch run --no-memory # disable memory system +dispatch run --resume # resume from last checkpoint dispatch run --base-branch develop # target a different base branch ``` @@ -57,13 +60,16 @@ dispatch run --base-branch develop # target a different base branch 1. Fetches open issues from your GitHub repo 2. Classifies each issue (bug fix, feature, investigation, audit, docs, refactor) 3. Prioritizes by labels (P0 → P1 → P2) and reactions -4. For each issue: +4. Loads memory context (codebase cache, past insights, PR lessons) +5. For each batch of issues: - Creates a branch (`dispatch/issue-42-fix-login-bug`) - - Invokes Claude Code to solve it + - Invokes AI (routed per phase: cheap for classify/score, strong for solve) - Self-assesses confidence (1-10) + - Runs tests to verify changes - Commits, pushes, and opens a PR -5. Low-confidence solutions become draft PRs -6. Generates a morning summary report + - Saves insights for the next batch +6. Low-confidence or test-failing solutions become draft PRs +7. Generates a summary report with cost breakdown ### `dispatch create` @@ -82,11 +88,30 @@ dispatch create "add rate limiting to public endpoints" --no-post ### `dispatch status` -View results from the last run. +View results from the last run, including cost breakdown, memory state, and provider config. ```bash -dispatch status # pretty-printed morning report -dispatch status --json # raw JSON output +dispatch status # pretty-printed morning report +dispatch status --json # raw JSON output +dispatch status --memory # show memory system details +``` + +### `dispatch providers` + +Show detected AI providers and model routing configuration. + +```bash +dispatch providers # show providers, routing, and registered models +``` + +### `dispatch learn` + +Scan Dispatch-created PRs for review feedback and extract lessons. Lessons are stored locally and fed into future solves. + +```bash +dispatch learn # scan PRs and extract lessons +dispatch learn --show # show current lessons without scanning +dispatch learn --max-prs 20 # limit scan to 20 PRs ``` ### `dispatch schedule` @@ -172,14 +197,20 @@ Dispatch reads `.dispatchrc.json` from your repo root: "draftThreshold": 5, "stateDir": ".dispatch", "timeoutPerIssue": 600000, - "concurrency": 3 + "concurrency": 3, + "provider": "auto", + "routingStrategy": "auto", + "enableCodebaseContext": true, + "enableCrossIssue": true } ``` | Option | Description | Default | |--------|-------------|---------| -| `engine` | AI backend (`claude`, `github-models`, `gemini`) | `claude` | -| `model` | Model name (`sonnet`/`opus`/`haiku` for Claude; `openai/gpt-4o` etc. for GitHub Models; `gemini-2.5-pro` etc. for Gemini) | `sonnet` | +| `engine` | AI backend (`claude`, `github-models`, `gemini`) — legacy, prefer `provider` | `claude` | +| `model` | Model name — legacy, prefer `routingStrategy` | `sonnet` | +| `provider` | AI provider (`auto`, `anthropic`, `gemini`, `github-models`, `openai`) | `auto` | +| `routingStrategy` | Model routing (`auto`, `provider-locked`, `pinned`) | `auto` | | `labels` | Only process issues with these labels (empty = all) | `[]` | | `exclude` | Skip issues with these labels | `["wontfix", "blocked", "duplicate"]` | | `maxIssues` | Max issues per run | `10` | @@ -192,6 +223,8 @@ Dispatch reads `.dispatchrc.json` from your repo root: | `stateDir` | Directory for dispatch state/logs | `.dispatch` | | `timeoutPerIssue` | Timeout per issue in milliseconds | `600000` (10 min) | | `concurrency` | Number of issues to process in parallel | `3` | +| `enableCodebaseContext` | Cache and reuse codebase analysis (Tier 1 memory) | `true` | +| `enableCrossIssue` | Share insights across issues in a run (Tier 2 memory) | `true` | | `telemetry` | Enable anonymous usage analytics | `true` | ## Issue Types @@ -318,6 +351,83 @@ Uses Google's [Gemini API](https://ai.google.dev/) via its OpenAI-compatible end **How it works:** Like the GitHub Models engine, the Gemini engine runs its own agentic loop: it calls the Gemini API (via Google's OpenAI-compatible endpoint), executes tool calls locally in the worktree, and repeats until the issue is solved. No additional CLI tools need to be installed. +### OpenAI (Direct API) + +Uses the [OpenAI API](https://platform.openai.com/) directly. Requires an `OPENAI_API_KEY`. + +```json +{ + "provider": "openai" +} +``` + +**Available models:** +- `gpt-4.1` — Most capable (recommended for solving) +- `gpt-4.1-mini` — Faster and cheaper (good for classify/score) +- `o3-mini` — Reasoning model + +**Setup:** +1. Get an API key at [platform.openai.com/api-keys](https://platform.openai.com/api-keys) +2. Set it: `export OPENAI_API_KEY=sk-...` +3. Run `dispatch run --provider openai` + +## Model Routing (v2) + +Instead of using one model for everything, Dispatch v2 picks the **optimal model for each phase** of the pipeline. Classification doesn't need the same model as code generation. + +```bash +dispatch run # auto: haiku for classify, sonnet for solve +dispatch run --provider gemini # provider-locked: flash for classify, pro for solve +dispatch run --engine claude # legacy: sonnet for everything (backward compatible) +``` + +**Routing strategies:** +| Strategy | Flag | Behavior | +|----------|------|----------| +| `auto` (default) | — | Picks cheapest model for classify/score, strongest for solve, across all detected providers | +| `provider-locked` | `--provider gemini` | Uses only models from one provider, still routing cheap/strong per phase | +| `pinned` | `--engine claude` | Uses one model for all phases (v1 behavior) | + +**Cost savings:** Smart routing typically saves ~40% vs using the strongest model for everything. + +Run `dispatch providers` to see your current routing configuration and available models. + +## Memory System (v2) + +Dispatch v2 includes a 3-tier memory system that makes the AI smarter over time. + +### Tier 1 — Codebase Context Cache + +On first run, Dispatch analyzes your repo (file tree, package.json, patterns, conventions) and caches it at `.dispatch/memory/context.json`. This context is injected into every solve prompt, eliminating redundant codebase exploration. + +### Tier 2 — Cross-Issue Learning + +When issue #1 discovers project patterns, that knowledge feeds into issue #5. The pipeline processes issues in batches — insights from batch N are injected into batch N+1. + +### Tier 3 — PR Review Lessons (Local) + +After runs, `dispatch learn` scans your Dispatch-created PRs for review feedback and extracts lessons. These lessons are stored locally with a 30-day decay and injected into future solves at low priority. + +```bash +dispatch learn # scan PRs for feedback +dispatch learn --show # view current lessons +``` + +**Disable memory:** +```bash +dispatch run --no-memory # skip all memory injection +``` + +## Checkpoint & Resume (v2) + +If a run crashes at issue 7/10, you don't have to start over: + +```bash +dispatch run --resume # skips already-processed issues +``` + +Progress is saved to `.dispatch/checkpoint.json` after each issue. The checkpoint is cleared on successful completion. + ## Telemetry Dispatch collects **anonymous usage analytics** to help improve the tool. No personally identifiable information (PII) is collected. @@ -359,30 +469,48 @@ export DISPATCH_NO_TELEMETRY=1 ``` dispatch CLI -├── Commands (run, create, status, stats, init, schedule) -├── GitHub Client (octokit — issues, PRs, labels) +├── Commands (run, create, status, stats, init, schedule, providers, learn) +├── ModelRouter (per-phase model selection, cost tracking) +├── Memory System (codebase context, cross-issue insights, PR lessons) +├── GitHub Client (octokit — issues, PRs, comments, labels) ├── Engine Layer (pluggable AI adapters) -│ └── Claude Adapter (claude CLI --print) -│ └── GitHub Models Adapter (openai SDK + local tool execution) -│ └── Gemini Adapter (openai SDK + Google's OpenAI-compatible endpoint) -├── Orchestrator (pipeline, classifier, scorer) -├── Reporter (morning summary, run history) +│ ├── Claude Adapter (claude CLI --print) +│ ├── GitHub Models Adapter (openai SDK + local tool execution) +│ ├── Gemini Adapter (openai SDK + Google's OpenAI-compatible endpoint) +│ └── OpenAI Adapter (openai SDK + direct API) +├── Orchestrator (batched pipeline, classifier, scorer, checkpoint) +├── Reporter (morning summary, cost breakdown, run history) ├── Telemetry (anonymous analytics, local stats) -└── Utils (config, git, logger) +└── Utils (config, git, logger, worktree, semaphore) ``` The engine adapter pattern makes adding new AI backends trivial — implement the `AIEngine` interface and you're done. ## Roadmap +### Completed (v2 Community) - [x] Claude Code engine (default AI backend) - [x] Gemini engine adapter -- [ ] OpenAI adapter -- [x] GitHub Models engine (use Claude/GPT-4o via GITHUB_TOKEN — zero setup) -- [ ] Slack/Discord/Teams notifications on run completion +- [x] OpenAI engine adapter (direct API) +- [x] GitHub Models engine (use GPT-4.1/Claude via GITHUB_TOKEN — zero setup) +- [x] Smart model routing (ModelRouter — per-phase model selection) +- [x] Codebase context caching (Tier 1 memory) +- [x] Cross-issue learning within runs (Tier 2 memory) +- [x] Learn from PR review feedback (`dispatch learn` — Tier 3 local memory) +- [x] Batched parallel issue solving with insight sharing +- [x] Post-solve test verification +- [x] Cost tracking and per-run cost breakdown +- [x] Checkpoint/resume for crashed runs +- [x] Provider detection and diagnostics (`dispatch providers`) - [x] GitHub Action for scheduled runs - [x] Telemetry and analytics (`dispatch stats`) + +### Planned (Pro/Enterprise) +- [ ] Web dashboard (dispatch.dev) — run history, PR dashboard, analytics +- [ ] Cross-run persistent memory (Tier 3 cloud sync) +- [ ] GitLab and Bitbucket integration +- [ ] Managed AI proxy (no API keys needed) +- [ ] Slack/Discord/Teams notifications +- [ ] Visual workflow builder +- [ ] Team management and shared memory - [ ] Issue decomposition (break large issues into sub-tasks) -- [ ] Learn from PR review feedback -- [ ] Parallel issue solving -- [ ] Web dashboard for run history diff --git a/bin/dispatch.ts b/bin/dispatch.ts index 87843f8..4a3c505 100644 --- a/bin/dispatch.ts +++ b/bin/dispatch.ts @@ -11,6 +11,8 @@ import { registerStatusCommand } from "../src/commands/status.js"; import { registerInitCommand } from "../src/commands/init.js"; import { registerScheduleCommand } from "../src/commands/schedule.js"; import { registerStatsCommand } from "../src/commands/stats.js"; +import { registerProvidersCommand } from "../src/commands/providers.js"; +import { registerLearnCommand } from "../src/commands/learn.js"; const __dirname = dirname(fileURLToPath(import.meta.url)); const pkg = JSON.parse(readFileSync(join(__dirname, "..", "..", "package.json"), "utf-8")); @@ -34,6 +36,8 @@ registerStatusCommand(program); registerInitCommand(program); registerScheduleCommand(program); registerStatsCommand(program); +registerProvidersCommand(program); +registerLearnCommand(program); // Default action (no subcommand) program.action(() => { @@ -53,6 +57,8 @@ program.action(() => { console.log(` ${chalk.yellow("dispatch init")} Initialize config for this repo`); console.log(` ${chalk.yellow("dispatch schedule")} Set up nightly scheduled runs`); console.log(` ${chalk.yellow("dispatch stats")} View historical run statistics`); + console.log(` ${chalk.yellow("dispatch providers")} Show detected AI providers`); + console.log(` ${chalk.yellow("dispatch learn")} Learn from PR review feedback`); console.log(); console.log(chalk.gray(" Run dispatch --help for full options.")); console.log(); diff --git a/claude.md b/claude.md index d7e13ce..2455209 100644 --- a/claude.md +++ b/claude.md @@ -1,7 +1,7 @@ # CLAUDE.md — Dispatch CLI ## Project Overview -Dispatch is an AI-powered CLI tool for solving GitHub issues in batch. It creates branches, implements fixes, and opens PRs using Claude Code. +Dispatch is an AI-powered CLI tool for solving GitHub issues in batch. It creates branches, implements fixes, and opens PRs using multi-provider AI (Claude, Gemini, OpenAI, GitHub Models). v2 adds smart per-phase model routing, a 3-tier memory system, cost tracking, and checkpoint/resume. ## Build and Test Commands - **Build**: `npm run build` @@ -14,16 +14,28 @@ Dispatch is an AI-powered CLI tool for solving GitHub issues in batch. It create - **Module System**: ESM (use `.js` extensions in imports) - **Formatting**: 2-space indentation - **Error Handling**: Use `log.error` from `src/utils/logger.ts` -- **Logging**: Use the `log` utility (info, success, warn, error) +- **Logging**: Use the `log` utility (info, success, warn, error, debug) - **Testing**: Use Node.js native test runner (`node:test`) and `node:assert/strict` - **Git Operations**: Use utilities in `src/utils/git.ts` and `src/utils/worktree.ts` - **Dependency Management**: npm +- **Commit Messages**: Do NOT include Co-Authored-By lines ## Architecture Guide -- `bin/`: CLI entry point -- `src/commands/`: CLI subcommands (run, create, init, status, schedule) -- `src/engine/`: AI adapters -- `src/github/`: GitHub API client (issues, PRs) -- `src/orchestrator/`: Pipeline, classification, and scoring logic -- `src/reporter/`: Run summaries and reports -- `src/utils/`: Shared utilities (config, git, logger, etc.) +- `bin/`: CLI entry point (`dispatch.ts`) +- `src/commands/`: CLI subcommands (run, create, init, status, schedule, stats, providers, learn) +- `src/engine/`: AI engine adapters (claude, gemini, github-models, openai) + agentic loop + tool executor +- `src/router/`: ModelRouter — per-phase model selection, model registry, cost tracking, provider detection +- `src/memory/`: 3-tier memory system — codebase context cache (Tier 1), cross-issue insights (Tier 2), lessons from PR reviews (Tier 3 local) +- `src/github/`: GitHub API client (issues, PRs, comments, labels) +- `src/orchestrator/`: Pipeline (batched), classifier (heuristic + AI), scorer (heuristic adjustment) +- `src/reporter/`: Run summaries with cost breakdowns +- `src/telemetry/`: Anonymous analytics (local stats + optional PostHog) +- `src/utils/`: Shared utilities (config, git, logger, worktree, semaphore) + +## Key v2 Concepts +- **ModelRouter**: Selects optimal model per pipeline phase (cheap for classify/score, strong for solve). Strategies: auto, provider-locked, pinned. +- **Memory Tier 1**: Codebase context cache at `.dispatch/memory/context.json` — project structure, patterns, dependencies. Regenerated when stale. +- **Memory Tier 2**: Cross-issue insights — discoveries from issue N feed into issue N+1. Batched pipeline (not all-parallel). +- **Memory Tier 3 (local)**: Lessons from PR reviews at `.dispatch/memory/lessons.json` — populated by `dispatch learn`. 30-day decay. +- **Cost Tracking**: Every run reports cost breakdown by phase and provider in the summary. +- **Checkpoint/Resume**: Progress saved to `.dispatch/checkpoint.json` after each issue. Use `--resume` to continue. diff --git a/package.json b/package.json index 71a4032..c7aca80 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "dispatch-ai", - "version": "0.1.0-beta.3", + "version": "0.2.0-beta", "description": "Dispatch your GitHub issues. Receive pull requests. AI-powered batch issue solver.", "author": "Mehul Patel ", "license": "Apache-2.0 WITH Commons-Clause", diff --git a/src/commands/learn.ts b/src/commands/learn.ts new file mode 100644 index 0000000..70a9a4f --- /dev/null +++ b/src/commands/learn.ts @@ -0,0 +1,296 @@ +import { Command } from "commander"; +import chalk from "chalk"; +import { readFile, writeFile, mkdir } from "node:fs/promises"; +import { join } from "node:path"; +import { loadConfig } from "../utils/config.js"; +import { getRepoInfo } from "../utils/git.js"; +import { GitHubClient } from "../github/client.js"; +import { log } from "../utils/logger.js"; + +interface Lesson { + source: string; + prNumber: number; + issueNumber: number | null; + lesson: string; + category: "style" | "correctness" | "approach" | "general"; + learnedAt: string; + decayScore: number; +} + +interface LessonsStore { + version: number; + lessons: Lesson[]; + lastScanAt: string; +} + +const LESSONS_FILE = "memory/lessons.json"; +const HALF_LIFE_DAYS = 30; + +export function registerLearnCommand(program: Command) { + program + .command("learn") + .description("Scan Dispatch PRs for review feedback and extract lessons") + .option("--max-prs ", "Max PRs to scan", parseInt) + .option("--show", "Just show current lessons without scanning") + .action(async (options) => { + try { + const cwd = process.cwd(); + const config = await loadConfig(cwd); + const { owner, repo } = await getRepoInfo(cwd); + + const lessonsPath = join(cwd, config.stateDir, LESSONS_FILE); + + // Load existing lessons + let store = await loadLessons(lessonsPath); + + if (options.show) { + printLessons(store); + return; + } + + const github = await GitHubClient.create(owner, repo); + + log.info(`Scanning PRs in ${chalk.bold(`${owner}/${repo}`)} for Dispatch feedback...`); + + // Fetch PRs created by Dispatch + const prs = await github.listDispatchPRs({ + branchPrefix: config.branchPrefix, + state: "all", + maxPRs: options.maxPrs || 50, + }); + + if (prs.length === 0) { + log.info("No Dispatch PRs found. Run `dispatch run` first to create some PRs."); + return; + } + + log.info(`Found ${prs.length} Dispatch PRs to analyze`); + + let newLessonsCount = 0; + + for (const pr of prs) { + // Skip PRs we've already scanned (by checking existing lessons) + if (store.lessons.some((l) => l.prNumber === pr.number)) { + continue; + } + + // Only learn from PRs with feedback + if (pr.state === "open") continue; + + // Extract issue number from branch name + const issueMatch = pr.headBranch.match(/issue-(\d+)/); + const issueNumber = issueMatch ? parseInt(issueMatch[1], 10) : null; + + // Fetch review comments + const reviewComments = await github.fetchPRReviewComments(pr.number); + const issueComments = await github.fetchPRIssueComments(pr.number); + const allComments = [...reviewComments, ...issueComments]; + + // Filter out bot comments and the PR author's own comments + const humanFeedback = allComments.filter((c) => + c.author !== "github-actions[bot]" && + !c.body.includes("Created by Dispatch") && + c.body.trim().length > 10 + ); + + if (humanFeedback.length === 0 && pr.merged) { + // Merged with no feedback = good, but nothing to learn + continue; + } + + if (pr.merged && humanFeedback.length > 0) { + // Merged with review comments — extract lessons + for (const comment of humanFeedback) { + const lesson = extractLesson(comment.body, pr.title); + if (lesson) { + store.lessons.push({ + source: `PR #${pr.number} review by ${comment.author}`, + prNumber: pr.number, + issueNumber, + lesson, + category: categorizeLesson(lesson), + learnedAt: new Date().toISOString(), + decayScore: 1.0, + }); + newLessonsCount++; + } + } + } + + if (!pr.merged && pr.state === "closed") { + // Closed without merge — the approach was wrong + const rejectionComment = humanFeedback[0]; + const reason = rejectionComment + ? extractLesson(rejectionComment.body, pr.title) + : `PR "${pr.title}" was rejected without merge`; + + if (reason) { + store.lessons.push({ + source: `PR #${pr.number} (rejected)`, + prNumber: pr.number, + issueNumber, + lesson: reason, + category: "approach", + learnedAt: new Date().toISOString(), + decayScore: 1.0, + }); + newLessonsCount++; + } + } + } + + // Apply decay to existing lessons + applyDecay(store); + + // Deduplicate similar lessons + deduplicateLessons(store); + + // Save + store.lastScanAt = new Date().toISOString(); + await saveLessons(lessonsPath, store); + + // Print results + console.log(); + if (newLessonsCount > 0) { + log.success(`Learned ${newLessonsCount} new lesson(s) from PR reviews`); + } else { + log.info("No new lessons found from PR reviews"); + } + + printLessons(store); + } catch (err) { + log.error(`Learn failed: ${err instanceof Error ? err.message : err}`); + process.exit(1); + } + }); +} + +function extractLesson(commentBody: string, prTitle: string): string | null { + // Clean up the comment + const cleaned = commentBody + .replace(/```[\s\S]*?```/g, "") // Remove code blocks + .replace(//g, "") // Remove HTML comments + .trim(); + + if (cleaned.length < 15) return null; + if (cleaned.length > 500) return cleaned.substring(0, 500); + + return cleaned; +} + +function categorizeLesson(lesson: string): Lesson["category"] { + const lower = lesson.toLowerCase(); + if (lower.includes("style") || lower.includes("naming") || lower.includes("format")) return "style"; + if (lower.includes("bug") || lower.includes("error") || lower.includes("null") || lower.includes("check")) return "correctness"; + if (lower.includes("approach") || lower.includes("instead") || lower.includes("better") || lower.includes("should")) return "approach"; + return "general"; +} + +function applyDecay(store: LessonsStore): void { + const now = Date.now(); + for (const lesson of store.lessons) { + const ageMs = now - new Date(lesson.learnedAt).getTime(); + const ageDays = ageMs / (1000 * 60 * 60 * 24); + lesson.decayScore = Math.pow(0.5, ageDays / HALF_LIFE_DAYS); + } + + // Remove lessons with very low decay scores + store.lessons = store.lessons.filter((l) => l.decayScore > 0.1); +} + +function deduplicateLessons(store: LessonsStore): void { + const seen = new Set(); + store.lessons = store.lessons.filter((l) => { + const key = l.lesson.substring(0, 100).toLowerCase(); + if (seen.has(key)) return false; + seen.add(key); + return true; + }); +} + +async function loadLessons(path: string): Promise { + try { + const raw = await readFile(path, "utf-8"); + return JSON.parse(raw) as LessonsStore; + } catch { + return { version: 1, lessons: [], lastScanAt: "" }; + } +} + +async function saveLessons(path: string, store: LessonsStore): Promise { + const { dirname } = await import("node:path"); + await mkdir(dirname(path), { recursive: true }); + await writeFile(path, JSON.stringify(store, null, 2), "utf-8"); +} + +function printLessons(store: LessonsStore): void { + if (store.lessons.length === 0) { + console.log(chalk.gray("\n No lessons learned yet.")); + console.log(chalk.gray(" Create some PRs with `dispatch run`, get them reviewed, then run `dispatch learn`.")); + console.log(); + return; + } + + console.log(); + console.log(chalk.bold(` Lessons Learned (${store.lessons.length}):`)); + console.log(); + + // Sort by decay score (most relevant first) + const sorted = [...store.lessons].sort((a, b) => b.decayScore - a.decayScore); + + for (const lesson of sorted.slice(0, 15)) { + const relevance = lesson.decayScore > 0.8 + ? chalk.green("HIGH") + : lesson.decayScore > 0.5 + ? chalk.yellow("MED") + : chalk.gray("LOW"); + + const source = chalk.gray(lesson.source); + const category = chalk.cyan(`[${lesson.category}]`); + + console.log(` ${relevance} ${category} ${source}`); + console.log(` ${lesson.lesson.substring(0, 120)}`); + console.log(); + } + + if (sorted.length > 15) { + console.log(chalk.gray(` ... and ${sorted.length - 15} more lessons`)); + console.log(); + } + + if (store.lastScanAt) { + console.log(chalk.gray(` Last scanned: ${store.lastScanAt}`)); + console.log(); + } +} + +/** Load lessons for injection into solve prompts */ +export async function loadLessonsForPrompt(cwd: string, stateDir: string): Promise { + const path = join(cwd, stateDir, LESSONS_FILE); + const store = await loadLessons(path); + + if (store.lessons.length === 0) return ""; + + applyDecay(store); + + // Get top lessons by relevance + const topLessons = store.lessons + .filter((l) => l.decayScore > 0.3) + .sort((a, b) => b.decayScore - a.decayScore) + .slice(0, 10); + + if (topLessons.length === 0) return ""; + + const lines = [ + "## Lessons from Previous PR Reviews", + "", + "These are patterns learned from human code review of Dispatch's past PRs:", + "", + ]; + + for (const lesson of topLessons) { + lines.push(`- [${lesson.category}] ${lesson.lesson.substring(0, 200)}`); + } + + return lines.join("\n"); +} diff --git a/src/commands/providers.ts b/src/commands/providers.ts new file mode 100644 index 0000000..96b228b --- /dev/null +++ b/src/commands/providers.ts @@ -0,0 +1,95 @@ +import { Command } from "commander"; +import chalk from "chalk"; +import { loadConfig } from "../utils/config.js"; +import { log } from "../utils/logger.js"; + +export function registerProvidersCommand(program: Command) { + program + .command("providers") + .description("Show detected AI providers and model routing configuration") + .action(async () => { + try { + const cwd = process.cwd(); + const config = await loadConfig(cwd); + + const { getDetectedProvidersSummary, ModelRouter, MODEL_REGISTRY } = await import("../router/index.js"); + const providers = getDetectedProvidersSummary(); + + console.log(); + console.log(chalk.bold(" Detected AI Providers:")); + console.log(); + + for (const p of providers) { + const icon = p.available ? chalk.green(" +") : chalk.red(" -"); + const name = p.provider.padEnd(16); + const reason = chalk.gray(p.reason || ""); + console.log(`${icon} ${chalk.bold(name)} ${reason}`); + } + + const available = providers.filter((p) => p.available); + if (available.length === 0) { + console.log(); + console.log(chalk.yellow(" No providers detected. Set at least one API key:")); + console.log(chalk.gray(" ANTHROPIC_API_KEY, GEMINI_API_KEY, GITHUB_TOKEN, or OPENAI_API_KEY")); + console.log(); + return; + } + + // Show model routing + console.log(); + console.log(chalk.bold(` Model Routing (strategy: ${config.routingStrategy}):`)); + console.log(); + + try { + const router = new ModelRouter({ + strategy: config.routingStrategy as any, + preferredProvider: config.provider !== "auto" ? config.provider as any : undefined, + }); + + const phases = ["classify", "solve", "score", "create-issue"] as const; + for (const phase of phases) { + const model = router.getModelForPhase(phase); + const estCost = phase === "solve" + ? (50_000 / 1_000_000 * model.inputCostPer1M + 5_000 / 1_000_000 * model.outputCostPer1M) + : (2_000 / 1_000_000 * model.inputCostPer1M + 200 / 1_000_000 * model.outputCostPer1M); + + console.log( + ` ${chalk.cyan(phase.padEnd(14))} ${chalk.white(model.displayName.padEnd(30))} ` + + `${chalk.gray(`~$${estCost.toFixed(4)}/call`)}` + ); + } + } catch (err) { + console.log(chalk.yellow(` Could not initialize router: ${err instanceof Error ? err.message : err}`)); + } + + // Show registered models + console.log(); + console.log(chalk.bold(" Registered Models:")); + console.log(); + + const byProvider = new Map(); + for (const model of MODEL_REGISTRY) { + const list = byProvider.get(model.provider) || []; + list.push(model); + byProvider.set(model.provider, list); + } + + for (const [provider, models] of byProvider) { + const isAvailable = providers.find((p) => p.provider === provider)?.available; + const status = isAvailable ? chalk.green("[available]") : chalk.gray("[no key]"); + console.log(` ${chalk.bold(provider)} ${status}`); + for (const model of models) { + const costStr = model.inputCostPer1M > 0 + ? `$${model.inputCostPer1M}/$${model.outputCostPer1M} per 1M tokens` + : "free tier"; + console.log(` ${model.displayName.padEnd(28)} ${chalk.gray(costStr)}`); + } + } + + console.log(); + } catch (err) { + log.error(`Providers check failed: ${err instanceof Error ? err.message : err}`); + process.exit(1); + } + }); +} diff --git a/src/commands/run.ts b/src/commands/run.ts index 270735a..5dac0fd 100644 --- a/src/commands/run.ts +++ b/src/commands/run.ts @@ -6,8 +6,9 @@ import { GitHubClient } from "../github/client.js"; import { ClaudeEngine } from "../engine/claude.js"; import { GitHubModelsEngine } from "../engine/github-models.js"; import { GeminiEngine } from "../engine/gemini.js"; -import { runPipeline } from "../orchestrator/pipeline.js"; +import { runPipeline, loadCheckpoint } from "../orchestrator/pipeline.js"; import { log } from "../utils/logger.js"; +import type { ModelRouter as ModelRouterType } from "../router/router.js"; export function registerRunCommand(program: Command) { program @@ -24,6 +25,10 @@ export function registerRunCommand(program: Command) { .option("--base-branch ", "Base branch for PRs (default: main)") .option("--concurrency ", "Number of issues to process in parallel", parseInt) .option("--no-telemetry", "Disable anonymous telemetry for this run") + .option("--provider ", "AI provider: anthropic, gemini, github-models, openai (default: auto)") + .option("--strategy ", "Model routing: auto, provider-locked, pinned (default: auto)") + .option("--no-memory", "Disable memory system (codebase context + cross-issue learning)") + .option("--resume", "Resume from last checkpoint (skip already-processed issues)") .action(async (options) => { try { const cwd = process.cwd(); @@ -42,31 +47,71 @@ export function registerRunCommand(program: Command) { log.info(`Filtering: ${config.labels.join(", ")}`); } - // Create engine + // Create engine via ModelRouter (backward-compatible with --engine flag) + const { ModelRouter } = await import("../router/index.js"); + let engine; - if (config.engine === "claude") { - engine = new ClaudeEngine({ - model: config.model, - maxTurns: config.maxTurnsPerIssue, - }); - } else if (config.engine === "github-models") { - engine = new GitHubModelsEngine({ - model: config.model, - maxTurns: config.maxTurnsPerIssue, - }); - } else if (config.engine === "gemini") { - engine = new GeminiEngine({ - model: config.model === "sonnet" ? "gemini-2.5-pro" : config.model, - maxTurns: config.maxTurnsPerIssue, - }); + let router: ModelRouterType | undefined; + + // Backward compatibility: if --engine is explicitly set, use the old path + if (options.engine) { + log.debug(`Using legacy --engine flag: ${config.engine}`); + if (config.engine === "claude") { + engine = new ClaudeEngine({ model: config.model, maxTurns: config.maxTurnsPerIssue }); + } else if (config.engine === "github-models") { + engine = new GitHubModelsEngine({ model: config.model, maxTurns: config.maxTurnsPerIssue }); + } else if (config.engine === "gemini") { + engine = new GeminiEngine({ + model: config.model === "sonnet" ? "gemini-2.5-pro" : config.model, + maxTurns: config.maxTurnsPerIssue, + }); + } else { + log.error(`Engine "${config.engine}" is not supported.`); + process.exit(1); + } } else { - log.error(`Engine "${config.engine}" is not supported. Use "claude", "github-models", or "gemini".`); - process.exit(1); + // New path: use ModelRouter with provider detection + router = new ModelRouter({ + strategy: config.routingStrategy as any, + preferredProvider: config.provider !== "auto" ? config.provider as any : undefined, + }); + + const solveModel = router.getModelForPhase("solve"); + log.info(`ModelRouter: ${solveModel.displayName} for solving (strategy: ${config.routingStrategy})`); + + // Create engine based on selected provider + const { OpenAIEngine } = await import("../engine/openai.js"); + switch (solveModel.provider) { + case "anthropic": + engine = new ClaudeEngine({ model: solveModel.modelId, maxTurns: config.maxTurnsPerIssue }); + break; + case "gemini": + engine = new GeminiEngine({ model: solveModel.modelId, maxTurns: config.maxTurnsPerIssue }); + break; + case "github-models": + engine = new GitHubModelsEngine({ model: solveModel.modelId, maxTurns: config.maxTurnsPerIssue }); + break; + case "openai": + engine = new OpenAIEngine({ model: solveModel.modelId, maxTurns: config.maxTurnsPerIssue }); + break; + default: + log.error(`Provider "${solveModel.provider}" is not yet supported.`); + process.exit(1); + } } // Create GitHub client const github = await GitHubClient.create(owner, repo); + // Load checkpoint for resume + let skipIssues: number[] = []; + if (options.resume) { + skipIssues = await loadCheckpoint(cwd, config.stateDir); + if (skipIssues.length > 0) { + log.info(`Resuming: ${skipIssues.length} issues already processed`); + } + } + // Run the pipeline const summary = await runPipeline({ config, @@ -74,6 +119,8 @@ export function registerRunCommand(program: Command) { github, cwd, dryRun: options.dryRun, + router, + skipIssues, }); // Exit with appropriate code diff --git a/src/commands/status.ts b/src/commands/status.ts index e22df87..f7ba2de 100644 --- a/src/commands/status.ts +++ b/src/commands/status.ts @@ -1,5 +1,7 @@ import { Command } from "commander"; import chalk from "chalk"; +import { readFile, access } from "node:fs/promises"; +import { join } from "node:path"; import { loadConfig } from "../utils/config.js"; import { loadLastSummary, formatMorningSummary } from "../reporter/summary.js"; import { log } from "../utils/logger.js"; @@ -9,6 +11,7 @@ export function registerStatusCommand(program: Command) { .command("status") .description("View the results of the last dispatch run") .option("--json", "Output as JSON") + .option("--memory", "Show memory system details") .action(async (options) => { try { const cwd = process.cwd(); @@ -30,9 +33,72 @@ export function registerStatusCommand(program: Command) { // Pretty print the morning summary console.log(); console.log(formatMorningSummary(summary)); + + // Show memory details if requested or if there's interesting memory state + if (options.memory || summary.memoryStats) { + await showMemoryDetails(cwd, config.stateDir); + } + + // Show provider info + await showProviderInfo(config); } catch (err) { log.error(`Status failed: ${err instanceof Error ? err.message : err}`); process.exit(1); } }); } + +async function showMemoryDetails(cwd: string, stateDir: string): Promise { + console.log(chalk.bold(" Memory System:")); + + // Check codebase context cache + try { + const contextPath = join(cwd, stateDir, "memory", "context.json"); + await access(contextPath); + const raw = await readFile(contextPath, "utf-8"); + const ctx = JSON.parse(raw) as { generatedAt: string; commitHash: string; tokenEstimate: number }; + const age = Date.now() - new Date(ctx.generatedAt).getTime(); + const ageMinutes = Math.round(age / 1000 / 60); + const ageStr = ageMinutes < 60 + ? `${ageMinutes}m ago` + : ageMinutes < 1440 + ? `${Math.round(ageMinutes / 60)}h ago` + : `${Math.round(ageMinutes / 1440)}d ago`; + + console.log(` ${chalk.green("+")} Codebase context: cached (${ageStr}, ~${ctx.tokenEstimate} tokens, commit ${ctx.commitHash.substring(0, 7)})`); + } catch { + console.log(` ${chalk.gray("-")} Codebase context: not cached`); + } + + // Check lessons + try { + const lessonsPath = join(cwd, stateDir, "memory", "lessons.json"); + await access(lessonsPath); + const raw = await readFile(lessonsPath, "utf-8"); + const store = JSON.parse(raw) as { lessons: unknown[]; lastScanAt: string }; + console.log(` ${chalk.green("+")} Lessons: ${store.lessons.length} learned (last scan: ${store.lastScanAt || "never"})`); + } catch { + console.log(` ${chalk.gray("-")} Lessons: none (run ${chalk.yellow("dispatch learn")} to scan PRs)`); + } + + // Check checkpoint + try { + const cpPath = join(cwd, stateDir, "checkpoint.json"); + await access(cpPath); + const raw = await readFile(cpPath, "utf-8"); + const cp = JSON.parse(raw) as { processedIssues: number[]; timestamp: string }; + console.log(` ${chalk.yellow("!")} Checkpoint: ${cp.processedIssues.length} issues processed (${cp.timestamp})`); + console.log(` ${chalk.gray(`Use ${chalk.yellow("dispatch run --resume")} to continue`)}`); + } catch { + // No checkpoint — that's normal + } + + console.log(); +} + +async function showProviderInfo(config: { provider: string; routingStrategy: string }): Promise { + console.log(chalk.bold(" Configuration:")); + console.log(` Provider: ${config.provider === "auto" ? "auto-detect" : config.provider}`); + console.log(` Strategy: ${config.routingStrategy}`); + console.log(); +} diff --git a/src/engine/base.ts b/src/engine/base.ts index 9170a13..a3b9c38 100644 --- a/src/engine/base.ts +++ b/src/engine/base.ts @@ -1,7 +1,10 @@ import type { Issue, IssueClassification } from "./types.js"; -/** Build a rich context prompt from an issue and its comments */ -export function buildIssuePrompt(issue: Issue): string { +/** Build a rich context prompt from an issue, its comments, and optional memory context */ +export function buildIssuePrompt( + issue: Issue, + options?: { codebaseContext?: string; crossIssueInsights?: string }, +): string { let prompt = `# GitHub Issue #${issue.number}: ${issue.title}\n\n`; if (issue.labels.length > 0) { @@ -10,6 +13,16 @@ export function buildIssuePrompt(issue: Issue): string { prompt += `**Author:** ${issue.author}\n`; prompt += `**Created:** ${issue.createdAt}\n\n`; + // Inject codebase context (Tier 1 memory) + if (options?.codebaseContext) { + prompt += `${options.codebaseContext}\n\n`; + } + + // Inject cross-issue insights (Tier 2 memory) + if (options?.crossIssueInsights) { + prompt += `${options.crossIssueInsights}\n\n`; + } + if (issue.body) { prompt += `## Description\n\n${issue.body}\n\n`; } diff --git a/src/engine/claude.ts b/src/engine/claude.ts index 6b355ab..38d9c9c 100644 --- a/src/engine/claude.ts +++ b/src/engine/claude.ts @@ -170,7 +170,10 @@ export class ClaudeEngine implements AIEngine { async solve(issue: Issue, context: RepoContext): Promise { const classification = issue.classification || "code-fix"; const systemPrompt = SYSTEM_PROMPTS[classification]; - const issuePrompt = buildIssuePrompt(issue); + const issuePrompt = buildIssuePrompt(issue, { + codebaseContext: context.codebaseContext, + crossIssueInsights: context.crossIssueInsights, + }); log.info(`Solving #${issue.number} as "${classification}" with Claude (${this.model})...`); @@ -283,8 +286,28 @@ ${CONFIDENCE_PROMPT}`; issue: Issue, changedFiles: string[] ): Promise<{ score: number; uncertainties: string[] }> { - // Confidence is already scored during solve() via self-assessment - // This method exists for re-scoring if needed - return { score: 5, uncertainties: ["Re-scoring not yet implemented"] }; + const prompt = `Review this issue and the files that were changed. Rate the likelihood the changes correctly solve the issue. + +Issue: #${issue.number} — ${issue.title} +${issue.body ? issue.body.substring(0, 500) : "No description"} + +Changed files: ${changedFiles.join(", ")} + +Respond in JSON: { "score": <1-10>, "uncertainties": ["..."] }`; + + try { + const result = await this.runClaude(prompt, { + maxTurns: 1, + allowedTools: [], + timeout: 30_000, + }); + const parsed = this.parseJSON<{ score: number; uncertainties: string[] }>(result); + return { + score: Math.min(10, Math.max(1, Number(parsed.score) || 5)), + uncertainties: parsed.uncertainties || [], + }; + } catch { + return { score: 5, uncertainties: ["Scoring failed — manual review recommended"] }; + } } } diff --git a/src/engine/gemini.ts b/src/engine/gemini.ts index cf5f98f..67b06af 100644 --- a/src/engine/gemini.ts +++ b/src/engine/gemini.ts @@ -120,7 +120,10 @@ export class GeminiEngine implements AIEngine { async solve(issue: Issue, context: RepoContext): Promise { const classification = issue.classification || "code-fix"; const systemPrompt = SYSTEM_PROMPTS[classification]; - const issuePrompt = buildIssuePrompt(issue); + const issuePrompt = buildIssuePrompt(issue, { + codebaseContext: context.codebaseContext, + crossIssueInsights: context.crossIssueInsights, + }); log.info(`Solving #${issue.number} as "${classification}" with Gemini (${this.model})...`); @@ -224,9 +227,33 @@ ${CONFIDENCE_PROMPT}`; } async scoreConfidence( - _issue: Issue, - _changedFiles: string[], + issue: Issue, + changedFiles: string[], ): Promise<{ score: number; uncertainties: string[] }> { - return { score: 5, uncertainties: ["Re-scoring not yet implemented"] }; + const prompt = `Review this issue and the files that were changed. Rate the likelihood the changes correctly solve the issue. + +Issue: #${issue.number} — ${issue.title} +${issue.body ? issue.body.substring(0, 500) : "No description"} + +Changed files: ${changedFiles.join(", ")} + +Respond in JSON only: { "score": <1-10>, "uncertainties": ["..."] }`; + + try { + const response = await this.client.chat.completions.create({ + model: this.model, + messages: [{ role: "user", content: prompt }], + temperature: 0.1, + max_tokens: 200, + }); + const text = response.choices[0]?.message?.content || ""; + const parsed = this.parseJSON<{ score: number; uncertainties: string[] }>(text); + return { + score: Math.min(10, Math.max(1, Number(parsed.score) || 5)), + uncertainties: parsed.uncertainties || [], + }; + } catch { + return { score: 5, uncertainties: ["Scoring failed — manual review recommended"] }; + } } } diff --git a/src/engine/github-models.ts b/src/engine/github-models.ts index d80d74b..f01eff9 100644 --- a/src/engine/github-models.ts +++ b/src/engine/github-models.ts @@ -120,7 +120,10 @@ export class GitHubModelsEngine implements AIEngine { async solve(issue: Issue, context: RepoContext): Promise { const classification = issue.classification || "code-fix"; const systemPrompt = SYSTEM_PROMPTS[classification]; - const issuePrompt = buildIssuePrompt(issue); + const issuePrompt = buildIssuePrompt(issue, { + codebaseContext: context.codebaseContext, + crossIssueInsights: context.crossIssueInsights, + }); log.info(`Solving #${issue.number} as "${classification}" with GitHub Models (${this.model})...`); @@ -224,9 +227,33 @@ ${CONFIDENCE_PROMPT}`; } async scoreConfidence( - _issue: Issue, - _changedFiles: string[], + issue: Issue, + changedFiles: string[], ): Promise<{ score: number; uncertainties: string[] }> { - return { score: 5, uncertainties: ["Re-scoring not yet implemented"] }; + const prompt = `Review this issue and the files that were changed. Rate the likelihood the changes correctly solve the issue. + +Issue: #${issue.number} — ${issue.title} +${issue.body ? issue.body.substring(0, 500) : "No description"} + +Changed files: ${changedFiles.join(", ")} + +Respond in JSON only: { "score": <1-10>, "uncertainties": ["..."] }`; + + try { + const response = await this.client.chat.completions.create({ + model: this.model, + messages: [{ role: "user", content: prompt }], + temperature: 0.1, + max_tokens: 200, + }); + const text = response.choices[0]?.message?.content || ""; + const parsed = this.parseJSON<{ score: number; uncertainties: string[] }>(text); + return { + score: Math.min(10, Math.max(1, Number(parsed.score) || 5)), + uncertainties: parsed.uncertainties || [], + }; + } catch { + return { score: 5, uncertainties: ["Scoring failed — manual review recommended"] }; + } } } diff --git a/src/engine/openai.ts b/src/engine/openai.ts new file mode 100644 index 0000000..b83d3ee --- /dev/null +++ b/src/engine/openai.ts @@ -0,0 +1,254 @@ +import OpenAI from "openai"; +import { writeFile } from "node:fs/promises"; +import type { + AIEngine, + Issue, + IssueClassification, + RepoContext, + SolveResult, + StructuredIssue, +} from "./types.js"; +import { + buildIssuePrompt, + SYSTEM_PROMPTS, + CONFIDENCE_PROMPT, + CLASSIFICATION_PROMPT, + ISSUE_CREATION_PROMPT, +} from "./base.js"; +import { SOLVE_TOOLS, READ_ONLY_TOOLS } from "./tools/definitions.js"; +import { runAgenticLoop } from "./agentic-loop.js"; +import { log } from "../utils/logger.js"; + +interface OpenAIOptions { + model: string; + maxTurns: number; +} + +export class OpenAIEngine implements AIEngine { + readonly name = "openai"; + private client: OpenAI; + private model: string; + private maxTurns: number; + + constructor(options: OpenAIOptions) { + const apiKey = process.env.OPENAI_API_KEY; + if (!apiKey) { + throw new Error( + "OPENAI_API_KEY is required for the openai engine. " + + "Get an API key at: https://platform.openai.com/api-keys" + ); + } + + this.client = new OpenAI({ + baseURL: "https://api.openai.com/v1", + apiKey, + }); + + this.model = options.model; + this.maxTurns = options.maxTurns; + } + + /** Parse JSON from a model response, handling markdown code blocks */ + private parseJSON(text: string): T { + try { + return JSON.parse(text); + } catch { + const jsonMatch = text.match(/```(?:json)?\s*\n?([\s\S]*?)\n?```/); + if (jsonMatch) { + return JSON.parse(jsonMatch[1].trim()); + } + + const objectMatch = text.match(/\{[\s\S]*\}/); + if (objectMatch) { + return JSON.parse(objectMatch[0]); + } + + throw new Error(`Could not parse JSON from response: ${text.substring(0, 200)}`); + } + } + + /** Parse the self-assessment JSON from a solve response */ + private parseAssessment( + text: string, + trackedFiles: string[], + issueNumber: number, + ): SolveResult { + try { + const assessment = this.parseJSON<{ + confidence: number; + summary: string; + uncertainties: string[]; + changedFiles: string[]; + commitMessage: string; + }>(text); + + const rawConfidence = Number(assessment.confidence); + const confidence = Number.isFinite(rawConfidence) + ? Math.min(10, Math.max(1, rawConfidence)) + : 5; + + const allFiles = [...new Set([ + ...(assessment.changedFiles || []), + ...trackedFiles, + ])]; + + return { + success: confidence >= 3, + changedFiles: allFiles, + summary: assessment.summary || "Changes made to resolve the issue.", + confidence, + uncertainties: assessment.uncertainties || [], + commitMessage: assessment.commitMessage || `fix: resolve issue #${issueNumber}`, + }; + } catch { + log.warn("Could not parse assessment from OpenAI response, using defaults"); + return { + success: true, + changedFiles: trackedFiles, + summary: "Changes made to resolve the issue. Assessment parsing failed.", + confidence: 5, + uncertainties: ["Could not self-assess — manual review recommended"], + commitMessage: `fix: resolve issue #${issueNumber}`, + }; + } + } + + async solve(issue: Issue, context: RepoContext): Promise { + const classification = issue.classification || "code-fix"; + const systemPrompt = SYSTEM_PROMPTS[classification]; + const issuePrompt = buildIssuePrompt(issue, { + codebaseContext: context.codebaseContext, + crossIssueInsights: context.crossIssueInsights, + }); + + log.info(`Solving #${issue.number} as "${classification}" with OpenAI (${this.model})...`); + + const combinedPrompt = `${issuePrompt} + +Please solve this issue. When done, DO NOT commit — just make the file changes. + +After making all changes, output your self-assessment as a JSON block at the very end of your response. +${CONFIDENCE_PROMPT}`; + + const timeout = context.timeout ?? 10 * 60 * 1000; + + if (context.issueLogFile) { + await writeFile( + context.issueLogFile, + `--- dispatch: openai engine started at ${new Date().toISOString()} ---\n` + + `--- model: ${this.model} | maxTurns: ${this.maxTurns} | timeout: ${Math.round(timeout / 1000)}s ---\n\n`, + ); + } + + const result = await runAgenticLoop({ + client: this.client, + model: this.model, + messages: [ + { role: "system", content: systemPrompt }, + { role: "user", content: combinedPrompt }, + ], + tools: SOLVE_TOOLS, + toolOptions: { + cwd: context.cwd, + timeout: 30_000, + trackedFiles: new Set(), + }, + maxTurns: this.maxTurns, + timeout, + issueLogFile: context.issueLogFile, + logPrefix: "openai", + }); + + log.debug(`[openai] Completed in ${result.totalTurns} turns`); + return this.parseAssessment(result.finalContent, result.changedFiles, issue.number); + } + + async investigate(issue: Issue, context: RepoContext): Promise { + const investigationIssue: Issue = { ...issue, classification: "investigation" as IssueClassification }; + return this.solve(investigationIssue, context); + } + + async createIssue(description: string, context: RepoContext): Promise { + const prompt = ISSUE_CREATION_PROMPT.replace("{description}", () => description); + + const result = await runAgenticLoop({ + client: this.client, + model: this.model, + messages: [{ role: "user", content: prompt }], + tools: READ_ONLY_TOOLS, + toolOptions: { + cwd: context.cwd, + timeout: 15_000, + trackedFiles: new Set(), + }, + maxTurns: 3, + timeout: 60_000, + logPrefix: "openai", + }); + + return this.parseJSON(result.finalContent); + } + + async classifyIssue(issue: Issue): Promise { + const prompt = CLASSIFICATION_PROMPT + .replace("{title}", () => issue.title) + .replace("{body}", () => (issue.body || "").substring(0, 500)); + + const response = await this.client.chat.completions.create({ + model: this.model, + messages: [{ role: "user", content: prompt }], + temperature: 0.1, + max_tokens: 50, + }); + + const text = response.choices[0]?.message?.content?.trim().toLowerCase().replace(/[^a-z-]/g, "") || ""; + + const valid: IssueClassification[] = [ + "code-fix", "feature", "investigation", "documentation", "audit", "refactor", + ]; + + if (valid.includes(text as IssueClassification)) { + return text as IssueClassification; + } + + if (text.includes("fix") || text.includes("bug")) return "code-fix"; + if (text.includes("feat") || text.includes("enhance")) return "feature"; + if (text.includes("invest") || text.includes("research")) return "investigation"; + if (text.includes("doc")) return "documentation"; + if (text.includes("audit") || text.includes("review")) return "audit"; + if (text.includes("refact")) return "refactor"; + + return "unknown"; + } + + async scoreConfidence( + issue: Issue, + changedFiles: string[], + ): Promise<{ score: number; uncertainties: string[] }> { + const prompt = `Review this issue and the files that were changed. Rate the likelihood the changes correctly solve the issue. + +Issue: #${issue.number} — ${issue.title} +${issue.body ? issue.body.substring(0, 500) : "No description"} + +Changed files: ${changedFiles.join(", ")} + +Respond in JSON only: { "score": <1-10>, "uncertainties": ["..."] }`; + + try { + const response = await this.client.chat.completions.create({ + model: this.model, + messages: [{ role: "user", content: prompt }], + temperature: 0.1, + max_tokens: 200, + }); + const text = response.choices[0]?.message?.content || ""; + const parsed = this.parseJSON<{ score: number; uncertainties: string[] }>(text); + return { + score: Math.min(10, Math.max(1, Number(parsed.score) || 5)), + uncertainties: parsed.uncertainties || [], + }; + } catch { + return { score: 5, uncertainties: ["Scoring failed — manual review recommended"] }; + } + } +} diff --git a/src/engine/types.ts b/src/engine/types.ts index 8d78863..10e925e 100644 --- a/src/engine/types.ts +++ b/src/engine/types.ts @@ -39,6 +39,10 @@ export interface RepoContext { timeout?: number; /** Path to per-issue log file for capturing subprocess output */ issueLogFile?: string; + /** Codebase context from memory system (Tier 1) */ + codebaseContext?: string; + /** Cross-issue insights from memory system (Tier 2) */ + crossIssueInsights?: string; } /** Events emitted by the AI engine during solving */ @@ -62,6 +66,8 @@ export interface SolveResult { uncertainties: string[]; /** Commit message used */ commitMessage: string; + /** Insights extracted from this solve for cross-issue learning */ + insights?: string[]; } /** A structured issue ready to be posted to GitHub */ diff --git a/src/github/client.ts b/src/github/client.ts index fe7bfc0..41c0e3c 100644 --- a/src/github/client.ts +++ b/src/github/client.ts @@ -219,6 +219,85 @@ export class GitHubClient { }); } + /** List pull requests created by Dispatch (matching branch prefix) */ + async listDispatchPRs(options: { + branchPrefix: string; + state?: "open" | "closed" | "all"; + maxPRs?: number; + }): Promise> { + const { branchPrefix, state = "all", maxPRs = 100 } = options; + const prs = await this.octokit.paginate( + this.octokit.pulls.list, + { + owner: this.owner, + repo: this.repo, + state: state as "open" | "closed" | "all", + per_page: Math.min(maxPRs, 100), + sort: "created", + direction: "desc", + }, + (response, done) => { + if (response.data.length >= maxPRs) done(); + return response.data; + } + ); + + return prs + .filter((pr) => pr.head.ref.startsWith(branchPrefix)) + .slice(0, maxPRs) + .map((pr) => ({ + number: pr.number, + title: pr.title, + state: pr.state, + merged: !!pr.merged_at, + headBranch: pr.head.ref, + url: pr.html_url, + createdAt: pr.created_at, + closedAt: pr.closed_at, + })); + } + + /** Fetch review comments on a PR */ + async fetchPRReviewComments(prNumber: number): Promise> { + const comments = await this.octokit.paginate( + this.octokit.pulls.listReviewComments, + { + owner: this.owner, + repo: this.repo, + pull_number: prNumber, + per_page: 100, + } + ); + + return comments.map((c) => ({ + author: c.user?.login || "unknown", + body: c.body || "", + createdAt: c.created_at, + })); + } + + /** Fetch issue comments on a PR (general discussion, not inline review) */ + async fetchPRIssueComments(prNumber: number): Promise> { + return this.fetchIssueComments(prNumber); + } + /** Add a label to an issue */ async addLabel(issueNumber: number, label: string): Promise { try { diff --git a/src/index.ts b/src/index.ts index 294272d..4281144 100644 --- a/src/index.ts +++ b/src/index.ts @@ -2,7 +2,12 @@ export { runPipeline } from "./orchestrator/pipeline.js"; export { ClaudeEngine } from "./engine/claude.js"; export { GitHubModelsEngine } from "./engine/github-models.js"; export { GeminiEngine } from "./engine/gemini.js"; +export { OpenAIEngine } from "./engine/openai.js"; export { GitHubClient } from "./github/client.js"; export { loadConfig } from "./utils/config.js"; +export { ModelRouter } from "./router/router.js"; +export { MemoryManager } from "./memory/manager.js"; export type { DispatchConfig } from "./utils/config.js"; export type { AIEngine, EngineEvent, Issue, IssueClassification } from "./engine/types.js"; +export type { RouterConfig, ModelSpec, AIProvider, PipelinePhase } from "./router/types.js"; +export type { MemoryConfig, CodebaseContext } from "./memory/types.js"; diff --git a/src/memory/budget.ts b/src/memory/budget.ts new file mode 100644 index 0000000..5b3086c --- /dev/null +++ b/src/memory/budget.ts @@ -0,0 +1,102 @@ +import type { ModelSpec } from "../router/types.js"; + +/** Priority levels for context budget allocation */ +export type ContextPriority = "critical" | "high" | "medium" | "low"; + +interface BudgetSlot { + name: string; + priority: ContextPriority; + content: string; + tokenEstimate: number; +} + +/** Priority weights — higher priority content is kept, lower is trimmed first */ +const PRIORITY_ORDER: ContextPriority[] = ["critical", "high", "medium", "low"]; + +/** + * Manages token budget across different context sections. + * Ensures the total prompt stays within the model's context window. + * + * Allocation strategy: + * - Reserve 25% for model output + * - Allocate remaining by priority: critical > high > medium > low + * - Trim lowest-priority content first if over budget + */ +export class ContextBudgetManager { + private slots: BudgetSlot[] = []; + private maxTokens: number; + private reservedForOutput: number; + + constructor(model: ModelSpec, outputReserveFraction: number = 0.25) { + this.maxTokens = model.maxContextTokens; + this.reservedForOutput = Math.floor(this.maxTokens * outputReserveFraction); + } + + /** Available tokens for input */ + get availableTokens(): number { + return this.maxTokens - this.reservedForOutput; + } + + /** Currently used tokens */ + get usedTokens(): number { + return this.slots.reduce((sum, s) => sum + s.tokenEstimate, 0); + } + + /** Remaining tokens */ + get remainingTokens(): number { + return this.availableTokens - this.usedTokens; + } + + /** + * Add a content section with a priority level. + * Tokens are estimated at ~4 chars per token. + */ + addSection(name: string, content: string, priority: ContextPriority): void { + const tokenEstimate = Math.ceil(content.length / 4); + this.slots.push({ name, priority, content, tokenEstimate }); + } + + /** + * Build the final prompt, trimming low-priority content if needed. + * Returns sections in priority order (critical first). + */ + build(): string { + // Sort by priority + const sorted = [...this.slots].sort( + (a, b) => PRIORITY_ORDER.indexOf(a.priority) - PRIORITY_ORDER.indexOf(b.priority), + ); + + const result: string[] = []; + let usedTokens = 0; + const budget = this.availableTokens; + + for (const slot of sorted) { + if (usedTokens + slot.tokenEstimate <= budget) { + result.push(slot.content); + usedTokens += slot.tokenEstimate; + } else { + // Try to include a truncated version + const remainingTokens = budget - usedTokens; + if (remainingTokens > 100) { + const maxChars = remainingTokens * 4; + const truncated = slot.content.substring(0, maxChars) + "\n[... truncated for context budget]"; + result.push(truncated); + usedTokens += remainingTokens; + } + // Skip remaining slots + break; + } + } + + return result.join("\n\n"); + } + + /** Get a summary of budget allocation */ + getSummary(): Record { + const summary: Record = {}; + for (const slot of this.slots) { + summary[slot.name] = { tokens: slot.tokenEstimate, priority: slot.priority }; + } + return summary; + } +} diff --git a/src/memory/codebase-context.ts b/src/memory/codebase-context.ts new file mode 100644 index 0000000..44dbdab --- /dev/null +++ b/src/memory/codebase-context.ts @@ -0,0 +1,224 @@ +import { readFile, writeFile, access, mkdir } from "node:fs/promises"; +import { join, dirname } from "node:path"; +import { execFile } from "node:child_process"; +import { promisify } from "node:util"; +import type { CodebaseContext } from "./types.js"; +import { log } from "../utils/logger.js"; + +const exec = promisify(execFile); + +const CACHE_FILE = "memory/context.json"; + +/** + * Load or generate codebase context. + * Returns cached version if fresh, regenerates if stale. + */ +export async function getCodebaseContext( + cwd: string, + stateDir: string, + maxAgeMs: number, +): Promise { + const cachePath = join(cwd, stateDir, CACHE_FILE); + + // Try loading cache + try { + await access(cachePath); + const raw = await readFile(cachePath, "utf-8"); + const cached = JSON.parse(raw) as CodebaseContext; + + const age = Date.now() - new Date(cached.generatedAt).getTime(); + if (age < maxAgeMs) { + // Check if commit hash still matches HEAD + const currentHash = await getCurrentCommitHash(cwd); + if (currentHash === cached.commitHash) { + log.debug("Using cached codebase context"); + return cached; + } + } + } catch { + // No cache or invalid — regenerate + } + + log.info("Generating codebase context..."); + const context = await generateCodebaseContext(cwd); + + // Save cache + try { + await mkdir(dirname(cachePath), { recursive: true }); + await writeFile(cachePath, JSON.stringify(context, null, 2), "utf-8"); + log.debug("Codebase context cached"); + } catch (err) { + log.debug(`Failed to cache context: ${err}`); + } + + return context; +} + +async function getCurrentCommitHash(cwd: string): Promise { + try { + const { stdout } = await exec("git", ["rev-parse", "HEAD"], { cwd, timeout: 5000 }); + return stdout.trim(); + } catch { + return "unknown"; + } +} + +async function generateCodebaseContext(cwd: string): Promise { + const commitHash = await getCurrentCommitHash(cwd); + + // Generate file tree (depth-limited) + const fileTree = await generateFileTree(cwd, 3); + + // Detect patterns from package.json, config files + const patterns = await detectPatterns(cwd); + + // Find key files + const keyFiles = await findKeyFiles(cwd); + + // Get dependencies + const dependencies = await getDependencies(cwd); + + // Build structure summary + const structure = buildStructureSummary(fileTree, patterns, keyFiles); + + // Rough token estimate (4 chars ≈ 1 token) + const totalText = structure + fileTree + JSON.stringify(patterns) + keyFiles.join("\n"); + const tokenEstimate = Math.ceil(totalText.length / 4); + + return { + generatedAt: new Date().toISOString(), + commitHash, + structure, + patterns, + keyFiles, + dependencies, + fileTree, + tokenEstimate, + }; +} + +async function generateFileTree(cwd: string, maxDepth: number): Promise { + try { + const { stdout } = await exec( + "find", [".", "-maxdepth", String(maxDepth), "-type", "f", + "-not", "-path", "*/node_modules/*", + "-not", "-path", "*/.git/*", + "-not", "-path", "*/dist/*", + ], + { cwd, timeout: 10_000, maxBuffer: 512 * 1024 }, + ); + return stdout.trim(); + } catch { + return "(could not generate file tree)"; + } +} + +async function detectPatterns(cwd: string): Promise { + const patterns: CodebaseContext["patterns"] = { + testFramework: null, + moduleSystem: "unknown", + buildTool: null, + linter: null, + language: "unknown", + packageManager: "npm", + }; + + try { + const pkgRaw = await readFile(join(cwd, "package.json"), "utf-8"); + const pkg = JSON.parse(pkgRaw); + + // Detect module system + patterns.moduleSystem = pkg.type === "module" ? "ESM" : "CommonJS"; + + // Detect language + if (pkg.devDependencies?.typescript || pkg.dependencies?.typescript) { + patterns.language = "TypeScript"; + } else { + patterns.language = "JavaScript"; + } + + // Detect test framework + const scripts = pkg.scripts || {}; + if (scripts.test?.includes("node --test")) patterns.testFramework = "node:test"; + else if (scripts.test?.includes("jest")) patterns.testFramework = "jest"; + else if (scripts.test?.includes("vitest")) patterns.testFramework = "vitest"; + else if (scripts.test?.includes("mocha")) patterns.testFramework = "mocha"; + + // Detect build tool + if (scripts.build?.includes("tsc")) patterns.buildTool = "tsc"; + else if (scripts.build?.includes("esbuild")) patterns.buildTool = "esbuild"; + else if (scripts.build?.includes("vite")) patterns.buildTool = "vite"; + + // Detect linter + if (scripts.lint?.includes("eslint")) patterns.linter = "eslint"; + else if (scripts.lint?.includes("tsc")) patterns.linter = "tsc --noEmit"; + + // Detect package manager + try { + await access(join(cwd, "pnpm-lock.yaml")); + patterns.packageManager = "pnpm"; + } catch { + try { + await access(join(cwd, "yarn.lock")); + patterns.packageManager = "yarn"; + } catch { + patterns.packageManager = "npm"; + } + } + } catch { + // No package.json + } + + return patterns; +} + +async function findKeyFiles(cwd: string): Promise { + const keyPatterns = [ + "package.json", "tsconfig.json", ".eslintrc*", ".prettierrc*", + "README.md", "CLAUDE.md", "CONTRIBUTING.md", + "src/index.ts", "src/index.js", "src/main.ts", "src/main.js", + "src/app.ts", "src/app.js", + ]; + + const found: string[] = []; + for (const pattern of keyPatterns) { + try { + await access(join(cwd, pattern)); + found.push(pattern); + } catch { + // not found + } + } + return found; +} + +async function getDependencies(cwd: string): Promise { + try { + const pkgRaw = await readFile(join(cwd, "package.json"), "utf-8"); + const pkg = JSON.parse(pkgRaw); + return Object.keys(pkg.dependencies || {}); + } catch { + return []; + } +} + +function buildStructureSummary( + fileTree: string, + patterns: CodebaseContext["patterns"], + keyFiles: string[], +): string { + const lines = [ + `Language: ${patterns.language}`, + `Module system: ${patterns.moduleSystem}`, + patterns.buildTool ? `Build: ${patterns.buildTool}` : null, + patterns.testFramework ? `Tests: ${patterns.testFramework}` : null, + patterns.linter ? `Linter: ${patterns.linter}` : null, + `Package manager: ${patterns.packageManager}`, + `Key files: ${keyFiles.join(", ")}`, + "", + "Directory structure:", + fileTree, + ].filter(Boolean); + + return lines.join("\n"); +} diff --git a/src/memory/index.ts b/src/memory/index.ts new file mode 100644 index 0000000..f5f6fdc --- /dev/null +++ b/src/memory/index.ts @@ -0,0 +1,10 @@ +export { MemoryManager } from "./manager.js"; +export { InsightCollector } from "./issue-insights.js"; +export { getCodebaseContext } from "./codebase-context.js"; +export { ContextBudgetManager } from "./budget.js"; +export type { + CodebaseContext, + IssueInsight, + RunInsights, + MemoryConfig, +} from "./types.js"; diff --git a/src/memory/issue-insights.ts b/src/memory/issue-insights.ts new file mode 100644 index 0000000..0ab5bf0 --- /dev/null +++ b/src/memory/issue-insights.ts @@ -0,0 +1,91 @@ +import type { IssueInsight, RunInsights } from "./types.js"; +import type { SolveResult } from "../engine/types.js"; +import { log } from "../utils/logger.js"; + +/** + * Manages cross-issue insights within a single run. + * Insights from batch N are fed into batch N+1. + */ +export class InsightCollector { + private insights: IssueInsight[] = []; + private maxInsights: number; + + constructor(maxInsights: number = 20) { + this.maxInsights = maxInsights; + } + + /** + * Extract insights from a completed solve result. + * Called after each successful issue solve. + */ + addFromSolve( + issueNumber: number, + result: SolveResult, + classification: string, + ): void { + // Extract patterns from changed files + const patterns: string[] = []; + const dirs = new Set(result.changedFiles.map((f) => f.split("/").slice(0, -1).join("/"))); + if (dirs.size > 0) { + patterns.push(`Modified directories: ${[...dirs].join(", ")}`); + } + + // Extract insight from summary + const insight: IssueInsight = { + fromIssue: issueNumber, + insight: `Issue #${issueNumber} (${classification}): ${result.summary}`, + relevantFiles: result.changedFiles.slice(0, 10), // Cap at 10 + patterns, + confidence: result.confidence, + timestamp: Date.now(), + }; + + this.insights.push(insight); + + // Keep only the most recent/high-confidence insights + if (this.insights.length > this.maxInsights) { + this.insights.sort((a, b) => b.confidence - a.confidence); + this.insights = this.insights.slice(0, this.maxInsights); + } + + log.debug(`[memory] Added insight from issue #${issueNumber} (total: ${this.insights.length})`); + } + + /** + * Format insights for injection into an AI prompt. + * Returns a concise summary suitable for the context window. + */ + formatForPrompt(): string { + if (this.insights.length === 0) return ""; + + const lines = [ + "## Insights from previously solved issues in this run", + "", + ]; + + for (const insight of this.insights) { + lines.push(`- ${insight.insight}`); + if (insight.relevantFiles.length > 0) { + lines.push(` Files: ${insight.relevantFiles.join(", ")}`); + } + if (insight.patterns.length > 0) { + lines.push(` ${insight.patterns.join("; ")}`); + } + } + + return lines.join("\n"); + } + + /** Get raw insights data */ + getInsights(): RunInsights { + return { + insights: [...this.insights], + issuesProcessed: this.insights.length, + }; + } + + /** Get count of collected insights */ + get count(): number { + return this.insights.length; + } +} diff --git a/src/memory/manager.ts b/src/memory/manager.ts new file mode 100644 index 0000000..bbed52b --- /dev/null +++ b/src/memory/manager.ts @@ -0,0 +1,137 @@ +import type { CodebaseContext, MemoryConfig, RunInsights } from "./types.js"; +import type { SolveResult } from "../engine/types.js"; +import { getCodebaseContext } from "./codebase-context.js"; +import { InsightCollector } from "./issue-insights.js"; +import { log } from "../utils/logger.js"; +import { loadLessonsForPrompt } from "../commands/learn.js"; + +const DEFAULT_MEMORY_CONFIG: MemoryConfig = { + enableCodebaseContext: true, + enableCrossIssue: true, + cacheMaxAgeMs: 60 * 60 * 1000, // 1 hour + maxInsights: 20, + stateDir: ".dispatch", +}; + +/** + * MemoryManager — manages Tier 1 (codebase context) and Tier 2 (cross-issue insights). + * + * Usage: + * const memory = new MemoryManager(config); + * await memory.initialize(cwd); // loads/generates codebase context + * memory.addInsight(issueNum, result, classification); // after each solve + * const context = memory.getContextForIssue(issue); // before each solve + */ +export class MemoryManager { + private config: MemoryConfig; + private codebaseContext: CodebaseContext | null = null; + private insightCollector: InsightCollector; + private lessonsPrompt: string = ""; + private initialized = false; + + constructor(config: Partial = {}) { + this.config = { ...DEFAULT_MEMORY_CONFIG, ...config }; + this.insightCollector = new InsightCollector(this.config.maxInsights); + } + + /** Initialize memory — load or generate codebase context */ + async initialize(cwd: string): Promise { + if (this.config.enableCodebaseContext) { + try { + this.codebaseContext = await getCodebaseContext( + cwd, + this.config.stateDir, + this.config.cacheMaxAgeMs, + ); + log.debug(`[memory] Codebase context loaded (${this.codebaseContext.tokenEstimate} tokens)`); + } catch (err) { + log.warn(`[memory] Failed to generate codebase context: ${err}`); + } + } + // Load lessons from previous runs (Tier 3 local) + try { + this.lessonsPrompt = await loadLessonsForPrompt(cwd, this.config.stateDir); + if (this.lessonsPrompt) { + log.debug(`[memory] Loaded lessons from previous PR reviews`); + } + } catch { + // No lessons yet — that's fine + } + + this.initialized = true; + } + + /** Get the count of loaded lessons */ + get lessonsCount(): number { + return this.lessonsPrompt ? this.lessonsPrompt.split("\n- ").length - 1 : 0; + } + + /** Get formatted codebase context for prompt injection */ + getCodebaseContextPrompt(): string { + if (!this.codebaseContext) return ""; + + return [ + "## Codebase Context", + "", + this.codebaseContext.structure, + "", + `Dependencies: ${this.codebaseContext.dependencies.join(", ")}`, + ].join("\n"); + } + + /** Get formatted cross-issue insights for prompt injection */ + getInsightsPrompt(): string { + if (!this.config.enableCrossIssue) return ""; + return this.insightCollector.formatForPrompt(); + } + + /** Get lessons prompt */ + getLessonsPrompt(): string { + return this.lessonsPrompt; + } + + /** Get all memory context formatted for prompt injection */ + getFullContextPrompt(): string { + const parts: string[] = []; + + const codebase = this.getCodebaseContextPrompt(); + if (codebase) parts.push(codebase); + + const insights = this.getInsightsPrompt(); + if (insights) parts.push(insights); + + if (this.lessonsPrompt) parts.push(this.lessonsPrompt); + + return parts.join("\n\n"); + } + + /** Add an insight after solving an issue */ + addInsight( + issueNumber: number, + result: SolveResult, + classification: string, + ): void { + if (!this.config.enableCrossIssue) return; + this.insightCollector.addFromSolve(issueNumber, result, classification); + } + + /** Get the insight collector (for advanced usage) */ + getInsightCollector(): InsightCollector { + return this.insightCollector; + } + + /** Get the cached codebase context object */ + getCodebaseContext(): CodebaseContext | null { + return this.codebaseContext; + } + + /** Get run insights */ + getRunInsights(): RunInsights { + return this.insightCollector.getInsights(); + } + + /** Check if initialized */ + get isInitialized(): boolean { + return this.initialized; + } +} diff --git a/src/memory/types.ts b/src/memory/types.ts new file mode 100644 index 0000000..556da37 --- /dev/null +++ b/src/memory/types.ts @@ -0,0 +1,69 @@ +/** + * Memory system types for Dispatch v2. + * Tier 1: Codebase context (cached analysis of repo structure) + * Tier 2: Cross-issue insights (learned within a run) + */ + +/** Cached codebase context — regenerated when stale */ +export interface CodebaseContext { + /** When this cache was generated */ + generatedAt: string; + /** Git commit hash when generated */ + commitHash: string; + /** Repository structure summary */ + structure: string; + /** Key patterns detected (test framework, module system, etc.) */ + patterns: { + testFramework: string | null; + moduleSystem: string; + buildTool: string | null; + linter: string | null; + language: string; + packageManager: string; + }; + /** Important files (entry points, configs, key modules) */ + keyFiles: string[]; + /** Dependencies summary */ + dependencies: string[]; + /** File tree (depth-limited) */ + fileTree: string; + /** Approximate token count of this context */ + tokenEstimate: number; +} + +/** An insight learned from solving one issue, available for subsequent issues */ +export interface IssueInsight { + /** Issue number that produced this insight */ + fromIssue: number; + /** What was learned */ + insight: string; + /** Files that were relevant */ + relevantFiles: string[]; + /** Patterns discovered */ + patterns: string[]; + /** Confidence in this insight (1-10) */ + confidence: number; + /** Timestamp */ + timestamp: number; +} + +/** Collection of insights from current run */ +export interface RunInsights { + insights: IssueInsight[]; + /** Total issues processed so far in this run */ + issuesProcessed: number; +} + +/** Configuration for the memory manager */ +export interface MemoryConfig { + /** Enable Tier 1 codebase context caching */ + enableCodebaseContext: boolean; + /** Enable Tier 2 cross-issue learning */ + enableCrossIssue: boolean; + /** Max age for codebase context cache in ms (default: 1 hour) */ + cacheMaxAgeMs: number; + /** Max number of insights to carry forward between batches */ + maxInsights: number; + /** State directory for cache storage */ + stateDir: string; +} diff --git a/src/orchestrator/pipeline.ts b/src/orchestrator/pipeline.ts index aa5b7e5..8f3442b 100644 --- a/src/orchestrator/pipeline.ts +++ b/src/orchestrator/pipeline.ts @@ -3,6 +3,7 @@ import ora from "ora"; import type { AIEngine, Issue, SolveResult } from "../engine/types.js"; import type { GitHubClient } from "../github/client.js"; import type { DispatchConfig } from "../utils/config.js"; +import type { ModelRouter } from "../router/router.js"; import { fetchAndHydrateIssues, prioritizeIssues, slugifyTitle } from "../github/issues.js"; import { createPRForIssue } from "../github/pulls.js"; import { @@ -16,6 +17,7 @@ import { adjustConfidence } from "../orchestrator/scorer.js"; import { Semaphore } from "../utils/semaphore.js"; import { createWorktree, removeWorktree, getWorktreePath, cleanupAllWorktrees } from "../utils/worktree.js"; import { TelemetryCollector } from "../telemetry/collector.js"; +import { MemoryManager } from "../memory/manager.js"; import { getAnonymousId, sendTelemetryEvent } from "../telemetry/remote.js"; import { updateStats } from "../telemetry/store.js"; @@ -25,13 +27,25 @@ export interface PipelineOptions { github: GitHubClient; cwd: string; dryRun?: boolean; + /** Optional ModelRouter for cost tracking and per-phase routing */ + router?: ModelRouter; + /** Optional: skip issues already processed (for resume) */ + skipIssues?: number[]; } export async function runPipeline(options: PipelineOptions): Promise { - const { config, engine, github, cwd, dryRun = false } = options; + const { config, engine, github, cwd, dryRun = false, router, skipIssues = [] } = options; const startTime = Date.now(); const telemetry = new TelemetryCollector(); + // Initialize memory system + const memory = new MemoryManager({ + stateDir: config.stateDir, + enableCodebaseContext: config.enableCodebaseContext, + enableCrossIssue: config.enableCrossIssue, + }); + await memory.initialize(cwd); + // Initialize file-based logging await initFileLogging(config.stateDir, cwd); @@ -90,7 +104,16 @@ export async function runPipeline(options: PipelineOptions): Promise } // 3. Prioritize - const issues = prioritizeIssues(allIssues); + let issues = prioritizeIssues(allIssues); + + // Filter out already-processed issues (for --resume) + if (skipIssues.length > 0) { + const before = issues.length; + issues = issues.filter((i) => !skipIssues.includes(i.number)); + if (before !== issues.length) { + log.info(`Resuming: skipping ${before - issues.length} already-processed issues`); + } + } log.info(`Processing ${issues.length} issues:\n`); for (const issue of issues) { @@ -164,6 +187,8 @@ export async function runPipeline(options: PipelineOptions): Promise cwd: worktreePath, timeout: config.timeoutPerIssue, issueLogFile, + codebaseContext: memory.getCodebaseContextPrompt(), + crossIssueInsights: memory.getInsightsPrompt(), }; if (isInvestigation) { @@ -184,6 +209,25 @@ export async function runPipeline(options: PipelineOptions): Promise // Adjust confidence with heuristics result = adjustConfidence(result); + // Post-solve test verification + try { + const { execFile: execFileSync } = await import("node:child_process"); + const { promisify } = await import("node:util"); + const execAsync = promisify(execFileSync); + await execAsync( + "npm", ["test"], + { cwd: worktreePath, timeout: 60_000, maxBuffer: 5 * 1024 * 1024 }, + ); + log.debug(`[#${issue.number}] Tests passed`); + } catch { + log.warn(`[#${issue.number}] Tests failed after solve — lowering confidence`); + result = { + ...result, + confidence: Math.max(1, result.confidence - 2), + uncertainties: [...result.uncertainties, "Tests failed after changes were made"], + }; + } + // Commit and push (operates within the worktree directory) const { hasChanges } = await commitAndPush(branchName, result.commitMessage, worktreePath); @@ -231,6 +275,9 @@ export async function runPipeline(options: PipelineOptions): Promise issueStatus = "solved"; + // Collect insight for cross-issue learning + memory.addInsight(issue.number, result, issue.classification || "unknown"); + telemetry.recordIssue({ issueNumber: issue.number, classification: issue.classification || "unknown", @@ -253,6 +300,9 @@ export async function runPipeline(options: PipelineOptions): Promise }); log.success(`[#${issue.number}] PR #${pr.number} created (confidence: ${result.confidence}/10)`); + + // Save checkpoint for resume capability + await saveCheckpoint(cwd, config.stateDir, issueSummaries); } catch (err) { const errorMsg = err instanceof Error ? err.message : String(err); log.error(`[#${issue.number}] Failed: ${errorMsg}`); @@ -291,7 +341,23 @@ export async function runPipeline(options: PipelineOptions): Promise } }; - await Promise.all(issues.map(processIssue)); + // Process issues in batches for cross-issue learning + const batchSize = config.concurrency; + for (let i = 0; i < issues.length; i += batchSize) { + const batch = issues.slice(i, i + batchSize); + const batchNum = Math.floor(i / batchSize) + 1; + const totalBatches = Math.ceil(issues.length / batchSize); + + log.info(`\nBatch ${batchNum}/${totalBatches} (${batch.length} issues)`); + + // Process batch in parallel (with semaphore for concurrency control) + await Promise.all(batch.map(processIssue)); + + // Log insight count between batches + if (memory.getInsightCollector().count > 0 && i + batchSize < issues.length) { + log.info(`[memory] ${memory.getInsightCollector().count} insights collected, feeding into next batch`); + } + } // 5. Generate summary const summary: RunSummary = { @@ -302,8 +368,26 @@ export async function runPipeline(options: PipelineOptions): Promise totalSolved: issueSummaries.filter((i) => i.status === "solved").length, totalFailed: issueSummaries.filter((i) => i.status === "failed").length, prsCreated, + costSummary: router ? (() => { + const cs = router.getCostSummary(); + return { + totalCostUSD: cs.totalCostUSD, + byPhase: cs.byPhase as Record, + byProvider: cs.byProvider, + totalInputTokens: cs.totalInputTokens, + totalOutputTokens: cs.totalOutputTokens, + }; + })() : undefined, + memoryStats: { + codebaseContextLoaded: !!memory.getCodebaseContext(), + insightsCollected: memory.getInsightCollector().count, + lessonsLoaded: memory.lessonsCount, + }, }; + // Clear checkpoint on successful completion + await clearCheckpoint(cwd, config.stateDir); + await saveSummary(summary, cwd, config.stateDir); printSummary(summary); @@ -371,4 +455,75 @@ function printSummary(summary: RunSummary) { } console.log(); } + + // Cost breakdown + if (summary.costSummary && summary.costSummary.totalCostUSD > 0) { + console.log(chalk.bold(" Cost Breakdown:")); + for (const [phase, cost] of Object.entries(summary.costSummary.byPhase)) { + if (cost > 0) { + console.log(` ${chalk.cyan(phase.padEnd(14))} $${cost.toFixed(4)}`); + } + } + console.log(chalk.gray(` ${"─".repeat(30)}`)); + console.log(` ${chalk.bold("Total")}${" ".repeat(9)} ${chalk.yellow(`$${summary.costSummary.totalCostUSD.toFixed(4)}`)}`); + console.log(); + } + + // Memory stats + if (summary.memoryStats) { + const m = summary.memoryStats; + const parts: string[] = []; + if (m.codebaseContextLoaded) parts.push("context cached"); + if (m.insightsCollected > 0) parts.push(`${m.insightsCollected} insights`); + if (m.lessonsLoaded > 0) parts.push(`${m.lessonsLoaded} lessons`); + if (parts.length > 0) { + console.log(chalk.gray(` Memory: ${parts.join(", ")}`)); + console.log(); + } + } +} + +/** Save checkpoint for resume capability */ +async function saveCheckpoint( + cwd: string, + stateDir: string, + issueSummaries: IssueSummary[], +): Promise { + try { + const { mkdir, writeFile } = await import("node:fs/promises"); + const { join } = await import("node:path"); + const dir = join(cwd, stateDir); + await mkdir(dir, { recursive: true }); + const checkpoint = { + processedIssues: issueSummaries.map((i) => i.number), + timestamp: new Date().toISOString(), + }; + await writeFile(join(dir, "checkpoint.json"), JSON.stringify(checkpoint, null, 2), "utf-8"); + } catch { + // Non-fatal + } +} + +/** Clear checkpoint after successful run */ +async function clearCheckpoint(cwd: string, stateDir: string): Promise { + try { + const { unlink } = await import("node:fs/promises"); + const { join } = await import("node:path"); + await unlink(join(cwd, stateDir, "checkpoint.json")); + } catch { + // Not found or already cleared + } +} + +/** Load checkpoint for resume */ +export async function loadCheckpoint(cwd: string, stateDir: string): Promise { + try { + const { readFile } = await import("node:fs/promises"); + const { join } = await import("node:path"); + const raw = await readFile(join(cwd, stateDir, "checkpoint.json"), "utf-8"); + const data = JSON.parse(raw) as { processedIssues: number[] }; + return data.processedIssues || []; + } catch { + return []; + } } diff --git a/src/reporter/summary.ts b/src/reporter/summary.ts index 5e137f7..39f2f7c 100644 --- a/src/reporter/summary.ts +++ b/src/reporter/summary.ts @@ -13,6 +13,14 @@ export interface IssueSummary { error?: string; } +export interface CostSummary { + totalCostUSD: number; + byPhase: Record; + byProvider: Record; + totalInputTokens: number; + totalOutputTokens: number; +} + export interface RunSummary { startedAt: string; duration: number; @@ -21,6 +29,12 @@ export interface RunSummary { totalSolved: number; totalFailed: number; prsCreated: Array<{ number: number; url: string; issueNumber: number }>; + costSummary?: CostSummary; + memoryStats?: { + codebaseContextLoaded: boolean; + insightsCollected: number; + lessonsLoaded: number; + }; } export async function saveSummary( @@ -105,12 +119,36 @@ export function formatMorningSummary(summary: RunSummary): string { const noChanges = summary.issues.filter((i) => i.status === "no-changes"); if (noChanges.length > 0) { - lines.push(`⚪ No changes needed:`); + lines.push(` No changes needed:`); for (const issue of noChanges) { lines.push(` #${issue.number}: ${issue.title}`); } lines.push(``); } + if (summary.costSummary && summary.costSummary.totalCostUSD > 0) { + lines.push(` Cost Breakdown:`); + for (const [phase, cost] of Object.entries(summary.costSummary.byPhase)) { + if (cost > 0) { + lines.push(` ${phase.padEnd(14)} $${cost.toFixed(4)}`); + } + } + lines.push(` ${"─".repeat(30)}`); + lines.push(` Total $${summary.costSummary.totalCostUSD.toFixed(4)}`); + lines.push(``); + } + + if (summary.memoryStats) { + const m = summary.memoryStats; + const parts: string[] = []; + if (m.codebaseContextLoaded) parts.push("context cached"); + if (m.insightsCollected > 0) parts.push(`${m.insightsCollected} insights`); + if (m.lessonsLoaded > 0) parts.push(`${m.lessonsLoaded} lessons`); + if (parts.length > 0) { + lines.push(` Memory: ${parts.join(", ")}`); + lines.push(``); + } + } + return lines.join("\n"); } diff --git a/src/router/detect.ts b/src/router/detect.ts new file mode 100644 index 0000000..1e83956 --- /dev/null +++ b/src/router/detect.ts @@ -0,0 +1,106 @@ +import { execFile } from "node:child_process"; +import { promisify } from "node:util"; +import type { AIProvider } from "./types.js"; + +const exec = promisify(execFile); + +interface DetectedProvider { + provider: AIProvider; + available: boolean; + reason?: string; +} + +/** + * Detect which AI providers are available in the current environment. + * Checks API keys and installed CLI tools. + */ +export async function detectProviders(): Promise { + const results: DetectedProvider[] = []; + + // Anthropic: check for ANTHROPIC_API_KEY or claude CLI + const hasAnthropicKey = !!process.env.ANTHROPIC_API_KEY; + let hasClaudeCli = false; + try { + await exec("claude", ["--version"], { timeout: 5000 }); + hasClaudeCli = true; + } catch { + // not installed + } + results.push({ + provider: "anthropic", + available: hasAnthropicKey || hasClaudeCli, + reason: hasAnthropicKey + ? "ANTHROPIC_API_KEY set" + : hasClaudeCli + ? "claude CLI installed" + : "No ANTHROPIC_API_KEY and claude CLI not found", + }); + + // Gemini: check for GEMINI_API_KEY or GOOGLE_API_KEY + const hasGeminiKey = !!(process.env.GEMINI_API_KEY || process.env.GOOGLE_API_KEY); + results.push({ + provider: "gemini", + available: hasGeminiKey, + reason: hasGeminiKey ? "API key set" : "No GEMINI_API_KEY or GOOGLE_API_KEY", + }); + + // GitHub Models: check for GITHUB_TOKEN (also used for repo access) + const hasGHToken = !!process.env.GITHUB_TOKEN; + results.push({ + provider: "github-models", + available: hasGHToken, + reason: hasGHToken ? "GITHUB_TOKEN set" : "No GITHUB_TOKEN", + }); + + // OpenAI: check for OPENAI_API_KEY + const hasOpenAIKey = !!process.env.OPENAI_API_KEY; + results.push({ + provider: "openai", + available: hasOpenAIKey, + reason: hasOpenAIKey ? "OPENAI_API_KEY set" : "No OPENAI_API_KEY", + }); + + return results; +} + +/** Get all detected providers with their status */ +export function getDetectedProvidersSummary(): DetectedProvider[] { + // Sync version for display purposes + const results: DetectedProvider[] = []; + const hasAnthropicKey = !!process.env.ANTHROPIC_API_KEY; + results.push({ + provider: "anthropic", + available: hasAnthropicKey, + reason: hasAnthropicKey ? "ANTHROPIC_API_KEY set" : "No ANTHROPIC_API_KEY", + }); + const hasGeminiKey = !!(process.env.GEMINI_API_KEY || process.env.GOOGLE_API_KEY); + results.push({ + provider: "gemini", + available: hasGeminiKey, + reason: hasGeminiKey ? "API key set" : "No GEMINI_API_KEY or GOOGLE_API_KEY", + }); + const hasGHToken = !!process.env.GITHUB_TOKEN; + results.push({ + provider: "github-models", + available: hasGHToken, + reason: hasGHToken ? "GITHUB_TOKEN set" : "No GITHUB_TOKEN", + }); + const hasOpenAIKey = !!process.env.OPENAI_API_KEY; + results.push({ + provider: "openai", + available: hasOpenAIKey, + reason: hasOpenAIKey ? "OPENAI_API_KEY set" : "No OPENAI_API_KEY", + }); + return results; +} + +/** Get the first available provider, preferring in order: anthropic > gemini > github-models > openai */ +export async function getDefaultProvider(): Promise { + const providers = await detectProviders(); + const preferred: AIProvider[] = ["anthropic", "gemini", "github-models", "openai"]; + for (const p of preferred) { + const match = providers.find((d) => d.provider === p && d.available); + if (match) return match.provider; + } + return null; +} diff --git a/src/router/index.ts b/src/router/index.ts new file mode 100644 index 0000000..f0cad75 --- /dev/null +++ b/src/router/index.ts @@ -0,0 +1,12 @@ +export { ModelRouter } from "./router.js"; +export { detectProviders, getDefaultProvider, getDetectedProvidersSummary } from "./detect.js"; +export { findModel, modelsForProvider, MODEL_REGISTRY } from "./models.js"; +export type { + AIProvider, + CostEntry, + ModelSpec, + PipelinePhase, + RouterConfig, + RoutingStrategy, + RunCostSummary, +} from "./types.js"; diff --git a/src/router/models.ts b/src/router/models.ts new file mode 100644 index 0000000..101427f --- /dev/null +++ b/src/router/models.ts @@ -0,0 +1,128 @@ +import type { ModelSpec } from "./types.js"; + +/** + * Registry of known models with pricing and capabilities. + * Prices are approximate and should be updated periodically. + */ +export const MODEL_REGISTRY: ModelSpec[] = [ + // Anthropic + { + provider: "anthropic", + modelId: "claude-sonnet-4-20250514", + displayName: "Claude Sonnet 4", + inputCostPer1M: 3, + outputCostPer1M: 15, + maxContextTokens: 200_000, + recommendedPhases: ["solve", "create-issue"], + }, + { + provider: "anthropic", + modelId: "claude-haiku-3-5-20241022", + displayName: "Claude Haiku 3.5", + inputCostPer1M: 0.8, + outputCostPer1M: 4, + maxContextTokens: 200_000, + recommendedPhases: ["classify", "score"], + }, + // Gemini + { + provider: "gemini", + modelId: "gemini-2.5-pro", + displayName: "Gemini 2.5 Pro", + inputCostPer1M: 1.25, + outputCostPer1M: 10, + maxContextTokens: 1_000_000, + recommendedPhases: ["solve", "create-issue"], + }, + { + provider: "gemini", + modelId: "gemini-2.5-flash", + displayName: "Gemini 2.5 Flash", + inputCostPer1M: 0.15, + outputCostPer1M: 0.60, + maxContextTokens: 1_000_000, + recommendedPhases: ["classify", "score"], + }, + // GitHub Models (free tier — costs are $0 for rate-limited access) + { + provider: "github-models", + modelId: "openai/gpt-4.1", + displayName: "GPT-4.1 (GitHub Models)", + inputCostPer1M: 0, + outputCostPer1M: 0, + maxContextTokens: 128_000, + recommendedPhases: ["classify", "solve", "score", "create-issue"], + }, + { + provider: "github-models", + modelId: "openai/gpt-4.1-mini", + displayName: "GPT-4.1 Mini (GitHub Models)", + inputCostPer1M: 0, + outputCostPer1M: 0, + maxContextTokens: 128_000, + recommendedPhases: ["classify", "score"], + }, + // OpenAI (direct API) + { + provider: "openai", + modelId: "gpt-4.1", + displayName: "GPT-4.1", + inputCostPer1M: 2, + outputCostPer1M: 8, + maxContextTokens: 1_000_000, + recommendedPhases: ["classify", "solve", "score", "create-issue"], + }, + { + provider: "openai", + modelId: "gpt-4.1-mini", + displayName: "GPT-4.1 Mini", + inputCostPer1M: 0.4, + outputCostPer1M: 1.6, + maxContextTokens: 1_000_000, + recommendedPhases: ["classify", "score"], + }, + { + provider: "openai", + modelId: "o3-mini", + displayName: "o3-mini", + inputCostPer1M: 1.1, + outputCostPer1M: 4.4, + maxContextTokens: 200_000, + recommendedPhases: ["solve", "create-issue"], + }, +]; + +/** Find a model spec by ID (exact match) */ +export function findModel(modelId: string): ModelSpec | undefined { + return MODEL_REGISTRY.find((m) => m.modelId === modelId); +} + +/** Find all models for a provider */ +export function modelsForProvider(provider: string): ModelSpec[] { + return MODEL_REGISTRY.filter((m) => m.provider === provider); +} + +/** Find the cheapest model recommended for a phase from a provider */ +export function cheapestForPhase(phase: string, provider?: string): ModelSpec | undefined { + let candidates = MODEL_REGISTRY.filter((m) => + m.recommendedPhases.includes(phase as any) + ); + if (provider) { + candidates = candidates.filter((m) => m.provider === provider); + } + if (candidates.length === 0) return undefined; + return candidates.sort((a, b) => a.inputCostPer1M - b.inputCostPer1M)[0]; +} + +/** Find the strongest model recommended for a phase from a provider */ +export function strongestForPhase(phase: string, provider?: string): ModelSpec | undefined { + let candidates = MODEL_REGISTRY.filter((m) => + m.recommendedPhases.includes(phase as any) + ); + if (provider) { + candidates = candidates.filter((m) => m.provider === provider); + } + if (candidates.length === 0) return undefined; + // Higher cost = generally stronger + return candidates.sort((a, b) => b.inputCostPer1M - a.inputCostPer1M)[0]; +} diff --git a/src/router/router.ts b/src/router/router.ts new file mode 100644 index 0000000..2a8fad1 --- /dev/null +++ b/src/router/router.ts @@ -0,0 +1,128 @@ +import type { + AIProvider, + CostEntry, + ModelSpec, + PipelinePhase, + RouterConfig, + RunCostSummary, +} from "./types.js"; +import { findModel, cheapestForPhase, strongestForPhase, MODEL_REGISTRY } from "./models.js"; +import { log } from "../utils/logger.js"; + +export class ModelRouter { + private config: RouterConfig; + private costEntries: CostEntry[] = []; + + constructor(config: RouterConfig) { + this.config = config; + } + + /** + * Get the model to use for a given pipeline phase. + * + * Strategy logic: + * - "pinned": Use config.pinnedModel for everything + * - "provider-locked": Use cheap model from preferred provider for classify/score, + * strong model for solve/create-issue + * - "auto": Pick the best model per phase across all available providers + * + * Phase overrides always take precedence. + */ + getModelForPhase(phase: PipelinePhase): ModelSpec { + // Phase overrides always win + if (this.config.phaseOverrides?.[phase]) { + const override = findModel(this.config.phaseOverrides[phase]!); + if (override) return override; + log.warn(`Phase override model "${this.config.phaseOverrides[phase]}" not found in registry, falling back`); + } + + switch (this.config.strategy) { + case "pinned": { + if (this.config.pinnedModel) { + const model = findModel(this.config.pinnedModel); + if (model) return model; + log.warn(`Pinned model "${this.config.pinnedModel}" not found in registry`); + } + // Fall through to auto + return this.autoSelectModel(phase); + } + + case "provider-locked": { + const provider = this.config.preferredProvider; + if (!provider) { + log.warn("provider-locked strategy but no preferredProvider set, falling back to auto"); + return this.autoSelectModel(phase); + } + + // Use cheap model for lightweight phases, strong for heavy phases + if (phase === "classify" || phase === "score") { + const cheap = cheapestForPhase(phase, provider); + if (cheap) return cheap; + } + const strong = strongestForPhase(phase, provider); + if (strong) return strong; + + // Fallback: any model from that provider + const fallback = MODEL_REGISTRY.find((m) => m.provider === provider); + if (fallback) return fallback; + + log.warn(`No models found for provider "${provider}", falling back to auto`); + return this.autoSelectModel(phase); + } + + case "auto": + default: + return this.autoSelectModel(phase); + } + } + + /** Auto-select: cheap for classify/score, strong for solve/create-issue */ + private autoSelectModel(phase: PipelinePhase): ModelSpec { + if (phase === "classify" || phase === "score") { + return cheapestForPhase(phase) || MODEL_REGISTRY[0]; + } + return strongestForPhase(phase) || MODEL_REGISTRY[0]; + } + + /** Record a cost entry after an API call */ + recordCost(entry: Omit): void { + this.costEntries.push({ ...entry, timestamp: Date.now() }); + } + + /** Estimate cost given token counts and a model spec */ + estimateCost(model: ModelSpec, inputTokens: number, outputTokens: number): number { + return ( + (inputTokens / 1_000_000) * model.inputCostPer1M + + (outputTokens / 1_000_000) * model.outputCostPer1M + ); + } + + /** Get the full run cost summary */ + getCostSummary(): RunCostSummary { + const byPhase: Record = {}; + const byProvider: Record = {}; + let totalInputTokens = 0; + let totalOutputTokens = 0; + + for (const entry of this.costEntries) { + byPhase[entry.phase] = (byPhase[entry.phase] || 0) + entry.estimatedCostUSD; + byProvider[entry.provider] = (byProvider[entry.provider] || 0) + entry.estimatedCostUSD; + totalInputTokens += entry.inputTokens; + totalOutputTokens += entry.outputTokens; + } + + return { + totalCostUSD: this.costEntries.reduce((sum, e) => sum + e.estimatedCostUSD, 0), + byPhase: byPhase as Record, + byProvider, + totalInputTokens, + totalOutputTokens, + entries: [...this.costEntries], + }; + } + + /** Reset cost tracking (between runs) */ + resetCosts(): void { + this.costEntries = []; + } +} diff --git a/src/router/types.ts b/src/router/types.ts new file mode 100644 index 0000000..66c51b3 --- /dev/null +++ b/src/router/types.ts @@ -0,0 +1,64 @@ +/** + * Model routing types for Dispatch v2. + * The ModelRouter selects the optimal model per pipeline phase. + */ + +/** Pipeline phases that need model selection */ +export type PipelinePhase = "classify" | "solve" | "score" | "create-issue"; + +/** Supported AI providers */ +export type AIProvider = "anthropic" | "gemini" | "github-models" | "openai"; + +/** A specific model with its metadata */ +export interface ModelSpec { + /** Provider identifier */ + provider: AIProvider; + /** Model identifier (e.g., "claude-sonnet-4-20250514", "gemini-2.5-pro") */ + modelId: string; + /** Human-friendly display name */ + displayName: string; + /** Cost per 1M input tokens in USD */ + inputCostPer1M: number; + /** Cost per 1M output tokens in USD */ + outputCostPer1M: number; + /** Max context window in tokens */ + maxContextTokens: number; + /** Recommended phases for this model */ + recommendedPhases: PipelinePhase[]; +} + +/** Routing strategy */ +export type RoutingStrategy = "auto" | "provider-locked" | "pinned"; + +/** ModelRouter configuration */ +export interface RouterConfig { + /** Routing strategy */ + strategy: RoutingStrategy; + /** When strategy is "provider-locked", which provider to use */ + preferredProvider?: AIProvider; + /** When strategy is "pinned", model ID to use for all phases */ + pinnedModel?: string; + /** Override model for a specific phase */ + phaseOverrides?: Partial>; +} + +/** Cost tracking for a single API call */ +export interface CostEntry { + phase: PipelinePhase; + provider: AIProvider; + model: string; + inputTokens: number; + outputTokens: number; + estimatedCostUSD: number; + timestamp: number; +} + +/** Aggregated cost for a run */ +export interface RunCostSummary { + totalCostUSD: number; + byPhase: Record; + byProvider: Record; + totalInputTokens: number; + totalOutputTokens: number; + entries: CostEntry[]; +} diff --git a/src/utils/config.ts b/src/utils/config.ts index 3d2657b..6564a5a 100644 --- a/src/utils/config.ts +++ b/src/utils/config.ts @@ -31,6 +31,14 @@ export interface DispatchConfig { timeoutPerIssue: number; /** Number of issues to process in parallel (default: 3) */ concurrency: number; + /** AI provider: "anthropic", "gemini", "github-models", "openai" (default: auto-detect) */ + provider: string; + /** Model routing strategy: "auto" | "provider-locked" | "pinned" (default: "auto") */ + routingStrategy: string; + /** Enable codebase context caching (Tier 1 memory) */ + enableCodebaseContext: boolean; + /** Enable cross-issue learning (Tier 2 memory) */ + enableCrossIssue: boolean; /** Enable anonymous telemetry (default: true). Set to false to opt out of remote analytics. */ telemetry: boolean; /** PostHog host URL for self-hosted instances (default: https://app.posthog.com) */ @@ -54,6 +62,10 @@ const DEFAULT_CONFIG: DispatchConfig = { stateDir: ".dispatch", timeoutPerIssue: 10 * 60 * 1000, // 10 minutes concurrency: 3, + provider: "auto", + routingStrategy: "auto", + enableCodebaseContext: true, + enableCrossIssue: true, telemetry: true, posthogHost: "https://app.posthog.com", // Write-only key — safe to embed. Can only send events, not read data. @@ -131,6 +143,13 @@ export function applyCliOverrides(config: DispatchConfig, options: Record { + it("returns empty array when no checkpoint exists", async () => { + const result = await loadCheckpoint("/tmp/nonexistent-dispatch-test", ".dispatch"); + assert.deepStrictEqual(result, []); + }); + + it("loads checkpoint with processed issues", async () => { + const testDir = join(tmpdir(), `dispatch-checkpoint-test-${Date.now()}`); + const stateDir = join(testDir, ".dispatch"); + await mkdir(stateDir, { recursive: true }); + + const checkpoint = { + processedIssues: [1, 3, 7], + timestamp: new Date().toISOString(), + }; + await writeFile(join(stateDir, "checkpoint.json"), JSON.stringify(checkpoint), "utf-8"); + + const result = await loadCheckpoint(testDir, ".dispatch"); + assert.deepStrictEqual(result, [1, 3, 7]); + + await rm(testDir, { recursive: true, force: true }); + }); + + it("handles malformed checkpoint gracefully", async () => { + const testDir = join(tmpdir(), `dispatch-checkpoint-test-${Date.now()}`); + const stateDir = join(testDir, ".dispatch"); + await mkdir(stateDir, { recursive: true }); + + await writeFile(join(stateDir, "checkpoint.json"), "not-valid-json", "utf-8"); + + const result = await loadCheckpoint(testDir, ".dispatch"); + assert.deepStrictEqual(result, []); + + await rm(testDir, { recursive: true, force: true }); + }); +}); diff --git a/test/learn.test.ts b/test/learn.test.ts new file mode 100644 index 0000000..015ca29 --- /dev/null +++ b/test/learn.test.ts @@ -0,0 +1,92 @@ +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { loadLessonsForPrompt } from "../src/commands/learn.js"; +import { writeFile, mkdir, rm } from "node:fs/promises"; +import { join } from "node:path"; +import { tmpdir } from "node:os"; + +describe("Learn — loadLessonsForPrompt", () => { + it("returns empty string when no lessons file exists", async () => { + const result = await loadLessonsForPrompt("/tmp/nonexistent-dispatch-test", ".dispatch"); + assert.equal(result, ""); + }); + + it("returns formatted lessons when file exists", async () => { + const testDir = join(tmpdir(), `dispatch-test-${Date.now()}`); + const memoryDir = join(testDir, ".dispatch", "memory"); + await mkdir(memoryDir, { recursive: true }); + + const store = { + version: 1, + lessons: [ + { + source: "PR #1 review", + prNumber: 1, + issueNumber: 42, + lesson: "Always add null checks on optional parameters", + category: "correctness", + learnedAt: new Date().toISOString(), + decayScore: 0.9, + }, + { + source: "PR #2 (rejected)", + prNumber: 2, + issueNumber: 43, + lesson: "Do not modify migration files directly", + category: "approach", + learnedAt: new Date().toISOString(), + decayScore: 0.8, + }, + ], + lastScanAt: new Date().toISOString(), + }; + + await writeFile(join(memoryDir, "lessons.json"), JSON.stringify(store), "utf-8"); + + const result = await loadLessonsForPrompt(testDir, ".dispatch"); + assert.ok(result.includes("Lessons from Previous PR Reviews")); + assert.ok(result.includes("null checks")); + assert.ok(result.includes("migration files")); + + // Cleanup + await rm(testDir, { recursive: true, force: true }); + }); + + it("filters out low-relevance lessons", async () => { + const testDir = join(tmpdir(), `dispatch-test-${Date.now()}`); + const memoryDir = join(testDir, ".dispatch", "memory"); + await mkdir(memoryDir, { recursive: true }); + + const store = { + version: 1, + lessons: [ + { + source: "PR #1", + prNumber: 1, + issueNumber: 1, + lesson: "Old lesson with low decay", + category: "general", + learnedAt: new Date(Date.now() - 90 * 24 * 60 * 60 * 1000).toISOString(), // 90 days ago + decayScore: 0.1, + }, + ], + lastScanAt: new Date().toISOString(), + }; + + await writeFile(join(memoryDir, "lessons.json"), JSON.stringify(store), "utf-8"); + + const result = await loadLessonsForPrompt(testDir, ".dispatch"); + // Very old lessons should be filtered out + assert.equal(result, ""); + + await rm(testDir, { recursive: true, force: true }); + }); +}); + +describe("Learn — lesson categorization", () => { + // These are tested indirectly through the learn command behavior + it("placeholder for learn integration tests", () => { + // Integration tests would require a mock GitHub API + assert.ok(true); + }); +}); diff --git a/test/memory.test.ts b/test/memory.test.ts new file mode 100644 index 0000000..19bd321 --- /dev/null +++ b/test/memory.test.ts @@ -0,0 +1,101 @@ +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { InsightCollector } from "../src/memory/issue-insights.js"; +import { ContextBudgetManager } from "../src/memory/budget.js"; +import type { SolveResult } from "../src/engine/types.js"; + +describe("InsightCollector", () => { + it("collects insights from solved issues", () => { + const collector = new InsightCollector(10); + + const result: SolveResult = { + success: true, + changedFiles: ["src/foo.ts", "src/bar.ts"], + summary: "Fixed the null check", + confidence: 8, + uncertainties: [], + commitMessage: "fix: null check", + }; + + collector.addFromSolve(42, result, "code-fix"); + assert.equal(collector.count, 1); + + const formatted = collector.formatForPrompt(); + assert.ok(formatted.includes("Issue #42")); + assert.ok(formatted.includes("Fixed the null check")); + }); + + it("limits insights to maxInsights", () => { + const collector = new InsightCollector(3); + + for (let i = 0; i < 5; i++) { + collector.addFromSolve(i, { + success: true, + changedFiles: [`file${i}.ts`], + summary: `Fixed issue ${i}`, + confidence: i + 1, + uncertainties: [], + commitMessage: `fix: issue ${i}`, + }, "code-fix"); + } + + assert.equal(collector.count, 3); + + // Should keep highest confidence + const insights = collector.getInsights(); + const confidences = insights.insights.map((i) => i.confidence); + assert.ok(confidences.every((c) => c >= 3)); + }); + + it("returns empty string when no insights", () => { + const collector = new InsightCollector(10); + assert.equal(collector.formatForPrompt(), ""); + }); +}); + +describe("ContextBudgetManager", () => { + const mockModel = { + provider: "anthropic" as const, + modelId: "test", + displayName: "Test", + inputCostPer1M: 3, + outputCostPer1M: 15, + maxContextTokens: 1000, // Small for testing + recommendedPhases: ["solve" as const], + }; + + it("tracks available tokens", () => { + const budget = new ContextBudgetManager(mockModel, 0.25); + assert.equal(budget.availableTokens, 750); // 1000 - 25% + }); + + it("adds sections and tracks usage", () => { + const budget = new ContextBudgetManager(mockModel, 0.25); + budget.addSection("system", "You are a helpful assistant.", "critical"); + assert.ok(budget.usedTokens > 0); + assert.ok(budget.remainingTokens < budget.availableTokens); + }); + + it("respects priority ordering in build output", () => { + const budget = new ContextBudgetManager(mockModel, 0.25); + budget.addSection("low", "low priority content", "low"); + budget.addSection("critical", "critical content", "critical"); + + const output = budget.build(); + // Critical should appear before low + const criticalIdx = output.indexOf("critical content"); + const lowIdx = output.indexOf("low priority content"); + assert.ok(criticalIdx < lowIdx); + }); + + it("truncates low-priority content when over budget", () => { + const budget = new ContextBudgetManager(mockModel, 0.25); + + // Fill up most of the budget with critical content + budget.addSection("critical", "x".repeat(2800), "critical"); // ~700 tokens + budget.addSection("low", "y".repeat(400), "low"); // ~100 tokens, won't fully fit + + const output = budget.build(); + assert.ok(output.includes("x".repeat(100))); // Critical content preserved + }); +}); diff --git a/test/openai.test.ts b/test/openai.test.ts new file mode 100644 index 0000000..73899dd --- /dev/null +++ b/test/openai.test.ts @@ -0,0 +1,30 @@ +import { describe, it, beforeEach, afterEach } from "node:test"; +import assert from "node:assert/strict"; + +describe("OpenAIEngine", () => { + const originalKey = process.env.OPENAI_API_KEY; + + afterEach(() => { + if (originalKey) { + process.env.OPENAI_API_KEY = originalKey; + } else { + delete process.env.OPENAI_API_KEY; + } + }); + + it("throws if OPENAI_API_KEY is not set", async () => { + delete process.env.OPENAI_API_KEY; + const { OpenAIEngine } = await import("../src/engine/openai.js"); + assert.throws( + () => new OpenAIEngine({ model: "gpt-4.1", maxTurns: 5 }), + /OPENAI_API_KEY/ + ); + }); + + it("creates engine when OPENAI_API_KEY is set", async () => { + process.env.OPENAI_API_KEY = "sk-test-key"; + const { OpenAIEngine } = await import("../src/engine/openai.js"); + const engine = new OpenAIEngine({ model: "gpt-4.1", maxTurns: 5 }); + assert.equal(engine.name, "openai"); + }); +}); diff --git a/test/pipeline-integration.test.ts b/test/pipeline-integration.test.ts new file mode 100644 index 0000000..0cf8cee --- /dev/null +++ b/test/pipeline-integration.test.ts @@ -0,0 +1,79 @@ +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { MemoryManager } from "../src/memory/manager.js"; +import { InsightCollector } from "../src/memory/issue-insights.js"; + +describe("Pipeline Integration", () => { + describe("MemoryManager", () => { + it("initializes without errors when no repo context", async () => { + const memory = new MemoryManager({ + enableCodebaseContext: false, + enableCrossIssue: true, + cacheMaxAgeMs: 60000, + maxInsights: 10, + stateDir: ".dispatch", + }); + + await memory.initialize("/tmp/nonexistent"); + assert.ok(memory.isInitialized); + }); + + it("returns empty context when disabled", async () => { + const memory = new MemoryManager({ + enableCodebaseContext: false, + enableCrossIssue: false, + cacheMaxAgeMs: 60000, + maxInsights: 10, + stateDir: ".dispatch", + }); + + await memory.initialize("/tmp/nonexistent"); + + assert.equal(memory.getCodebaseContextPrompt(), ""); + assert.equal(memory.getInsightsPrompt(), ""); + assert.equal(memory.getFullContextPrompt(), ""); + }); + + it("collects and surfaces insights", async () => { + const memory = new MemoryManager({ + enableCodebaseContext: false, + enableCrossIssue: true, + cacheMaxAgeMs: 60000, + maxInsights: 10, + stateDir: ".dispatch", + }); + + await memory.initialize("/tmp/nonexistent"); + + memory.addInsight(1, { + success: true, + changedFiles: ["src/a.ts"], + summary: "Fixed bug in module A", + confidence: 8, + uncertainties: [], + commitMessage: "fix: module A", + }, "code-fix"); + + const prompt = memory.getInsightsPrompt(); + assert.ok(prompt.includes("Fixed bug in module A")); + assert.ok(prompt.includes("Issue #1")); + }); + }); + + describe("Batch Processing Logic", () => { + it("creates correct batch sizes", () => { + const items = [1, 2, 3, 4, 5, 6, 7]; + const batchSize = 3; + const batches: number[][] = []; + + for (let i = 0; i < items.length; i += batchSize) { + batches.push(items.slice(i, i + batchSize)); + } + + assert.equal(batches.length, 3); + assert.deepEqual(batches[0], [1, 2, 3]); + assert.deepEqual(batches[1], [4, 5, 6]); + assert.deepEqual(batches[2], [7]); + }); + }); +}); diff --git a/test/providers.test.ts b/test/providers.test.ts new file mode 100644 index 0000000..67915b3 --- /dev/null +++ b/test/providers.test.ts @@ -0,0 +1,119 @@ +import { describe, it, beforeEach, afterEach } from "node:test"; +import assert from "node:assert/strict"; +import { getDetectedProvidersSummary } from "../src/router/detect.js"; +import { MODEL_REGISTRY, findModel, modelsForProvider } from "../src/router/models.js"; + +describe("Provider Detection", () => { + const originalEnv: Record = {}; + + beforeEach(() => { + // Save and clear all provider keys + for (const key of ["ANTHROPIC_API_KEY", "GEMINI_API_KEY", "GOOGLE_API_KEY", "GITHUB_TOKEN", "OPENAI_API_KEY"]) { + originalEnv[key] = process.env[key]; + delete process.env[key]; + } + }); + + afterEach(() => { + // Restore env + for (const [key, value] of Object.entries(originalEnv)) { + if (value === undefined) { + delete process.env[key]; + } else { + process.env[key] = value; + } + } + }); + + it("detects no providers when no keys are set", () => { + const providers = getDetectedProvidersSummary(); + assert.ok(providers.every((p) => !p.available)); + }); + + it("detects Anthropic when ANTHROPIC_API_KEY is set", () => { + process.env.ANTHROPIC_API_KEY = "test-key"; + const providers = getDetectedProvidersSummary(); + const anthropic = providers.find((p) => p.provider === "anthropic"); + assert.ok(anthropic?.available); + }); + + it("detects Gemini when GEMINI_API_KEY is set", () => { + process.env.GEMINI_API_KEY = "test-key"; + const providers = getDetectedProvidersSummary(); + const gemini = providers.find((p) => p.provider === "gemini"); + assert.ok(gemini?.available); + }); + + it("detects Gemini when GOOGLE_API_KEY is set", () => { + process.env.GOOGLE_API_KEY = "test-key"; + const providers = getDetectedProvidersSummary(); + const gemini = providers.find((p) => p.provider === "gemini"); + assert.ok(gemini?.available); + }); + + it("detects GitHub Models when GITHUB_TOKEN is set", () => { + process.env.GITHUB_TOKEN = "test-token"; + const providers = getDetectedProvidersSummary(); + const gh = providers.find((p) => p.provider === "github-models"); + assert.ok(gh?.available); + }); + + it("detects OpenAI when OPENAI_API_KEY is set", () => { + process.env.OPENAI_API_KEY = "test-key"; + const providers = getDetectedProvidersSummary(); + const openai = providers.find((p) => p.provider === "openai"); + assert.ok(openai?.available); + }); + + it("detects multiple providers simultaneously", () => { + process.env.ANTHROPIC_API_KEY = "test-key"; + process.env.OPENAI_API_KEY = "test-key"; + const providers = getDetectedProvidersSummary(); + const available = providers.filter((p) => p.available); + assert.ok(available.length >= 2); + }); +}); + +describe("Model Registry — OpenAI models", () => { + it("includes OpenAI provider models", () => { + const openaiModels = modelsForProvider("openai"); + assert.ok(openaiModels.length >= 2, "Should have at least 2 OpenAI models"); + }); + + it("findModel returns gpt-4.1", () => { + const model = findModel("gpt-4.1"); + assert.ok(model); + assert.equal(model.provider, "openai"); + }); + + it("findModel returns gpt-4.1-mini", () => { + const model = findModel("gpt-4.1-mini"); + assert.ok(model); + assert.equal(model.provider, "openai"); + }); + + it("findModel returns o3-mini", () => { + const model = findModel("o3-mini"); + assert.ok(model); + assert.equal(model.provider, "openai"); + }); + + it("all models have valid cost data", () => { + for (const model of MODEL_REGISTRY) { + assert.ok(model.inputCostPer1M >= 0, `${model.modelId} has negative input cost`); + assert.ok(model.outputCostPer1M >= 0, `${model.modelId} has negative output cost`); + assert.ok(model.maxContextTokens > 0, `${model.modelId} has invalid context window`); + assert.ok(model.recommendedPhases.length > 0, `${model.modelId} has no recommended phases`); + } + }); + + it("every provider has at least one model recommended for solve", () => { + const providers = new Set(MODEL_REGISTRY.map((m) => m.provider)); + for (const provider of providers) { + const solveModels = MODEL_REGISTRY.filter( + (m) => m.provider === provider && m.recommendedPhases.includes("solve") + ); + assert.ok(solveModels.length > 0, `Provider "${provider}" has no solve-capable models`); + } + }); +}); diff --git a/test/router.test.ts b/test/router.test.ts new file mode 100644 index 0000000..59339bb --- /dev/null +++ b/test/router.test.ts @@ -0,0 +1,130 @@ +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; +import { ModelRouter } from "../src/router/router.js"; +import { findModel, cheapestForPhase, strongestForPhase, MODEL_REGISTRY } from "../src/router/models.js"; +import type { RouterConfig } from "../src/router/types.js"; + +describe("ModelRouter", () => { + describe("pinned strategy", () => { + it("returns the pinned model for all phases", () => { + const router = new ModelRouter({ + strategy: "pinned", + pinnedModel: "gemini-2.5-pro", + }); + + const classify = router.getModelForPhase("classify"); + const solve = router.getModelForPhase("solve"); + assert.equal(classify.modelId, "gemini-2.5-pro"); + assert.equal(solve.modelId, "gemini-2.5-pro"); + }); + + it("falls back to auto if pinned model not found", () => { + const router = new ModelRouter({ + strategy: "pinned", + pinnedModel: "nonexistent-model", + }); + + const model = router.getModelForPhase("solve"); + assert.ok(model.modelId); // Should get some model + }); + }); + + describe("provider-locked strategy", () => { + it("uses cheap model for classify, strong for solve", () => { + const router = new ModelRouter({ + strategy: "provider-locked", + preferredProvider: "gemini", + }); + + const classify = router.getModelForPhase("classify"); + const solve = router.getModelForPhase("solve"); + + assert.equal(classify.provider, "gemini"); + assert.equal(solve.provider, "gemini"); + // Classify should use cheaper model + assert.ok(classify.inputCostPer1M <= solve.inputCostPer1M); + }); + }); + + describe("auto strategy", () => { + it("selects cheap models for classify/score", () => { + const router = new ModelRouter({ strategy: "auto" }); + + const classify = router.getModelForPhase("classify"); + const solve = router.getModelForPhase("solve"); + + // Classify model should be cheaper than solve model + assert.ok(classify.inputCostPer1M <= solve.inputCostPer1M); + }); + }); + + describe("phase overrides", () => { + it("overrides take precedence over strategy", () => { + const router = new ModelRouter({ + strategy: "pinned", + pinnedModel: "gemini-2.5-pro", + phaseOverrides: { + classify: "gemini-2.5-flash", + }, + }); + + const classify = router.getModelForPhase("classify"); + assert.equal(classify.modelId, "gemini-2.5-flash"); + + // Non-overridden phase uses pinned + const solve = router.getModelForPhase("solve"); + assert.equal(solve.modelId, "gemini-2.5-pro"); + }); + }); + + describe("cost tracking", () => { + it("records and summarizes costs", () => { + const router = new ModelRouter({ strategy: "auto" }); + + router.recordCost({ + phase: "classify", + provider: "gemini", + model: "gemini-2.5-flash", + inputTokens: 1000, + outputTokens: 50, + estimatedCostUSD: 0.001, + }); + + router.recordCost({ + phase: "solve", + provider: "anthropic", + model: "claude-sonnet-4-20250514", + inputTokens: 50000, + outputTokens: 5000, + estimatedCostUSD: 0.225, + }); + + const summary = router.getCostSummary(); + assert.ok(summary.totalCostUSD > 0); + assert.equal(summary.entries.length, 2); + assert.equal(summary.totalInputTokens, 51000); + }); + }); +}); + +describe("Model Registry", () => { + it("has models for all providers", () => { + const providers = new Set(MODEL_REGISTRY.map((m) => m.provider)); + assert.ok(providers.has("anthropic")); + assert.ok(providers.has("gemini")); + assert.ok(providers.has("github-models")); + }); + + it("findModel returns correct model", () => { + const model = findModel("gemini-2.5-pro"); + assert.ok(model); + assert.equal(model.provider, "gemini"); + }); + + it("cheapestForPhase returns cheapest", () => { + const model = cheapestForPhase("classify"); + assert.ok(model); + // Should be one of the cheaper models + assert.ok(model.inputCostPer1M <= 1); + }); +});