Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions docs/ai/design/feature-gemini-cli-adapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
phase: design
title: "Gemini CLI Adapter in @ai-devkit/agent-manager - Design"
feature: gemini-cli-adapter
description: Architecture and implementation design for introducing Gemini CLI adapter support in the shared agent manager package
---

# Design: Gemini CLI Adapter for @ai-devkit/agent-manager

## Architecture Overview

```mermaid
graph TD
User[User runs ai-devkit agent list/open] --> Cmd[packages/cli/src/commands/agent.ts]
Cmd --> Manager[AgentManager]

subgraph Pkg[@ai-devkit/agent-manager]
Manager --> Claude[ClaudeCodeAdapter]
Manager --> Codex[CodexAdapter]
Manager --> Gemini[GeminiCliAdapter]
Gemini --> Proc[process utils - node argv scan]
Gemini --> File[fs read of ~/.gemini/tmp]
Gemini --> Hash[sha256 projectHash]
Gemini --> Types[AgentAdapter/AgentInfo/AgentStatus]
Focus[TerminalFocusManager]
end

Cmd --> Focus
Cmd --> Output[CLI table/json rendering]
```

Responsibilities:
- `GeminiCliAdapter`: discover running Gemini processes, match them to local session files, and emit `AgentInfo`
- `AgentManager`: aggregate Gemini + Claude + Codex results in parallel
- CLI command: register adapter, render results, invoke open/focus

## Data Models

- Reuse `AgentAdapter`, `AgentInfo`, `AgentStatus`, and `AgentType`
- `AgentType` already supports `gemini`; adapter emits `type: 'gemini'`
- Gemini raw session shape (on disk):
- `sessionId`: uuid string
- `projectHash`: sha256 of the project root path (walked to nearest `.git` boundary)
- `startTime`, `lastUpdated`: ISO timestamps
- `messages[]`: entries with `id`, `timestamp`, `type` (`user` | `gemini` | `thought` | `tool`), and `content` / `displayContent`
- `content` is polymorphic: `string` for assistant-side, `Part[]` (e.g. `[{text: "..."}]`) for user-side
- `directories[]`, `kind`
- Normalized into `AgentInfo`:
- `id`: `gemini-<sessionId prefix>`
- `name`: derived from session project directory (basename of registry path)
- `cwd`: project root
- `sessionStart`: parsed from `startTime`
- `status`: computed from `lastUpdated` vs shared status thresholds
- `pid`: matched live `node` process running the `gemini` script

## API Design

### Package Exports
- Add `GeminiCliAdapter` to:
- `packages/agent-manager/src/adapters/index.ts`
- `packages/agent-manager/src/index.ts`

### CLI Integration
- Update `packages/cli/src/commands/agent.ts` to register `GeminiCliAdapter` alongside `ClaudeCodeAdapter` and `CodexAdapter`
- No presentation logic moves into the package; CLI retains formatting

## Component Breakdown

1. `packages/agent-manager/src/adapters/GeminiCliAdapter.ts`
- Implements adapter contract
- Detects live Gemini processes by scanning `node` processes for a `gemini` argv token
- Resolves the project-to-shortId mapping from `~/.gemini/projects.json`
- Reads session files from `~/.gemini/tmp/<shortId>/chats/session-*.json`
- Computes `projectHash` by walking the candidate project root up to `.git` and hashing with sha256
- Normalizes `content`/`displayContent` via a shared `resolveContent(string | Part[])` helper
- Exposes `getConversation` for reading message history

2. `packages/agent-manager/src/__tests__/adapters/GeminiCliAdapter.test.ts`
- Unit tests for process filtering, session parsing, array-of-parts content, empty/malformed cases, and status mapping

3. `packages/agent-manager/src/adapters/index.ts` and `src/index.ts`
- Export adapter class

4. `packages/cli/src/commands/agent.ts`
- Register Gemini adapter in manager setup paths

5. `README.md`
- Flip Gemini CLI row to ✅ in the "Agent Control Support" matrix

## Design Decisions

- Decision: Detect Gemini CLI by scanning `node` processes and filtering argv for a `gemini`/`gemini.exe`/`gemini.js` token.
- Rationale: Gemini CLI is distributed as a pure Node script (unlike Claude's native binary or Codex's Node wrapper around Rust). `listAgentProcesses('gemini')` returns empty on macOS/Linux because `argv[0]` is `node`.
- Decision: Compute `projectHash` by walking from the candidate cwd up to the nearest `.git` directory, falling back to the starting directory when no `.git` is found.
- Rationale: matches the algorithm Gemini CLI uses internally (verified by sha256-hashing against a live session file's `projectHash`).
- Decision: Normalize polymorphic `content` through a single `resolveContent(string | Part[])` helper that extracts `.text` from each part and concatenates.
- Rationale: user messages are stored as `Part[]`; calling `.trim()` on an array throws `.trim is not a function`, which earlier caused `detectAgents` to throw and `AgentManager` to return an empty list.
- Decision: Keep `displayContent` preferred over `content` when both are present.
- Rationale: `displayContent` is the user-visible rendered string; `content` can include raw tool/thought payloads.
- Decision: Gate list membership on running `node` processes that match the Gemini token filter (process-first, like Codex).
- Rationale: stale session files from previous runs should not surface as active agents.
- Decision: Keep parsing resilient — adapter-level failures are caught and translated to empty results.
- Rationale: a malformed session file must not break the entire `agent list` command.
- Decision: Follow `CodexAdapter` structure for method names, helper extraction, and error handling.
- Rationale: maintainer guidance "đừng custom quá nhiều" — reduce cognitive load across adapters and keep the extension path uniform.

## Non-Functional Requirements

- Performance: adapter aggregation remains bounded by existing manager patterns; session file reads are limited to the directories of live processes.
- Reliability: Gemini adapter failures must be isolated so Claude/Codex entries still render.
- Maintainability: code structure mirrors Codex adapter for consistency.
- Security: only reads local files under `~/.gemini` and local `ps` output already permitted by existing adapters.
92 changes: 92 additions & 0 deletions docs/ai/implementation/feature-gemini-cli-adapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
phase: implementation
title: "Gemini CLI Adapter in @ai-devkit/agent-manager - Implementation"
feature: gemini-cli-adapter
description: Implementation notes for Gemini CLI adapter support in the package agent manager and CLI integration
---

# Implementation Guide: Gemini CLI Adapter in @ai-devkit/agent-manager

## Development Setup

- Branch: `feat/gemini-cli-adapter`
- Install dependencies with `npm ci`
- Build + lint + test with:
- `npx nx run agent-manager:build`
- `npx nx run agent-manager:lint`
- `npx nx run agent-manager:test`
- `npx nx run cli:test -- --runInBand src/__tests__/commands/agent.test.ts`

## Code Structure

- Package adapter implementation:
- `packages/agent-manager/src/adapters/GeminiCliAdapter.ts`
- Package exports:
- `packages/agent-manager/src/adapters/index.ts`
- `packages/agent-manager/src/index.ts`
- CLI wiring:
- `packages/cli/src/commands/agent.ts`
- Tests:
- `packages/agent-manager/src/__tests__/adapters/GeminiCliAdapter.test.ts`
- `packages/cli/src/__tests__/commands/agent.test.ts`
- README matrix update:
- `README.md` (Agent Control Support row for Gemini CLI)

## Implementation Notes

### Core Features
- Adapter contract: `type = 'gemini'`, plus `canHandle`, `detectAgents`, `getConversation`.
- Process detection: `listAgentProcesses('node')` + `isGeminiExecutable(argv)` token-scan for basename `gemini` / `gemini.exe` / `gemini.js`.
- Project-to-session mapping:
- Walk from the process cwd to the nearest `.git` boundary (fallback: starting cwd)
- Compute sha256 of the resolved project root → `projectHash`
- Cross-check against `~/.gemini/projects.json` for the `shortId` used in the session path
- Session file discovery: `~/.gemini/tmp/<shortId>/chats/session-*.json`, filtered to the matching `projectHash`.
- Content normalization: `resolveContent(content)` accepts `string | Part[]` and returns a concatenated string of `part.text` values; non-text parts are dropped.
- `messageText(entry)` prefers `displayContent` over `content`.

### Patterns & Best Practices
- Follow `CodexAdapter` structure for helper extraction and error handling.
- Keep parsing inside the adapter; keep CLI-side formatting unchanged.
- Fail soft: malformed session entries are skipped; adapter-level exceptions return empty results so other adapters still render.
- Avoid adapter-specific customization the maintainer flagged as unnecessary (e.g., Windows-specific basename handling was reverted).

## Integration Points

- `AgentManager` parallel aggregation across Claude + Codex + Gemini
- `TerminalFocusManager` open/focus flow reused without Gemini-specific branches
- CLI list/json output mapping unchanged

## Error Handling

- Missing `~/.gemini/projects.json` or `~/.gemini/tmp` → empty result, no throw.
- Malformed session JSON → skip that file, continue with the rest.
- Polymorphic `content` → handled by `resolveContent` so `.trim()` never runs on an array.
- Adapter-level throw is caught at the manager layer, isolating the failure.

## Performance Considerations

- Process detection is bounded by the number of live `node` processes on the host.
- Session file reads are scoped to the `shortId` resolved from `projects.json` for each live Gemini process — not a full `~/.gemini/tmp` scan.
- Reuses existing async aggregation model in `AgentManager`.

## Security Notes

- Reads only local files under `~/.gemini` and local process metadata already permitted by existing adapters.
- No external network calls; no execution of user content.

## Implementation Status

- Completed:
- `packages/agent-manager/src/adapters/GeminiCliAdapter.ts`
- Package exports in `packages/agent-manager/src/adapters/index.ts` and `src/index.ts`
- `packages/cli/src/commands/agent.ts` registers `GeminiCliAdapter` for list and open
- README "Agent Control Support" row for Gemini CLI flipped to ✅
- Unit tests (42 total) including the 5 added after maintainer review for array-shaped content
- Review-iteration fixes:
- Introduced `resolveContent(string | Part[])` + `messageText(entry)` helpers to handle Gemini's `Part[]` user content
- Reverted Windows-specific `path.win32.basename` customization (per "đừng custom quá nhiều")
- Commands verified:
- `npx nx run-many -t build test lint` ✅
- `npx nx run agent-manager:test -- --runInBand src/__tests__/adapters/GeminiCliAdapter.test.ts` ✅ (42 passed)
- End-to-end: real Gemini CLI session surfaced correctly in `ai-devkit agent list`
89 changes: 89 additions & 0 deletions docs/ai/planning/feature-gemini-cli-adapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
phase: planning
title: "Gemini CLI Adapter in @ai-devkit/agent-manager - Planning"
feature: gemini-cli-adapter
description: Task plan for adding Gemini CLI adapter support and integrating it into CLI agent commands
---

# Planning: Gemini CLI Adapter in @ai-devkit/agent-manager

## Milestones

- [x] Milestone 1: Research Gemini CLI distribution, session schema, and projectHash algorithm
- [x] Milestone 2: Adapter implementation, package exports, CLI registration, README flip
- [x] Milestone 3: Unit tests + real-Gemini end-to-end verification + maintainer review iteration

## Task Breakdown

### Phase 1: Research & Foundation
- [x] Task 1.1: Investigate Gemini CLI process shape
- Confirmed Gemini runs as `node /path/to/gemini` (pure Node script), not a native binary
- `listAgentProcesses('gemini')` returns empty; need `listAgentProcesses('node')` + token-scan filter
- [x] Task 1.2: Reverse-engineer session storage
- Sessions live at `~/.gemini/tmp/<shortId>/chats/session-*.json`
- `~/.gemini/projects.json` maps `projectRoot → shortId`
- Session schema: `{sessionId, projectHash, startTime, lastUpdated, messages, directories, kind}`
- [x] Task 1.3: Confirm `projectHash` algorithm
- sha256 of project root walked to nearest `.git` boundary
- Verified against a real session written by Gemini CLI
- [x] Task 1.4: Scaffold adapter + test files following `CodexAdapter` structure

### Phase 2: Core Implementation
- [x] Task 2.1: Implement process detection
- `isGeminiExecutable(argv)` token-scans for basename matching `gemini`/`gemini.exe`/`gemini.js`
- Enumerate candidate project roots from cwd walk
- [x] Task 2.2: Implement session parsing and mapping
- Normalize message `content` through `resolveContent(string | Part[])` helper
- Prefer `displayContent` over `content` when present
- Compute status from `lastUpdated` using shared thresholds
- [x] Task 2.3: Register adapter
- Add to `packages/agent-manager/src/adapters/index.ts` and `src/index.ts`
- Wire into `packages/cli/src/commands/agent.ts` for list/open paths
- [x] Task 2.4: Flip README "Agent Control Support" matrix for Gemini CLI

### Phase 3: Testing & Review Iteration
- [x] Task 3.1: Unit tests (42 total)
- Process filtering with mixed `node` processes
- Session parsing — string content, array content, mixed, missing
- Status mapping, empty directory handling, malformed JSON
- [x] Task 3.2: End-to-end verification with real Gemini CLI
- User authenticated and ran a live Gemini chat session
- Verified `ai-devkit agent list` surfaces the Gemini process with correct cwd/sessionId mapping
- [x] Task 3.3: Address maintainer review feedback
- Fixed `.trim is not a function` crash on array-shaped user content (introduced `resolveContent` helper + 5 new tests)
- Reverted earlier Windows-specific basename customization (per maintainer: "đừng custom quá nhiều")
- [x] Task 3.4: Produce docs/ai artifacts per repo `dev-lifecycle` skill

## Dependencies

- Existing `@ai-devkit/agent-manager` adapter contract and utilities
- Existing CLI agent command integration points
- A live Gemini CLI install for end-to-end verification (provided by user auth during review)

## Timeline & Estimates

- Task 1.1–1.4 (research + scaffold): 0.5 day
- Task 2.1–2.4 (implementation): 1.0 day
- Task 3.1–3.4 (tests + E2E + review iteration + docs): 1.0 day
- Total: ~2.5 days across PR iterations

## Risks & Mitigation

- Risk: Gemini CLI session schema may evolve across versions.
- Mitigation: defensive parsing, tests for partial/malformed fixtures, polymorphic `content` handling.
- Risk: `node` argv scanning is broad and could false-positive on unrelated Node processes.
- Mitigation: strict token-scan requiring a `gemini`-basename match in argv.
- Risk: `projectHash` algorithm could drift if Gemini CLI changes its boundary-detection logic.
- Mitigation: walk-to-`.git` fallback to starting directory; verified against live session.
- Risk: Adding a third adapter increases list latency.
- Mitigation: existing parallel aggregation pattern, bounded file reads, early exits on no live processes.

## Resources Needed

- `CodexAdapter` as implementation template
- Live Gemini CLI session for verification
- Maintainer review cycle on `codeaholicguy/ai-devkit` PR #70

## Progress Summary

Implementation is complete. `GeminiCliAdapter` ships in `@ai-devkit/agent-manager`, is exported through package entry points, and is registered in CLI `list`/`open` flows. Gemini CLI is now ✅ in the README "Agent Control Support" matrix. The maintainer review surfaced one regression (`.trim` on array content) and a suggestion to run the work through the repo's `dev-lifecycle` skill — both addressed. End-to-end verification against a live Gemini CLI session confirmed the mapping between live `node` processes, session files, and `AgentInfo` output.
Loading
Loading