Skip to content

Commit 81d4f7b

Browse files
authored
feat(agent-manager): add Gemini CLI agent adapter (#70)
Adds a GeminiCliAdapter alongside the existing Claude Code and Codex adapters so `ai-devkit agent list` and `ai-devkit agent detail` can discover and inspect running Gemini CLI sessions. The adapter follows the canHandle / detectAgents / getConversation contract and registers in every agent.ts entrypoint that composes an AgentManager. Process detection enumerates `node` processes and filters argv for a `gemini` / `gemini.exe` / `gemini.js` basename, because Gemini CLI is distributed as a pure Node script (unlike Claude's native binary or Codex's Node wrapper around Rust). Earlier Windows-specific basename handling was removed to stay close to the existing adapter pattern. Session discovery walks `~/.gemini/tmp/<shortId>/chats/session-*.json` across every project short-id directory Gemini maintains. Each session JSON carries its own `projectHash` — sha256 of the project root that Gemini CLI resolved at write time via its `.git`-bounded walk. The adapter enumerates every ancestor of each running process' CWD, hashes each candidate, and matches on projectHash to populate `resolvedCwd` on a SessionFile. The shared `matchProcessesToSessions()` performs the usual CWD + birthtime 1:1 greedy assignment, and processes without a matching session fall back to the process-only AgentInfo shape. `getConversation` parses Gemini's single-JSON-per-file layout (not JSONL): `messages` is an array with `type` of 'user' or 'gemini' for visible turns; 'thought' and 'tool' entries are hidden by default and surface as `system` role under `--verbose`. Message `content` is polymorphic — assistant messages store a string, user messages store `Part[]` (e.g. `[{text: '...'}]`). A `resolveContent` helper normalizes both shapes so `.trim()` never runs on an array, and `displayContent` takes priority over `content` when both are present. Test coverage mirrors CodexAdapter — 42 unit tests across initialization, canHandle, detectAgents, discoverSessions (including parent-of-cwd git-root cases), determineStatus, parseSession (string + array content + non-text parts), and getConversation. Also updates the jest mock in the CLI agent command test so `agent detail` can route `gemini_cli` agents through the new adapter. Docs follow the repo's dev-lifecycle skill: `docs/ai/{requirements, design,planning,implementation,testing}/feature-gemini-cli-adapter.md` capture the Node-script distribution rationale, session schema, projectHash algorithm, content-normalization decisions, and the real-Gemini end-to-end verification.
1 parent f50ac86 commit 81d4f7b

11 files changed

Lines changed: 1765 additions & 0 deletions

File tree

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
---
2+
phase: design
3+
title: "Gemini CLI Adapter in @ai-devkit/agent-manager - Design"
4+
feature: gemini-cli-adapter
5+
description: Architecture and implementation design for introducing Gemini CLI adapter support in the shared agent manager package
6+
---
7+
8+
# Design: Gemini CLI Adapter for @ai-devkit/agent-manager
9+
10+
## Architecture Overview
11+
12+
```mermaid
13+
graph TD
14+
User[User runs ai-devkit agent list/open] --> Cmd[packages/cli/src/commands/agent.ts]
15+
Cmd --> Manager[AgentManager]
16+
17+
subgraph Pkg[@ai-devkit/agent-manager]
18+
Manager --> Claude[ClaudeCodeAdapter]
19+
Manager --> Codex[CodexAdapter]
20+
Manager --> Gemini[GeminiCliAdapter]
21+
Gemini --> Proc[process utils - node argv scan]
22+
Gemini --> File[fs read of ~/.gemini/tmp]
23+
Gemini --> Hash[sha256 projectHash]
24+
Gemini --> Types[AgentAdapter/AgentInfo/AgentStatus]
25+
Focus[TerminalFocusManager]
26+
end
27+
28+
Cmd --> Focus
29+
Cmd --> Output[CLI table/json rendering]
30+
```
31+
32+
Responsibilities:
33+
- `GeminiCliAdapter`: discover running Gemini processes, match them to local session files, and emit `AgentInfo`
34+
- `AgentManager`: aggregate Gemini + Claude + Codex results in parallel
35+
- CLI command: register adapter, render results, invoke open/focus
36+
37+
## Data Models
38+
39+
- Reuse `AgentAdapter`, `AgentInfo`, `AgentStatus`, and `AgentType`
40+
- `AgentType` already supports `gemini`; adapter emits `type: 'gemini'`
41+
- Gemini raw session shape (on disk):
42+
- `sessionId`: uuid string
43+
- `projectHash`: sha256 of the project root path (walked to nearest `.git` boundary)
44+
- `startTime`, `lastUpdated`: ISO timestamps
45+
- `messages[]`: entries with `id`, `timestamp`, `type` (`user` | `gemini` | `thought` | `tool`), and `content` / `displayContent`
46+
- `content` is polymorphic: `string` for assistant-side, `Part[]` (e.g. `[{text: "..."}]`) for user-side
47+
- `directories[]`, `kind`
48+
- Normalized into `AgentInfo`:
49+
- `id`: `gemini-<sessionId prefix>`
50+
- `name`: derived from session project directory (basename of registry path)
51+
- `cwd`: project root
52+
- `sessionStart`: parsed from `startTime`
53+
- `status`: computed from `lastUpdated` vs shared status thresholds
54+
- `pid`: matched live `node` process running the `gemini` script
55+
56+
## API Design
57+
58+
### Package Exports
59+
- Add `GeminiCliAdapter` to:
60+
- `packages/agent-manager/src/adapters/index.ts`
61+
- `packages/agent-manager/src/index.ts`
62+
63+
### CLI Integration
64+
- Update `packages/cli/src/commands/agent.ts` to register `GeminiCliAdapter` alongside `ClaudeCodeAdapter` and `CodexAdapter`
65+
- No presentation logic moves into the package; CLI retains formatting
66+
67+
## Component Breakdown
68+
69+
1. `packages/agent-manager/src/adapters/GeminiCliAdapter.ts`
70+
- Implements adapter contract
71+
- Detects live Gemini processes by scanning `node` processes for a `gemini` argv token
72+
- Resolves the project-to-shortId mapping from `~/.gemini/projects.json`
73+
- Reads session files from `~/.gemini/tmp/<shortId>/chats/session-*.json`
74+
- Computes `projectHash` by walking the candidate project root up to `.git` and hashing with sha256
75+
- Normalizes `content`/`displayContent` via a shared `resolveContent(string | Part[])` helper
76+
- Exposes `getConversation` for reading message history
77+
78+
2. `packages/agent-manager/src/__tests__/adapters/GeminiCliAdapter.test.ts`
79+
- Unit tests for process filtering, session parsing, array-of-parts content, empty/malformed cases, and status mapping
80+
81+
3. `packages/agent-manager/src/adapters/index.ts` and `src/index.ts`
82+
- Export adapter class
83+
84+
4. `packages/cli/src/commands/agent.ts`
85+
- Register Gemini adapter in manager setup paths
86+
87+
5. `README.md`
88+
- Flip Gemini CLI row to ✅ in the "Agent Control Support" matrix
89+
90+
## Design Decisions
91+
92+
- Decision: Detect Gemini CLI by scanning `node` processes and filtering argv for a `gemini`/`gemini.exe`/`gemini.js` token.
93+
- Rationale: Gemini CLI is distributed as a pure Node script (unlike Claude's native binary or Codex's Node wrapper around Rust). `listAgentProcesses('gemini')` returns empty on macOS/Linux because `argv[0]` is `node`.
94+
- Decision: Compute `projectHash` by walking from the candidate cwd up to the nearest `.git` directory, falling back to the starting directory when no `.git` is found.
95+
- Rationale: matches the algorithm Gemini CLI uses internally (verified by sha256-hashing against a live session file's `projectHash`).
96+
- Decision: Normalize polymorphic `content` through a single `resolveContent(string | Part[])` helper that extracts `.text` from each part and concatenates.
97+
- Rationale: user messages are stored as `Part[]`; calling `.trim()` on an array throws `.trim is not a function`, which earlier caused `detectAgents` to throw and `AgentManager` to return an empty list.
98+
- Decision: Keep `displayContent` preferred over `content` when both are present.
99+
- Rationale: `displayContent` is the user-visible rendered string; `content` can include raw tool/thought payloads.
100+
- Decision: Gate list membership on running `node` processes that match the Gemini token filter (process-first, like Codex).
101+
- Rationale: stale session files from previous runs should not surface as active agents.
102+
- Decision: Keep parsing resilient — adapter-level failures are caught and translated to empty results.
103+
- Rationale: a malformed session file must not break the entire `agent list` command.
104+
- Decision: Follow `CodexAdapter` structure for method names, helper extraction, and error handling.
105+
- Rationale: maintainer guidance "đừng custom quá nhiều" — reduce cognitive load across adapters and keep the extension path uniform.
106+
107+
## Non-Functional Requirements
108+
109+
- Performance: adapter aggregation remains bounded by existing manager patterns; session file reads are limited to the directories of live processes.
110+
- Reliability: Gemini adapter failures must be isolated so Claude/Codex entries still render.
111+
- Maintainability: code structure mirrors Codex adapter for consistency.
112+
- Security: only reads local files under `~/.gemini` and local `ps` output already permitted by existing adapters.
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
phase: implementation
3+
title: "Gemini CLI Adapter in @ai-devkit/agent-manager - Implementation"
4+
feature: gemini-cli-adapter
5+
description: Implementation notes for Gemini CLI adapter support in the package agent manager and CLI integration
6+
---
7+
8+
# Implementation Guide: Gemini CLI Adapter in @ai-devkit/agent-manager
9+
10+
## Development Setup
11+
12+
- Branch: `feat/gemini-cli-adapter`
13+
- Install dependencies with `npm ci`
14+
- Build + lint + test with:
15+
- `npx nx run agent-manager:build`
16+
- `npx nx run agent-manager:lint`
17+
- `npx nx run agent-manager:test`
18+
- `npx nx run cli:test -- --runInBand src/__tests__/commands/agent.test.ts`
19+
20+
## Code Structure
21+
22+
- Package adapter implementation:
23+
- `packages/agent-manager/src/adapters/GeminiCliAdapter.ts`
24+
- Package exports:
25+
- `packages/agent-manager/src/adapters/index.ts`
26+
- `packages/agent-manager/src/index.ts`
27+
- CLI wiring:
28+
- `packages/cli/src/commands/agent.ts`
29+
- Tests:
30+
- `packages/agent-manager/src/__tests__/adapters/GeminiCliAdapter.test.ts`
31+
- `packages/cli/src/__tests__/commands/agent.test.ts`
32+
- README matrix update:
33+
- `README.md` (Agent Control Support row for Gemini CLI)
34+
35+
## Implementation Notes
36+
37+
### Core Features
38+
- Adapter contract: `type = 'gemini'`, plus `canHandle`, `detectAgents`, `getConversation`.
39+
- Process detection: `listAgentProcesses('node')` + `isGeminiExecutable(argv)` token-scan for basename `gemini` / `gemini.exe` / `gemini.js`.
40+
- Project-to-session mapping:
41+
- Walk from the process cwd to the nearest `.git` boundary (fallback: starting cwd)
42+
- Compute sha256 of the resolved project root → `projectHash`
43+
- Cross-check against `~/.gemini/projects.json` for the `shortId` used in the session path
44+
- Session file discovery: `~/.gemini/tmp/<shortId>/chats/session-*.json`, filtered to the matching `projectHash`.
45+
- Content normalization: `resolveContent(content)` accepts `string | Part[]` and returns a concatenated string of `part.text` values; non-text parts are dropped.
46+
- `messageText(entry)` prefers `displayContent` over `content`.
47+
48+
### Patterns & Best Practices
49+
- Follow `CodexAdapter` structure for helper extraction and error handling.
50+
- Keep parsing inside the adapter; keep CLI-side formatting unchanged.
51+
- Fail soft: malformed session entries are skipped; adapter-level exceptions return empty results so other adapters still render.
52+
- Avoid adapter-specific customization the maintainer flagged as unnecessary (e.g., Windows-specific basename handling was reverted).
53+
54+
## Integration Points
55+
56+
- `AgentManager` parallel aggregation across Claude + Codex + Gemini
57+
- `TerminalFocusManager` open/focus flow reused without Gemini-specific branches
58+
- CLI list/json output mapping unchanged
59+
60+
## Error Handling
61+
62+
- Missing `~/.gemini/projects.json` or `~/.gemini/tmp` → empty result, no throw.
63+
- Malformed session JSON → skip that file, continue with the rest.
64+
- Polymorphic `content` → handled by `resolveContent` so `.trim()` never runs on an array.
65+
- Adapter-level throw is caught at the manager layer, isolating the failure.
66+
67+
## Performance Considerations
68+
69+
- Process detection is bounded by the number of live `node` processes on the host.
70+
- Session file reads are scoped to the `shortId` resolved from `projects.json` for each live Gemini process — not a full `~/.gemini/tmp` scan.
71+
- Reuses existing async aggregation model in `AgentManager`.
72+
73+
## Security Notes
74+
75+
- Reads only local files under `~/.gemini` and local process metadata already permitted by existing adapters.
76+
- No external network calls; no execution of user content.
77+
78+
## Implementation Status
79+
80+
- Completed:
81+
- `packages/agent-manager/src/adapters/GeminiCliAdapter.ts`
82+
- Package exports in `packages/agent-manager/src/adapters/index.ts` and `src/index.ts`
83+
- `packages/cli/src/commands/agent.ts` registers `GeminiCliAdapter` for list and open
84+
- README "Agent Control Support" row for Gemini CLI flipped to ✅
85+
- Unit tests (42 total) including the 5 added after maintainer review for array-shaped content
86+
- Review-iteration fixes:
87+
- Introduced `resolveContent(string | Part[])` + `messageText(entry)` helpers to handle Gemini's `Part[]` user content
88+
- Reverted Windows-specific `path.win32.basename` customization (per "đừng custom quá nhiều")
89+
- Commands verified:
90+
- `npx nx run-many -t build test lint`
91+
- `npx nx run agent-manager:test -- --runInBand src/__tests__/adapters/GeminiCliAdapter.test.ts` ✅ (42 passed)
92+
- End-to-end: real Gemini CLI session surfaced correctly in `ai-devkit agent list`
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
phase: planning
3+
title: "Gemini CLI Adapter in @ai-devkit/agent-manager - Planning"
4+
feature: gemini-cli-adapter
5+
description: Task plan for adding Gemini CLI adapter support and integrating it into CLI agent commands
6+
---
7+
8+
# Planning: Gemini CLI Adapter in @ai-devkit/agent-manager
9+
10+
## Milestones
11+
12+
- [x] Milestone 1: Research Gemini CLI distribution, session schema, and projectHash algorithm
13+
- [x] Milestone 2: Adapter implementation, package exports, CLI registration, README flip
14+
- [x] Milestone 3: Unit tests + real-Gemini end-to-end verification + maintainer review iteration
15+
16+
## Task Breakdown
17+
18+
### Phase 1: Research & Foundation
19+
- [x] Task 1.1: Investigate Gemini CLI process shape
20+
- Confirmed Gemini runs as `node /path/to/gemini` (pure Node script), not a native binary
21+
- `listAgentProcesses('gemini')` returns empty; need `listAgentProcesses('node')` + token-scan filter
22+
- [x] Task 1.2: Reverse-engineer session storage
23+
- Sessions live at `~/.gemini/tmp/<shortId>/chats/session-*.json`
24+
- `~/.gemini/projects.json` maps `projectRoot → shortId`
25+
- Session schema: `{sessionId, projectHash, startTime, lastUpdated, messages, directories, kind}`
26+
- [x] Task 1.3: Confirm `projectHash` algorithm
27+
- sha256 of project root walked to nearest `.git` boundary
28+
- Verified against a real session written by Gemini CLI
29+
- [x] Task 1.4: Scaffold adapter + test files following `CodexAdapter` structure
30+
31+
### Phase 2: Core Implementation
32+
- [x] Task 2.1: Implement process detection
33+
- `isGeminiExecutable(argv)` token-scans for basename matching `gemini`/`gemini.exe`/`gemini.js`
34+
- Enumerate candidate project roots from cwd walk
35+
- [x] Task 2.2: Implement session parsing and mapping
36+
- Normalize message `content` through `resolveContent(string | Part[])` helper
37+
- Prefer `displayContent` over `content` when present
38+
- Compute status from `lastUpdated` using shared thresholds
39+
- [x] Task 2.3: Register adapter
40+
- Add to `packages/agent-manager/src/adapters/index.ts` and `src/index.ts`
41+
- Wire into `packages/cli/src/commands/agent.ts` for list/open paths
42+
- [x] Task 2.4: Flip README "Agent Control Support" matrix for Gemini CLI
43+
44+
### Phase 3: Testing & Review Iteration
45+
- [x] Task 3.1: Unit tests (42 total)
46+
- Process filtering with mixed `node` processes
47+
- Session parsing — string content, array content, mixed, missing
48+
- Status mapping, empty directory handling, malformed JSON
49+
- [x] Task 3.2: End-to-end verification with real Gemini CLI
50+
- User authenticated and ran a live Gemini chat session
51+
- Verified `ai-devkit agent list` surfaces the Gemini process with correct cwd/sessionId mapping
52+
- [x] Task 3.3: Address maintainer review feedback
53+
- Fixed `.trim is not a function` crash on array-shaped user content (introduced `resolveContent` helper + 5 new tests)
54+
- Reverted earlier Windows-specific basename customization (per maintainer: "đừng custom quá nhiều")
55+
- [x] Task 3.4: Produce docs/ai artifacts per repo `dev-lifecycle` skill
56+
57+
## Dependencies
58+
59+
- Existing `@ai-devkit/agent-manager` adapter contract and utilities
60+
- Existing CLI agent command integration points
61+
- A live Gemini CLI install for end-to-end verification (provided by user auth during review)
62+
63+
## Timeline & Estimates
64+
65+
- Task 1.1–1.4 (research + scaffold): 0.5 day
66+
- Task 2.1–2.4 (implementation): 1.0 day
67+
- Task 3.1–3.4 (tests + E2E + review iteration + docs): 1.0 day
68+
- Total: ~2.5 days across PR iterations
69+
70+
## Risks & Mitigation
71+
72+
- Risk: Gemini CLI session schema may evolve across versions.
73+
- Mitigation: defensive parsing, tests for partial/malformed fixtures, polymorphic `content` handling.
74+
- Risk: `node` argv scanning is broad and could false-positive on unrelated Node processes.
75+
- Mitigation: strict token-scan requiring a `gemini`-basename match in argv.
76+
- Risk: `projectHash` algorithm could drift if Gemini CLI changes its boundary-detection logic.
77+
- Mitigation: walk-to-`.git` fallback to starting directory; verified against live session.
78+
- Risk: Adding a third adapter increases list latency.
79+
- Mitigation: existing parallel aggregation pattern, bounded file reads, early exits on no live processes.
80+
81+
## Resources Needed
82+
83+
- `CodexAdapter` as implementation template
84+
- Live Gemini CLI session for verification
85+
- Maintainer review cycle on `codeaholicguy/ai-devkit` PR #70
86+
87+
## Progress Summary
88+
89+
Implementation is complete. `GeminiCliAdapter` ships in `@ai-devkit/agent-manager`, is exported through package entry points, and is registered in CLI `list`/`open` flows. Gemini CLI is now ✅ in the README "Agent Control Support" matrix. The maintainer review surfaced one regression (`.trim` on array content) and a suggestion to run the work through the repo's `dev-lifecycle` skill — both addressed. End-to-end verification against a live Gemini CLI session confirmed the mapping between live `node` processes, session files, and `AgentInfo` output.

0 commit comments

Comments
 (0)