A sophisticated Go-based automation tool that leverages AI to interpret high-level development requests, generate a structured plan, and execute it efficiently.
Go AI Agent is a command-line tool designed to streamline development workflows. You provide a goal in natural language, and the agent uses a large language model (like Google's Gemini) to create a detailed, step-by-step plan. This plan is structured as a Directed Acyclic Graph (DAG), allowing for parallel execution of independent tasks, significantly speeding up complex operations.
- AI-Powered Plan Generation & Self-Correction: Translates natural language requests into a structured, executable plan (DAG) of shell commands. When tasks fail, the agent intelligently consults the AI for potential fixes and retries, enhancing resilience.
- Parallel Task Execution: Utilizes a DAG to run independent tasks concurrently, optimizing for speed and efficiency.
- Configurable AI Executors & Models: Directly interfaces with various AI APIs (e.g., Google's Gemini) and allows users to specify the AI model to use, enabling flexible and future-proof integration.
- Persistent Caching: Caches AI API responses to disk, significantly reducing latency, API costs, and improving overall performance across multiple runs.
- Interactive Plan Recovery: Detects existing unfinished execution plans and prompts the user to either resume the previous session or start a fresh one.
- Comprehensive Session Summary: Provides a detailed summary at the end of each session, including accurate task completion statistics, success rates, executor and model information, and token usage.
Follow these instructions to get the project up and running on your local machine.
- Go: The project requires Go to be installed. The specific version is listed in the
go.modfile (e.g., Go 1.22 or newer). - API Key: You need an API key from you AI executor (e.g., Gemini) You must set it as an environment variable:
export GEMINI_API_KEY="your_api_key_here"
-
Clone the repository to your local machine:
git clone https://github.com/your-username/go-ai-agent.git cd go-ai-agent -
Build the binary:
go build .This will create an executable file named
go-ai-agentin the root directory.
The go-ai-agent tool takes a natural language prompt and executes a plan to achieve the specified goal.
./go-ai-agent [flags]| Flag | Shorthand | Type | Default | Description |
|---|---|---|---|---|
--prompt |
string | User prompt describing the software development task | ||
--directory |
-d |
string | . |
Target project directory |
--executor |
-e |
string | AI executor to use (e.g., gemini, mock) |
|
--concurrency |
-c |
int | 4 |
Maximum number of parallel tasks |
--fresh |
bool | false |
Start with a fresh working directory | |
--git-enable |
bool | true |
Enable Git operations | |
--git-push |
bool | false |
Push changes to remote repository | |
--git-branch |
string | generated |
Git branch to use | |
--api-keys |
string | API keys for AI executors (e.g., GEMINI_API_KEY=your_key) |
||
--executor-model |
string | Specific model to use with the chosen executor (e.g., gemini-pro-latest) |
||
--config-file |
string | Configuration file path | ||
--verbose |
-v |
bool | false |
Enable verbose logging |
--exit |
bool | false |
Exit after the first run |
-
Create and Run a Simple HTTP Server:
./go-ai-agent --prompt "create a new http server in a file called server.go, initialize a go module for it, and then run it" --executor "gemini" --executor-model "gemini-pro-latest" --fresh --exit
-
Resume an Existing Plan (Interactive): If you run the agent and it finds an unfinished
plan.jsonin the.go-agent-workdirectory, it will prompt you to either continue the existing plan or start fresh../go-ai-agent --prompt "fix the bug in main.go" --executor "gemini"
(The agent will ask: "Do you want to continue with the existing plan? (y/n):")
-
Enable Verbose Logging and Git Push:
./go-ai-agent --prompt "add a new endpoint /health to the server.go" --executor "gemini" --verbose --git-push
-
Specify a Custom Working Directory:
./go-ai-agent -d /path/to/my/project --prompt "refactor the database connection" --executor "gemini"
For more detailed command information and additional options, please refer to the Commands Documentation.
The go-ai-agent orchestrates task execution based on a meticulously generated plan, represented as a Directed Acyclic Graph (DAG). This ensures tasks are performed in the correct order, respecting all dependencies, and leveraging parallelism for efficiency.
Here's a detailed breakdown of the execution process:
-
Plan Generation:
- Upon receiving a user prompt, the agent communicates with the configured AI executor (e.g., Gemini) to generate a comprehensive
ExecutionPlan. - This plan is a DAG, where each node is a
Taskwith defineddependencieson other tasks. - The generated plan is saved to
/.go-agent-work/plan.jsonfor persistence and recovery.
- Upon receiving a user prompt, the agent communicates with the configured AI executor (e.g., Gemini) to generate a comprehensive
-
Interactive Recovery (if applicable):
- If an existing
plan.jsonis found, the agent prompts the user to eithercontinuethe previous session orstart fresh. - If continuing, the agent validates the status of each task in the loaded plan. Tasks with successful
validationCommandare marked ascompleted. Tasks previously markedrunningare reset topendingto allow re-execution.
- If an existing
-
Parallel Execution Loop:
- The agent enters a continuous loop, managed by the
/parallelcomponent, until all tasks in the plan are eithercompletedorfailed. - In each iteration, the
dagcomponent identifies "ready" tasks: these are tasks whosestatusispendingand all of theirdependencieshave acompletedstatus. - Ready tasks are then executed concurrently as goroutines, up to the configured
concurrencylimit.
- The agent enters a continuous loop, managed by the
-
Task Execution (
/runner):- Each task is handled by the
/runnercomponent. WriteFileIntent: If a task'sintentisWriteFile, the agent uses the AI executor to generate the content and writes it to the specifiedoutputFiles. The absolute path is constructed to prevent "no such file or directory" errors.RunCommandIntent: If a task'sintentisRunCommand, the agent executes thecommandin a shell.- Validation: After executing the main command, if a
validationCommandis specified, it is also executed. The success or failure of this command determines the task's outcome.
- Each task is handled by the
-
AI-Powered Self-Correction:
- If a
RunCommandtask fails and has remainingattempts, the agent consults the AI again. - A new prompt is generated, including the failed task's details, the original command, and the error output.
- The AI suggests a
fixedCommand, which then replaces the original command for the next retry. This allows the agent to adapt and self-correct.
- If a
-
Status Updates & Persistence:
- As tasks are executed, their
status(e.g.,running,completed,pendingfor retry,failed) andattemptscount are updated in thestateManager. - The updated plan is continuously saved to
/.go-agent-work/plan.json, ensuring persistence across interruptions.
- As tasks are executed, their
-
Session Summary:
- Upon completion of all tasks or user exit, a comprehensive session summary is displayed.
- This summary includes: total interactions, task completion statistics (completed/failed), overall success rate, executor and model used, total input/output tokens consumed, and a
git diff --statof changes made.
The project is organized into several key directories:
/cache: Implements a persistent, file-based caching mechanism for AI API responses, improving performance and reducing costs./cmd: Contains the command-line interface logic, built using the Cobra library. The mainruncommand is defined here./config: Manages application configuration, including loading settings from files, environment variables, and command-line flags./context: Gathers and processes project-specific context to inform the AI's planning process./dag: Provides utilities for working with Directed Acyclic Graphs, used to represent and manage task dependencies./docs: Contains supplementary project documentation./executor: Implements the logic for interacting with various AI models (e.g., Gemini) to generate responses, including caching mechanisms for efficiency./git: Handles Git operations, such as committing and pushing changes, as part of the automated workflow./logging: Provides structured logging functionalities for consistent and informative output./parallel: Orchestrates the parallel execution of tasks based on their dependencies, optimizing for speed./planner: Responsible for communicating with the AI model to generate the task plan (DAG) based on the user's request./prompt: Manages the parsing and processing of user prompts, including file references./runner: Executes individual tasks defined in the plan, primarily by running shell commands and handling their output./state: Manages the state of the execution plan, including saving and loading the plan, and updating task statuses./types: Defines the core data structures used throughout the application, such asTaskandExecutionPlan./utils: Contains various utility functions and helpers used across the project.