Skip to content

weverkley/go-ai-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Go AI Agent

A sophisticated Go-based automation tool that leverages AI to interpret high-level development requests, generate a structured plan, and execute it efficiently.

Introduction

Go AI Agent is a command-line tool designed to streamline development workflows. You provide a goal in natural language, and the agent uses a large language model (like Google's Gemini) to create a detailed, step-by-step plan. This plan is structured as a Directed Acyclic Graph (DAG), allowing for parallel execution of independent tasks, significantly speeding up complex operations.

Features

  • AI-Powered Plan Generation & Self-Correction: Translates natural language requests into a structured, executable plan (DAG) of shell commands. When tasks fail, the agent intelligently consults the AI for potential fixes and retries, enhancing resilience.
  • Parallel Task Execution: Utilizes a DAG to run independent tasks concurrently, optimizing for speed and efficiency.
  • Configurable AI Executors & Models: Directly interfaces with various AI APIs (e.g., Google's Gemini) and allows users to specify the AI model to use, enabling flexible and future-proof integration.
  • Persistent Caching: Caches AI API responses to disk, significantly reducing latency, API costs, and improving overall performance across multiple runs.
  • Interactive Plan Recovery: Detects existing unfinished execution plans and prompts the user to either resume the previous session or start a fresh one.
  • Comprehensive Session Summary: Provides a detailed summary at the end of each session, including accurate task completion statistics, success rates, executor and model information, and token usage.

Getting Started

Follow these instructions to get the project up and running on your local machine.

Prerequisites

  • Go: The project requires Go to be installed. The specific version is listed in the go.mod file (e.g., Go 1.22 or newer).
  • API Key: You need an API key from you AI executor (e.g., Gemini) You must set it as an environment variable:
    export GEMINI_API_KEY="your_api_key_here"

Installation

  1. Clone the repository to your local machine:

    git clone https://github.com/your-username/go-ai-agent.git
    cd go-ai-agent
  2. Build the binary:

    go build .

    This will create an executable file named go-ai-agent in the root directory.

Usage

The go-ai-agent tool takes a natural language prompt and executes a plan to achieve the specified goal.

Command Structure

./go-ai-agent [flags]

Arguments

Flag Shorthand Type Default Description
--prompt string User prompt describing the software development task
--directory -d string . Target project directory
--executor -e string AI executor to use (e.g., gemini, mock)
--concurrency -c int 4 Maximum number of parallel tasks
--fresh bool false Start with a fresh working directory
--git-enable bool true Enable Git operations
--git-push bool false Push changes to remote repository
--git-branch string generated Git branch to use
--api-keys string API keys for AI executors (e.g., GEMINI_API_KEY=your_key)
--executor-model string Specific model to use with the chosen executor (e.g., gemini-pro-latest)
--config-file string Configuration file path
--verbose -v bool false Enable verbose logging
--exit bool false Exit after the first run

Examples

  1. Create and Run a Simple HTTP Server:

    ./go-ai-agent --prompt "create a new http server in a file called server.go, initialize a go module for it, and then run it" --executor "gemini" --executor-model "gemini-pro-latest" --fresh --exit
  2. Resume an Existing Plan (Interactive): If you run the agent and it finds an unfinished plan.json in the .go-agent-work directory, it will prompt you to either continue the existing plan or start fresh.

    ./go-ai-agent --prompt "fix the bug in main.go" --executor "gemini"

    (The agent will ask: "Do you want to continue with the existing plan? (y/n):")

  3. Enable Verbose Logging and Git Push:

    ./go-ai-agent --prompt "add a new endpoint /health to the server.go" --executor "gemini" --verbose --git-push
  4. Specify a Custom Working Directory:

    ./go-ai-agent -d /path/to/my/project --prompt "refactor the database connection" --executor "gemini"

For more detailed command information and additional options, please refer to the Commands Documentation.

Execution Flow

The go-ai-agent orchestrates task execution based on a meticulously generated plan, represented as a Directed Acyclic Graph (DAG). This ensures tasks are performed in the correct order, respecting all dependencies, and leveraging parallelism for efficiency.

Here's a detailed breakdown of the execution process:

  1. Plan Generation:

    • Upon receiving a user prompt, the agent communicates with the configured AI executor (e.g., Gemini) to generate a comprehensive ExecutionPlan.
    • This plan is a DAG, where each node is a Task with defined dependencies on other tasks.
    • The generated plan is saved to /.go-agent-work/plan.json for persistence and recovery.
  2. Interactive Recovery (if applicable):

    • If an existing plan.json is found, the agent prompts the user to either continue the previous session or start fresh.
    • If continuing, the agent validates the status of each task in the loaded plan. Tasks with successful validationCommand are marked as completed. Tasks previously marked running are reset to pending to allow re-execution.
  3. Parallel Execution Loop:

    • The agent enters a continuous loop, managed by the /parallel component, until all tasks in the plan are either completed or failed.
    • In each iteration, the dag component identifies "ready" tasks: these are tasks whose status is pending and all of their dependencies have a completed status.
    • Ready tasks are then executed concurrently as goroutines, up to the configured concurrency limit.
  4. Task Execution (/runner):

    • Each task is handled by the /runner component.
    • WriteFile Intent: If a task's intent is WriteFile, the agent uses the AI executor to generate the content and writes it to the specified outputFiles. The absolute path is constructed to prevent "no such file or directory" errors.
    • RunCommand Intent: If a task's intent is RunCommand, the agent executes the command in a shell.
    • Validation: After executing the main command, if a validationCommand is specified, it is also executed. The success or failure of this command determines the task's outcome.
  5. AI-Powered Self-Correction:

    • If a RunCommand task fails and has remaining attempts, the agent consults the AI again.
    • A new prompt is generated, including the failed task's details, the original command, and the error output.
    • The AI suggests a fixedCommand, which then replaces the original command for the next retry. This allows the agent to adapt and self-correct.
  6. Status Updates & Persistence:

    • As tasks are executed, their status (e.g., running, completed, pending for retry, failed) and attempts count are updated in the stateManager.
    • The updated plan is continuously saved to /.go-agent-work/plan.json, ensuring persistence across interruptions.
  7. Session Summary:

    • Upon completion of all tasks or user exit, a comprehensive session summary is displayed.
    • This summary includes: total interactions, task completion statistics (completed/failed), overall success rate, executor and model used, total input/output tokens consumed, and a git diff --stat of changes made.

Project Structure

The project is organized into several key directories:

  • /cache: Implements a persistent, file-based caching mechanism for AI API responses, improving performance and reducing costs.
  • /cmd: Contains the command-line interface logic, built using the Cobra library. The main run command is defined here.
  • /config: Manages application configuration, including loading settings from files, environment variables, and command-line flags.
  • /context: Gathers and processes project-specific context to inform the AI's planning process.
  • /dag: Provides utilities for working with Directed Acyclic Graphs, used to represent and manage task dependencies.
  • /docs: Contains supplementary project documentation.
  • /executor: Implements the logic for interacting with various AI models (e.g., Gemini) to generate responses, including caching mechanisms for efficiency.
  • /git: Handles Git operations, such as committing and pushing changes, as part of the automated workflow.
  • /logging: Provides structured logging functionalities for consistent and informative output.
  • /parallel: Orchestrates the parallel execution of tasks based on their dependencies, optimizing for speed.
  • /planner: Responsible for communicating with the AI model to generate the task plan (DAG) based on the user's request.
  • /prompt: Manages the parsing and processing of user prompts, including file references.
  • /runner: Executes individual tasks defined in the plan, primarily by running shell commands and handling their output.
  • /state: Manages the state of the execution plan, including saving and loading the plan, and updating task statuses.
  • /types: Defines the core data structures used throughout the application, such as Task and ExecutionPlan.
  • /utils: Contains various utility functions and helpers used across the project.

About

A sophisticated Go-based automation tool that leverages AI to interpret high-level development requests, generate a structured plan, and execute it efficiently.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages