This project implements a foundational, minimal, and fully functional AI agent capable of:
- basic filename and preview‑based code search
- loading selective file excerpts into LLM context
- executing build and test commands on the local nlohmann/json project
- parsing compiler errors and locating referenced source lines
- maintaining a strict 5,000‑token rolling context buffer
- Recursively scans the
nlohmann/jsondirectory - Stores:
- file paths
- file sizes
- first ~30 lines as a preview
- Supports substring filename queries
(e.g.,"json_pointer","binary_reader") - Hybrid ranking:
- heuristic prefilter (path/basename/preview token hits) to shortlist top‑K candidates
- LLM ranks only the shortlisted previews (reduces prompt size)
- LLM‑returned paths are normalized and matched against the actual indexed
json/...paths.
- Maintains a rolling 5,000‑token buffer
- Eviction policy:
- preferentially evicts low‑value
"system"log/tool output first - falls back to FIFO eviction when no
"system"items remain
- preferentially evicts low‑value
- Stores:
- file excerpts
- build/test logs
- user and system messages
Runs basic build actions:
cmake -S json -B build
cmake --build build
- Output is returned and may be inserted into context as needed.
Runs tests using:
ctest --test-dir build
- Returns raw output (minimal formatting) for consistent analysis.
Given compiler output, the agent:
- routes to compiler error analysis only when a
:line:colpattern is detected (reduces false positives) - extracts file / line / column data when present (de‑duped and capped to a small number of locations)
- loads a small excerpt around the failing region
- returns structured context for LLM reasoning
agent/
indexer.py # Minimal metadata index (paths, sizes, previews)
token_utils.py # True token counting via tiktoken
context_manager.py # Rolling 5k-token buffer with eviction
actions.py # Build/test execution and error parsing
agent_loop.py # Minimal routing and LLM-based ranking
openai_client.py # Minimal wrapper around GPT‑4o‑mini model (caches system prompt + client)
demo/
run_agent.py # CLI entry point
json/ # Local nlohmann/json checkout (must exist locally)
-
Install dependencies:
pip install -r agent/requirements.txt sudo apt-get update sudo apt-get install -y build-essential cmake -
Ensure
nlohmann/jsonexists at:./json/ -
Set the OpenAI API key:
export OPENAI_API_KEY="insert_your_key_here" -
Launch interactive mode:
python3 demo/run_agent.py
Example queries once running:
what is your role?
what are your capabilities?
can you tell me about the codebase?
build
run tests
json_pointer
Where is json_pointer defined?
Explain binary_reader.hpp
which files are responsible for JSON serialization?
i get this error json/src/modules/json.cppm:12:5: error: expected ';'
Type exit or quit to end the session.
Note: This agent was tested and designed on Linux Ubuntu v22.04
- Each demo run generates a timestamped log file:
demo/logs/run_YYYYMMDD_HHMMSS.txt run_agent.pyuses alog_print()wrapper that writes to:- stdout
- the active log file
All banners, queries, errors, and responses are logged.
Logging is isolated to the demo runner and does not modify internal agent components.
- Every OpenAI call prints:
- prompt tokens
- completion tokens
- total tokens
- Implemented inside
agent/openai_client.py - Fully isolated from core agent logic.
A dedicated system_prompt.txt enforces strict agent behavior.
Stored at:
agent/system_prompt.txt
Loaded automatically for every LLM call (cached at startup in OpenAIClient):
# OpenAIClient.__init__()
with open("agent/system_prompt.txt", "r", encoding="utf-8") as f:
self.system_prompt = f.read()
# OpenAIClient.chat()
messages = [{"role": "system", "content": self.system_prompt}] + messages
Ensures:
- consistent assignment‑compliant behavior
- strict C++ codebase analysis
- prevention of accidental prompt omission
- deterministic enforcement of allowed commands and context rules
- minimal complexity
- correctness over heavy-weight features
- disciplined context usage
- predictable and deterministic command execution
A full architectural walkthrough is available in design.md, covering system design, context management, and rationale for all major decisions.
Advanced, out‑of‑scope features (semantic search, multi‑file summarization, agent‑of‑thought loops, etc.) are intentionally omitted.
This project is released under the MIT License.
See the LICENSE file for details.