Skip to content

Latest commit

ย 

History

History
292 lines (223 loc) ยท 12.3 KB

File metadata and controls

292 lines (223 loc) ยท 12.3 KB
title NEXON-AI
emoji ๐Ÿ›ก๏ธ
colorFrom blue
colorTo indigo
sdk docker
app_port 7860
pinned false

NEXUS-AI ๐ŸŒ๐Ÿ›ก๏ธ

Autonomous Incident Investigation Dashboard

Python FastAPI React Tailwind Ollama

Status: Active Simulation Pipeline
Architecture: Real-time WebSockets + Multi-Agent Consensus


๐Ÿ“– What is NEXUS-AI?

NEXUS is a next-generation, autonomous dual-agent environment designed to investigate and validate software incidents in real-time. Using a combination of an Investigator and a Validator agent, NEXUS autonomously forms hypotheses, executes systems tools, evaluates system behavior, and reaches strict consensus on root causes.

Traditional manual debugging requires extensive context-switching and tool fatigue. NEXUS solves this through:

  1. Dual-Agent Autonomy: Two specialized models communicating word-by-word via WebSockets.
  2. Dynamic Tool Execution: Fully integrated system terminals allowing agents to run sandboxed validation scripts.
  3. Semantic Reward Engine: Evaluates conversational drift mathematically (using native GPU embeddings).

The result: An AI "Incident Response Team" that navigates servers, traces logs, and fixes bugs identically to a human SRE.


๐Ÿ–ผ๏ธ Application Screenshots

๐Ÿ“Š Simulation Dashboard

The core command center. Features live agent terminals, a dual-communication consensus log, and a mathematical performance reward graph plotting investigation confidence.

Simulation Dashboard

๐ŸŽ›๏ธ Scenario Registry & Core Settings

The system is architected for instant adaptability โ€” seamlessly switch LLM providers and inject custom threat models entirely through the frontend DOM.

Scenario Browser
Scenario Registry
A persistent LocalStorage-backed grid of tactical simulations. Users can dynamically inject custom infrastructure-specific incidents directly into the agent pipeline.
Hardware Configuration
Runtime Configuration
Dynamically maps available locally-installed Ollama networks, allowing the user to pair models (e.g., Qwen vs Dolphin-Phi) with fully independent parameters.

๐Ÿ—๏ธ System Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    CLIENT BROWSER                               โ”‚
โ”‚          React SPA (Tailwind + Framer Motion)                   โ”‚
โ”‚          localhost:5173                                         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ”‚ HTTP (REST)                     โ”‚ ws://
            โ–ผ                                 โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              FASTAPI BACKEND (localhost:7860)                   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”‚
โ”‚  โ”‚ /config  โ”‚ โ”‚/scenariosโ”‚ โ”‚  /reset  โ”‚ โ”‚  ws:// Simulator โ”‚    โ”‚
โ”‚  โ”‚ Env Sync โ”‚ โ”‚ DB Cache โ”‚ โ”‚ Injectionโ”‚ โ”‚  Live Stream Syncโ”‚    โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ”‚                                   โ”‚
            โ–ผ                                   โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  OLLAMA ENGINE / LLM PIPELINE                   โ”‚
โ”‚  Agent A (Investigator)   โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ–บ   Agent B (Validator)        โ”‚
โ”‚  - Generates Hypotheses              - Challenges Assertions    โ”‚
โ”‚  - Runs System Tools                 - Requires Proof           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐ŸŒ Execution Environments

NEXUS-AI supports two distinct execution models for agent tools, toggleable via the Settings dashboard:

1. Simulated Mode (Safe Sandbox)

  • Default Mode: Agents interact with a pre-defined clue_map within the scenario YAML.
  • No System Impact: Commands like read_logs or check_service return mocked data.
  • Use Case: Training, logic validation, and "what-if" analysis without infrastructure risk.

2. SSH Lab Node (Real-World Execution)

  • Live Connection: Commands are executed in real-time on a remote Linux server via SSH.
  • Autonomous Terminal: Agents use the run_terminal_command tool to browse logs, check systemd status, and inspect real configs.
  • Security: Includes a command blocklist to prevent highly destructive operations (e.g., rm -rf /).
  • Use Case: Actual incident response on isolated Lab/Staging nodes.

๐Ÿ“ OpenEnv Specification

NEXUS-AI strictly adheres to the OpenEnv 1.0 standard for agent-environment interaction.

๐ŸŽฎ Action Space

The environment accepts a typed NexusAction (Text-based with structured tool calls).

  • agent_id: string ("agent_a" or "agent_b")
  • message: string (The natural language reasoning/communication)
  • tool_calls: List[ToolCall] (Optional structured calls like TOOL: read_logs(file='app.log'))
  • confidence: float (0.0 - 1.0)

๐Ÿง Observation Space

The environment returns a structured NexusObservation summarizing the system state.

  • scenario_description: string (High-level objective)
  • scenario_context: string (Background telemetry/environment info)
  • partner_message: string (The last message from the other agent)
  • tool_results: List[ToolResult] (Output of any executed system tools)
  • clues_found: List[string] (Accumulated evidence identified by the Reward Engine)
  • investigation_stage: string (investigating, narrowing, found, verified)
  • round: integer (Current episode round)
  • available_tools: List[string] (List of permitted tools for the current mode)

๐Ÿ“ Task Registry & Difficulty

Task Name Difficulty Objective Grader Method
software-incident Easy Fix Nginx 503 rate-limit misconfiguration State Check: nginx-proxy.rate_limit
business-process-failure Medium Resolve inventory stockout logic error State Check: stock_threshold + Red Herring Penalty
cascade-system-failure Hard Fix Postgres connection exhaustion Multi-Step: Query Termination + Config Update

๐Ÿ“ˆ Baseline Benchmarks

Validated using inference.py (Phi-3-mini & Qwen2.5-1.5B).

  • Software Incident: 0.88 / 1.00
  • Business Process Failure: 0.72 / 1.00
  • Cascade System Failure: 0.48 / 1.00

๐Ÿง  The AI Pipeline Deep-Dive

Step 1: Scenario Injection & Bootstrapping

# The EpisodeManager receives the frontend custom scenario JSON
# Broadcasts 'episode_start' natively over the WebSocket to synchronize the UI
await broadcast("episode_start", {
    "scenario": active_scenario,
    "agent_a_model": settings.AGENT_A_MODEL
})

Step 2: Agent Consensus Loop

# Agents interact sequentially. The Investigator attempts a solution
# while the Validator challenges it. Both agents have access to dynamic system execution.
client, model_name = model_manager.get_client(agent_id)
stream = await client.chat.completions.create(
    model=model_name,
    messages=injected_history,
    tools=available_tools, # e.g. fix_proposer, run_terminal_command
    stream=True
)

Step 3: Fast GPU Embeddings (Similarity Evaluation)

# Heavy CPU blocking is completely bypassed.
# Semantic embedding computations map strictly into the Ollama GPU pipeline.
@lru_cache(maxsize=256)
def get_embedding(text: str) -> List[float]:
    response = httpx.post("http://localhost:11434/api/embeddings", json={
        "model": "all-minilm",
        "prompt": text
    }, timeout=60.0)
    return response.json().get("embedding", [])

๐Ÿ› ๏ธ Full Technology Stack

Layer Technology Why
Frontend Framework React 18 (Vite) Lightning fast HMR, component isolation
Frontend Styling Tailwind CSS Utility-first tactical glassmorphism
Backend Framework FastAPI Async Python, explicit endpoint mapping
Transport Layer WebSockets Word-by-word streaming across UI boundaries
Local AI Engine Ollama Native device acceleration, absolute privacy
Remote Provider HuggingFace Inference API Drop-in SaaS alternatives
SSH Connectivity Paramiko Secure remote shell execution for Lab Nodes
Data Persistence LocalStorage & .env Injection Avoids over-architected SQL constraints

๐Ÿš€ How to Run This Project (Full Step-by-Step Guide)

๐Ÿ“‹ Prerequisites

  • Python 3.10+
  • Node.js 18+
  • Ollama (installed locally for model hosting)
  • Optional: A remote Linux VM (Ubuntu/Kali) with SSH enabled for Lab Node mode

1๏ธโƒฃ Backend Setup (FastAPI / Python)

cd backend

# Create and activate virtual environment
python -m venv venv
# source venv/bin/activate       # Linux/macOS
venv\Scripts\activate        # Windows

# Install all dependencies
pip install -r requirements.txt

Start the Backend Engine

# This exposes the core REST API and the WebSocket simulation tunnel
python main.py

2๏ธโƒฃ Frontend Setup (React)

Open a new terminal tab:

cd frontend

# Install Node.js dependencies
npm install

# Start the Vite development server
npm run dev

The application is now fully accessible at http://localhost:5173.


3๏ธโƒฃ Pulling Models

To run the simulation locally without cloud API keys, you must ensure you pull suitable reasoning models through Ollama:

ollama run qwen2.5:3b     # Excellent validator logic footprint
ollama run dolphin-llama3 # Uncensored investigative assertions
ollama pull all-minilm    # Mandatory for semantic similarity scoring

๐Ÿงช Automated Testing

NEXUS-AI includes a comprehensive test suite to ensure environment stability and specification compliance.

# Run the OpenEnv specification validator
python openenv_validator.py

# Run unit tests for core logic
pip install pytest
pytest tests/

๐Ÿค Authors

Developed by: Ashish Menon & Vector