Skip to content

md-hameem/Autonomous-Deep-Research-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ”¬ Autonomous Deep Research Agent

Production-Grade Multi-Agent Research System

Python 3.11+ LangGraph Streamlit License: MIT

An AI-powered research assistant that autonomously investigates any topic using specialized agents, parallel web searches, and quality-controlled report generation.

Features β€’ Quick Start β€’ Architecture β€’ Usage β€’ API Reference


✨ Features

πŸ€– Multi-Agent Architecture

Four specialized AI agents work together:

  • Planner β€” Creates targeted search strategies
  • Researcher β€” Executes parallel web searches
  • Critic β€” Evaluates quality & completeness
  • Writer β€” Generates structured reports

⚑ High Performance

  • Parallel Execution β€” 3-5x faster research
  • Smart Caching β€” SQLite-based result caching
  • Async I/O β€” Non-blocking operations
  • Rate Limiting β€” Respects API limits

πŸ” Advanced Research

  • Multi-Provider Search β€” Tavily + Wikipedia + Serper
  • Quality Scoring β€” 1-10 relevance ratings
  • Fact Checking β€” Cross-reference validation
  • Iterative Refinement β€” Auto-improves weak results

🎨 Modern Interface

  • Streamlit Web UI β€” Beautiful dark theme
  • Real-time Streaming β€” Live progress updates
  • CLI Support β€” Full command-line interface
  • REST API β€” FastAPI with WebSocket

πŸš€ Quick Start

Prerequisites

Installation

# Clone the repository
git clone https://github.com/yourusername/autonomous-research-agent.git
cd autonomous-research-agent

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure API keys
cp .env.example .env
# Edit .env with your API keys

Run the Application

🎨 Web Interface

streamlit run app.py

Opens at http://localhost:8501

πŸ’» Command Line

python main.py "Your research topic"

🌐 API Server

uvicorn api.main:app --reload

Opens at http://localhost:8000


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        🎯 WORKFLOW ORCHESTRATOR                      β”‚
β”‚                    (LangGraph State Machine)                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β–Ό                           β–Ό                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ“‹ PLANNER   β”‚           β”‚ πŸ” RESEARCHER β”‚           β”‚  πŸ”¬ CRITIC    β”‚
β”‚               β”‚           β”‚               β”‚           β”‚               β”‚
β”‚ β€’ Analyze     β”‚     β”Œβ”€β”€β”€β”€β–Άβ”‚ β€’ Parallel    β”‚           β”‚ β€’ Score       β”‚
β”‚   topic       β”‚     β”‚     β”‚   search      β”‚           β”‚   quality     β”‚
β”‚ β€’ Generate    β”‚β”€β”€β”€β”€β”€β”˜     β”‚ β€’ Multi-      │──────────▢│ β€’ Check       β”‚
β”‚   queries     β”‚           β”‚   provider    β”‚           β”‚   coverage    β”‚
β”‚ β€’ Strategy    β”‚           β”‚ β€’ Rate &      β”‚           β”‚ β€’ Suggest     β”‚
β”‚   planning    β”‚           β”‚   cache       β”‚           β”‚   refinements β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                                                β”‚
                            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
                            β”‚  πŸ“ WRITER    β”‚β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚               β”‚
                            β”‚ β€’ Structure   β”‚
                            β”‚   report      β”‚
                            β”‚ β€’ Citations   β”‚
                            β”‚ β€’ Formatting  β”‚
                            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Workflow

  1. Planning β†’ Planner breaks topic into 3-5 targeted search queries
  2. Research β†’ Researcher executes queries in parallel via Tavily + Wikipedia
  3. Evaluation β†’ Critic scores quality (completeness, diversity, consistency)
  4. Refinement β†’ If score < 7/10, loops back with improvement suggestions
  5. Writing β†’ Writer compiles sources into structured markdown report

πŸ“ Project Structure

autonomous-research-agent/
β”‚
β”œβ”€β”€ 🎨 app.py                 # Streamlit Web UI
β”œβ”€β”€ πŸ’» main.py                # CLI Entry Point
β”œβ”€β”€ πŸ“¦ pyproject.toml         # Project configuration
β”‚
β”œβ”€β”€ src/                      # Core Package
β”‚   β”œβ”€β”€ config.py             # Configuration management
β”‚   β”œβ”€β”€ state.py              # State definitions
β”‚   β”œβ”€β”€ graph.py              # LangGraph workflow
β”‚   β”‚
β”‚   β”œβ”€β”€ agents/               # Specialized Agents
β”‚   β”‚   β”œβ”€β”€ base.py           # Base agent class
β”‚   β”‚   β”œβ”€β”€ planner.py        # Research planning
β”‚   β”‚   β”œβ”€β”€ researcher.py     # Parallel search
β”‚   β”‚   β”œβ”€β”€ critic.py         # Quality evaluation
β”‚   β”‚   └── writer.py         # Report generation
β”‚   β”‚
β”‚   └── tools/                # Utilities
β”‚       β”œβ”€β”€ search.py         # Search providers
β”‚       └── cache.py          # SQLite caching
β”‚
β”œβ”€β”€ api/                      # REST API
β”‚   └── main.py               # FastAPI + WebSocket
β”‚
β”œβ”€β”€ tests/                    # Test Suite
β”œβ”€β”€ reports/                  # Generated Reports
└── data/                     # Cache Storage

πŸ’» Usage

Web Interface

The Streamlit UI provides the most user-friendly experience:

streamlit run app.py

Features:

  • πŸŒ™ Modern dark theme with glassmorphism
  • πŸ“Š Real-time quality metrics
  • πŸ“‹ Live agent activity log
  • πŸ“₯ One-click report download

Command Line

# Basic usage
python main.py "Impact of quantum computing on cryptography"

# With options
python main.py --output ./my_reports --max-revisions 3 "AI in healthcare"

Options:

Flag Description Default
--output, -o Output directory reports/
--max-revisions, -r Max refinement loops 2

Programmatic API

from src.graph import run_research

# Run research
result = run_research("Climate change mitigation strategies")

# Access results
print(result["final_report"])
print(f"Quality: {result['quality_report']['overall_score']}/10")
print(f"Sources: {len(result['sources'])}")

🌐 API Reference

REST Endpoints

Method Endpoint Description
POST /api/research/start Start new research session
GET /api/research/{id} Get session status
POST /api/research/{id}/approve Approve research plan
GET /api/research/{id}/report Get final report

WebSocket

const ws = new WebSocket('ws://localhost:8000/ws/research/{session_id}');

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  // Types: 'message', 'status', 'plan', 'quality', 'complete'
  console.log(data.type, data.content);
};

βš™οΈ Configuration

Environment Variables

Variable Description Required
TAVILY_API_KEY Tavily search API βœ…
ANTHROPIC_API_KEY Claude API key One of these
OPENAI_API_KEY OpenAI API key required
LLM_PROVIDER anthropic or openai Default: anthropic
SERPER_API_KEY Google Search (optional) ❌

Advanced Configuration

from src.config import get_config

config = get_config()

# Search settings
config.search.max_results_per_query = 10
config.search.max_parallel_searches = 8

# Quality thresholds
config.quality.min_quality_score = 8.0
config.quality.max_refinement_iterations = 3

# Cache settings
config.cache.ttl_hours = 48

πŸ§ͺ Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=src --cov-report=html

🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❀️ using LangGraph, Streamlit, and Tavily

⭐ Star this repo if you find it useful!

About

An AI-powered research assistant that autonomously investigates any topic using specialized agents, parallel web searches, and quality-controlled report generation.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages