Autonomous ML Agent 🤖

An end-to-end autonomous machine learning agent that ingests datasets, evaluates multiple models, and recommends the best predictor based on evaluation metrics.
This tool is designed for developers who want a quick, automated way to test and compare ML models without writing boilerplate code.

🎥 Demo Video

🚀 Features

Automated data preprocessing (handling missing values, scaling, encoding).
Supports multiple machine learning models out of the box.
Evaluation and ranking of models using standard metrics (accuracy, F1-score, RMSE, etc.).
Visualizations for performance comparison.
Modular design → easily extendable with new models or metrics.

🛠️ Tech Stack

Python 3.10+
scikit-learn – training & evaluation
pandas / numpy – data handling
matplotlib – visualization
OpenRouter - used to run LLM prompts and generate python scripts
E2B sandbox - execute generated python scripts

⚡ Quick Start

Follow these steps to set up and run the Autonomous ML Agent locally:

Clone the repository

git clone https://github.com/utkarshkumar7/Autonomous-ML-Agent.git
cd Autonomous-ML-Agent

Create a virtual environment

python -m venv venv
source venv/bin/activate   # On macOS/Linux
venv\Scripts\activate      # On Windows

Install required dependencies

pip install -r requirements.txt

Add your OpenRouter + E2B API keys in .env file (private)

"OPENROUTER_API_KEY" = #ADD YOUR API KEY
"E2B_API_KEY" = #ADD YOUR API KEY

Run the application locally!

# It will run on http://localhost:8501 by default
streamlit run main.py

🧭 Architecture & Process Flow

Below is the end-to-end flow for the Autonomous ML Agent. It uses an LLM (e.g., DeepSeek via OpenRouter) to generate Python code, executes it inside two isolated E2B sandboxes, and then produces both a metrics dataframe and a natural-language summary with a model leaderboard.

Flow Diagram

🧭 Architecture & Process Flow

flowchart TD
  U[User / CLI]
  REG[Agent Orchestrator]
  OR[OpenRouter LLM]
  GEN[LLM Code Generation]
  SB1[E2B Sandbox 1 - Data Cleaning]
  SB2[E2B Sandbox 2 - Modeling and Evaluation]
  ART[Artifacts: cleaned.csv, model_results_df, charts]
  SUM[LLM Summarizer]
  VIZ[Leaderboard and Report]

  U -->|dataset.csv + prediction column selection on UI| REG
  REG -->|called by llm_client.py with prompts.py input| OR
  OR --> GEN
  GEN -->|script JSON unpacked as .py script| REG

  REG -->|run cleaning script| SB1
  SB1 -->|cleaned.csv + logs| ART

  REG -->|run training script with cleaned_data.csv| SB2
  SB2 -->|model results dataframe as JSON object| ART

  REG -->|model_results_df| SUM
  SUM -->|natural language summary + insights + leaderboard| VIZ

  ART --> VIZ

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
e2b_executor.py		e2b_executor.py
llm_client.py		llm_client.py
main.py		main.py
prompts.py		prompts.py
requirements.txt		requirements.txt
test_environment.py		test_environment.py
ui_components.py		ui_components.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autonomous ML Agent 🤖

🎥 Demo Video

🚀 Features

🛠️ Tech Stack

⚡ Quick Start

🧭 Architecture & Process Flow

Flow Diagram

🧭 Architecture & Process Flow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autonomous ML Agent 🤖

🎥 Demo Video

🚀 Features

🛠️ Tech Stack

⚡ Quick Start

🧭 Architecture & Process Flow

Flow Diagram

🧭 Architecture & Process Flow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages