Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
c701a9f
First LLM in terra baselines
Jul 21, 2025
518717e
Improve partitioning prompt
Jul 22, 2025
5ef679d
Better prompt
Jul 22, 2025
a84de49
Adapt other prompt
Jul 23, 2025
4a7c8e5
Improved delegation
Jul 23, 2025
ad81d37
Improve configuration file LLM
Jul 23, 2025
edfa078
Small fixes
Jul 23, 2025
45de420
Improved name of prompt files
Jul 23, 2025
0537647
Improved README
Jul 23, 2025
84233b4
Change config file name
Jul 23, 2025
7741f9a
Improved README
Jul 23, 2025
62c8d7d
remove old google models
Jul 23, 2025
4dc485e
Improved readme
Jul 23, 2025
c69502a
Parallel draft
Jul 23, 2025
f6b7d9f
Best partition draft
Jul 23, 2025
469c6dc
parallel version
Jul 24, 2025
2c0419f
aggregator
Jul 24, 2025
5aca57c
fix parallel
Jul 24, 2025
709a8ea
early stopping
Jul 24, 2025
832fe28
small fixes
Jul 24, 2025
4a71c6f
correct index
Jul 24, 2025
dde0406
README.md
Jul 24, 2025
82c647e
Improved README
Jul 24, 2025
bfc788e
Move file to baselines and remove files from terra
Jul 24, 2025
0dba335
Adk fix
Jul 25, 2025
dd290de
Add some functions in utils for the game
Jul 25, 2025
99cac27
First simplified LLM game version
Jul 25, 2025
ba444d0
Improve game and small fixes
Jul 29, 2025
ebbe066
Change partitioning prompt (multiple!) to avoid area not fuly covered
Jul 31, 2025
564b80f
Support big maps
Aug 2, 2025
c0982c6
Big maps improvement
Aug 4, 2025
b3d4727
Improved big maps
Aug 5, 2025
0dd5cdb
Merge branch 'leggedrobotics:master' into llm_moving
gioelemo Aug 5, 2025
3975f35
remove printing
Aug 5, 2025
e734cfb
Fix issue
Aug 18, 2025
60f602b
Remove old prompts
Aug 18, 2025
579b609
Add support for GPT 5
Aug 18, 2025
16dcef8
Add gpt 5 also for the game
Aug 19, 2025
a2a2a30
Fix README
Aug 19, 2025
80f365f
Add code for random partitioning
Aug 19, 2025
36c954d
remove file
Aug 19, 2025
fffa89c
fix issue
Aug 19, 2025
4c6bfd0
Use exclusive assignment for target in multiple partitions as option …
Aug 19, 2025
d197c3b
Remove old commented code
Aug 19, 2025
94f5758
cleaning part 1
Aug 19, 2025
6644d0a
cleaning
Aug 20, 2025
3242aaf
cleaning
Aug 20, 2025
99a9c03
Remove unused functions
Aug 20, 2025
3657aba
small fix
Aug 20, 2025
debed65
cleaning
Aug 20, 2025
a1a17a3
Add back pkl files
Aug 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
245 changes: 245 additions & 0 deletions llm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,245 @@
# Terra: Hybrid LLM and RL Approach

A sophisticated framework combining Large Language Models (LLMs) and Reinforcement Learning (RL) for intelligent map exploration and excavation tasks.

## 📋 Table of Contents

- [Overview](#overview)
- [Architecture](#architecture)
- [Installation](#installation)
- [Configuration](#configuration)
- [Usage](#usage)
- [Basic Usage](#basic-usage)
- [Running on Clusters](#running-on-clusters)
- [Supported Models](#supported-models)
- [Level Index Reference](#level-index-reference)
- [Prompts Documentation](#prompts-documentation)
- [Project Structure](#project-structure)
- [Contributing](#contributing)

## 🎯 Overview

This module implements a hybrid approach that leverages both LLMs and RL policies to efficiently process excavation tasks across partitioned maps. The system intelligently delegates between fast RL policies and more sophisticated LLM-based decision making based on the complexity of each partition.

## 🏗️ Architecture

![Hybrid VLM Architecture](assets/VLM_Schema.png)

The system operates in four main phases:

### 1. **Map Partitioning**
The map is divided into manageable sections using one of three methods:
- **Manual**: Direct specification of partition boundaries
- **Random**: Automated partitioning with configurable constraints
- Minimum width/height (percentage-based)
- Minimum target count per partition
- **LLM-based**: Intelligent partitioning using language models

### 2. **Partition Processing**
Each partition is managed by a master LLM agent that decides between:
- **RL Policy** (`delegate_to_RL`): Fast, pre-trained policy for routine tasks
- **LLM Policy** (`delegate_to_LLM`): Sophisticated decision-making for complex scenarios

### 3. **Synchronization**
- Global map updates after partition completion
- Cross-partition information synchronization
- Periodic re-evaluation of delegation strategy

### 4. **Iteration**
- Process continues until map completion
- Automatic progression to next map with fresh partitioning

## 🚀 Installation

### Prerequisites

1. Ensure Terra and Terra baselines are installed
2. Install additional dependencies from `environment_llm.yaml`:
```bash
conda env create -f environment_llm.yaml
conda activate terra-llm
```

### API Keys Setup

Export the API keys for your chosen model providers:

```bash
# Google Models (Gemini)
export GOOGLE_API_KEY="your-api-key-here"

# OpenAI Models (GPT, o3)
export OPENAI_API_KEY="your-api-key-here"

# Anthropic Models (Claude)
export ANTHROPIC_API_KEY="your-api-key-here"
```

## ⚙️ Configuration

The main configuration file is [`config_llm.yaml`](config_llm.yaml). Key parameters include:

- Partitioning strategy and constraints
- Model selection and API settings
- RL policy paths
- Iteration and synchronization intervals
- Logging and debugging options

Prompts can be customized by modifying files in the [`prompts`](prompts) folder.

## 📖 Usage

### Basic Usage

Run the main script with the following command:

```bash
DATASET_PATH=<path> DATASET_SIZE=<size> python -m llm.main_llm \
--model_name <model> \
--model_key <key> \
--num_timesteps <steps> \
-s <seed> \
-n <num_env> \
-run <policy_path> \
--level_index <index>
```

#### Parameters

| Parameter | Description |
|-----------|-------------|
| `DATASET_PATH` | Path to the Terra-generated map dataset |
| `DATASET_SIZE` | Number of maps in the dataset |
| `--model_name` | LLM model identifier (see [Supported Models](#supported-models)) |
| `--model_key` | Provider key: `gpt`, `gemini`, or `claude` |
| `--num_timesteps` | Maximum steps per episode |
| `-s` | Random seed for reproducibility |
| `-n` | Number of parallel environments |
| `-run` | Path to pre-trained RL policy |
| `--level_index` | Map difficulty level (see [Level Index](#level-index-reference)) |

### Running big maps
After generating big maps (currently tested only with 128x128) it is possible to run the code as described in the previous section.

> [!WARNING]
> Make sure to set the new map size in `config_llm.yaml` file and adapt the visualization in Terra `terra/viz/game/setting.py`. A value of 384 can be used for 128x128 (2x the value used for 64x64)!

### Running on Clusters

Ensure API keys are properly configured in your cluster environment. Consult your cluster's [documentation](https://scicomp.ethz.ch/wiki/Main_Page) for specific setup instructions.

For SLURM-based clusters (e.g., ETH Zürich Euler):

```bash
sbatch run_levels.slurm
```

Adapt the parameter of the script to the actual values.
This script will run the evaluation on different nodes for each levels (see mapping in a later section).

For random partitioning and choosing the best partition (with higher coverage) among all the trials it is possible to use

```bash
sbatch run_levels_random.slurm
```

where
--n_maps is the number of test map and
--n_partitions_per_map is the number of trials for the random partitions.

Note that in this script you need to run each level independently and also change the `config.yaml` file to use random partitions.

To aggregate the results you need to use the `result_aggregator.py` script to get the final benchmarks.


## 🤖 Supported Models

The framework supports multiple LLM providers through [LiteLLM](https://docs.litellm.ai/docs/):

### OpenAI
- `gpt-4o`
- `gpt-4.1`
- `gpt-5`
- `o4-mini`
- `o3`
- `o3-mini`

### Google
- `gemini-1.5-flash-latest`
- `gemini-2.0-flash`
- `gemini-2.5-pro`
- `gemini-2.5-flash`

### Anthropic
- `claude-3-haiku-20240307`
- `claude-3-7-sonnet-20250219`
- `claude-opus-4-20250514`
- `claude-sonnet-4-20250514`

For the latest supported models, refer to the [LiteLLM providers documentation](https://docs.litellm.ai/docs/providers).

## 📊 Level Index Reference

| Level Name | Index | Description |
|------------|-------|-------------|
| All levels | None | Run all available levels |
| Foundations | 0 | Basic excavation tasks |
| Single Trenches | 1 | Simple linear excavations |
| Double Trenches | 2 | Parallel excavation paths |
| Double Diagonal | 3 | Angled parallel paths |
| Triple Trenches | 4 | Complex parallel structures |
| Triple Diagonal | 5 | Advanced angled patterns |

## 📝 Prompts Documentation

The system uses three types of specialized prompts:

### 1. **Partitioning Agent**
- [`partitioning.txt`](prompts/partitioning.txt): Standard adaptive partitioning
- [`partitioning_exact.txt`](prompts/partitioning_exact.txt): Fixed excavator count (experimental)

### 2. **Delegation Agent**
- [`delegation_no_intervention.txt`](prompts/delegation_no_intervention.txt): Production-ready autonomous delegation
- [`delegation.txt`](prompts/delegation.txt): Experimental intervention mode (not fully tested)

### 3. **Excavator Agent**
- [`excavator_llm_simple.txt`](prompts/excavator_llm_simple.txt): System prompt for LLM excavator control
- [`excavator_action.txt`](prompts/excavator_action.txt): Context-aware status updates

## 📁 Project Structure

```
llm/
├── assets/ # Media and documentation assets
├── prompts/ # Customizable prompt templates
│ ├── delegation_no_intervention.txt
│ ├── delegation.txt
│ ├── excavator_action.txt
│ ├── excavator_llm_simple.txt
│ ├── partitioning_exact.txt
│ └── partitioning.txt
├── __init__.py
├── config_llm.yaml # Main configuration file
├── env_llm.py # Individual environment management
├── env_manager_llm.py # Global environment orchestration
├── eval_llm.py # Benchmarking utilities
├── main_llm.py # Entry point
├── prompt_manager_llm.py # Prompt loading and management
├── session_manager_llm.py # LLM agent lifecycle management
└── utils_llm.py # Helper functions and utilities
```

## 🤝 Contributing

Found a bug or have a feature request? Please open an issue and tag @gioelemo. We welcome contributions that improve the framework's capabilities or documentation.

### Development Guidelines

1. Follow existing code style and conventions
2. Add tests for new functionality
3. Update documentation as needed
4. Submit pull requests with clear descriptions

---

**Note**: This is an active research project. Performance may vary based on model selection and task complexity.
Empty file added llm/__init__.py
Empty file.
Loading