Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 67 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,46 +15,53 @@ OpenEvolve implements a comprehensive evolutionary coding system with:
- **Evolutionary Coding Agent**: LLM-guided evolution of entire code files (not just functions)
- **Distributed Controller Loop**: Asynchronous pipeline coordinating LLMs, evaluators, and databases
- **Program Database**: Storage and sampling of evolved programs with evaluation metrics
- **Prompt Sampling**: Context-rich prompts with past programs, scores, and problem descriptions
- **Prompt Sampling**: Context-rich prompts with past programs, scores, and problem descriptions
- **LLM Ensemble**: Multiple language models working together for code generation
- **Multi-objective Optimization**: Simultaneous optimization of multiple evaluation metrics
- **Checkpoint System**: Automatic saving and resuming of evolution state

#### 🔬 **Scientific Reproducibility**

- **Comprehensive Seeding**: Full deterministic reproduction with hash-based component isolation
- **Default Reproducibility**: Seed=42 by default for immediate reproducible results
- **Granular Control**: Per-component seeding for LLMs, database, and evaluation pipeline

#### 🤖 **Advanced LLM Integration**
#### 🤖 **Advanced LLM Integration**

- **Ensemble Sophistication**: Weighted model combinations with intelligent fallback strategies
- **Test-Time Compute**: Integration with [optillm](https://github.com/codelion/optillm) for Mixture of Agents (MoA) and enhanced reasoning
- **Universal API Support**: Works with any OpenAI-compatible endpoint (Anthropic, Google, local models)
- **Plugin Ecosystem**: Support for optillm plugins (readurls, executecode, z3_solver, etc.)

#### 🧬 **Evolution Algorithm Innovations**
- **MAP-Elites Implementation**: Quality-diversity algorithm for balanced exploration/exploitation

- **MAP-Elites Implementation**: Quality-diversity algorithm for balanced exploration/exploitation
- **Island-Based Evolution**: Multiple populations with periodic migration for diversity maintenance
- **Inspiration vs Performance**: Sophisticated prompt engineering separating top performers from diverse inspirations
- **Multi-Strategy Selection**: Elite, diverse, and exploratory program sampling strategies

#### 📊 **Evaluation & Feedback Systems**

- **Artifacts Side-Channel**: Capture build errors, profiling data, and execution feedback for LLM improvement
- **Cascade Evaluation**: Multi-stage testing with progressive complexity for efficient resource usage
- **LLM-Based Feedback**: Automated code quality assessment and reasoning capture
- **Comprehensive Error Handling**: Graceful recovery from evaluation failures with detailed diagnostics

#### 🌐 **Multi-Language & Platform Support**

- **Language Agnostic**: Python, Rust, R, Metal shaders, and more
- **Platform Optimization**: Apple Silicon GPU kernels, CUDA optimization, CPU-specific tuning
- **Framework Integration**: MLX, PyTorch, scientific computing libraries

#### 🔧 **Developer Experience & Tooling**

- **Real-Time Visualization**: Interactive web-based evolution tree viewer with performance analytics
- **Advanced CLI**: Rich command-line interface with checkpoint management and configuration override
- **Comprehensive Examples**: 12+ diverse examples spanning optimization, ML, systems programming, and scientific computing
- **Error Recovery**: Robust checkpoint loading with automatic fix for common serialization issues

#### 🚀 **Performance & Scalability**

- **Threaded Parallelism**: High-throughput asynchronous evaluation pipeline
- **Resource Management**: Memory limits, timeouts, and resource monitoring
- **Efficient Storage**: Optimized database with artifact management and cleanup policies
Expand All @@ -68,17 +75,20 @@ OpenEvolve orchestrates a sophisticated evolutionary pipeline:
### Core Evolution Loop

1. **Enhanced Prompt Sampler**: Creates rich prompts containing:
- Top-performing programs (for optimization guidance)

- Top-performing programs (for optimization guidance)
- Diverse inspiration programs (for creative exploration)
- Execution artifacts and error feedback
- Dynamic documentation fetching (via optillm plugins)

2. **Intelligent LLM Ensemble**:
2. **Intelligent LLM Ensemble**:

- Weighted model combinations for quality/speed tradeoffs
- Test-time compute techniques (MoA, chain-of-thought, reflection)
- Deterministic selection with comprehensive seeding

3. **Advanced Evaluator Pool**:

- Multi-stage cascade evaluation
- Artifact collection for detailed feedback
- LLM-based code quality assessment
Expand All @@ -95,6 +105,7 @@ OpenEvolve orchestrates a sophisticated evolutionary pipeline:
### Installation

To install natively, use:

```bash
git clone https://github.com/codelion/openevolve.git
cd openevolve
Expand All @@ -108,18 +119,18 @@ pip install -e .
OpenEvolve uses the OpenAI SDK, which means it works with any LLM provider that supports an OpenAI-compatible API:

1. **Set the API Key**: Export the `OPENAI_API_KEY` environment variable:

```bash
export OPENAI_API_KEY=your-api-key-here
```

2. **Using Alternative LLM Providers**:
2. **Using Alternative LLM Providers**:
- For providers other than OpenAI (e.g., Anthropic, Cohere, local models), update the `api_base` in your config.yaml:
```yaml
llm:
api_base: "https://your-provider-endpoint.com/v1"
```

3. **Maximum Flexibility with optillm**:
3. **Maximum Flexibility with optillm**:
- For advanced routing, rate limiting, or using multiple providers, we recommend [optillm](https://github.com/codelion/optillm)
- optillm acts as a proxy that can route requests to different LLMs based on your rules
- Simply point `api_base` to your optillm instance:
Expand All @@ -140,7 +151,7 @@ if not os.environ.get("OPENAI_API_KEY"):

# Initialize the system
evolve = OpenEvolve(
initial_program_path="path/to/initial_program.py",
initial_programs_paths=["path/to/initial_program.py"],
evaluation_file="path/to/evaluator.py",
config_path="path/to/config.yaml"
)
Expand Down Expand Up @@ -172,6 +183,7 @@ python openevolve-run.py path/to/initial_program.py path/to/evaluator.py \
```

When resuming from a checkpoint:

- The system loads all previously evolved programs and their metrics
- Checkpoint numbering continues from where it left off (e.g., if loaded from checkpoint_50, the next checkpoint will be checkpoint_60)
- All evolution state is preserved (best programs, feature maps, archives, etc.)
Expand Down Expand Up @@ -234,6 +246,7 @@ python scripts/visualizer.py --path examples/function_minimization/openevolve_ou
```

In the visualization UI, you can

- see the branching of your program evolution in a network visualization, with node radius chosen by the program fitness (= the currently selected metric),
- see the parent-child relationship of nodes and click through them in the sidebar (use the yellow locator icon in the sidebar to center the node in the graph),
- select the metric of interest (with the available metric choices depending on your data set),
Expand All @@ -246,6 +259,7 @@ In the visualization UI, you can
### Docker

You can also install and execute via Docker:

```bash
docker build -t openevolve .
docker run --rm -v $(pwd):/app --network="host" openevolve examples/function_minimization/initial_program.py examples/function_minimization/evaluator.py --config examples/function_minimization/config.yaml --iterations 1000
Expand All @@ -258,38 +272,39 @@ OpenEvolve is highly configurable with advanced options:
```yaml
# Example configuration showcasing advanced features
max_iterations: 1000
random_seed: 42 # Full reproducibility by default
random_seed: 42 # Full reproducibility by default

llm:
# Advanced ensemble configuration
models:
- name: "gemini-2.0-flash-lite"
weight: 0.7
- name: "moa&readurls-gemini-2.0-flash" # optillm test-time compute
- name: "moa&readurls-gemini-2.0-flash" # optillm test-time compute
weight: 0.3
temperature: 0.7

database:
# MAP-Elites configuration
population_size: 500
num_islands: 5 # Island-based evolution
num_islands: 5 # Island-based evolution
migration_interval: 20
feature_dimensions: ["score", "complexity"] # Quality-diversity features
feature_dimensions: ["score", "complexity"] # Quality-diversity features

evaluator:
# Advanced evaluation features
enable_artifacts: true # Capture execution feedback
cascade_evaluation: true # Multi-stage testing
use_llm_feedback: true # AI-based code quality assessment
enable_artifacts: true # Capture execution feedback
cascade_evaluation: true # Multi-stage testing
use_llm_feedback: true # AI-based code quality assessment

prompt:
# Sophisticated prompt engineering
num_top_programs: 3 # Performance examples
num_diverse_programs: 2 # Creative inspiration
include_artifacts: true # Execution feedback
num_top_programs: 3 # Performance examples
num_diverse_programs: 2 # Creative inspiration
include_artifacts: true # Execution feedback
```

Sample configuration files are available in the `configs/` directory:

- `default_config.yaml`: Comprehensive configuration with all available options
- `island_config_example.yaml`: Advanced island-based evolution setup

Expand Down Expand Up @@ -317,18 +332,23 @@ return EvaluationResult(
```

The next generation prompt will include:

```markdown
## Last Execution Output

### Stderr

SyntaxError: invalid syntax (line 15)

### Traceback

...
```

## Example: LLM Feedback

An example for an LLM artifact side channel is part of the default evaluation template, which ends with

```markdown
Return your evaluation as a JSON object with the following format:
{{
Expand All @@ -338,6 +358,7 @@ Return your evaluation as a JSON object with the following format:
"reasoning": "[brief explanation of scores]"
}}
```

The non-float values, in this case the "reasoning" key of the json response that the evaluator LLM generates, will be available within the next generation prompt.

### Configuration
Expand All @@ -351,7 +372,7 @@ evaluator:

prompt:
include_artifacts: true
max_artifact_bytes: 4096 # 4KB limit in prompts
max_artifact_bytes: 4096 # 4KB limit in prompts
artifact_security_filter: true
```

Expand All @@ -374,57 +395,76 @@ See the `examples/` directory for complete examples of using OpenEvolve on vario
### Mathematical Optimization

#### [Function Minimization](examples/function_minimization/)

A comprehensive example demonstrating evolution from random search to sophisticated simulated annealing.

#### [Circle Packing](examples/circle_packing/)

Our implementation of the circle packing problem. For the n=26 case, we achieve state-of-the-art results matching published benchmarks.

Below is the optimal packing found by OpenEvolve after 800 iterations:
<<<<<<< HEAD
Key features:

- Automatic generation of initial programs from benchmark tasks
- Evolution from simple linear models to complex mathematical expressions
- Evaluation on physics, chemistry, biology, and material science datasets
- # Competitive results compared to state-of-the-art symbolic regression methods
Below is the optimal packing found by OpenEvolve after 800 iterations:
> > > > > > > upstream/main

![circle-packing-result](https://github.com/user-attachments/assets/00100f9e-2ac3-445b-9266-0398b7174193)

### Advanced AI & LLM Integration

#### [Web Scraper with optillm](examples/web_scraper_optillm/)

Demonstrates integration with [optillm](https://github.com/codelion/optillm) for test-time compute optimization, including:

- **readurls plugin**: Automatic documentation fetching
- **Mixture of Agents (MoA)**: Multi-response synthesis for improved accuracy
- **Mixture of Agents (MoA)**: Multi-response synthesis for improved accuracy
- **Local model optimization**: Enhanced reasoning with smaller models

#### [LLM Prompt Optimization](examples/llm_prompt_optimazation/)

Evolving prompts themselves for better LLM performance, demonstrating self-improving AI systems.

### Systems & Performance Optimization

#### [MLX Metal Kernel Optimization](examples/mlx_metal_kernel_opt/)

Automated discovery of custom GPU kernels for Apple Silicon, achieving:

- **2-3x speedup** over baseline attention implementations
- **Hardware-aware optimizations** for unified memory architecture
- **Metal shader evolution** with numerical correctness validation

#### [Rust Adaptive Sort](examples/rust_adaptive_sort/)

Evolution of sorting algorithms that adapt to data patterns, showcasing OpenEvolve's language-agnostic capabilities.

### Scientific Computing & Discovery

#### [Symbolic Regression](examples/symbolic_regression/)

A comprehensive example demonstrating automated discovery of mathematical expressions from scientific datasets using the LLM-SRBench benchmark.

#### [R Robust Regression](examples/r_robust_regression/)

Developing robust regression methods resistant to outliers using R language support.

#### [Signal Processing](examples/signal_processing/)

Automated design of digital filters with superior performance characteristics.

### Web and Integration Examples

#### [Online Judge Programming](examples/online_judge_programming/)

Automated competitive programming solution generation with external evaluation systems.

#### [LM-Eval Integration](examples/lm_eval/)
Working with standard ML evaluation harnesses for automated benchmark improvement.


Working with standard ML evaluation harnesses for automated benchmark improvement.

## Preparing Your Own Problems

Expand All @@ -448,5 +488,3 @@ If you use OpenEvolve in your research, please cite:
url = {https://github.com/codelion/openevolve}
}
```


18 changes: 12 additions & 6 deletions configs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,29 @@ This directory contains configuration files for OpenEvolve with examples for dif
## Configuration Files

### `default_config.yaml`

The main configuration file containing all available options with sensible defaults. This file includes:

- Complete documentation for all configuration parameters
- Default values for all settings
- **Island-based evolution parameters** for proper evolutionary diversity

Use this file as a template for your own configurations.

### `island_config_example.yaml`

A practical example configuration demonstrating proper island-based evolution setup. Shows:

- Recommended island settings for most use cases
- Balanced migration parameters
- Complete working configuration

### `island_examples.yaml`

Multiple example configurations for different scenarios:

- **Maximum Diversity**: Many islands, frequent migration
- **Focused Exploration**: Few islands, rare migration
- **Focused Exploration**: Few islands, rare migration
- **Balanced Approach**: Default recommended settings
- **Quick Exploration**: Small-scale rapid testing
- **Large-Scale Evolution**: Complex optimization runs
Expand All @@ -34,9 +40,9 @@ The key new parameters for proper evolutionary diversity are:

```yaml
database:
num_islands: 5 # Number of separate populations
migration_interval: 50 # Migrate every N generations
migration_rate: 0.1 # Fraction of top programs to migrate
num_islands: 5 # Number of separate populations
migration_interval: 50 # Migrate every N generations
migration_rate: 0.1 # Fraction of top programs to migrate
```

### Parameter Guidelines
Expand Down Expand Up @@ -66,8 +72,8 @@ Then use with OpenEvolve:
```python
from openevolve import OpenEvolve
evolve = OpenEvolve(
initial_program_path="program.py",
evaluation_file="evaluator.py",
initial_program_paths=["program.py"],
evaluation_file="evaluator.py",
config_path="my_config.yaml"
)
```
Loading