Welcome to Curie! This tutorial will walk you through the complete process of using Curie for automated machine learning (ML) experimentation. We'll cover everything from setup to running experiments and analyzing results.
-
First, install Docker from here if you haven't already.
sudo chmod 666 /var/run/docker.sock docker ps # Verify Docker installation -
Install Curie using pip:
pip install curie-ai
-
Verify the installation:
python -c "import curie; print(curie.__version__)"
We support all kinds of API key providers, but please let us know if you encounter API setup issues here.
key_dict = {
"MODEL": "claude-3-7-sonnet-20250219",
"ANTHROPIC_API_KEY": "your-anthropic-key",
# "MODEL": 'openai/gpt-4o-mini',
# OPENAI_API_KEY: "your-openai-key",
}Here we just download MNIST dataset to /data as an example:
sudo mkdir -p /data
wget https://raw.githubusercontent.com/fgnt/mnist/master/train-images-idx3-ubyte.gz
wget https://raw.githubusercontent.com/fgnt/mnist/master/train-labels-idx1-ubyte.gz
wget https://raw.githubusercontent.com/fgnt/mnist/master/t10k-images-idx3-ubyte.gz
wget https://raw.githubusercontent.com/fgnt/mnist/master/t10k-labels-idx1-ubyte.gz
sudo mv *gz /data
sudo gunzip /data/*.gz
Generally question input are divided into two parts, the first part is your question, which includes the idea, task, intro of the network structure and other general requirements. The seconde part will be the code instructions, which includes the description of the function for important files, output format and other detailed
code implementation details. The code instruction part will be enclosed by an opening tag <CODE_INSTRUCTION> and a closing tag </CODE_INSTRUCTION>. Please refer to the sample question regarding how to seperate question and code instruction properly in sample_question.txt under folder docs.
You should provide as many details as you can to your question.
import curie
result = curie.experiment(
api_keys=key_dict,
question="What is the best model among Logistic Regression, MLP, and CNN for my MNIST dataset?",
dataset_dir="/data" # absolute path to the dataset
)-
Work with your starter code:
- Starter code: You can prepare the starter code to let Curie work on top of. This is very important if your dataset needs specialize data loader.
/abs/path/starter_code/ ├── train.py # Python script for training └── description.md # instructions that highlight how to run your experiments. # Please explicitly name it as `description.md` or `README.md`
import curie result = curie.experiment( api_keys=key_dict, question="Among Logistic Regression, MLP, and CNN, which model achieves the highest prediction accuracy on my MNIST dataset?, dataset_dir="`data_loader.py` will assit you load the dataset.", codebase_dir="/abs/path/starter_code/", # Change this to the path of your starter code code_instructions="", max_global_steps=50, # control your compute budget )
-
Provide with your research paper: To provide more context for Curie, you can mention the necessary paper (
txt,pdf, ...) in the question. Please put your paper under the same directory of your starter code.
-
If you are using
AWS bedrockAPI, please give permission to model'amazon.titan-embed-text-v2:0'/abs/path/starter_code/ ├── train.py # Python script for training └── paper.pdf # Research paper detailing the approach
import curie result = curie.experiment( api_keys=key_dict, question="Refer to the evaluation setup in `paper.pdf`. Among Logistic Regression, MLP, and CNN, which model achieves the highest prediction accuracy on my MNIST dataset?", dataset_dir="/data", codebase_dir="/abs/path/starter_code", # Change this to the path of your starter code max_global_steps=50, # control your compute budget )
-
Provide with your own complex environment You can provide your own environment by providing an environment requirements file or pre-configuring a
micromamba/miniconda. This allows you to specify exact package versions and dependencies needed for your research. This is important to save time for Curie to figure out the dependencies by herself.- Option 1: Put your environment requirements file
requirements.txtunder thecodebase_dir:Or you can specify separately:/abs/path/starter_code/ ├── train.py # Python script for training └── requirements.txt # including the `package==version`
result = curie.experiment(api_keys=key_dict, question="How does the choice of sorting algorithm impact runtime performance across different input distributions?", env_requirements='/abs/path/requirements.txt')
- Option 2: You can pre-configure your environment and name it as
venvand put under your starter_code:starter_code/ ├── venv/ # exactly named as `venv` └── ... # the rest of your codebase
- Option 1: Put your environment requirements file
-
Generate a experiment report in the middle of Curie's experimentation process
If you’d like to monitor progress partway through Curie’s experimentation—or if the experiment wasn’t run end-to-end—you can still generate a report from the available data:
curie.generate_report(api_keys=key_dict,
log_dir='/abs/path/logs/research_20250605231023_iter1/',
workspace_dir='/abs/path/workspace/')-
Customize the agent to your workload. Each agent and experiment stage is coupled with a system prompt, which you can fine-tune in order to let Curie understand your context better.
import curie task_config = { "supervisor_system_prompt_filename": "/home/ubuntu/prompt.txt", # # "control_worker_system_prompt_filename": "/path/to/your/new/prompt", # "patcher_system_prompt_filename": "/path/to/your/new/prompt", # "llm_verifier_system_prompt_filename": "/path/to/your/new/prompt", # "coding_prompt_filename": "/path/to/your/new/prompt", # "worker_system_prompt_filename": "/path/to/your/new/prompt", } result = curie.experiment( api_keys=key_dict, question="Among Logistic Regression, MLP, and CNN, which model achieves the highest prediction accuracy on my MNIST dataset?", dataset_dir="/data", max_global_steps=50, # control your compute budget task_config=task_config, )