SakanaAI · Shivamkak19 · Jan 20, 2025 · Jan 21, 2025 · Jan 21, 2025 · Jan 25, 2025
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "templates/amp/DeepMimic"]
+	path = templates/amp/DeepMimic
+	url = https://github.com/Shivamkak19/DeepMimic.git
diff --git a/templates/amp/DeepMimic b/templates/amp/DeepMimic
diff --git a/templates/amp/README.md b/templates/amp/README.md
@@ -0,0 +1,107 @@
+# AMP Template Setup Guide
+
+## Overview
+This guide provides basic details on the AMP algorithm. Additionally, it provides step-by-step instructions for setting up and running the AI Scientist with the AMP template and DeepMimic integration.
+
+## What Is AMP?
+Adversarial Motion Priors ```(Peng et al., 2021)``` presents an unsupervised reinforcement learning approach to character animation based on learning from unstructured motion data to cast natural behaviors on simulated characters.
+
+Paper Website available here:
+```
+https://xbpeng.github.io/projects/AMP/index.html
+```
+
+The paper was released with the ```DeepMimic``` library as a framework for training AMP agents. This template for the AI-Scientist allows users to experiment with modifications to the base AMP algorithm within the DeepMimic library.
+
+```DeepMimic``` requires a somewhat complicated build process, so I wrote a bash script ```DeepMimic/auto_setup.sh``` that handles the entire setup process.
+
+The ```experiment.py``` file implements a simple training run on an AMP agent for 3 different motion files:
+```
+"DeepMimic/data/motions/humanoid3d_walk.txt"
+"DeepMimic/data/motions/humanoid3d_jog.txt"
+"DeepMimic/data/motions/humanoid3d_run.txt"
+```
+
+Anothe popular (and more recent) option for experimenting with AMP is through the [ProtoMotions](https://github.com/NVlabs/ProtoMotions) Library, which uses NVIDIA's IsaacGym as a backbone. For this reason, I decided to go with DeepMimic as a more light-weight alternative that still allows users to test and evaluate experimental conditions on the base AMP algorithm. 
+
+Please follow the section below for specific setup instructions, and please see ```templates/amp/examples/``` for example paper generations. Note, that the Semantic Scholar API was not used for any of these generations, as I am on the waiting list for an API key.
+
+I generated the given example papers on a "fresh-out-the-box" A100 (40 GB SXM4) on Lambda Labs by followings the instructions as indicated in [Step-by-Step Setup Instructions](#setup-instructions).
+
+## Prerequisites
+Before beginning the setup process, ensure that you have Miniconda3 installed on your system at ```/home/ubuntu/miniconda3```. If it is not already installed, it will be handled by the ```DeepMimic/auto_setup.sh``` script automatically. This is important because this path is used for building the python wrapper of DeepMimic in
+```DeepMimic/DeepMimicCore/Makefile.auto```.
+
+<a id="setup-instructions"></a>
+## Step-by-Step Setup Instructions
+
+
+### Global Environment Setup
+```bash
+# Create and activate a new conda environment
+conda create -n ai_scientist python=3.11
+conda activate ai_scientist
+
+# Install LaTeX dependencies
+sudo apt-get install texlive-full
+
+# Install required Python packages from AI-Scientist root
+pip install -r requirements.txt
+```
+
+### DeepMimic Configuration
+```bash
+# Initialize and update the DeepMimic submodule
+git submodule update --init
+
+# Navigate to DeepMimic directory
+cd templates/amp/DeepMimic
+
+# Build the Python wrapper for DeepMimicCore
+bash auto_setup.sh
+
+# Make sure Conda was exported to PATH if installed through auto_setup.sh
+PATH ="/home/ubuntu/miniconda3:$PATH"
+echo 'export PATH="/home/ubuntu/miniconda3:$PATH"' >> ~/.bashrc
+source ~/.bashrc
+```
+
+
+### Running Experiments
+```bash
+# Move to the AMP template directory
+cd ../
+
+# Execute the experiment
+python experiment.py
+
+# Generate visualization plots
+python plot.py
+```
+
+### Launching AI Scientist
+```bash
+# Go to AI-Scientist Root Directory
+cd ../../
+
+# Ensure you're in the ai_scientist environment
+conda activate ai_scientist
+
+# Launch the AI Scientist with specified parameters
+python launch_scientist.py --model "gpt-4o-2024-05-13" --experiment amp --num-ideas 2
+
+python launch_scientist.py --model "claude-3-5-sonnet-20241022" --experiment amp --num-ideas 2
+```
+
+## Relevant Directory Subset
+```
+AI-Scientist/
+├── launch_scientist.py
+├── requirements.txt
+├── templates/
+│   └── amp/
+│       ├── DeepMimic/
+│       │   └── auto_setup.sh
+│       ├── experiment.py
+│       └── plot.py
+```
diff --git a/templates/amp/environment_handler.py b/templates/amp/environment_handler.py
@@ -0,0 +1,27 @@
+import subprocess
+import os
+import sys
+import atexit
+
+def run_in_conda_env(env_name):
+    """
+    Re-run the current script in the specified conda environment
+    """
+    conda_path = os.environ.get('CONDA_EXE', 'conda')
+    current_env = os.environ.get('CONDA_DEFAULT_ENV')
+    print("Current environment:", current_env)
+
+    if current_env != env_name:
+        script_path = os.path.abspath(sys.argv[0])
+        script_args = ' '.join(sys.argv[1:])
+        cmd = f'"{conda_path}" run -n {env_name} python "{script_path}" {script_args}'
+
+        print(f"Switching to {env_name} environment...")
+        try:
+            process = subprocess.run(cmd, shell=True, check=True)
+            sys.exit(process.returncode)
+        except subprocess.CalledProcessError as e:
+            print(f"Error running script in {env_name}: {str(e)}")
+            sys.exit(1)
+        finally:
+            print(f"Switching back to {current_env} environment...")
diff --git a/templates/amp/examples/generation1/graph.png b/templates/amp/examples/generation1/graph.png
diff --git a/templates/amp/examples/generation1/ideas.json b/templates/amp/examples/generation1/ideas.json
@@ -0,0 +1,47 @@
+[
+    {
+        "Name": "adaptive_reward_weighting",
+        "Title": "Dynamic Loss Balancing in AMP: Adaptive Reward Weighting for Improved Motion Imitation",
+        "Experiment": "Implement an adaptive reward weighting system that dynamically adjusts the balance between task and style rewards based on their relative magnitudes during training. The weights should be updated using a moving average of loss ratios. Compare pose error and task performance against the baseline AMP implementation.",
+        "Interestingness": 7,
+        "Feasibility": 8,
+        "Novelty": 6,
+        "novel": true
+    },
+    {
+        "Name": "hierarchical_discriminator",
+        "Title": "Multi-Scale Motion Assessment: Hierarchical Discriminators for Natural Movement Generation",
+        "Experiment": "Modify the AMP discriminator to use a hierarchical architecture with both local and global motion discriminators. The local discriminator focuses on frame-level features while the global discriminator assesses longer temporal sequences. Compare the quality of generated motions against the baseline single discriminator approach.",
+        "Interestingness": 8,
+        "Feasibility": 7,
+        "Novelty": 7,
+        "novel": true
+    },
+    {
+        "Name": "dataset_specialization",
+        "Title": "Targeted Motion Priors: Investigating Dataset Specialization in AMP",
+        "Experiment": "Compare the performance of AMP when trained on specialized motion datasets (e.g., HumanEva focused on basic locomotion) versus general motion capture collections (e.g., full AMASS dataset). Evaluate pose error and motion naturalness for specific tasks like walking.",
+        "Interestingness": 6,
+        "Feasibility": 9,
+        "Novelty": 5,
+        "novel": true
+    },
+    {
+        "Name": "momentum_features",
+        "Title": "Physics-Aware Motion Priors: Velocity Feature Enhancement for Natural Movement Synthesis",
+        "Experiment": "Modify calc_pose_error() to compute joint velocities as frame-to-frame differences. Add velocity_error component weighted at 0.3 relative to existing pose error terms. Extend discriminator input to include joint velocity features. Compare against baseline using both pose error and new velocity error metric. Evaluate on the full range of motions, with particular focus on transitions between different movement types.",
+        "Interestingness": 8,
+        "Feasibility": 8,
+        "Novelty": 8,
+        "novel": true
+    },
+    {
+        "Name": "progressive_discriminator",
+        "Title": "Progressive Growing of Discriminators for Refined Motion Imitation",
+        "Experiment": "Implement two-stage discriminator growth: Start with 2 layers of 512 units, then expand to full 1024 units when disc_reward_mean stabilizes (variance < threshold over 1000 steps). Modify AMPAgent to track reward stability and handle architecture transition. Compare training curves, final pose error, and motion quality against baseline. Analyze impact on early training stability and final motion precision.",
+        "Interestingness": 8,
+        "Feasibility": 8,
+        "Novelty": 8,
+        "novel": true
+    }
+]
diff --git a/templates/amp/examples/generation1/paper.pdf b/templates/amp/examples/generation1/paper.pdf
diff --git a/templates/amp/examples/generation1/review.txt b/templates/amp/examples/generation1/review.txt
@@ -0,0 +1,5 @@
+Processing report1.pdf...
+[========================================] (7/7)
+4
+Reject
+['The method relies on manually tuned motion-specific bounds and hyperparameters (temperature, decay rate), which may introduce biases and limit its applicability to other motion types or domains.', 'Certain sections of the paper, particularly the experimental setup and the autoencoder aggregator, lack clarity, making it difficult to fully evaluate the robustness and reproducibility of the results.', 'The paper does not adequately address potential negative societal impacts or provide detailed qualitative analyses to support the quantitative results.', 'The novelty of the proposed approach may be limited as it builds on existing dynamic weighting techniques in a somewhat incremental manner.', 'The experimental results may not be generalizable due to the limited scope of motion types tested (only walking, jogging, and running).', 'The paper lacks a thorough comparison with a wider range of state-of-the-art methods in the field, limiting the understanding of its relative performance.']
diff --git a/templates/amp/examples/generation2/graph.png b/templates/amp/examples/generation2/graph.png
diff --git a/templates/amp/examples/generation2/ideas.json b/templates/amp/examples/generation2/ideas.json
@@ -0,0 +1,47 @@
+[
+    {
+        "Name": "adaptive_reward_weighting",
+        "Title": "Dynamic Loss Balancing in AMP: Adaptive Reward Weighting for Improved Motion Imitation",
+        "Experiment": "Implement an adaptive reward weighting system that dynamically adjusts the balance between task and style rewards based on their relative magnitudes during training. The weights should be updated using a moving average of loss ratios. Compare pose error and task performance against the baseline AMP implementation.",
+        "Interestingness": 7,
+        "Feasibility": 8,
+        "Novelty": 6,
+        "novel": true
+    },
+    {
+        "Name": "hierarchical_discriminator",
+        "Title": "Multi-Scale Motion Assessment: Hierarchical Discriminators for Natural Movement Generation",
+        "Experiment": "Modify the AMP discriminator to use a hierarchical architecture with both local and global motion discriminators. The local discriminator focuses on frame-level features while the global discriminator assesses longer temporal sequences. Compare the quality of generated motions against the baseline single discriminator approach.",
+        "Interestingness": 8,
+        "Feasibility": 7,
+        "Novelty": 7,
+        "novel": true
+    },
+    {
+        "Name": "dataset_specialization",
+        "Title": "Targeted Motion Priors: Investigating Dataset Specialization in AMP",
+        "Experiment": "Compare the performance of AMP when trained on specialized motion datasets (e.g., HumanEva focused on basic locomotion) versus general motion capture collections (e.g., full AMASS dataset). Evaluate pose error and motion naturalness for specific tasks like walking.",
+        "Interestingness": 6,
+        "Feasibility": 9,
+        "Novelty": 5,
+        "novel": true
+    },
+    {
+        "Name": "temporal_motion_discriminator",
+        "Title": "Learning Natural Motion Dynamics: Temporal Discriminators for Physics-Based Character Animation",
+        "Experiment": "Implement a simplified temporal discriminator using a fixed 15-frame window. Modifications: 1) Create a sequence buffer that maintains recent state history 2) Add two 1D conv layers (kernel size 5, 64 channels) followed by the existing fully connected layers 3) Compute temporal features: joint velocities and accelerations over the window. Compare against baseline using: a) average joint velocity consistency b) pose error c) discriminator loss convergence. Focus on walking motion as primary test case.",
+        "Interestingness": 9,
+        "Feasibility": 8,
+        "Novelty": 8,
+        "novel": true
+    },
+    {
+        "Name": "curriculum_adversarial_training",
+        "Title": "Balanced Curriculum Learning for Motion Imitation Networks",
+        "Experiment": "Modify AMPAgent to implement dynamic learning rate adjustment: 1) Track discriminator accuracy using average output probabilities for real/fake samples 2) When discriminator accuracy exceeds 0.8, reduce policy learning rate by 50% and increase discriminator learning rate by 50% 3) When discriminator accuracy drops below 0.6, do the opposite 4) Compare against baseline using: time to convergence, final pose error, and learning stability measured by reward variance. Test on walk, run, and jump motions to verify generalization",
+        "Interestingness": 8,
+        "Feasibility": 9,
+        "Novelty": 7,
+        "novel": true
+    }
+]
diff --git a/templates/amp/examples/generation2/paper.pdf b/templates/amp/examples/generation2/paper.pdf
diff --git a/templates/amp/examples/generation2/review.txt b/templates/amp/examples/generation2/review.txt
@@ -0,0 +1,5 @@
+Processing report2.pdf...
+[========================================] (7/7)
+5
+Reject
+['The methodology lacks detailed explanations, particularly around the implementation specifics and dynamic weight adaptation constraints.', 'The evaluation is limited to three motion types and does not explore other motion styles or more complex scenarios.', 'The approach relies heavily on empirical tuning of parameters, which may limit its generalizability.', 'The clarity of the presentation could be improved, particularly in the description of the reward tracking and weight adjustment mechanisms.']
diff --git a/templates/amp/examples/generation3/graph.png b/templates/amp/examples/generation3/graph.png
diff --git a/templates/amp/examples/generation3/ideas.json b/templates/amp/examples/generation3/ideas.json
@@ -0,0 +1,29 @@
+[
+    {
+        "Name": "adaptive_reward_weighting",
+        "Title": "Dynamic Loss Balancing in AMP: Adaptive Reward Weighting for Improved Motion Imitation",
+        "Experiment": "Implement an adaptive reward weighting system that dynamically adjusts the balance between task and style rewards based on their relative magnitudes during training. The weights should be updated using a moving average of loss ratios. Compare pose error and task performance against the baseline AMP implementation.",
+        "Interestingness": 7,
+        "Feasibility": 8,
+        "Novelty": 6,
+        "novel": false
+    },
+    {
+        "Name": "hierarchical_discriminator",
+        "Title": "Multi-Scale Motion Assessment: Hierarchical Discriminators for Natural Movement Generation",
+        "Experiment": "Modify the AMP discriminator to use a hierarchical architecture with both local and global motion discriminators. The local discriminator focuses on frame-level features while the global discriminator assesses longer temporal sequences. Compare the quality of generated motions against the baseline single discriminator approach.",
+        "Interestingness": 8,
+        "Feasibility": 7,
+        "Novelty": 7,
+        "novel": false
+    },
+    {
+        "Name": "dataset_specialization",
+        "Title": "Targeted Motion Priors: Investigating Dataset Specialization in AMP",
+        "Experiment": "Compare the performance of AMP when trained on specialized motion datasets (e.g., HumanEva focused on basic locomotion) versus general motion capture collections (e.g., full AMASS dataset). Evaluate pose error and motion naturalness for specific tasks like walking.",
+        "Interestingness": 6,
+        "Feasibility": 9,
+        "Novelty": 5,
+        "novel": false
+    }
+]
diff --git a/templates/amp/examples/generation3/paper.pdf b/templates/amp/examples/generation3/paper.pdf
diff --git a/templates/amp/examples/generation3/review.txt b/templates/amp/examples/generation3/review.txt
@@ -0,0 +1,5 @@
+Processing report3.pdf...
+[========================================] (8/8)
+5
+Reject
+['The paper lacks sufficient details in the methodology section, particularly concerning the training process and the architecture of the discriminators.', 'The ablation studies, while useful, could be expanded to explore more variations and provide a deeper understanding of the proposed method.', 'The computational overhead and the need for manual tuning of window sizes are significant limitations that are not comprehensively addressed.', 'The novelty of hierarchical discriminators in adversarial learning is limited; the paper does not sufficiently differentiate its contributions from existing work.', 'The choice of window sizes seems arbitrary, with no strong theoretical justification provided.', 'The paper is dense and difficult to follow in parts, especially in the description of the hierarchical discriminator architecture and the training process.', 'Jogging performance remains problematic, and the paper does not sufficiently address this issue.']
diff --git a/templates/amp/examples/generation4/graph.png b/templates/amp/examples/generation4/graph.png
diff --git a/templates/amp/examples/generation4/ideas.json b/templates/amp/examples/generation4/ideas.json
@@ -0,0 +1,29 @@
+[
+    {
+        "Name": "adaptive_reward_weighting",
+        "Title": "Dynamic Loss Balancing in AMP: Adaptive Reward Weighting for Improved Motion Imitation",
+        "Experiment": "Implement an adaptive reward weighting system that dynamically adjusts the balance between task and style rewards based on their relative magnitudes during training. The weights should be updated using a moving average of loss ratios. Compare pose error and task performance against the baseline AMP implementation.",
+        "Interestingness": 7,
+        "Feasibility": 8,
+        "Novelty": 6,
+        "novel": false
+    },
+    {
+        "Name": "hierarchical_discriminator",
+        "Title": "Multi-Scale Motion Assessment: Hierarchical Discriminators for Natural Movement Generation",
+        "Experiment": "Modify the AMP discriminator to use a hierarchical architecture with both local and global motion discriminators. The local discriminator focuses on frame-level features while the global discriminator assesses longer temporal sequences. Compare the quality of generated motions against the baseline single discriminator approach.",
+        "Interestingness": 8,
+        "Feasibility": 7,
+        "Novelty": 7,
+        "novel": false
+    },
+    {
+        "Name": "dataset_specialization",
+        "Title": "Targeted Motion Priors: Investigating Dataset Specialization in AMP",
+        "Experiment": "Compare the performance of AMP when trained on specialized motion datasets (e.g., HumanEva focused on basic locomotion) versus general motion capture collections (e.g., full AMASS dataset). Evaluate pose error and motion naturalness for specific tasks like walking.",
+        "Interestingness": 6,
+        "Feasibility": 9,
+        "Novelty": 5,
+        "novel": false
+    }
+]
diff --git a/templates/amp/examples/generation4/paper.pdf b/templates/amp/examples/generation4/paper.pdf
diff --git a/templates/amp/examples/generation4/review.txt b/templates/amp/examples/generation4/review.txt
@@ -0,0 +1,5 @@
+Processing report4.pdf...
+[========================================] (7/7)
+4
+Reject
+['The scope is somewhat narrow, focusing primarily on locomotion tasks. It would be beneficial to extend the study to other types of motions or tasks.', 'The methodology section lacks sufficient detail, particularly in the description of the autoencoder aggregator and its role in the framework.', 'The evaluation metrics and their links to the research questions need to be more clearly defined.', 'The paper lacks a comprehensive ablation study to thoroughly investigate the impact of different components of the proposed approach.', 'The experiments are conducted within a specific framework and setup, which may limit the generalizability of the findings.', 'The training duration (10,000 steps) might not be sufficient to observe long-term effects.', 'More theoretical insights into why certain motions interfere with each other would strengthen the claims.']
diff --git a/templates/amp/examples/generation5/graph.png b/templates/amp/examples/generation5/graph.png