Skip to content

Code for our paper "Reverse-Engineering Memory in DreamerV3: From Sparse Representations to Functional Circuits" (NeurIPS 2025, Spotlight at Mech Interp Workshop).

License

Notifications You must be signed in to change notification settings

Johnny1188/rl-memory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reverse-Engineering Memory in DreamerV3: From Sparse Representations to Functional Circuits

Jan Sobotka, Auke Ijspeert, Guillaume Bellegarda


This repository contains code for the project "Reverse-Engineering Memory in DreamerV3: From Sparse Representations to Functional Circuits". The project is focused on the analysis of memory representations in recurrent reinforcement learning agents, DreamerV3 in particular.

Index

Environment setup

Set up an environment from the environment.yaml file and activate it (Miniconda):

conda env create -f environment.yaml
conda activate rlm

Install the local rlm package:

pip install -e .

If you want to run the agent in the Memory Maze environment, install the modified memory_maze package in the pkgs directory (modification of the original environment):

pip install -e pkgs/memory_maze

Create .env file in the root directory according to .env.example file and make sure to set the path to an existing directory where the data and checkpoints will be saved (DATA_DIR, RUNS_DIR). You might need to load the environment variable(s) from the .env file manually in the terminal: export $(cat .env | xargs)

Directory structure

.
├── rlm
│   ├── agents
│   │   └── dreamer  # Modified code of the DreamerV3 agent
│   ├── envs
│   │   ├── wrappers.py  # Environment wrappers
│   │   ├── minigrid.py  # MiniGrid-Memory environment
│   │   ├── switching_minigrid_memory.py  # MiniGrid-Switching-Memory environment
│   │   ├── four_room_minigrid_memory.py  # MiniGrid-Four-Room-Memory environment
│   │   └── memory_maze.py  # Memory Maze environment wrapper
│   ├── analysis  # Code used in the initial exploration of Recurrent MFRL
│   │   ├── mask
│   │   │   ├── masked_model.py  # Binary-weight-mask model wrapper
│   │   │   └── trainer.py  # Trainer for the binary weight mask
│   │   ├── notebooks  # Jupyter notebooks with analysis
│   │   ├── intervention.py  # Hidden-state intervention code
│   │   ├── in_out_tracker.py  # Wrapper for input/output tracking of PyTorch models
│   │   └── utils.py  # Utility functions for analysis
│   ├── configs  # Configuration files for Hydra
│   │   ├── defaults.yaml  # Main configuration file
│   │   └── ...  # Other configuration files
│   ├── utils.py  # Utility functions for the project
│   └── main.py  # Main script to run the training
├── pkgs
│   └── memory_maze  # Modified memory_maze package
├── setup.py
├── environment.yaml
├── .env.example
├── .gitignore
├── LICENSE
└── README.md

Running the code

Training the DreamerV3 agent on the MiniGrid environment, and saving the results to the specified directory:

python rlm/main.py

Hydra is used for configuration management. Directory with all configuration files is rlm/configs.

The training configuration rlm/configs/defaults.yaml composes several other configuration files. Please see the Hydra documentation and make changes to the configuration files as needed.

Analysis

All analysis code is in the rlm/analysis directory. The analysis was performed in Jupyter notebooks, which are located in the rlm/analysis/notebooks directory:

  • intervention.ipynb: Hidden-state intervention analysis (also used in circuit finding and editing notebooks).
  • circuit_finding.ipynb: Optimization and analysis of the binary-weight-masked model.
  • circuit_development.ipynb: Analysis of the memory circuit development during training.
  • circuit_editing.ipynb: Model (circuit) editing in the MiniGrid-Memory environment.
  • circuit_editing_switching.ipynb: Model (circuit) editing in the MiniGrid-Switching-Memory environment.
  • wandb_plotting.ipynb: Plotting of results from the Weights & Biases (W&B) dashboard. Used for analysis of the hidden-state size.

Acknowledgements

About

Code for our paper "Reverse-Engineering Memory in DreamerV3: From Sparse Representations to Functional Circuits" (NeurIPS 2025, Spotlight at Mech Interp Workshop).

Resources

License

Stars

Watchers

Forks