This is the official implementation for NeurIPS 2024 paper On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability.
conda env create -f environment.yamlDetailed hyperparameters config can be found in Appendix B.
bash main_train_ar.sh #with hyperparameters in Appendix Bpython plot.py #specify the output