In this work, we introduce task selection based on prior experience into a meta-learning algorithm by conceptualizing the learner and the active meta-learning setting using a probabilistic latent variable model.
This repository implements the models and algorithms necessary to reproduce experiments (i)-(iii).
To reproduce the results, you can run batch.sh (please do not change default values for certain parameters in run.py).
The core components of the repository are:
run.py: script to run the PAML algorithm including all parametersenv: directory for configuring and observing environmentscontrols.py: generates control signalsenvironment_configurator.py: configuresdm_controlenvironmentsto.py: observes trajectories of the environments given controls
models: directory for the PAML modelmeta_learner.py: trains the model and infers latent task variablesmlgp.py: the meta-learning (sparse variational) gaussian process modeltp.py: predicts trajectories (for evaluation)
utility_functions: directory for the in the paper used utility functions and baselinespaml.py: PAMLlhs.py: Latin Hypercube Samplinguni.py: Uniform sampling
utils: directory for miscellaneous toolsalgorithm_utils.py: separated key steps of the PAML algorithmdataset.py: stores and prepares trajectory observationsevaluation.py: evaluates the model's performance on test tasks
This code was tested in Python 3.7.
The dependencies can be found in requirements.txt.
- Download and install MuJoCo Pro 2.00
- You need a license and you can request a trial license for 30 days
- At installation time,
dm_control, looks for the MuJoCo headers in~/.mujoco/mujoco200_$PLATFORM/include - At runtime,
dm_controllooks for the MuJoCo license key file at~/.mujoco/mjkey.txt
- Install all dependencies with
pip install -r requirements.txt
# Under-specified cart-pole environment
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1 --under_specified_system True --observed_config_space_dim=1
# Fully-specified cart-pole environment
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1
# Fully-specified pendubot environment
python3 run.py --env_name="pendubot" --utility_function="PAML" --seed=1
# Fully-specified cart-double-pole environment
python3 run.py --env_name="cartdoublepole" --utility_function="PAML" --seed=1
# Over-specified cart-pole environment
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1 --over_specified_system True --observed_config_space_dim=3 --config_space_dim=2
Parameters that require string values:
--env_name:'cartpole','cartdoublepole','pendubot'--utility_function:'PAML','LHS','UNI'--policy:'ALTERNATE'--initial_training_configurations:'LHS','UNI'
Parameters that require boolean values:
--verbose: printing additional information--evaluation: evaluation of the MLGP on a test task grid--under_specified_system: enables an unobserved, stochastic configuration dimension--oracle: initial training on the test task grid--data_normalization: normalization of training data over all dimensions
The task paramater interval can be specified through the console, e.g.,
# By default, the following command runs an experiment with cart-pole tasks with pendulum mass in [0.5, 3.0] kg
python3 run.py --env_name="cartpole" --utility_function="PAML" --seed=1 --config_interval_lower_bound_dim_1=0.5 --config_interval_upper_bound_dim_1=3.0
In order to change the environment's parameterization (e.g., which configuration interval dimension corresponds to mass, length, radius, etc.), please have a look at env/environment_configurator.py
@inproceedings{kaddour2020paml,
title={Probabilistic Active-Meta Learning},
author={Kaddour, Jean and Saemundsson, Steindor and Deisenroth, Marc Peter},
booktitle={Advances in Neural Information Processing Systems},
year={2020}
}