diff --git a/docs/src/concepts/fine-tuning.rst b/docs/src/concepts/fine-tuning.rst new file mode 100644 index 0000000000..3a26a92d8c --- /dev/null +++ b/docs/src/concepts/fine-tuning.rst @@ -0,0 +1,226 @@ +.. _label_fine_tuning_concept: + +Fine-tune a pre-trained model +============================= + +.. warning:: + + Finetuning may not be supported by every architecture and if supported the syntax to + start a finetuning may be different from how it is explained here. + This section describes the process of fine-tuning a pre-trained model to adapt it to + new tasks or datasets. Fine-tuning is a common technique used in machine learning, + where a model is trained on a large dataset and then fine-tuned on a smaller dataset + to improve its performance on specific tasks. So far the fine-tuning capabilities are + only available for PET model. + +There is a complete example in the tutorial section +:ref:`sphx_glr_generated_examples_0-beginner_02-fine-tuning.py`. + +.. note:: + + Please note that the fine-tuning recommendations in this section are not universal + and require testing on your specific dataset to achieve the best results. You might + need to experiment with different fine-tuning strategies depending on your needs. + + +Basic Fine-tuning +----------------- + +The basic way to fine-tune a model is to use the ``mtt train`` command with the +available pre-trained model defined in an ``options.yaml`` file. In this case, all the +weights of the model will be adapted to the new dataset. In contrast to the +training continuation, the optimizer and scheduler state will be reset. You can still +adjust the training hyperparameters in the ``options.yaml`` file, but the model +architecture will be taken from the checkpoint. +To set the path to the pre-trained model checkpoint, you need to specify the +``read_from`` parameter in the ``options.yaml`` file: + +.. code-block:: yaml + + architecture: + training: + finetune: + method: "full" # This stands for the full fine-tuning + read_from: path/to/checkpoint.ckpt + +We recommend to use a lower learning rate than the one used for the original training, +as this will help stabilizing the training process. I.e. if the default learning rate is +``1e-4``, you can set it to ``1e-5`` or even lower, using the following in the +``options.yaml`` file: + +.. code-block:: yaml + + architecture: + training: + learning_rate: 1e-5 + +Please note, that in most use cases you should invoke a new energy head by specifying +a new energy variant. A variant is a version of a target quantity, such as ``energy``. +A model can have multiple variants, that can be selected during training and inference. +More on variants can be found in `metatomic`_ + +.. _metatomic: https://docs.metatensor.org/metatomic/latest/engines/index.html + +Variant names follow the simple pattern ``energy/{variantname}``, where we used +``energy`` as the target quantity. A reasonable name could be the energy functional or +level of theory your finetuning dataset was trained on, e.g. ``energy/pbe``, +``energy/SCAN`` or even ``energy/dataset1``. Further we recommend adding a short +:attr:`description` for the new variant, that you can specify in ``description`` of +your ``options.yaml`` file. + + +.. code-block:: yaml + + training_set: + systems: + read_from: path/to/dataset.xyz + length_unit: angstrom + targets: + energy/: + quantity: energy + key: + unit: + description: "description of your variant" + + +The new energy variant can be selected for evaluation either with ``mtt eval`` by +specifying it in the options.yaml for evaluation: + +.. code-block:: yaml + + systems: path/to/dataset.xyz + targets: + energy/: + key: + unit: + forces: + key: forces + +When using the finetuned model in simulation engines the default target name expected +by the ``metatomic`` package in order to use the model in ASE and LAMMPS calculations +is ``energy``. When loading the model in ``metatomic`` you have to specify which +variant should be used for energy and force prediction. You can find an example for how +to do this in the tutorial +(see :ref:`sphx_glr_generated_examples_0-beginner_02-fine-tuning.py`) and more in the +`metatomic documentation`_. + +.. _metatomic documentation: https://docs.metatensor.org/metatomic/latest/engines/index.html + + +Until here, our model would train on all weights of the model, create a new energy head +and a new composition model. + +The basic fine-tuning strategy is a good choice for most use cases. Below, we present +a few more advanced topics. + +Inheriting weights from existing heads +-------------------------------------- + +In some cases, the new targets might be similar to the existing targets +in the pre-trained model. For example, if the pre-trained model is trained +on energies and forces computed with the PBE functional, and the new targets +are energies and forces coming from the PBE0 calculations, it might be beneficial +to initialize the new PBE0 heads and last layers with the weights of the PBE +heads and last layers. This can be done by specifying the ``inherit_heads`` +parameter in the ``options.yaml`` file: + +.. code-block:: yaml + + architecture: + training: + finetune: + method: full + read_from: path/to/checkpoint.ckpt + inherit_heads: + energy/: energy # inherit weights from the "energy" head + +The ``inherit_heads`` parameter is a dictionary mapping the new trainable +targets specified in the ``training_set/targets`` section to the existing +targets in the pre-trained model. The weights of the corresponding heads and +last layers will be copied from the source heads to the destination heads +instead of random initialization. These weights are still trainable and +will be adapted to the new dataset during the training process. + + +Multi-fidelity training +----------------------- +So far the old head is left untouched, but it is rendered useless, due to changing +deeper weights of the model. If you want to fine-tune and retain multiple functional +heads, the recommended way is to do full fine-tuning on a new target, but keep +training the old energy head as well. This will leave you with a model capable of +using different variants for energy and force prediction. Again, you are able to select +the preferred head in ``LAMMPS`` or when creating a ``metatomic`` calculator object. +Thus, you should specify both variants in the ``targets`` section of your +``options.yaml``. In the code snippet, we additionally assume that the energy labels +come from different datasets. Please note, if you have both references in one file, +they can be selected by selecting the corresponding keys from the same system. +the same dataset. + +.. code-block:: yaml + + training_set: + - systems: + read_from: dataset_1.xyz + length_unit: angstrom + targets: + energy/: + quantity: energy + key: my_energy_label1 + unit: eV + description: 'my variant1 description' + - systems: + read_from: dataset_2.xyz + length_unit: angstrom + targets: + energy/: + quantity: energy + key: my_energy_label2 + unit: eV + description: 'my variant2 description' + + + +You can find more about setting up training with multiple files in the +:ref:`Training YAML reference `. + + +Training only the head weights can be an alternative, if one wants to keep the old energy +head, but the reference data it was trained are not available. In that case, the +internal model weights are frozen, and only the weights of the new target are trained. + + +Fine-tuning model Heads only +---------------------------- + +Adapting all the model weights to a new dataset is not always the best approach. If the +new dataset consist of the same or similar data computed with a slightly different level +of theory compared to the pre-trained models' dataset, you might want to keep the +learned representations of the crystal structures and only adapt the readout layers +(i.e. the model heads) to the new dataset. +In this case, the ``mtt train`` command needs to be accompanied by the specific training +options in the ``options.yaml`` file. The following options need to be set: + +.. code-block:: yaml + + architecture: + training: + finetune: + method: "heads" + read_from: path/to/checkpoint.ckpt + config: + head_modules: ['node_heads', 'edge_heads'] + last_layer_modules: ['node_last_layers', 'edge_last_layers'] + + +The ``method`` parameter specifies the fine-tuning method to be used and the +``read_from`` parameter specifies the path to the pre-trained model checkpoint. The +``head_modules`` and ``last_layer_modules`` parameters specify the modules to be +fine-tuned. Here, the ``node_*`` and ``edge_*`` modules represent different parts of the +model readout layers related to the atom-based and bond-based features. The +``*_last_layer`` modules are the last layers of the corresponding heads, implemented as +multi-layer perceptron (MLPs). You can select different combinations of the node and +edge heads and last layers to be fine-tuned. + +We recommend to first start the fine-tuning including all the modules listed above and +experiment with their different combinations if needed. You might also consider using a +lower learning rate, e.g. ``1e-5`` or even lower, to stabilize the training process. diff --git a/docs/src/concepts/index.rst b/docs/src/concepts/index.rst index db689d7518..06372e535d 100644 --- a/docs/src/concepts/index.rst +++ b/docs/src/concepts/index.rst @@ -9,5 +9,6 @@ such as output naming, auxiliary outputs, and wrapper models. :maxdepth: 1 output-naming + fine-tuning loss-functions auxiliary-outputs diff --git a/docs/src/getting-started/finetuning-example.rst b/docs/src/getting-started/finetuning-example.rst deleted file mode 100644 index f13ec6e556..0000000000 --- a/docs/src/getting-started/finetuning-example.rst +++ /dev/null @@ -1,110 +0,0 @@ -.. _fine-tuning-example: - -Finetuning example -================== - -.. warning:: - - Finetuning is currently only available for the PET architecture. - - -This is a simple example for fine-tuning PET-MAD (or a general PET model), that -can be used as a template for general fine-tuning with metatrain. -Fine-tuning a pretrained model allows you to obtain a model better suited for -your specific system. You need to provide a dataset of structures that have -been evaluated at a higher reference level of theory, usually DFT. Fine-tuning -a universal model such as PET-MAD allows for reasonable model performance even if little training -data is available. -It requires using a pre-trained model checkpoint with the ``mtt train`` command and setting the -new targets corresponding to the new level of theory in the ``options.yaml`` file. - - -In order to obtain a pretrained model, you can use a PET-MAD checkpoint from huggingface - -.. code-block:: bash - - wget https://huggingface.co/lab-cosmo/pet-mad/resolve/v1.1.0/models/pet-mad-v1.1.0.ckpt - -Next, we set up the ``options.yaml`` file. We can specify the fine-tuning method -in the ``finetune`` block in the ``training`` options of the ``architecture``. -Here, the basic ``full`` option is chosen, which finetunes all weights of the model. -All available fine-tuning methods are found in the advanced concepts -:ref:`Fine-tuning `. This section discusses implementation details, -options and recommended use cases. Other fine-tuning options can be simply substituted in this script, -by changing the ``finetune`` block. - -Furthermore, you need to specify the checkpoint, that you want to fine-tune in -the ``read_from`` option. - -A simple ``options.yaml`` file for this task could look like this: - -Training on a new level of theory is a common use case for transfer learning. Let's - -.. code-block:: yaml - - architecture: - name: pet - training: - num_epochs: 1000 - learning_rate: 1e-5 - finetune: - method: full - read_from: path/to/checkpoint.ckpt - training_set: - systems: - read_from: dataset.xyz - reader: ase - length_unit: angstrom - targets: - energy: - quantity: energy - read_from: dataset.xyz - reader: ase - key: energy - unit: eV - forces: - read_from: dataset.xyz - reader: ase - key: forces - stress: - read_from: dataset.xyz - reader: ase - key: stress - - test_set: 0.1 - validation_set: 0.1 - -In this example, we specified generic but reasonable ``num_epochs`` and ``learning_rate`` -parameters. The ``learning_rate`` is chosen to be relatively low to stabilise -training. - -.. warning:: - - Note that in ``targets`` we use the PET-MAD ``energy`` head. This means, that there won't be a new head - for the new reference energies provided in your dataset. This can lead to bad performance, if the reference - energies differ from the ones used in pretraining (different levels of theory, or different electronic structure - software used). In future it is recommended to create a new ``energy`` target for the new level of theory. - Find more about this in :ref:`Transfer-Learning ` - - - -We assumed that the pre-trained model is trained on the dataset ``dataset.xyz`` in which -energies are written in the ``energy`` key of the ``info`` dictionary of the -energies. Additionally, forces and stresses should be provided with corresponding keys -which you can specify in the ``options.yaml`` file under ``targets``. -Further information on specifying targets can be found in the :ref:`data section of the Training YAML Reference -`. - -.. note:: - - It is important that the ``length_unit`` is set to ``angstrom`` and the ``energy`` ``unit`` is ``eV`` in order - to match the units PET-MAD was trained on. If your dataset has different energy units, it is - necessary to convert it to ``eV`` before fine-tuning. - - -After setting up your ``options.yaml`` file, finetuning can then simply be run -via ``mtt train options.yaml``. - - -Further fine-tuning examples can be found in the -`AtomisticCookbook `_ diff --git a/docs/src/getting-started/index.rst b/docs/src/getting-started/index.rst index 1baee9deeb..af6d2f4224 100644 --- a/docs/src/getting-started/index.rst +++ b/docs/src/getting-started/index.rst @@ -10,5 +10,4 @@ This sections describes how to install the package, and its most basic commands. train_yaml_config override checkpoints - finetuning-example units diff --git a/docs/src/getting-started/quickstart.rst b/docs/src/getting-started/quickstart.rst index 4251124e85..77384422d7 100644 --- a/docs/src/getting-started/quickstart.rst +++ b/docs/src/getting-started/quickstart.rst @@ -8,6 +8,13 @@ Quickstart :start-after: :end-before: -For a more detailed description please checkout -our :ref:`label_basic_usage` and the rest of the -documentation. +.. hint:: + + If you want to fine-tune an existing model + check out :ref:`label_fine_tuning_concept`. + +.. note:: + + For a more detailed descriptions on the training pleases, checkout + our :ref:`label_basic_usage` and the rest of the + documentation. diff --git a/examples/0-beginner/02-fine-tuning.py b/examples/0-beginner/02-fine-tuning.py index e3dd2f5b28..e549538706 100644 --- a/examples/0-beginner/02-fine-tuning.py +++ b/examples/0-beginner/02-fine-tuning.py @@ -1,198 +1,296 @@ r""" -.. _fine-tuning: -Fine-tune a pre-trained model -============================= +Fine-tuning a pre-trained model +=============================== .. warning:: - This section of the documentation is only relevant for PET model so far. + Finetuning is currently only available for the PET architecture. -This section describes the process of fine-tuning a pre-trained model to -adapt it to new tasks or datasets. Fine-tuning is a common technique used -in machine learning, where a model is trained on a large dataset and then -fine-tuned on a smaller dataset to improve its performance on specific tasks. -So far the fine-tuning capabilities are only available for PET model. -There is a complete example in :ref:`Fine-tune example `. +This is a simple example for fine-tuning PET-MAD (or a general PET model), that +can be used as a template for general fine-tuning with metatrain. +Fine-tuning a pretrained model allows you to obtain a model better suited for +your specific system. You need to provide a dataset of structures that have +been evaluated at a higher reference level of theory, usually DFT. Fine-tuning +a universal model such as PET-MAD allows for reasonable model performance even if little +training data is available. +It requires using a pre-trained model checkpoint with the ``mtt train`` command and +setting the new targets corresponding to the new level of theory in the ``options.yaml`` +file. -.. note:: - - Please note that the fine-tuning recommendations in this section are not universal - and require testing on your specific dataset to achieve the best results. You might - need to experiment with different fine-tuning strategies depending on your needs. - - -Basic Fine-tuning ------------------ - -The basic way to fine-tune a model is to use the ``mtt train`` command with the -available pre-trained model defined in an ``options.yaml`` file. In this case, all the -weights of the model will be adapted to the new dataset. In contrast to to the -training continuation, the optimizer and scheduler state will be reset. You can still -adjust the training hyperparameters in the ``options.yaml`` file, but the model -architecture will be taken from the checkpoint. - -To set the path to the pre-trained model checkpoint, you need to specify the -``read_from`` parameter in the ``options.yaml`` file: - -.. code-block:: yaml - - architecture: - training: - finetune: - method: "full" # This stands for the full fine-tuning - read_from: path/to/checkpoint.ckpt - -We recommend to use a lower learning rate than the one used for the original training, -as this will help stabilizing the training process. I.e. if the default learning rate is -``1e-4``, you can set it to ``1e-5`` or even lower, using the following in the -``options.yaml`` file: - -.. code-block:: yaml - architecture: - training: - learning_rate: 1e-5 +In order to obtain a pretrained model, you can use a PET-MAD checkpoint from huggingface -Please note, that in the case of the basic fine-tuning, the composition model weights -will be taken from the checkpoint and not adapted to the new dataset. +.. code-block:: bash -The basic fine-tuning strategy is a good choice in the case when the level of theory -which is used for the original training is the same, or at least similar to the one used -for the new dataset. However, since this is not always the case, we also provide more -advanced fine-tuning strategies described below. + wget https://huggingface.co/lab-cosmo/pet-mad/resolve/v1.1.0/models/pet-mad-v1.1.0.ckpt -Here is the specification for the inputs to pass to the -``architecture.training.finetune`` parameter in case of the basic fine-tuning: +Next, we set up the ``options.yaml`` file. Here we specify to fine-tune on a small model +dataset containing structures of ethanol, labelled with energies and forces. +We can specify the fine-tuning method in the ``finetune`` block in the ``training`` +options of the ``architecture``. Here, the basic ``full`` option is chosen, which +finetunes all weights of the model. All available fine-tuning methods are found in the +concepts page :ref:`Fine-tuning `. This section discusses +implementation details, options and recommended use cases. Other fine-tuning options can +be simply substituted in this script, by changing the ``finetune`` block. -.. autoclass:: metatrain.pet.modules.finetuning.FullFinetuneHypers - :members: - :undoc-members: +.. note:: + Since our dataset has energies and forces obtained from reference calculations, + different from the reference of the pretrained model, it is recommended to create a + new energy head. Using this so-called energy variant can be simply invoked by + requesting a new target in the options file. Follow the nomenclature + energy/{yourname}. -Fine-tuning model Heads ------------------------ -Adapting all the model weights to a new dataset is not always the best approach. If the -new dataset consist of the same or similar data computed with a slightly different level -of theory compared to the pre-trained models' dataset, you might want to keep the -learned representations of the crystal structures and only adapt the readout layers -(i.e. the model heads) to the new dataset. +Furthermore, you need to specify the checkpoint, that you want to fine-tune in +the ``read_from`` option. -In this case, the ``mtt train`` command needs to be accompanied by the specific training -options in the ``options.yaml`` file. The following options need to be set: +A simple ``options-ft.yaml`` file for this task could look like this: .. code-block:: yaml - architecture: - training: - finetune: - method: "heads" - read_from: path/to/checkpoint.ckpt - config: - head_modules: ['node_heads', 'edge_heads'] - last_layer_modules: ['node_last_layers', 'edge_last_layers'] - - -The ``method`` parameter specifies the fine-tuning method to be used and the -``read_from`` parameter specifies the path to the pre-trained model checkpoint. The -``head_modules`` and ``last_layer_modules`` parameters specify the modules to be -fine-tuned. Here, the ``node_*`` and ``edge_*`` modules represent different parts of the -model readout layers related to the atom-based and bond-based features. The -``*_last_layer`` modules are the last layers of the corresponding heads, implemented as -multi-layer perceptron (MLPs). You can select different combinations of the node and -edge heads and last layers to be fine-tuned. - -We recommend to first start the fine-tuning including all the modules listed above and -experiment with their different combinations if needed. You might also consider using a -lower learning rate, e.g. ``1e-5`` or even lower, to stabilize the training process. - -Here is the specification for the inputs to pass to the -``architecture.training.finetune`` parameter in case of ``"heads"`` fine-tuning: - -.. autoclass:: metatrain.pet.modules.finetuning.HeadsFinetuneHypers - :members: - :undoc-members: + architecture: + name: pet + training: + batch_size: 8 + num_epochs: 10 + learning_rate: 1e-3 + warmup_fraction: 0.01 + finetune: + method: full + read_from: pet-mad-v1.1.0.ckpt + inherit_heads: + energy/finetune: energy # inherit weights from the "energy" head + + training_set: + systems: + read_from: ethanol_reduced_100.xyz + reader: ase + length_unit: angstrom + targets: + energy/finetune: + quantity: energy + read_from: ethanol_reduced_100.xyz + reader: ase + key: energy + unit: eV + description: "pbe energy ethanol" + forces: + read_from: ethanol_reduced_100.xyz + reader: ase + key: forces + + validation_set: 0.1 + test_set: 0.1 + + +In this example, we specified a low number of :attr:`num_epochs` and a relatively high +:attr:`learning_rate`, for short compilation time. Usually, the ``learning_rate`` is +chosen to be relatively low. Typically lower, than the ``learning_rate`` that the model +has been per-trained on. +to stabilise training. -.. autoclass:: metatrain.pet.modules.finetuning.HeadsFinetuneConfig - :members: - :undoc-members: - - -LoRA Fine-tuning ----------------- +.. warning:: -If the conceptually new type of structures is introduced in the new dataset, tuning only -the model heads might not be sufficient. In this case, you might need to adapt the -internal representations of the crystal structures. This can be done using the LoRA -technique. However, in this case the model heads will be not adapted to the new dataset, -so conceptually the level of theory should be consistent with the one used for the -pre-trained model. + Note that in ``targets`` we use the ``energy/finetune`` head, differing from the + default ``energy`` head. This means, that the model creates a new head with a new + composition model for the new reference energies provided in your dataset. While + the old energy reference is still available, it is rendered useless, as we trained + all weights of the model. If you want to obtain a model with multiple energy heads, + you can simply train on multiple energy references simultaneously. This and other + more advanced fine-tuning strategies are discussed in + :ref:`Fine-tuning concepts `. -What is LoRA? -^^^^^^^^^^^^^ -LoRA (Low-Rank Adaptation) stands for a Parameter-Efficient Fine-Tuning (PEFT) -technique used to adapt pre-trained models to new tasks by introducing low-rank -matrices into the model's architecture. +We assumed that the pre-trained model is trained on the dataset +``ethanol_reduced_100.xyz`` in which energies are written in the ``energy`` key of +the ``info`` dictionary of the dataset. +Additionally, forces should be provided with corresponding keys +which you can specify in the ``options-ft.yaml`` file under ``targets``. +Further information on specifying targets can be found in the :ref:`data section of +the Training YAML Reference `. -Given a pre-trained model with the weights matrix :math:`W_0`, LoRA introduces -low-rank matrices :math:`A` and :math:`B` of a rank :math:`r` such that the -new weights matrix :math:`W` is computed as: +.. note:: -.. math:: + It is important that the ``length_unit`` is set to ``angstrom`` and the ``energy`` + ``unit`` is ``eV`` in order to match the units of your reference data. - W = W_0 + \frac{\alpha}{r} A B -where :math:`\alpha` is a regularization factor that controls the influence -of the low-rank matrices on the model's weights. By adjusting the rank :math:`r` -and the regularization factor :math:`\alpha`, you can fine-tune the model -to achieve better performance on specific tasks. +After setting up your ``options-ft.yaml`` file, you can then simply run: -To use LoRA for fine-tuning, you need to provide the pre-trained model checkpoint with -the ``mtt train`` command and specify the LoRA parameters in the ``options.yaml`` file: +.. code-block:: bash -.. code-block:: yaml + mtt train options-ft.yaml -o model-ft.pt - architecture: - training: - finetune: - method: "lora" - read_from: path/to/pre-trained-model.ckpt - config: - alpha: 0.1 - rank: 4 - -These parameters control the rank of the low-rank matrices introduced by LoRA -(``rank``), and the regularization factor for the low-rank matrices (``alpha``). -By selecting the LoRA rank and the regularization factor, you can control the -amount of adaptation to the new dataset. Using lower values of the rank and -the regularization factor will lead to a more conservative adaptation, which can help -balancing the performance of the model on the original and new datasets. - -We recommend to start with the LoRA parameters listed above and experiment with -different values if needed. You might also consider using a lower learning rate, -e.g. ``1e-5`` or even lower, to stabilize the training process. - -Here is the specification for the inputs to pass to the -``architecture.training.finetune`` parameter in case of ``"lora"`` fine-tuning: - -.. autoclass:: metatrain.pet.modules.finetuning.LoRaFinetuneHypers - :members: - :undoc-members: - -.. autoclass:: metatrain.pet.modules.finetuning.LoRaFinetuneConfig - :members: - :undoc-members: - -Fine-tuning on a new level of theory ------------------------------------- - -If the new dataset is computed with a totally different level of theory compared to the -pre-trained model, which includes, for instance, the different composition energies, or -you want to fine-tune the model on a completely new target, you might need to consider -the transfer learning approach and introduce a new target in the ``options.yaml`` file. -More details about this approach can be found in the :ref:`Transfer Learning -` section of the documentation. +You can check finetuning training curves by parsing the ``train.csv`` that is written +by ``mtt train``. We remove the old outputs folder from other examples, which +is not necessary for the normal usage. """ + +# %% +# +import glob +import subprocess + +import ase.io +import matplotlib.pyplot as plt +import numpy as np +from metatomic.torch.ase_calculator import MetatomicCalculator + + +# %% +# + +# Here, we get the PET-MAD ckpt, run ``mtt train`` as a subprocess, and delete the old +# outputs folder. +subprocess.run( + [ + "wget", + "https://huggingface.co/lab-cosmo/pet-mad/resolve/v1.1.0/models/pet-mad-v1.1.0.ckpt", + ] +) +subprocess.run(["rm", "-rf", "outputs"]) +subprocess.run(["mtt", "train", "options-ft.yaml", "-o", "model-ft.pt"], check=True) + +# %% +# +csv_path = glob.glob("outputs/*/*/train.csv")[-1] +with open(csv_path, "r") as f: + header = f.readline().strip().split(",") + f.readline() # skip units row + +# Build dtype +dtype = [(h, float) for h in header] + +# Load data as plain float array +data = np.loadtxt(csv_path, delimiter=",", skiprows=2) + +# Convert to structured +structured = np.zeros(data.shape[0], dtype=dtype) +for i, h in enumerate(header): + structured[h] = data[:, i] + +# %% +# +# Now, let's plot the learning curves. + +# %% +# +training_energy_RMSE = structured["training energy/finetune RMSE (per atom)"] +training_forces_MAE = structured["training forces[energy/finetune] MAE"] +validation_energy_RMSE = structured["validation energy/finetune RMSE (per atom)"] +validation_forces_MAE = structured["validation forces[energy/finetune] MAE"] + +fig, axs = plt.subplots(1, 2, figsize=(12, 5)) + +axs[0].plot(training_energy_RMSE, label="training energy/finetune RMSE (per atom)") +axs[0].plot(validation_energy_RMSE, label="validation energy/finetune RMSE (per atom)") +axs[0].set_xlabel("Epochs") +axs[0].set_ylabel("energy / meV") +axs[0].set_xscale("log") +axs[0].set_yscale("log") +axs[0].legend() +axs[1].plot(training_forces_MAE, label="training forces[energy/finetune] MAE") +axs[1].plot(validation_forces_MAE, label="validation forces[energy/finetune] MAE") +axs[1].set_ylabel("force / meV/A") +axs[1].set_xlabel("Epochs") +axs[1].set_xscale("log") +axs[1].set_yscale("log") +axs[1].legend() +plt.tight_layout() +plt.show() + +# %% +# +# You can see that the validation loss still decreases, however, for the sake of brevity +# of this exercise we only finetuned for a few epochs. As further check for how well +# your fine-tuned model performs on a dataset of choice, we can check the parity plots +# for energy and force +# (see :ref:`sphx_glr_generated_examples_0-beginner_04-parity_plot.py`). +# For evaluation, we can compare performance of our fine-tuned model and the base model +# PET-MAD. Using ``mtt eval`` we can simply evaluate our new energy head, by specifying +# it in the options-ft-eval.yaml: +# +# .. code-block:: yaml +# +# systems: ethanol_reduced_100.xyz +# targets: +# energy/finetune: +# key: energy +# unit: eV +# forces: +# key: forces +# +# and then run +# +# .. code-block:: bash +# +# mtt eval model-ft.pt options-ft-eval.yaml -o output-ft.xyz +# +# Then you can simply read the predicted energies in the headers of the xyz file. +# Another possibility is to load your fine-tuned model ``model-ft.pt`` as ``metatomic`` +# model and evaluate energies and forces with ASE in Python. +# + +# %% +# +targets = ase.io.read( + "ethanol_reduced_100.xyz", + format="extxyz", + index=":", +) +calc_ft = MetatomicCalculator( + "model-ft.pt", variants={"energy": "finetune"}, extensions_directory=None +) # specify variant suffix here + +e_targets = np.array( + [frame.get_total_energy() / len(frame) for frame in targets] +) # target energies +f_targets = np.array( + [frame.get_forces().flatten() for frame in targets] +).flatten() # target forces + +for frame in targets: + frame.set_calculator(calc_ft) + +e_predictions = np.array( + [frame.get_total_energy() / len(frame) for frame in targets] +) # predicted energies +f_predictions = np.array( + [frame.get_forces().flatten() for frame in targets] +).flatten() # predicted forces + +# %% +# +fig, axs = plt.subplots(1, 2, figsize=(12, 5)) + +# Parity plot for energies +axs[0].scatter(e_targets, e_predictions, label="FT") +axs[0].axline((np.min(e_targets), np.min(e_targets)), slope=1, ls="--", color="red") +axs[0].set_xlabel("Target energy / meV") +axs[0].set_ylabel("Predicted energy / meV") +min_e = np.min(np.array([e_targets, e_predictions])) - 2 +max_e = np.max(np.array([e_targets, e_predictions])) + 2 +axs[0].set_title("Energy Parity Plot") +axs[0].set_xlim(min_e, max_e) +axs[0].set_ylim(min_e, max_e) + +# Parity plot for forces +axs[1].scatter(f_targets, f_predictions, alpha=0.5, label="FT") +axs[1].axline((np.min(f_targets), np.min(f_targets)), slope=1, ls="--", color="red") +axs[1].set_xlabel("Target force / meV/Å") +axs[1].set_ylabel("Predicted force / meV/Å") +min_f = np.min(np.array([f_targets, f_predictions])) - 2 +max_f = np.max(np.array([f_targets, f_predictions])) + 2 +axs[1].set_title("Force Parity Plot") +axs[1].set_xlim(min_f, max_f) +axs[1].set_ylim(min_f, max_f) +fig.tight_layout() +plt.show() + +# %% +# +# Further fine-tuning examples can be found in the +# `AtomisticCookbook `_ diff --git a/examples/0-beginner/04-parity_plot.py b/examples/0-beginner/04-parity_plot.py index ccb7948105..6c6ed88033 100644 --- a/examples/0-beginner/04-parity_plot.py +++ b/examples/0-beginner/04-parity_plot.py @@ -1,4 +1,5 @@ """ + Model validation with parity plots ================================== diff --git a/examples/0-beginner/options-ft-eval.yaml b/examples/0-beginner/options-ft-eval.yaml new file mode 100644 index 0000000000..c81a193341 --- /dev/null +++ b/examples/0-beginner/options-ft-eval.yaml @@ -0,0 +1,8 @@ +systems: ethanol_reduced_100.xyz + +targets: + energy/finetune: + key: energy + unit: eV + forces: + key: forces diff --git a/examples/0-beginner/options-ft.yaml b/examples/0-beginner/options-ft.yaml new file mode 100644 index 0000000000..6035788aa9 --- /dev/null +++ b/examples/0-beginner/options-ft.yaml @@ -0,0 +1,31 @@ +architecture: + name: pet + training: + batch_size: 8 + num_epochs: 10 + learning_rate: 1e-3 + warmup_fraction: 0.01 + finetune: + method: full + read_from: pet-mad-v1.1.0.ckpt + +training_set: + systems: + read_from: ethanol_reduced_100.xyz + reader: ase + length_unit: angstrom + targets: + energy/finetune: + quantity: energy + read_from: ethanol_reduced_100.xyz + reader: ase + key: energy + unit: eV + description: pbe energy ethanol + forces: + read_from: ethanol_reduced_100.xyz + reader: ase + key: forces + +validation_set: 0.1 +test_set: 0.1 diff --git a/src/metatrain/pet/documentation.py b/src/metatrain/pet/documentation.py index db32c610b9..fe60466a44 100644 --- a/src/metatrain/pet/documentation.py +++ b/src/metatrain/pet/documentation.py @@ -202,5 +202,5 @@ class TrainerHypers(TypedDict): } """Parameters for fine-tuning trained PET models. - See :ref:`fine-tuning` for more details. + See :ref:`label_fine_tuning_concept` for more details. """