Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@

## 📖 Overview

We provide efficient and streamlined implementations of the TOFU, MUSE and WMDP unlearning benchmarks while supporting 11+ unlearning methods, 5+ datasets, 10+ evaluation metrics, and 7+ LLM architectures. Each of these can be easily extended to incorporate more variants.
We provide efficient and streamlined implementations of the TOFU, MUSE and WMDP unlearning benchmarks while supporting 12+ unlearning methods, 5+ datasets, 10+ evaluation metrics, and 7+ LLM architectures. Each of these can be easily extended to incorporate more variants.


We invite the LLM unlearning community to collaborate by adding new benchmarks, unlearning methods, datasets and evaluation metrics here to expand OpenUnlearning's features, gain feedback from wider usage and drive progress in the field.

Expand Down Expand Up @@ -77,7 +78,7 @@ We provide several variants for each of the components in the unlearning pipelin
| **Component** | **Available Options** |
|------------------------|----------------------|
| **Benchmarks** | [TOFU](https://arxiv.org/abs/2401.06121), [MUSE](https://muse-bench.github.io/), [WMDP](https://www.wmdp.ai/) |
| **Unlearning Methods** | GradAscent, GradDiff, NPO, SimNPO, DPO, RMU, UNDIAL, AltPO, SatImp, WGA, CE-U |
| **Unlearning Methods** | GradAscent, GradDiff, NPO, SimNPO, DPO, RMU, UNDIAL, AltPO, SatImp, WGA, CE-U, PDU |
| **Evaluation Metrics** | Verbatim Probability, Verbatim ROUGE, Knowledge QA-ROUGE, Model Utility, Forget Quality, TruthRatio, Extraction Strength, Exact Memorization, 6 MIA attacks, [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) |
| **Datasets** | MUSE-News (BBC), MUSE-Books (Harry Potter), TOFU (different splits), WMDP-Bio, WMDP-Cyber |
| **Model Families** | TOFU: Llama-3.2, Llama-3.1, Llama-2; MUSE: Llama-2; Additional: Phi-3.5, Phi-1.5, Gemma, Zephyr |
Expand Down
47 changes: 47 additions & 0 deletions community/methods/PDU/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models


We propose a new formulation of LLM unlearning
as a constrained optimization problem: forgetting is enforced via a novel logitmargin flattening loss
that explicitly drives the output distribution toward uniformity on a designated forget set,
while retention is preserved through a hard constraint on a separate retain set.
We solve the constrained problem using a scalable primal-dual algorithm that exposes the
trade-off between forgetting and retention through the dynamics of the dual variable.

# Setup

Experimental setup

- **Hyperparameters & Search Space:** Please see the corresponding [paper](https://arxiv.org/abs/2506.05314) for details of the hyperparameter. Importantly
to produce good results using our method, it is vital the hyperparameter `retain_loss_eps` is set to an appropriate value.
To choose such a value, simply look at the value of the retain loss of the pretrained model and choose
an appropriately larger value than this starting value.

Note that our method's loss is a quadratic function of a difference in the logit spaces. Consequently,
the value of this loss can be large. As a result, it is natural that we set the initial parameter of the
retain loss preference to 50 or 100.
- **Computational Setup:** Please see the Supplementary Material in the paper.

# Results

Please see the `run.sh` script that contains all necessary commands to reproduce the final results.

All unlearned models are available under https://huggingface.co/tamarsonha.

# Citation


If you use this work, please cite:


```bibtex


@article{entesari2025constrained,
title={Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models},
author={Entesari, Taha and Hatami, Arman and Khaziev, Rinat and Ramakrishna, Anil and Fazlyab, Mahyar},
journal={arXiv preprint arXiv:2506.05314},
year={2025}
}

```
130 changes: 130 additions & 0 deletions community/methods/PDU/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
#!/bin/bash



########################################################################################################################
########################################### Final best parameters #####################################################
########################################################################################################################
# for an 8 GPU system:
num_processes=8


############################################## TOFU #####################################################
per_device_train_batch_size=4
learning_rate=0.00001
dual_warmup_epochs=5

pref=100
dual_step_size=5
retain_loss_eps=0.3

retain_precentages=(90 95 99)
models=(Llama-3.2-1B-Instruct Llama-3.2-3B-Instruct Llama-3.1-8B-Instruct gemma-7b-it)

for model in "${models[@]}"; do
for retain_percentage in "${retain_precentages[@]}"; do

if [ "$retain_percentage" = "90" ]; then
forget_split=forget10
retain_split=retain90
elif [ "$retain_percentage" = "95" ]; then
forget_split=forget05
retain_split=retain95
elif [ "$retain_percentage" = "99" ]; then
forget_split=forget01
retain_split=retain99
else
# echo "hello"
echo "Invalid retain percentage. Please set it to 90, 95, or 99."
exit 1
fi


if [ "$model" = "Llama-3.2-1B-Instruct" ]; then
pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.2-1B-Instruct_full
num_train_epochs=10
elif [ "$model" = "Llama-3.2-3B-Instruct" ]; then
pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.2-3B-Instruct_full
num_train_epochs=10
elif [ "$model" = "Llama-3.1-8B-Instruct" ]; then
pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.1-8B-Instruct_full
num_train_epochs=30
elif [ "$model" = "gemma-7b-it" ]; then
pretrained_model_name_or_path=tamarsonha/TOFU-target-gemma-7b-it
num_train_epochs=20
else
echo "Invalid model name. Please set it to Llama-3.2-1B-Instruct, Llama-3.2-3B-Instruct, Llama-3.1-8B-Instruct, or gemma-7b-it."
exit 1
fi

task_name=PDU-TOFU$retain_split-E$num_train_epochs-lr$learning_rate-P1-$pref-Primal$retain_loss_eps-Step$dual_step_size-Warmup$dual_warmup_epochs-model_$model
accelerate launch --config_file configs/accelerate/default_config.yaml --num_processes=$num_processes \
src/train.py --config-name=unlearn.yaml experiment=unlearn/tofu/default \
forget_split=$forget_split retain_split=$retain_split\
trainer=PDU\
trainer.args.num_train_epochs=$num_train_epochs\
trainer.args.eval_on_start=false trainer.args.do_eval=false\
trainer.args.per_device_train_batch_size=$per_device_train_batch_size\
trainer.args.learning_rate=$learning_rate\
trainer.method_args.gamma=1. trainer.method_args.alpha=$pref\
trainer.method_args.primal_dual=true trainer.method_args.retain_loss_eps=$retain_loss_eps\
trainer.method_args.dual_step_size=$dual_step_size\
trainer.method_args.dual_update_upon="step" trainer.method_args.dual_warmup_epochs=$dual_warmup_epochs\
task_name=$task_name\
model=$model model.model_args.pretrained_model_name_or_path=$pretrained_model_name_or_path
done
done


######################################################## MUSE #########################################################
dual_step_size=1
num_train_epochs=10
dual_warmup_epochs=3
data_splits=("News" "Books")
learning_rate=0.00001
dual_update_upon="step"

models=(Llama-2-7b-hf Llama-2-13b-hf)
pref=50

for model in "${models[@]}"; do
for data_split in "${data_splits[@]}"; do

if [ "$model" = "Llama-2-7b-hf" ]; then
pretrained_model_name_or_path=muse-bench/MUSE-${data_split}_target
epsNews=(1.5)
epsBooks=(0.1)
elif [ "$model" = "Llama-2-13b-hf" ]; then
pretrained_model_name_or_path=tamarsonha/MUSE-${data_split}-target-Llama-2-13b-hf
epsNews=(0.8)
epsBooks=(0.6)
else
exit 1
fi


if [ "$data_split" == "News" ]; then
eps_array=("${epsNews[@]}")
else
eps_array=("${epsBooks[@]}")
fi

for retain_loss_eps in "${eps_array[@]}"; do
task_name=PDU-Muse$data_split-E$num_train_epochs-lr$learning_rate-P1-$pref-Primal$retain_loss_eps-Step$dual_step_size-Warmup$dual_warmup_epochs-model$model
accelerate launch --config_file configs/accelerate/default_config.yaml --num_processes=$num_processes \
src/train.py --config-name=unlearn.yaml experiment=unlearn/muse/default \
data_split=$data_split\
trainer=PDU\
trainer.args.num_train_epochs=$num_train_epochs\
trainer.args.eval_on_start=false trainer.args.do_eval=false\
trainer.args.per_device_train_batch_size=$per_device_train_batch_size\
trainer.args.learning_rate=$learning_rate\
trainer.method_args.gamma=1. trainer.method_args.alpha=$pref\
trainer.method_args.primal_dual=true trainer.method_args.retain_loss_eps=$retain_loss_eps\
trainer.method_args.dual_step_size=$dual_step_size\
trainer.method_args.dual_update_upon="step" trainer.method_args.dual_warmup_epochs=$dual_warmup_epochs\
task_name=$task_name\
model=$model model.model_args.pretrained_model_name_or_path=$pretrained_model_name_or_path
done
done
done
2 changes: 1 addition & 1 deletion community/methods/template/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@
########################################### Final best parameters #####################################################
########################################################################################################################

# Required to replicate your results
# Required to replicate your results
8 changes: 8 additions & 0 deletions configs/data/datasets/MUSE_train.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
MUSE_train:
handler: PretrainingDataset
args:
hf_args:
path: "tamarsonha/MUSE-News-Train"
split: "full"
text_key: "text"
max_length: 2048
30 changes: 30 additions & 0 deletions configs/experiment/finetune/muse/default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# @package _global_

defaults:
- override /model: Llama-2-13b-hf
- override /trainer: finetune
- override /data/[email protected]: MUSE_train
- override /eval: muse
- override /data: finetune

mode: finetune
data_split: News
data_sub_set: full # full or retain

data:
train:
MUSE_train:
args:
hf_args:
path: tamarsonha/MUSE-${data_split}-Train
split: ${data_sub_set}
# you can find fine-tuned models on https://huggingface.co/tamarsonha

trainer:
args:
learning_rate: 1e-5
weight_decay: 0.01
warmup_epochs: 1.0 # custom parameter
num_train_epochs: 10

task_name: muse_news_full
2 changes: 1 addition & 1 deletion configs/experiment/unlearn/wmdp/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ trainer:
gamma: 1.0
steering_coeff: 2
retain_loss_type: EMBED_DIFF
alpha: 1
alpha: 1
module_regex: model\.layers\.7
trainable_params_regex:
- model\.layers\.(5|6|7)\.mlp\.down_proj\.weight # If you want to update only these weights (as done in https://github.com/centerforaisafety/wmdp/blob/bc5e1ba0367ea826caeeeaa50656336a1e87acfb/rmu/unlearn.py#L26)
Expand Down
12 changes: 12 additions & 0 deletions configs/model/Llama-2-13b-hf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
model_args:
pretrained_model_name_or_path: "meta-llama/Llama-2-13b-hf"
attn_implementation: 'flash_attention_2'
torch_dtype: bfloat16
tokenizer_args:
pretrained_model_name_or_path: "meta-llama/Llama-2-13b-hf"
template_args: # Used in creating prompts for the dataset. See src/data/utils.py#preprocess_chat_instance.
apply_chat_template: False
user_start_tag: "Question: "
user_end_tag: "\n"
asst_start_tag: "Answer: "
asst_end_tag: "\n\n"
14 changes: 14 additions & 0 deletions configs/trainer/PDU.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
defaults:
- GradDiff

handler: PDU
method_args:
retain_loss_eps: ???
primal_dual: True
dual_step_size: 1.0
dual_update_upon: "step" # "step" or "epoch"
dual_warmup_epochs: 5
gamma: 1.0
alpha: 1.0
loss_names: ["forget_loss", "retain_loss"]

2 changes: 1 addition & 1 deletion configs/trainer/RMU.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ method_args:
gamma: 1.0
steering_coeff: 2
retain_loss_type: EMBED_DIFF
alpha: 1
alpha: 1
module_regex: model\.layers\.7
trainable_params_regex:
- .* # update all parameters (as done in https://github.com/tmlr-group/G-effect/blob/ef368eea3b2c6dba1e090b9ebb021ac9f047e0ae/dataloader.py#L271)
Expand Down
2 changes: 1 addition & 1 deletion configs/trainer/finetune.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont think we need to add new arguments in default trainingargs. you can add new arguments using + and don't need the attribute to have already been in the config. https://github.com/locuslab/open-unlearning/blob/main/docs/hydra.md has this documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware of the + new arguments. I just felt like this is something that is important and good to have by default in the default parameters. I'll leave the final decision regarding this to you.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this? If people sets eval_strategy=steps and forgets to set eval_steps. This will evaluate on each step. Other option is you can set it eval_steps: 1.0

Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,4 @@ args:
eval_on_start: True
eval_strategy: epoch
num_train_epochs: 10
seed: 0
seed: 0
1 change: 1 addition & 0 deletions docs/evaluation.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to me this should be a metric. @Dornavineeth thoughts?

Original file line number Diff line number Diff line change
Expand Up @@ -270,3 +270,4 @@ simple_evaluate_args:
system_instruction: null
apply_chat_template: false
```

2 changes: 2 additions & 0 deletions docs/links.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ Links to research papers and resources corresponding to implemented features in
| SatImp | Paper[📄](https://arxiv.org/pdf/2505.11953), Code [🐙](https://github.com/Puning97/SatImp-for-LLM-Unlearning) |
| WGA (G-effect) | Paper[📄](https://arxiv.org/pdf/2502.19301), Code [🐙](https://github.com/tmlr-group/G-effect) |
| CE-U (Cross-Entropy unlearning) | Paper[📄](https://arxiv.org/pdf/2503.01224) |
| PDU | Paper [📄](https://arxiv.org/abs/2506.05314) |


---

Expand Down
8 changes: 4 additions & 4 deletions docs/repro.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on resolution of other comments

Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@ bash scripts/muse_unlearn.sh

For all the experiments below, we used the following setup

| **Category** | **Details** |
|-------------------------|------------|
| **Hardware** | 2 × L40s GPUs (48GB each) |
| **Distributed Computing** | [DeepSpeed ZeRO Stage 3 (Accelerate)](https://huggingface.co/docs/accelerate/en/usage_guides/deepspeed) |
| **Category** | **Details** |
|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Hardware** | 2 × L40s GPUs (48GB each) |
| **Distributed Computing** | [DeepSpeed ZeRO Stage 3 (Accelerate)](https://huggingface.co/docs/accelerate/en/usage_guides/deepspeed) |
| **Hyperparameters** | Learning Rate (lr) = 1e-5 <br> α = 1, γ = 1, β = 0.1 (where applicable) <br> Batch size 32 effectively: 8 per device, 4 grad accum steps <br> Number of Epochs = 10 <br> Optimizer: [paged_adamw_32bit](https://huggingface.co/docs/bitsandbytes/main/en/reference/optim/adamw#bitsandbytes.optim.PagedAdamW) |


Expand Down
2 changes: 2 additions & 0 deletions src/trainer/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from trainer.unlearn.ceu import CEU
from trainer.unlearn.satimp import SatImp
from trainer.unlearn.wga import WGA
from trainer.unlearn.pdu import PDU


import logging
Expand Down Expand Up @@ -97,3 +98,4 @@ def load_trainer(
_register_trainer(CEU)
_register_trainer(SatImp)
_register_trainer(WGA)
_register_trainer(PDU)
Loading