locuslab · Dornavineeth · Jul 20, 2025 · May 30, 2025 · Jun 6, 2025 · May 30, 2025
diff --git a/README.md b/README.md
@@ -19,7 +19,8 @@
 
 ## 📖 Overview
 
-We provide efficient and streamlined implementations of the TOFU, MUSE and WMDP unlearning benchmarks while supporting 11+ unlearning methods, 5+ datasets, 10+ evaluation metrics, and 7+ LLM architectures. Each of these can be easily extended to incorporate more variants.
+We provide efficient and streamlined implementations of the TOFU, MUSE and WMDP unlearning benchmarks while supporting 12+ unlearning methods, 5+ datasets, 10+ evaluation metrics, and 7+ LLM architectures. Each of these can be easily extended to incorporate more variants.
+
 
 We invite the LLM unlearning community to collaborate by adding new benchmarks, unlearning methods, datasets and evaluation metrics here to expand OpenUnlearning's features, gain feedback from wider usage and drive progress in the field.
 
@@ -77,7 +78,7 @@ We provide several variants for each of the components in the unlearning pipelin
 | **Component**          | **Available Options** |
 |------------------------|----------------------|
 | **Benchmarks**        | [TOFU](https://arxiv.org/abs/2401.06121), [MUSE](https://muse-bench.github.io/), [WMDP](https://www.wmdp.ai/) |
-| **Unlearning Methods** | GradAscent, GradDiff, NPO, SimNPO, DPO, RMU, UNDIAL, AltPO, SatImp, WGA, CE-U |
+| **Unlearning Methods** | GradAscent, GradDiff, NPO, SimNPO, DPO, RMU, UNDIAL, AltPO, SatImp, WGA, CE-U, PDU |
 | **Evaluation Metrics** | Verbatim Probability, Verbatim ROUGE, Knowledge QA-ROUGE, Model Utility, Forget Quality, TruthRatio, Extraction Strength, Exact Memorization, 6 MIA attacks, [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) |
 | **Datasets**          | MUSE-News (BBC), MUSE-Books (Harry Potter), TOFU (different splits), WMDP-Bio, WMDP-Cyber |
 | **Model Families**    | TOFU: Llama-3.2, Llama-3.1, Llama-2; MUSE: Llama-2; Additional: Phi-3.5, Phi-1.5, Gemma, Zephyr |

diff --git a/community/methods/PDU/README.md b/community/methods/PDU/README.md
@@ -0,0 +1,47 @@
+# Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models
+
+
+ We propose a new formulation of LLM unlearning
+as a constrained optimization problem: forgetting is enforced via a novel logitmargin flattening loss
+ that explicitly drives the output distribution toward uniformity on a designated forget set,
+ while retention is preserved through a hard constraint on a separate retain set. 
+We solve the constrained problem using a scalable primal-dual algorithm that exposes the 
+ trade-off between forgetting and  retention through the dynamics of the dual variable.
+
+# Setup
+
+Experimental setup
+
+-  **Hyperparameters & Search Space:** Please see the corresponding [paper](https://arxiv.org/abs/2506.05314) for details of the hyperparameter. Importantly
+    to produce good results using our method, it is vital the hyperparameter `retain_loss_eps` is set to an appropriate value.
+    To choose such a value, simply look at the value of the retain loss of the pretrained model and choose
+    an appropriately larger value than this starting value.
+
+    Note that our method's loss is a quadratic function of a difference in the logit spaces. Consequently, 
+    the value of this loss can be large. As a result, it is natural that we set the initial parameter of the
+    retain loss preference to 50 or 100.
+-  **Computational Setup:** Please see the Supplementary Material in the paper.
+
+# Results
+
+Please see the `run.sh` script that contains all necessary commands to reproduce the final results.
+
+All unlearned models are available under https://huggingface.co/tamarsonha. 
+
+# Citation
+
+
+If you use this work, please cite:
+
+
+```bibtex
+
+
+@article{entesari2025constrained,
+  title={Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models},
+  author={Entesari, Taha and Hatami, Arman and Khaziev, Rinat and Ramakrishna, Anil and Fazlyab, Mahyar},
+  journal={arXiv preprint arXiv:2506.05314},
+  year={2025}
+}
+
+```
diff --git a/community/methods/PDU/run.sh b/community/methods/PDU/run.sh
@@ -0,0 +1,130 @@
+#!/bin/bash
+
+
+
+########################################################################################################################
+########################################### Final best parameters #####################################################
+########################################################################################################################
+# for an 8 GPU system:
+num_processes=8
+
+
+##############################################  TOFU #####################################################
+per_device_train_batch_size=4
+learning_rate=0.00001
+dual_warmup_epochs=5
+
+pref=100
+dual_step_size=5
+retain_loss_eps=0.3
+
+retain_precentages=(90 95 99)
+models=(Llama-3.2-1B-Instruct Llama-3.2-3B-Instruct Llama-3.1-8B-Instruct gemma-7b-it)
+
+for model in "${models[@]}"; do
+  for retain_percentage in "${retain_precentages[@]}"; do
+
+    if [ "$retain_percentage" = "90" ]; then
+      forget_split=forget10
+      retain_split=retain90
+    elif [ "$retain_percentage" = "95" ]; then
+      forget_split=forget05
+      retain_split=retain95
+    elif [ "$retain_percentage" = "99" ]; then
+      forget_split=forget01
+      retain_split=retain99
+    else
+    #  echo "hello"
+      echo "Invalid retain percentage. Please set it to 90, 95, or 99."
+      exit 1
+    fi
+
+
+    if [ "$model" = "Llama-3.2-1B-Instruct" ]; then
+      pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.2-1B-Instruct_full
+      num_train_epochs=10
+    elif [ "$model" = "Llama-3.2-3B-Instruct" ]; then
+      pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.2-3B-Instruct_full
+      num_train_epochs=10
+    elif [ "$model" = "Llama-3.1-8B-Instruct" ]; then
+      pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.1-8B-Instruct_full
+      num_train_epochs=30
+    elif [ "$model" = "gemma-7b-it" ]; then
+      pretrained_model_name_or_path=tamarsonha/TOFU-target-gemma-7b-it
+      num_train_epochs=20
+    else
+      echo "Invalid model name. Please set it to Llama-3.2-1B-Instruct, Llama-3.2-3B-Instruct, Llama-3.1-8B-Instruct, or gemma-7b-it."
+      exit 1
+    fi
+
+    task_name=PDU-TOFU$retain_split-E$num_train_epochs-lr$learning_rate-P1-$pref-Primal$retain_loss_eps-Step$dual_step_size-Warmup$dual_warmup_epochs-model_$model
+    accelerate launch --config_file configs/accelerate/default_config.yaml --num_processes=$num_processes \
+        src/train.py --config-name=unlearn.yaml experiment=unlearn/tofu/default \
+        forget_split=$forget_split retain_split=$retain_split\
+        trainer=PDU\
+        trainer.args.num_train_epochs=$num_train_epochs\
+        trainer.args.eval_on_start=false trainer.args.do_eval=false\
+        trainer.args.per_device_train_batch_size=$per_device_train_batch_size\
+        trainer.args.learning_rate=$learning_rate\
+        trainer.method_args.gamma=1. trainer.method_args.alpha=$pref\
+        trainer.method_args.primal_dual=true trainer.method_args.retain_loss_eps=$retain_loss_eps\
+        trainer.method_args.dual_step_size=$dual_step_size\
+        trainer.method_args.dual_update_upon="step" trainer.method_args.dual_warmup_epochs=$dual_warmup_epochs\
+        task_name=$task_name\
+        model=$model model.model_args.pretrained_model_name_or_path=$pretrained_model_name_or_path
+  done
+done
+
+
+######################################################## MUSE #########################################################
+dual_step_size=1
+num_train_epochs=10
+dual_warmup_epochs=3
+data_splits=("News" "Books")
+learning_rate=0.00001
+dual_update_upon="step"
+
+models=(Llama-2-7b-hf Llama-2-13b-hf)
+pref=50
+
+for model in "${models[@]}"; do
+  for data_split in "${data_splits[@]}"; do
+
+    if [ "$model" = "Llama-2-7b-hf" ]; then
+      pretrained_model_name_or_path=muse-bench/MUSE-${data_split}_target
+      epsNews=(1.5)
+      epsBooks=(0.1)
+    elif [ "$model" = "Llama-2-13b-hf" ]; then
+      pretrained_model_name_or_path=tamarsonha/MUSE-${data_split}-target-Llama-2-13b-hf
+      epsNews=(0.8)
+      epsBooks=(0.6)
+    else
+      exit 1
+    fi
+
+
+    if [ "$data_split" == "News" ]; then
+      eps_array=("${epsNews[@]}")
+    else
+      eps_array=("${epsBooks[@]}")
+    fi
+
+    for retain_loss_eps in "${eps_array[@]}"; do
+      task_name=PDU-Muse$data_split-E$num_train_epochs-lr$learning_rate-P1-$pref-Primal$retain_loss_eps-Step$dual_step_size-Warmup$dual_warmup_epochs-model$model
+      accelerate launch --config_file configs/accelerate/default_config.yaml --num_processes=$num_processes \
+          src/train.py --config-name=unlearn.yaml experiment=unlearn/muse/default \
+          data_split=$data_split\
+          trainer=PDU\
+          trainer.args.num_train_epochs=$num_train_epochs\
+          trainer.args.eval_on_start=false trainer.args.do_eval=false\
+          trainer.args.per_device_train_batch_size=$per_device_train_batch_size\
+          trainer.args.learning_rate=$learning_rate\
+          trainer.method_args.gamma=1. trainer.method_args.alpha=$pref\
+          trainer.method_args.primal_dual=true trainer.method_args.retain_loss_eps=$retain_loss_eps\
+          trainer.method_args.dual_step_size=$dual_step_size\
+          trainer.method_args.dual_update_upon="step" trainer.method_args.dual_warmup_epochs=$dual_warmup_epochs\
+          task_name=$task_name\
+          model=$model model.model_args.pretrained_model_name_or_path=$pretrained_model_name_or_path
+    done
+  done
+done
diff --git a/community/methods/template/run.sh b/community/methods/template/run.sh
@@ -10,4 +10,4 @@
 ########################################### Final best parameters #####################################################
 ########################################################################################################################
 
-# Required to replicate your results
+# Required to replicate your results
diff --git a/configs/data/datasets/MUSE_train.yaml b/configs/data/datasets/MUSE_train.yaml
@@ -0,0 +1,8 @@
+MUSE_train:
+  handler: PretrainingDataset
+  args:
+    hf_args:
+      path: "tamarsonha/MUSE-News-Train"
+      split: "full"
+    text_key: "text"
+    max_length: 2048
diff --git a/configs/experiment/finetune/muse/default.yaml b/configs/experiment/finetune/muse/default.yaml
@@ -0,0 +1,30 @@
+# @package _global_
+
+defaults:
+  - override /model: Llama-2-13b-hf
+  - override /trainer: finetune
+  - override /data/[email protected]: MUSE_train
+  - override /eval: muse
+  - override /data: finetune
+
+mode: finetune
+data_split: News
+data_sub_set: full   # full or retain
+
+data:
+  train:
+    MUSE_train:
+      args:
+        hf_args:
+          path: tamarsonha/MUSE-${data_split}-Train
+          split: ${data_sub_set}
+# you can find fine-tuned models on https://huggingface.co/tamarsonha
+
+trainer:
+  args:
+    learning_rate: 1e-5
+    weight_decay: 0.01
+    warmup_epochs: 1.0 # custom parameter
+    num_train_epochs: 10
+
+task_name: muse_news_full
diff --git a/configs/experiment/unlearn/wmdp/default.yaml b/configs/experiment/unlearn/wmdp/default.yaml
@@ -50,7 +50,7 @@ trainer:
     gamma: 1.0
     steering_coeff: 2
     retain_loss_type: EMBED_DIFF
-    alpha: 1 
+    alpha: 1
     module_regex: model\.layers\.7
     trainable_params_regex: 
       - model\.layers\.(5|6|7)\.mlp\.down_proj\.weight # If you want to update only these weights (as done in https://github.com/centerforaisafety/wmdp/blob/bc5e1ba0367ea826caeeeaa50656336a1e87acfb/rmu/unlearn.py#L26)

diff --git a/configs/model/Llama-2-13b-hf.yaml b/configs/model/Llama-2-13b-hf.yaml
@@ -0,0 +1,12 @@
+model_args:
+  pretrained_model_name_or_path: "meta-llama/Llama-2-13b-hf"
+  attn_implementation: 'flash_attention_2'
+  torch_dtype: bfloat16
+tokenizer_args:
+  pretrained_model_name_or_path: "meta-llama/Llama-2-13b-hf"
+template_args:  # Used in creating prompts for the dataset. See src/data/utils.py#preprocess_chat_instance.
+  apply_chat_template: False
+  user_start_tag: "Question: "
+  user_end_tag: "\n"
+  asst_start_tag: "Answer: "
+  asst_end_tag: "\n\n"
diff --git a/configs/trainer/PDU.yaml b/configs/trainer/PDU.yaml
@@ -0,0 +1,14 @@
+defaults:
+  - GradDiff
+
+handler: PDU
+method_args:
+  retain_loss_eps: ???
+  primal_dual: True
+  dual_step_size: 1.0
+  dual_update_upon: "step" # "step" or "epoch"
+  dual_warmup_epochs: 5
+  gamma: 1.0
+  alpha: 1.0
+  loss_names: ["forget_loss", "retain_loss"]
+
diff --git a/configs/trainer/RMU.yaml b/configs/trainer/RMU.yaml
@@ -7,7 +7,7 @@ method_args:
   gamma: 1.0
   steering_coeff: 2
   retain_loss_type: EMBED_DIFF
-  alpha: 1 
+  alpha: 1
   module_regex: model\.layers\.7
   trainable_params_regex: 
     - .* # update all parameters (as done in https://github.com/tmlr-group/G-effect/blob/ef368eea3b2c6dba1e090b9ebb021ac9f047e0ae/dataloader.py#L271)

diff --git a/configs/trainer/finetune.yaml b/configs/trainer/finetune.yaml
@@ -21,4 +21,4 @@ args:
   eval_on_start: True
   eval_strategy: epoch
   num_train_epochs: 10
-  seed: 0
+  seed: 0
diff --git a/docs/evaluation.md b/docs/evaluation.md
@@ -270,3 +270,4 @@ simple_evaluate_args:
   system_instruction: null
   apply_chat_template: false
 ```
+
diff --git a/docs/links.md b/docs/links.md
@@ -37,6 +37,8 @@ Links to research papers and resources corresponding to implemented features in
 | SatImp               | Paper[📄](https://arxiv.org/pdf/2505.11953), Code [🐙](https://github.com/Puning97/SatImp-for-LLM-Unlearning)                                                                                      |
 | WGA (G-effect)       | Paper[📄](https://arxiv.org/pdf/2502.19301), Code [🐙](https://github.com/tmlr-group/G-effect)                                                                                                     |
 | CE-U (Cross-Entropy unlearning)       | Paper[📄](https://arxiv.org/pdf/2503.01224)                                                                                                     |
+| PDU                  | Paper [📄](https://arxiv.org/abs/2506.05314) |
+
 
 ---
 

diff --git a/docs/repro.md b/docs/repro.md
@@ -19,10 +19,10 @@ bash scripts/muse_unlearn.sh
 
 For all the experiments below, we used the following setup
 
-| **Category**            | **Details** |
-|-------------------------|------------|
-| **Hardware**           | 2 × L40s GPUs (48GB each) |
-| **Distributed Computing** | [DeepSpeed ZeRO Stage 3 (Accelerate)](https://huggingface.co/docs/accelerate/en/usage_guides/deepspeed) |
+| **Category**            | **Details**                                                                                                                                                                                                                                                                                                             |
+|-------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| **Hardware**           | 2 × L40s GPUs (48GB each)                                                                                                                                                                                                                                                                                               |
+| **Distributed Computing** | [DeepSpeed ZeRO Stage 3 (Accelerate)](https://huggingface.co/docs/accelerate/en/usage_guides/deepspeed)                                                                                                                                                                                                                 |
 | **Hyperparameters**    | Learning Rate (lr) = 1e-5 <br> α = 1, γ = 1, β = 0.1 (where applicable) <br> Batch size 32 effectively: 8 per device, 4 grad accum steps <br> Number of Epochs = 10 <br> Optimizer: [paged_adamw_32bit](https://huggingface.co/docs/bitsandbytes/main/en/reference/optim/adamw#bitsandbytes.optim.PagedAdamW) |
 
 

diff --git a/src/trainer/__init__.py b/src/trainer/__init__.py
@@ -14,6 +14,7 @@
 from trainer.unlearn.ceu import CEU
 from trainer.unlearn.satimp import SatImp
 from trainer.unlearn.wga import WGA
+from trainer.unlearn.pdu import PDU
 
 
 import logging
@@ -97,3 +98,4 @@ def load_trainer(
 _register_trainer(CEU)
 _register_trainer(SatImp)
 _register_trainer(WGA)
+_register_trainer(PDU)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -270,3 +270,4 @@ simple_evaluate_args:
		system_instruction: null
		apply_chat_template: false
		```