Skip to content

Conversation

@tahaEntesari
Copy link
Contributor

What does this PR do?

Addresses #118
This PR adds several new features:

  • We add a new multi_loss base trainer class that all multi-loss trainers extend. To simplify the linear scalarization of multiple losses, we aggregate the preferences into a single input trainer.method_args.preferences that takes a list as input, instead of having the different gamma and alpha inputs.
  • We add our new PDU trainer. We provide unlearned models using this method on huggingface.co/tamarsonha. For the details of the unlearning see the paper.
  • We add a new evaluation script to use LLMs as a judge for evaluation.
  • We add a new Llama 2 13B model and provide pretrained checkpoints for the MUSE dataset on huggingface.co/tamarsonha. The models are pretrained for 10 epochs.
  • We add new fine-tuning configurations for the MUSE dataset. To streamline the fine-tuning we provide the merged dataset needed for fine-tuning MUSE on huggingface.co/tamarsonha
  • To address the changes that was made by the base multi_loss trainer, we performed searches based on the gamma and alpha key words and performed necessary changes in the documentation of the repo
  • We add documentation on PDU in the community folder.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Have you gone through the contributions guide?
  • Are your changes documented? Read documentation guidelines here.

tahaEntesari and others added 2 commits May 30, 2025 16:35
Added llm_judge and all the required files and configs required for it
Added multi_loss base trainer and made corresponding changes in other trainers.
Changed linear scalarization parameter input format to a list, allowing more losses to be added in a simple fashion. Made appropriate changes to all trainers and also documentation to reflect this.
Added finetuning for MUSE with the corresponding dataset
Added Llama 2 13B model. We provide checkpoints for this model on huggingface.co/tamarsonha
@Dornavineeth Dornavineeth requested review from Dornavineeth and molereddy and removed request for molereddy June 9, 2025 05:10
@armanhtm
Copy link

Hi dear Dorna and OU team,

I hope you're doing well. Thank you so much for your excellent repository; it has been incredibly helpful for us (the PDU team) in implementing and evaluating our method. I truly appreciate your time and effort.

I wanted to kindly ask if you could review our code and pull request, provide any feedback, and possibly merge it into your repository. We believe that PDU offers a new and interesting perspective on this problem, and we're excited to see more people engage with and use our method to make a meaningful impact.

Thank you once again for your amazing repository!

@molereddy
Copy link
Collaborator

Hi, we've seen the PR, but didn't get a chance to read it carefully. Both of us maintainers just graduated, so it's been busy for us. Things might be slower as we find and hand over to newer maintainers, but we will try to get to reviewing PDU and get it in. If you have any cleanup or code structure improvements you can make, please do those in the meantime, as that would make it easier for us. Thank you very much for your contribution!

@molereddy
Copy link
Collaborator

There are 40 files changes, and in particular I see that there are changes to many files unrelated to your PR as well, as you seem to have done some standardization. Have you checked all the changed points and tested enough of them to know this won't break things? We don't have unit tests in the code and can't run things ourselves right now so we are relying on contributors verifying all their changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unless i'm missing something) this is a metric and is implemented in evals, which must only contain evaluators. look at this example from the same folder: configs/eval/muse.yaml shows an example of what an evaluator is -- it's an evaluation suite having several individual metrics. the yaml config for a llm judge handler must be in the metrics folder of the eval suite it belongs to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLM judge is, in principle, an evaluator, not just a metric. The LLM judge performs its evaluation based on some metrics like accuracy, forgetting success, etc. As such, I do not think it should be a metric.
In the current implementation, the metrics are fixed. However, in a future release, it could be made more modular and contain some other metrics. I unfortunately don't have the time right now to make changes regarding this.

Copy link
Collaborator

@molereddy molereddy Jun 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on resolution of other comments

Copy link
Collaborator

@molereddy molereddy Jun 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems you are contributing multiloss as like a new base trainer for most unlearning methods. that's not ideal because there are other open PRs and other users who are using the old structure.

  • it seems the multiloss mentions primal dual too many times to be a base trainer for other methods, esp ones as simple as graddiff.

if you make this structural change, it would be better if the implementation were much cleaner: with a general base trainer and a more specific primal-dual class that modifies it further and also ensure consistency with other PRs

or just simply leave other methods as they are and add PDU separately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont think we need to add new arguments in default trainingargs. you can add new arguments using + and don't need the attribute to have already been in the config. https://github.com/locuslab/open-unlearning/blob/main/docs/hydra.md has this documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware of the + new arguments. I just felt like this is something that is important and good to have by default in the default parameters. I'll leave the final decision regarding this to you.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this? If people sets eval_strategy=steps and forgets to set eval_steps. This will evaluate on each step. Other option is you can set it eval_steps: 1.0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on resolution of other comments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to me this should be a metric. @Dornavineeth thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on resolution of other comments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is a defaultprompt_generator file with a generic create_prompt, it shouldn't be this specific. also prompts might be better given through the config files than written here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better we not give the prompts though configs because there are few characters say """ will be tough to parse in hydra.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tahaEntesari better we should rename create_prompt name to more specific name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer prior comment on multiloss yaml

@molereddy
Copy link
Collaborator

You may want to link your paper here: https://github.com/locuslab/open-unlearning/blob/main/docs/links.md

@molereddy
Copy link
Collaborator

Thank you for citing our work in your paper. We've just updated the README recently with a modified citation to our paper on OpenUnlearning. It would be great if you can change the citation to point to that instead!

@tahaEntesari
Copy link
Contributor Author

I will revert my changes regarding the multi_loss file.
I had been very busy and hadn't been able to check this thread. I'll try to make the required changes soon.

Added llm_judge and all the required files and configs required for it
Added multi_loss base trainer and made corresponding changes in other trainers.
Changed linear scalarization parameter input format to a list, allowing more losses to be added in a simple fashion. Made appropriate changes to all trainers and also documentation to reflect this.
Added finetuning for MUSE with the corresponding dataset
Added Llama 2 13B model. We provide checkpoints for this model on huggingface.co/tamarsonha
# Conflicts:
#	community/methods/PDU/README.md
#	community/methods/PDU/run.sh
#	configs/trainer/PDU.yaml
#	src/trainer/unlearn/pdu.py
@molereddy
Copy link
Collaborator

Can decide how to go about things on the basis of @Dornavineeth's review.

@armanhtm
Copy link

Hi dear Dorna and Anmol,

I hope you are doing well. Looks like Taha fixed the problems of the previous pull request. Is there anything else to do? Please let us know if something is missing so we can address it quickly. Thank you so much for your consideration.

@Dornavineeth
Copy link
Collaborator

Sorry. This has been delayed from my end. Will soon go over this PR.

@armanhtm
Copy link

No problem at all. Thank you so much for your attention!

@Dornavineeth
Copy link
Collaborator

@tahaEntesari Thank you for the considerable work you've put into this. A few thoughts after reviewing:

Trainer/Method

Components look excellent, really well done.

Eval

Regarding the llm_judge, I wonder if it might be cleaner to structure it as a metric instead. I understand your reasoning - framing it as a "judge" allows for multiple metrics under one umbrella, similar to how changing prompts leads to different evaluations. That said, this is also true for metrics like forget_QA_ROUGE and retain_QA_ROUGE, which share an implementation but are defined distinctly in the config.

When implementing MIA, we faced a similar design choice and leaned toward organizing metrics this way. You can implement in a similar way. More specifically

a) Move this into src/evals/metrics/llm_judge/
b) Create a prompts/ subfolder, so it's easy for others to contribute and extend
c) Register prompt variants explicitly in src/evals/metrics/llm_judge/init.py

One other note, I noticed the current evaluator takes in result files (like TOFU_EVAL.json) as arguments. This might restrict flexibility for users who work with different datasets and can't expect them to be in TOFU_EVAL.json files. Ideally, the evaluation interface should accept inputs directly rather than rely on file outputs.

Given the scope of changes around llm_judge and your availability, it makes sense to land the training parts first and revisit the evals piece in a separate PR. Can you remove the llm_judge in this PR and make a new PR including evals? I know it’s a tedious, sorry to ask, but this would really help make the system more modular and makes it easer for the other people.

PRO TIP: Always better to have n number of smaller PRs as it is easy to iterate and review.

Thanks again for all your effort on this!

@tahaEntesari
Copy link
Contributor Author

@Dornavineeth Thank you for your review and comments.
Ok. Let me remove the LLM judge components and let you know.

@tahaEntesari
Copy link
Contributor Author

All LLM judge files have been removed from this PR. I also reverted the fine-tune config to its original format.

@Dornavineeth
Copy link
Collaborator

Wow. This is super quick! Can you please resolve the conflicts (these should be minor)?
We can merge it into main after this. Also can you open a new PR with the llm_judge as judge, we can iterate on it.

@tahaEntesari
Copy link
Contributor Author

Fixed the conflicts.
I will need to work on the llm judge. I don't have the time right now unfortunately. Once a bit more settled, I can start looking into it and will take what you have said about implementing it as a metric into consideration.

Copy link
Collaborator

@Dornavineeth Dornavineeth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Work!

@Dornavineeth Dornavineeth changed the title Multi loss base, PDU, LLM as a Judge, Llama 2 13B, MUSE finetuning PDU, Llama 2 13B, MUSE finetuning Jul 20, 2025
@Dornavineeth Dornavineeth merged commit fd825ea into locuslab:main Jul 20, 2025
1 check passed
@Dornavineeth
Copy link
Collaborator

Dornavineeth commented Jul 20, 2025

@tahaEntesari Sounds good. You can open a [WIP] PR for llm_judge so the community can pick it up if interested. I’ve seen folks interested in implementing metrics via prompting LLMs using APIs; this might be a useful starting point for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants