This repository contains the implementation of our experiments on parameter-efficient fine-tuning (PEFT) methods for adapting BERT to downstream task. Read the paper
In this project, we investigated:
- Which PEFT method performs best (LoRA, Houlsby adapters, or Adapter+) in terms of accuracy and parameter efficiency on a binary classification task (CoLA dataset)?
- Do all transformer layers need adapters? Specifically, what happens if we remove adapters from lower layers?
- Adapter+ is the most effective PEFT method for this binary classification task.
- LoRA is stable but underperforms compared to adapters, even with similar parameter counts.
- Layer ablation suggests that most gains come from higher transformer layers.
- Install dependencies:
pip install -r requirements.txt- Hugging Face Authentication:
- Get your Hugging Face token from: https://huggingface.co/settings/tokens
- In
src/train.py, replacelogin()withlogin("your_token_here") - Or use my token (contact me for the token)
python train.py --figure figure4python train.py --figure figure6Output: Results saved to results_figure4/ or results_figure6/ directories with JSON files containing validation accuracy, parameters, and configuration details.
python figure4.pypython figure6.pyOutput: Displays plots showing parameter efficiency comparison and layer ablation study results.