This project implements fine-tuning of the Microsoft Phi-2 model using QLoRA (Quantized Low-Rank Adaptation) on the OpenAssistant dataset.
- Create a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Set up your Hugging Face token:
- Copy the .env.templatefile to.env
- Replace YOUR_TOKEN_HEREwith your actual Hugging Face token
- Never commit your .envfile to version control
 
- Copy the 
To start the training process, simply run:
python train.pyThe model is trained using LoRA (Low-Rank Adaptation), which means it produces adapter weights instead of a full model. To use the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/phi-2",
    torch_dtype=torch.float16,
    device_map="auto"
)
# Load adapter weights
model = PeftModel.from_pretrained(
    base_model,
    "jatingocodeo/phi2-finetuned-openassistant"
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("jatingocodeo/phi2-finetuned-openassistant")
# Generate text
text = "### Instruction:\nWhat is machine learning?\n\n### Response:\n"
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))To upload your fine-tuned model to Hugging Face Hub:
python upload_model.pyMake sure you have set up your Hugging Face token in the .env file before uploading.
- Uses 4-bit quantization with QLoRA for memory-efficient fine-tuning
- Implements proper conversation formatting for the OpenAssistant dataset
- Uses a maximum sequence length of 2048 tokens
- Implements gradient accumulation and mixed precision training
- Uses the Paged Optimizer for memory efficiency
- Base model: microsoft/phi-2
- LoRA Configuration:
- Rank (r): 16
- Alpha: 32
- Target modules: q_proj, k_proj, v_proj, dense
- Dropout: 0.05
 
- Training parameters:
- Epochs: 1
- Learning rate: 2e-4
- Batch size: 4 (with gradient accumulation steps of 4)
- Weight decay: 0.001
 
The fine-tuned model will be saved in the phi2-finetuned directory with the following structure:
phi2-finetuned/
├── final_model/
│   ├── adapter_config.json       # LoRA configuration
│   ├── adapter_model.bin        # LoRA weights
│   └── tokenizer files         # Tokenizer configuration and files
└── training_log.md             # Training progress and metrics