This repository contains the code, models, and resources for the Language and Technology Practical project. The goal of this project is to explore and fine-tune large language models (LLMs) to create meaningful and emotionally resonant narratives from image descriptions.
Generating narratives that are emotionally rich and abstract from visual descriptions is a challenging task for LLMs. Existing models often struggle to capture emotional depth, maintain contextual relevance, and generate diverse outputs simultaneously. This project addresses these challenges by fine-tuning the Flan-T5 model using the ArtEmis dataset, which contains human-annotated emotional responses to visual art. We used the Low-Rank Adaptation (LoRA) fine-tuning and Chain-of-Thought (CoT) prompting to generate nuanced stories that balance creativity with coherence.
The primary objectives of this project are:
- Enhance Narrative Generation: Fine-tune LLMs to produce emotionally and contextually rich narratives.
- Optimize Model Performance: Explore and evaluate techniques like LoRA and CoT prompting to balance semantic alignment and diversity.
- Evaluate Outputs: Use both quantitative (BERTScore, dissimilarity) and qualitative (human evaluation) methods to assess model performance.
- Baseline Model: Achieved a BERTScore of 0.90 and a dissimilarity score of 0.31, offering a balanced performance in terms of coherence and diversity.
- LoRA Fine-Tuned Model: Improved semantic alignment with a BERTScore of 0.91 but showed reduced diversity (dissimilarity score: 0.23).
- CoT Fine-Tuned Model: Generated the most diverse outputs (dissimilarity score: 0.57) with slightly reduced semantic alignment (BERTScore: 0.86).
To run this project, you need to set up a Conda environment with the necessary dependencies.
Ensure you have Conda installed on your system. You can use either Miniconda (lightweight version) or Anaconda (full distribution).
Once Conda is installed, you can create and activate the environment as follows.
conda env create --file environment.ymlconda activate ltp_envAll the preprocessing and datasets can be found in the Data/ folder. The preprocessing.py file preprocesses the Artemis.csv file by filetering out obscene content with the use of the JSON file en.json. The cleaned dataset is saved in the same folder as Artemis-cleaned.csv which is used to train and evaluate the models.
The Model/ folder contains the necessary files to execute in order to obtain the results for the three models developed.
- Baseline model in
baseline.pyfile, - LoRA fine-tuned model in
lora.pyfile, - CoT fine-tuned model in
cot_wrapper.pyfile
To run these files, write the following in the terminal after activating the environment:
python <model>.pyAll these files use the models designed in the Modules/ folder. Here, the data_loader.py file loads the Artemis dataset for the models, while flan_model.py and story_gen_model.py contain the necessary functions and classes for training and evaluating the models (baseline model and, respectively, the LoRA fine-tuned model).
The results are saved as plots in the Results/ folder.
- Janan Jahed
- Alexandru Cernat
- Andrei Medesan
For creating this project we used the Artemis Dataset and Hugging Face for their tools and resources.