🎬 IMDB Sentiment Analysis (NLP)

This project builds a binary sentiment classifier (positive/negative) for IMDB movie reviews using TF-IDF and linear classifiers (Logistic Regression, Linear SVM, ComplementNB). It selects the best model via 5-fold cross-validation, saves it, and provides a desktop GUI with CustomTkinter.

⚙️ Requirements

🐍 Python 3.10+
🪟 Windows PowerShell (commands below assume Windows)

Setup (recommended: virtual environment):

cd C:\Users\Konyar\Desktop\Code\IMDB_Sentiment_Analysis_NLP
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements.txt

📦 Dataset

📁 Location: datasets/movie.csv
🧾 Expected schema: text,label
- text: review text
- label: 0 (negative), 1 (positive)
🔗 Source: Kaggle – IMDB Movie Ratings Sentiment Analysis

🧠 Train the Model

The command below trains models, selects the best via CV, evaluates on a test split, and saves the model.

python IMDB_Sentiment_Analyser.py train --csv datasets\movie.csv --model models\sentiment_pipeline.joblib

📊 What you will see

✅ Test Accuracy: ... → test accuracy
📄 Classification Report: → precision / recall / F1
🔢 Confusion Matrix: printed in console
🖼️ Confusion matrix image saved to models/confusion_matrix.png
💾 Model saved to models/sentiment_pipeline.joblib

🖥️ Launch the GUI

After training:

python IMDB_Sentiment_Analyser.py gui --model models\sentiment_pipeline.joblib

Enter a review and click "Predict". The GUI shows prediction (Positive/Negative) and a confidence percentage.

📝 Notes

⏱️ Cross-validation over ~40k rows can take a few minutes.
📈 Confidence comes from predict_proba if available (Logistic Regression). For margin-based models (LinearSVC), a sigmoid mapping of the decision score is shown for readability. This is not a calibrated probability but useful as a relative confidence.

🗂️ Project Structure

IMDB_Sentiment_Analysis_NLP/
  IMDB_Sentiment_Analyser.py      # Training, evaluation (with CM plot), GUI
  datasets/
    movie.csv                     # Dataset (text,label)
  models/
    sentiment_pipeline.joblib     # (Created after training)
    confusion_matrix.png          # (Created after training)
  requirements.txt
  README.md

📜 License

This project is for educational purposes. Dataset usage terms belong to their respective source.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎬 IMDB Sentiment Analysis (NLP)

⚙️ Requirements

📦 Dataset

🧠 Train the Model

📊 What you will see

🖥️ Launch the GUI

📝 Notes

🗂️ Project Structure

📜 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
datasets		datasets
.gitignore		.gitignore
IMDB_Sentiment_Analyser.py		IMDB_Sentiment_Analyser.py
README.md		README.md
requirements.txt		requirements.txt

CodeTaha/IMDB_Sentiment_Analysis_NLP

Folders and files

Latest commit

History

Repository files navigation

🎬 IMDB Sentiment Analysis (NLP)

⚙️ Requirements

📦 Dataset

🧠 Train the Model

📊 What you will see

🖥️ Launch the GUI

📝 Notes

🗂️ Project Structure

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages