Skip to content

A Flask-based web app that detects spam emails/SMS using Multinomial Naive Bayes and TF-IDF. Built with NLP, Scikit-learn, and NLTK for high-accuracy classification.

License

Notifications You must be signed in to change notification settings

codedbyasim/Email-Spam-Detection-system

Repository files navigation

Email Spam Detection System using Naive Bayes

An intelligent web-based email spam classifier that uses natural language processing and a Multinomial Naive Bayes model to detect spam messages with high accuracy. Built with clean text preprocessing, TF-IDF vectorization, and trained on the well-known spam.csv dataset.


Algorithm Overview

Multinomial Naive Bayes is chosen for its effectiveness on textual data with discrete features like word frequencies and TF-IDF scores. It’s simple, fast, and highly accurate for spam filtering tasks.


Dataset


Key Features

Text preprocessing pipeline:  • Lowercasing, punctuation removal, stopwords removal, and stemming with PorterStemmer

Feature engineering:  • TF-IDF vectorisation

Model training:  • Multinomial Naive Bayes (via Scikit-learn)

Evaluation metrics:  • Accuracy, confusion matrix, classification report

Frontend:  • Clean and professional Flask-based web interface to analyze email text


Technologies Used

Python
Flask
Scikit-learn
NLTK
Pandas, NumPy
HTML, CSS, Jinja2

Model Performance

Metric Value
Accuracy 97%+
Precision High
Recall High
F1-Score Robust

Getting Started

1. Clone the Repository

git clone https://github.com/codedbyasim/Email-Spam-Detection-system.git
cd Email-Spam-Detection-system

2. Install Dependencies

pip install -r requirements.txt

3. Run the Web App

python app.py

Then open your browser and visit: http://localhost:5000


Project Structure

Email-Spam-Detection-system/
│
├── app.py                         # Flask web app
├── vectorizer.pkl                 # TF-IDF Vectorizer
├── spam_classifier_model.pkl      # Trained Naive Bayes model
├── Email_Spam_Detection_system_using_Naive_Bayes.ipynb
├── static/
│   └── style.css                  # Custom frontend styling
├── templates/
│   └── index.html                 # Frontend HTML with Jinja2
├── spam.csv                       # Raw dataset
├── README.md
└── requirements.txt

Author

Muhammad Asim Hanif Software Engineering Student | ML & AI Enthusiast GitHub | LinkedIn


About

A Flask-based web app that detects spam emails/SMS using Multinomial Naive Bayes and TF-IDF. Built with NLP, Scikit-learn, and NLTK for high-accuracy classification.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published