Hybrid Fake News Detection Model

A hybrid deep learning model for fake news detection using BERT and BiLSTM with attention mechanism.

Project Overview

This project implements a state-of-the-art fake news detection system that combines the power of BERT (Bidirectional Encoder Representations from Transformers) with BiLSTM (Bidirectional Long Short-Term Memory) and attention mechanisms. The model is designed to effectively identify fake news articles by analyzing their textual content and linguistic patterns.

Data and Model Files

The project uses the following datasets and model files:

Datasets

Raw and processed datasets are available at: Data Files
- Contains both raw and processed versions of the datasets
- Includes LIAR and Kaggle Fake News datasets
- Preprocessed versions ready for training

Model Files

Trained model checkpoints are available at: Model Files
- Contains saved model weights
- Includes best model checkpoints
- Model evaluation results

Project Structure

.
├── data/
│   ├── raw/           # Raw datasets
│   └── processed/     # Processed data
├── models/
│   ├── saved/        # Saved model checkpoints
│   └── checkpoints/  # Training checkpoints
├── src/
│   ├── config/       # Configuration files
│   ├── data/         # Data processing modules
│   ├── models/       # Model architecture
│   ├── utils/        # Utility functions
│   └── visualization/# Visualization modules
├── tests/            # Unit tests
├── notebooks/        # Jupyter notebooks
└── visualizations/   # Generated plots and graphs

Features

Hybrid architecture combining BERT and BiLSTM
Attention mechanism for better interpretability
Comprehensive text preprocessing pipeline
Support for multiple feature extraction methods
Early stopping and model checkpointing
Detailed evaluation metrics and saved reports (evaluation_report.json)
Reproducible runs via global seeding
Mixed-precision (AMP) training and linear warmup scheduler
Optional TensorBoard logging (runs/hybrid_model)
Interactive visualizations of model performance
Support for multiple datasets (LIAR, Kaggle Fake News)

Installation

Clone the repository:

git clone https://github.com/yourusername/fake-news-detection.git
cd fake-news-detection

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Usage

Download the required files:
- Download datasets from the Data Files link
- Download pre-trained models from the Model Files link
- Place the files in their respective directories as shown in the project structure
Prepare your dataset:
- Place your dataset in the data/raw directory
- The dataset should have at least two columns: 'text' and 'label'
- Supported formats: CSV, TSV
Train the model:

python src/train.py

Outputs:
- Final weights: models/saved/final_model.pt
- Best checkpoint (val F1): models/checkpoints/best_model.pt
- Metrics JSON: evaluation_report.json
- Plots: visualizations/ (history, confusion matrix, comparison, feature importance)
- TensorBoard logs (optional): runs/hybrid_model
(Optional) Launch TensorBoard to inspect logs:

tensorboard --logdir runs/hybrid_model

Model Architecture

The model combines:

BERT for contextual embeddings
BiLSTM for sequence modeling
Attention mechanism for focusing on important parts
Classification head for final prediction

Key Components:

BERT Layer: Extracts contextual word embeddings
BiLSTM Layer: Captures sequential patterns
Attention Layer: Identifies important text segments
Classification Head: Makes final prediction

Configuration

Key parameters can be modified in src/config/config.py:

Model hyperparameters
Training parameters
Data processing settings
Feature extraction options

Performance Metrics

The model is evaluated using:

Accuracy
Precision
Recall
F1 Score
Confusion Matrix

Future Improvements

Add support for image/video metadata
Implement real-time detection
Add social graph analysis
Improve model interpretability
Add API endpoints for inference
Support for multilingual fake news detection
Integration with fact-checking databases

Acknowledgments

I would like to express our sincere gratitude to Dr. Kirti Kumari for her invaluable guidance and support throughout the development of this project. Her expertise in data mining and machine learning has been instrumental in shaping this work.

Special thanks to:

Open-source community for their excellent tools and libraries
Dataset providers (LIAR, Kaggle)

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For any queries or suggestions, please feel free to reach out to me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid Fake News Detection Model

Project Overview

Data and Model Files

Datasets

Model Files

Project Structure

Features

Installation

Usage

Model Architecture

Key Components:

Configuration

Performance Metrics

Future Improvements

Acknowledgments

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
notebooks		notebooks
runs/hybrid_model		runs/hybrid_model
src		src
visualizations		visualizations
.gitignore		.gitignore
README.md		README.md
evaluation_report.json		evaluation_report.json
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Hybrid Fake News Detection Model

Project Overview

Data and Model Files

Datasets

Model Files

Project Structure

Features

Installation

Usage

Model Architecture

Key Components:

Configuration

Performance Metrics

Future Improvements

Acknowledgments

Contributing

License

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages