Pakistani Financial Screenshot Forgery Detector

Binary image classifier for Pakistani financial screenshots, including bank statements, IBFT receipts, JazzCash, Easypaisa, HBL, UBL, Meezan, and MCB-style transaction screenshots. The model predicts REAL or FAKE, reports confidence, and prints module-level signals that explain the result.

This project combines handcrafted forensic signals with deep visual embeddings:

Error Level Analysis for compression inconsistency.
EXIF metadata checks for editor/software anomalies.
MobileNetV2 embeddings reduced with PCA.
EasyOCR text-region consistency features.
DCT block matching for copy-move tampering.
UI layout structure and symmetry features.
Soft-voted SVM and XGBoost classifiers.

Installation

Create and activate a Python environment, then install dependencies:

pip install -r requirements.txt

Install the Tesseract binary as well, because pytesseract is only the Python wrapper:

# Ubuntu / Debian
sudo apt-get install tesseract-ocr

# macOS
brew install tesseract

On Windows, install Tesseract from the official installer and make sure the executable is available on PATH.

EasyOCR downloads its language models automatically the first time OCR runs. That first run needs internet access and can take a few minutes.

Dataset Layout

Keep images in this structure:

dataset/
  real/
    real_image_001.jpg
  fake/
    fake_image_001.png

JPEG and PNG files are supported. Mixed mobile resolutions are expected.

Training

python train.py

Training performs these steps:

Scans dataset/real and dataset/fake.
Fits MobileNetV2 PCA and saves models/pca.joblib.
Extracts 158-dimensional feature vectors.
Saves outputs/features.npy and outputs/labels.npy.
Fits and saves models/scaler.joblib.
Trains SVM and XGBoost models.
Saves reports and plots in outputs/.

Model files are saved in models/ and can be loaded later without retraining.

Prediction

python predict.py --image path/to/screenshot.jpg

The script prints a readable report:

============================================
PAKISTANI FINANCIAL SCREENSHOT ANALYSIS
============================================
File        : screenshot.jpg
Verdict     : FAKE
Confidence  : 91.3%
Risk Level  : LOW (high confidence result)

Module Scores:
  ELA Suspicion Score       : 0.74  WARN
  Metadata Anomaly Score    : 0.60  WARN
  MobileNet Deviation       : 0.81  ALERT
  OCR Inconsistency Score   : 0.33  OK
  Copy-Move Score           : 0.12  OK
  UI Layout Deviation       : 0.55  WARN

The ELA visualization is saved to outputs/ela_<filename>.png.

GUI App

python gui.py

Use Upload Image to select a screenshot, then Run Detection to calculate the REAL/FAKE verdict. The GUI shows the prediction details and previews the generated ELA output image.

Evaluation

python evaluate.py

Evaluation loads cached features and runs 10-fold stratified cross-validation. It reports mean accuracy, precision, recall, F1, and AUC-ROC with standard deviation. It also saves error-analysis artifacts in outputs/.

Expected Benchmarks

With around 180 real and 180 fake images, performance depends heavily on how representative the forged screenshots are. A reasonable first benchmark is:

Accuracy: 75-90%
F1: 75-90%
AUC-ROC: 80-95%

If fake screenshots are generated from very similar templates or augmented copies, scores may look higher than real-world performance. Always validate on fresh screenshots from sources not used during training.

Notes

Labels use 0 = REAL and 1 = FAKE.
Classifier probabilities represent fake probability.
All plots are saved for headless/server execution.
Feature extraction is robust: if one module fails on an image, that module contributes zeros and the pipeline continues.
MobileNetV2 weights are loaded through the modern torchvision weights= API.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pakistani Financial Screenshot Forgery Detector

Installation

Dataset Layout

Training

Prediction

GUI App

Evaluation

Expected Benchmarks

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
dataset		dataset
models		models
outputs		outputs
src		src
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
gui.py		gui.py
predict.py		predict.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Pakistani Financial Screenshot Forgery Detector

Installation

Dataset Layout

Training

Prediction

GUI App

Evaluation

Expected Benchmarks

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages