Skip to content

Sohaibcodecrafter/Pakistani-Financial-Screenshot-Forgery-Detector-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pakistani Financial Screenshot Forgery Detector

Binary image classifier for Pakistani financial screenshots, including bank statements, IBFT receipts, JazzCash, Easypaisa, HBL, UBL, Meezan, and MCB-style transaction screenshots. The model predicts REAL or FAKE, reports confidence, and prints module-level signals that explain the result.

This project combines handcrafted forensic signals with deep visual embeddings:

  • Error Level Analysis for compression inconsistency.
  • EXIF metadata checks for editor/software anomalies.
  • MobileNetV2 embeddings reduced with PCA.
  • EasyOCR text-region consistency features.
  • DCT block matching for copy-move tampering.
  • UI layout structure and symmetry features.
  • Soft-voted SVM and XGBoost classifiers.

Installation

Create and activate a Python environment, then install dependencies:

pip install -r requirements.txt

Install the Tesseract binary as well, because pytesseract is only the Python wrapper:

# Ubuntu / Debian
sudo apt-get install tesseract-ocr

# macOS
brew install tesseract

On Windows, install Tesseract from the official installer and make sure the executable is available on PATH.

EasyOCR downloads its language models automatically the first time OCR runs. That first run needs internet access and can take a few minutes.

Dataset Layout

Keep images in this structure:

dataset/
  real/
    real_image_001.jpg
  fake/
    fake_image_001.png

JPEG and PNG files are supported. Mixed mobile resolutions are expected.

Training

python train.py

Training performs these steps:

  1. Scans dataset/real and dataset/fake.
  2. Fits MobileNetV2 PCA and saves models/pca.joblib.
  3. Extracts 158-dimensional feature vectors.
  4. Saves outputs/features.npy and outputs/labels.npy.
  5. Fits and saves models/scaler.joblib.
  6. Trains SVM and XGBoost models.
  7. Saves reports and plots in outputs/.

Model files are saved in models/ and can be loaded later without retraining.

Prediction

python predict.py --image path/to/screenshot.jpg

The script prints a readable report:

============================================
PAKISTANI FINANCIAL SCREENSHOT ANALYSIS
============================================
File        : screenshot.jpg
Verdict     : FAKE
Confidence  : 91.3%
Risk Level  : LOW (high confidence result)

Module Scores:
  ELA Suspicion Score       : 0.74  WARN
  Metadata Anomaly Score    : 0.60  WARN
  MobileNet Deviation       : 0.81  ALERT
  OCR Inconsistency Score   : 0.33  OK
  Copy-Move Score           : 0.12  OK
  UI Layout Deviation       : 0.55  WARN

The ELA visualization is saved to outputs/ela_<filename>.png.

GUI App

python gui.py

Use Upload Image to select a screenshot, then Run Detection to calculate the REAL/FAKE verdict. The GUI shows the prediction details and previews the generated ELA output image.

Evaluation

python evaluate.py

Evaluation loads cached features and runs 10-fold stratified cross-validation. It reports mean accuracy, precision, recall, F1, and AUC-ROC with standard deviation. It also saves error-analysis artifacts in outputs/.

Expected Benchmarks

With around 180 real and 180 fake images, performance depends heavily on how representative the forged screenshots are. A reasonable first benchmark is:

  • Accuracy: 75-90%
  • F1: 75-90%
  • AUC-ROC: 80-95%

If fake screenshots are generated from very similar templates or augmented copies, scores may look higher than real-world performance. Always validate on fresh screenshots from sources not used during training.

Notes

  • Labels use 0 = REAL and 1 = FAKE.
  • Classifier probabilities represent fake probability.
  • All plots are saved for headless/server execution.
  • Feature extraction is robust: if one module fails on an image, that module contributes zeros and the pipeline continues.
  • MobileNetV2 weights are loaded through the modern torchvision weights= API.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages