Binary image classifier for Pakistani financial screenshots, including bank statements, IBFT receipts, JazzCash, Easypaisa, HBL, UBL, Meezan, and MCB-style transaction screenshots. The model predicts REAL or FAKE, reports confidence, and prints module-level signals that explain the result.
This project combines handcrafted forensic signals with deep visual embeddings:
- Error Level Analysis for compression inconsistency.
- EXIF metadata checks for editor/software anomalies.
- MobileNetV2 embeddings reduced with PCA.
- EasyOCR text-region consistency features.
- DCT block matching for copy-move tampering.
- UI layout structure and symmetry features.
- Soft-voted SVM and XGBoost classifiers.
Create and activate a Python environment, then install dependencies:
pip install -r requirements.txtInstall the Tesseract binary as well, because pytesseract is only the Python wrapper:
# Ubuntu / Debian
sudo apt-get install tesseract-ocr
# macOS
brew install tesseractOn Windows, install Tesseract from the official installer and make sure the executable is available on PATH.
EasyOCR downloads its language models automatically the first time OCR runs. That first run needs internet access and can take a few minutes.
Keep images in this structure:
dataset/
real/
real_image_001.jpg
fake/
fake_image_001.png
JPEG and PNG files are supported. Mixed mobile resolutions are expected.
python train.pyTraining performs these steps:
- Scans
dataset/realanddataset/fake. - Fits MobileNetV2 PCA and saves
models/pca.joblib. - Extracts 158-dimensional feature vectors.
- Saves
outputs/features.npyandoutputs/labels.npy. - Fits and saves
models/scaler.joblib. - Trains SVM and XGBoost models.
- Saves reports and plots in
outputs/.
Model files are saved in models/ and can be loaded later without retraining.
python predict.py --image path/to/screenshot.jpgThe script prints a readable report:
============================================
PAKISTANI FINANCIAL SCREENSHOT ANALYSIS
============================================
File : screenshot.jpg
Verdict : FAKE
Confidence : 91.3%
Risk Level : LOW (high confidence result)
Module Scores:
ELA Suspicion Score : 0.74 WARN
Metadata Anomaly Score : 0.60 WARN
MobileNet Deviation : 0.81 ALERT
OCR Inconsistency Score : 0.33 OK
Copy-Move Score : 0.12 OK
UI Layout Deviation : 0.55 WARN
The ELA visualization is saved to outputs/ela_<filename>.png.
python gui.pyUse Upload Image to select a screenshot, then Run Detection to calculate the REAL/FAKE verdict. The GUI shows the prediction details and previews the generated ELA output image.
python evaluate.pyEvaluation loads cached features and runs 10-fold stratified cross-validation. It reports mean accuracy, precision, recall, F1, and AUC-ROC with standard deviation. It also saves error-analysis artifacts in outputs/.
With around 180 real and 180 fake images, performance depends heavily on how representative the forged screenshots are. A reasonable first benchmark is:
- Accuracy: 75-90%
- F1: 75-90%
- AUC-ROC: 80-95%
If fake screenshots are generated from very similar templates or augmented copies, scores may look higher than real-world performance. Always validate on fresh screenshots from sources not used during training.
- Labels use
0 = REALand1 = FAKE. - Classifier probabilities represent fake probability.
- All plots are saved for headless/server execution.
- Feature extraction is robust: if one module fails on an image, that module contributes zeros and the pipeline continues.
- MobileNetV2 weights are loaded through the modern torchvision
weights=API.