Brain tumor detection and segmentation from MRI scans using a two-stage deep learning pipeline β EfficientNetB4 for classification, Attention ResUNet with CBAM + ASPP for segmentation.
β οΈ Research use only. This project is not intended for clinical diagnosis. It has not undergone FDA/CE validation or regulatory review.
- Overview
- What's New in v2.0
- How It Works
- Model Performance
- Dataset
- Project Structure
- Setup & Installation
- Training
- Inference
- API Reference
- Tech Stack
- References
- Contributors
- License
TumorVision processes MRI scans through two sequential stages:
- Classification β EfficientNetB4 determines whether a tumor is present. If no tumor is detected, inference stops here.
- Segmentation β If a tumor is found, Attention ResUNet v2.0 generates a precise pixel-level mask with boundary confidence scoring.
This two-stage design avoids running the heavier segmentation model on healthy scans, which meaningfully reduces inference time in practice.
| Component | v1.0 | v2.0 |
|---|---|---|
| Classification backbone | ResNet-50 | EfficientNetB4 + SE Attention |
| Segmentation model | Basic ResUNet | Attention ResUNet + CBAM + ASPP |
| Loss functions | Focal Tversky | Unified Focal + Boundary-Aware |
| Data augmentation | Basic flips/rotations | 15 medical imaging-specific augmentations |
| Inference | Standard | TTA + XLA JIT compilation |
| Classification accuracy | 97.92% | 99%+ |
| Segmentation Dice score | 0.91 | 0.94+ |
| Inference time | ~100ms | ~45ms |
MRI Input (256Γ256Γ3)
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Stage 1: Classification β
β β
β EfficientNetB4 (ImageNet) β
β + Squeeze-and-Excitation Attention β
β β Dense(512) β Dense(256) β
β β Dense(128) β Dense(64) β
β β Softmax(2) β
β β
β ~19M parameters β
βββββββββββββββββββββββββββββββββββββββ
β
Tumor found?
ββββββ΄βββββ
No Yes
β β
Done βΌ
βββββββββββββββββββββββββββββββββββββββ
β Stage 2: Segmentation β
β β
β Attention ResUNet v2.0 β
β Encoder: 32β64β128β256β512 β
β + CBAM at each encoder level β
β + ASPP in bottleneck β
β + Attention-gated skip connections β
β + SE blocks in decoder β
β β
β ~2.5M parameters β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
Tumor mask (256Γ256)
with boundary confidence
| Mechanism | Where | What it does |
|---|---|---|
| CBAM | Each encoder/decoder level | Channel + spatial attention |
| SE Block | Classification head | Channel recalibration |
| Attention Gates | Skip connections | Suppresses irrelevant activations |
| ASPP | Bottleneck | Multi-scale context aggregation |
# Segmentation β combined loss
Loss = 0.5 Γ Focal_Tversky + 0.3 Γ Dice + 0.2 Γ BCE
# Focal Tversky β handles foreground/background imbalance
Tversky = (TP + Ξ΅) / (TP + Ξ±Β·FN + (1-Ξ±)Β·FP + Ξ΅)
Focal_Tversky = (1 - Tversky)^Ξ³
# Ξ± = 0.7 β penalizes false negatives more heavily (right call for medical imaging)
# Ξ³ = 0.75 β focuses training on hard examples
# Boundary-aware loss β sharpens edge prediction
Boundary_Loss = BCE Γ Edge_Weight_Map
# Unified Focal loss β best on imbalanced sets
UFC = Ξ΄ Γ Focal_Tversky + (1-Ξ΄) Γ Focal_CEAll augmentations are applied during training only, tuned specifically for MRI characteristics.
| Augmentation | Probability | Purpose |
|---|---|---|
| Horizontal flip | 0.5 | Left-right invariance |
| Vertical flip | 0.5 | Orientation invariance |
| RandomRotate90 | 0.5 | Rotational invariance |
| ShiftScaleRotate | 0.5 | Position and scale variation |
| Elastic transform | 0.3 | Soft tissue deformation |
| Grid distortion | 0.3 | Shape variation |
| Optical distortion | 0.3 | Lens effect simulation |
| CLAHE | 0.5 | Contrast normalization |
| RandomBrightnessContrast | 0.5 | Intensity variation |
| RandomGamma | 0.5 | Gamma correction |
| Gaussian noise | 0.3 | Scanner noise robustness |
| Gaussian blur | 0.3 | Smoothing artifacts |
| Motion blur | 0.3 | Patient motion artifacts |
| Sharpen | 0.3 | Edge enhancement |
| Coarse dropout | 0.3 | Regularization |
| Metric | v1.0 (ResNet-50) | v2.0 (EfficientNetB4) |
|---|---|---|
| Accuracy | 97.92% | 99%+ |
| Precision | 0.98 | 0.99 |
| Recall | 0.98 | 0.99 |
| F1-Score | 0.98 | 0.99 |
| AUC-ROC | 0.98 | 0.995 |
| Inference time | ~100ms | ~45ms |
| Metric | v1.0 (ResUNet) | v2.0 (Attention ResUNet) |
|---|---|---|
| Dice coefficient | 0.91 | 0.94+ |
| IoU (Jaccard) | 0.88 | 0.91+ |
| Tversky index | 0.92 | 0.95+ |
| Sensitivity | 0.93 | 0.96+ |
| Specificity | 0.98 | 0.99 |
| Boundary accuracy | β | 95%+ |
| Change | Effect |
|---|---|
| CBAM attention | +3β5% localization accuracy |
| ASPP module | Better detection of small and large tumors |
| Attention gates | Sharper tumor boundaries |
| Boundary-aware loss | Precise edge delineation |
| TTA inference | More stable predictions across scan orientations |
| Mixed precision (FP16) | 2Γ faster training with no accuracy loss |
| Attribute | Detail |
|---|---|
| Source | TCGA (The Cancer Genome Atlas) |
| Total scans | 3,929 |
| Patients | 110 |
| Format | TIF, 256Γ256 |
| Split | 70% train / 15% val / 15% test |
| Class balance | ~50% tumor / ~50% healthy |
π For the complete project setup, troubleshooting, and testing instructions, please refer to SETUP.md.
The dataset and pre-trained model weights are hosted on Google Drive. Pick whichever works for you:
| Option | Link |
|---|---|
| Zipped (single file) | Download ZIP |
| Unzipped (folder) | Open Folder |
The ZIP contains both the MRI scan directories and the trained .keras/.hdf5 weight files.
After downloading and extracting, drop all TCGA_* folders directly into the root of the repository. The expected layout is:
TumorVision-2StageAI/
βββ app.py
βββ index.ipynb
βββ data_mask.csv
βββ TCGA_CS_4941_19960909/ β TCGA folders go here
βββ TCGA_CS_4942_19970222/
βββ TCGA_CS_4943_20000902/
β ... β (110 patient folders total)
βββ TCGA_HT_A616_19991226/
The training notebook reads scan paths relative to the project root, so the folder names and location need to match exactly.
TumorVision-2StageAI/
β
βββ app.py # Flask web app entry point
βββ index.ipynb # Training notebook (v2.0)
βββ utilities.py # All model code and helpers
β βββ Loss functions # Focal Tversky, Boundary-Aware, Unified Focal
β βββ Metrics # Dice, IoU, Sensitivity, Specificity
β βββ Data generators # Augmentation pipeline
β βββ Model architectures # Attention ResUNet, CBAM, ASPP
β βββ TTA prediction # Test-time augmentation helpers
β
βββ classifier-enhanced-best.keras # v2.0 classification weights
βββ AttentionResUNet-v2-weights.keras # v2.0 segmentation weights
βββ weights.hdf5 # v1.0 classification weights
βββ weights_seg.hdf5 # v1.0 segmentation weights
β
βββ data_mask.csv # Dataset labels
βββ test_tumor_detection.py # Unit tests
βββ requirements-web.txt # Python dependencies
βββ .env.example # Environment variable template
β
βββ templates/ # Flask HTML templates
βββ static/ # CSS, JS, images
βββ TCGA_*/ # MRI scan directories (one per patient)
Prerequisites: Python 3.8+, pip, git
git clone https://github.com/Brijeshthummar02/TumorVision-2StageAI.git
cd TumorVision-2StageAIpython -m venv venv
# macOS/Linux
source venv/bin/activate
# Windows
venv\Scripts\activatepip install -r requirements-web.txt# macOS/Linux
cp .env.example .env
# Windows
copy .env.example .envOpen .env and fill in the following:
FLASK_SECRET_KEY=your-long-random-secret-key
# Cloudinary β used for image upload and storage
# Get these from https://cloudinary.com/
CLOUDINARY_CLOUD_NAME=your-cloud-name
CLOUDINARY_API_KEY=your-api-key
CLOUDINARY_API_SECRET=your-api-secret
# MongoDB Atlas (optional)
# If omitted, the app falls back to local JSON storage
MONGO_URI=mongodb+srv://...Never commit your
.envfile. It's already in.gitignore.
python app.pyOpen http://localhost:5000 in your browser.
Open index.ipynb in Jupyter and run all cells. The notebook handles both training stages sequentially.
Training runs in two phases β first with the backbone frozen, then with the top layers unfrozen for fine-tuning.
| Parameter | Phase 1 (frozen backbone) | Phase 2 (fine-tuning) |
|---|---|---|
| Backbone | EfficientNetB4 (frozen) | Top 100 layers unfrozen |
| Learning rate | 0.0001 | 0.00001 |
| Epochs | 30β50 | 20β30 |
| Optimizer | Adam | Adam (clipnorm=1.0) |
| Loss | CCE (label_smoothing=0.1) | Same |
| Batch size | 16 | 16 |
| Parameter | Value |
|---|---|
| Encoder filters | 32 β 64 β 128 β 256 β 512 |
| Optimizer | Adam (lr=0.0001) |
| Loss | 0.5ΓFocal_Tversky + 0.3ΓDice + 0.2ΓBCE |
| Epochs | 80β100 |
| LR schedule | Cosine annealing with warm restarts |
| Early stopping | patience=20, monitor=val_dice |
| Batch size | 16 |
Trained weights are saved to classifier-enhanced-best.keras and AttentionResUNet-v2-weights.keras automatically.
import tensorflow as tf
from utilities import prediction
# Load models
model = tf.keras.models.load_model('classifier-enhanced-best.keras')
model_seg = tf.keras.models.load_model('AttentionResUNet-v2-weights.keras')
# Run prediction with TTA enabled
image_ids, masks, has_mask = prediction(test_df, model, model_seg, use_tta=True)TTA runs inference over multiple augmented versions of each scan and averages the results. It adds a small overhead but meaningfully improves prediction stability β especially on edge cases.
image_ids, masks, has_mask = prediction(test_df, model, model_seg, use_tta=False)- Tan & Le (2019) β EfficientNet: Rethinking Model Scaling for CNNs
- Woo et al. (2018) β CBAM: Convolutional Block Attention Module
- He et al. (2015) β Deep Residual Learning for Image Recognition
- Ronneberger et al. (2015) β U-Net: CNNs for Biomedical Image Segmentation
- Oktay et al. (2018) β Attention U-Net
- Abraham & Khan (2018) β Focal Tversky Loss
- Hu et al. (2017) β Squeeze-and-Excitation Networks
- Chen et al. (2018) β DeepLabV3+ / ASPP
Updates automatically as new contributors merge pull requests. Want to see your avatar here? start contributing.
MIT License β see LICENSE for details.
β Star this repo if you found it useful!
Made with β€οΈ for advancing medical AI














