The purpose of this project is to explore different kind of method of machine learning and computer vision to detect brain tumor in MRI scans.
I trained a total of 17 models, 15 models for set1 (5 types of image filtering for 3 modalities each), and 2 models for set2, (original and color quantization via kMeans Clustering)
- modality: t1, flair, segmentation
- dataset_trained_on: set1, set2
- CV_filtering: kMeans, harris, hough, canny, original
- the_models_accuracy: 0.90, 0.86, 0.65
- overfitting_potential: lofp, hofp, hufp
- tumor_true_or_false: True, False
- type_of_tumor: glioma, meningioma, notumor, pituitary
- of = overfitting
- lufp = low underfitting potential
- hufp = high underfitting potential
- lofp = low overfitting potential
- hofp = high overfitting potential
- nofp = no overfitting potential
- True = contains tumor
- False = does not contain tumor
.
├── set1: set1 consists of a dataset of 128 by 256 images containing 1 hemisphere of the brain
│ ├── canny: contains the same images in original except that they are filtered with the canny edge detector
│ │ ├── flairs
│ │ ├── segmentation
│ │ └── t1
│ │
│ ├── harris: contains the same images in original except that they are filtered with the harris corner detector
│ │ ├── flairs
│ │ ├── segmentation
│ │ └── t1
│ │
│ ├── hough: contains the same images in original except that they are filtered with the hough circle detection
│ │ ├── flairs
│ │ ├── segmentation
│ │ └── t1
│ │
│ ├── kMeans: contains the same images in original except that they are filtered with the kMeans clustering into 8 colors
│ │ ├── flairs
│ │ ├── segmentation
│ │ └── t1
│ │
│ └── original: contains the original images
│ ├── flairs
│ ├── segmentation
│ └── t1
│
└── set2: set2 consists of a dataset of 256 by 256 images containing 2 hemispheres of the brain and images taken from many axial planes
├── kMeans: contains the kMeans clustering of the images
│ ├── glioma
│ ├── meningioma
│ ├── notumor
│ └── pituitary
│
└── original: contains the original images
├── glioma
├── meningioma
├── notumor
└── pituitary
flairs: the flair imaging modality with 2 classes (tumor and non-tumor)
segmentation: the segmentation imaging modality with 2 classes (tumor and non-tumor)
t1: the t1 imaging modality with 2 classes (tumor and non-tumor)
glioma, meningioma, notumor, pituitary: These are the 4 classes of the images in set2, this dataset is more diverse, it contains images from many axial planes, it contains 2 hemispheres of the brain, and more modalities.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Here is a list of prerequisites for this project
- python 3.10
- pip 22.3.1
- pipenv 2022.11.30 (optional)
if you have pipenv installed, you can run the following command to install the dependencies
pipenv shell
pipenv installif you don't have pipenv installed, you can run the following command to install the dependencies
pip install opencv-python
pip install numpy
pip install matplotlib
pip install tensorflowor you can use the shortcut command
pip install -r requirements.txtif you encounter any issues, in terms of dependencies, please refer to the requirements.txt file for a list of required dependencies.
To activate this project's virtualenv, run pipenv shell. When in the virtualenv, type exit to escape
How commands are formed
python predict.py [argument 1] [argument 2] [argument 3]set1 and set2 are the 2 different datasets used in this project, so feeding a image from set1 into a model trained on set2 (or vice versa) will give bad results.
- set1: binary classification of tumor and non-tumor
- set2: multi-class classification of glioma, meningioma, notumor, pituitary
keep [dataset_trained_on] the same in the model file name and the image file name for proper results
This can be any file in the models folder, but it must be a .h5 file
model file as named as
- [modality]-[dataset_trained_on]-[CV_filtering]-[the_models_accuracy]-[overfitting_potential].h5
or
- [dataset_trained_on]-[CV_filtering]-[the_models_accuracy]-[overfitting_potential].h5
This can be any image file in the tests folder, but it must be a .png or .jpg file, for more images to test on, look at the dataset links in the Acknowledgments for Datasets section and add them to the tests folder
The images in the tests folder can be named anything, but for user understanding, the image files in the tests folder as named as
- [CV_filtering]-[dataset_trained_on]-[modality]-[patient_id]-[tumor_true_or_false].png
or
- [CV_filtering]-[dataset_trained_on]-[type_of_tumor]-[patient_id].jpg
Example of commands to run the program
python predict.py set1 t1-set1-harris-0.65-hofp.h5 harris-set1-t1-HG0001-85-True.png
python predict.py set1 flair-set1-hough-0.86-hufp.h5 hough-set1-flair-HG0001-57-False.png
python predict.py set2 set2-kMeans-0.90-lofp.h5 kMeans-set2-pituitary-Te-pi_0056.jpg
python predict.py set2 set2-kMeans-0.90-lofp.h5 kMeans-set2-notumor-Te-no_0020.jpg
python predict.py set2 set2-kMeans-0.90-lofp.h5 kMeans-set2-glioma-Te-gl_0025.jpgIf you see an issue or would like to contribute, please do & open a pull request or ticket for/with new features or fixes.
- Justin Zhang - Initial work - JustinZhang17
This project is licensed under the MIT License - see the LICENSE.md file for details