DTU-MLOps-Group7

Project repository for DTU 02476 - MLOps courses in January 2023.

Authors:

Chuansheng Liu, Xindi Wu, Chongchong Li, Mouadh Sadani

Project Description

Overall goal of the project

The goal of the project is to use convolutional neural network-based architecture to classify images in computer vision.

What framework are you going to use (PyTorch Image Models, Transformer, Pytorch-Geometrics)

Because the task is to classify images, we are going to use Pytorch Image Models framework to achieve our project goal.

How do you intend to include the framework into your project

From the framework, we will import and modify the needed model. Besides, the framework provides many tools for data processing, tuning, training. We will also use whichever is useful to our project.

What data are you going to run on (initially, may change)

We are going to use ImageNet 1000 (mini). It includes 1000 classes. It is a more compressed version of the ImageNet dataset and contains 38,7k images. The ImageNet dataset is used widely for classification challenges and is useful to develop Computer Vision and Deep Learning algorithms.

What deep learning models do you expect to use

The model we expect to use is ResNeSt. It is a ResNet variant which stacking several Split-Attention blocks (conposed by featuremap group and split attention operations). It is easy to work with, computational efficient, and universally improves the learned feature representations to boost performance across image classification.

Project Implementation

Configure Environment:

pip install -r requirements
pip install -r requirements_tests

or

make requirements

Download data and models:

dvc pull

or download data from: https://www.kaggle.com/datasets/ifigotin/imagenetmini-1000

Train model:

python src/models/train_model.py

or

make train

Inference:

python src/models/predict_model.py

or

make predict

Run unittest with coverage

coverage run --source=./src -m pytest tests/

or

make tests

create api (need Signoz)

make api

Project Organization

├── LICENSE
│
├── Makefile           <- Makefile with commands like `make train`.
│
├── README.md          <- The top-level README for developers using this project.
│
├── app                <- A fastapi to do inference.
│
├── conf
│   ├── data           <- Configurations for dataset.
│   └── experiment     <- Configurations for training.
│
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── model_store        <- Applications for local and cloud deployment.
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
│
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
├── tests              <- Unit tests code
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

Project based on the cookiecutter data science project template. #cookiecutterdatascience

Project Checklist

Please note that all the lists are exhaustive meaning that I do not expect you to have completed every point on the checklist for the exam.

Week 1

Week 2

Week 3

Check how robust your model is towards data drifting
Setup monitoring for the system telemetry of your deployed model
Setup monitoring for the performance of your deployed model
If applicable, play around with distributed data loading
If applicable, play around with distributed model training
Play around with quantization, compilation and pruning for you trained models to increase inference speed

Additional

Revisit your initial project description. Did the project turn out as you wanted?
Make sure all group members have a understanding about all parts of the project
Uploaded all your code to github

Name		Name	Last commit message	Last commit date
Latest commit History 230 Commits
.dvc		.dvc
.github/workflows		.github/workflows
app		app
conf		conf
docs3		docs3
model_store		model_store
references		references
reports		reports
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cloudbuild.yaml		cloudbuild.yaml
cml.yaml		cml.yaml
codecov.yml		codecov.yml
config_cpu.yaml		config_cpu.yaml
config_gpu.yaml		config_gpu.yaml
data.dvc		data.dvc
entrypoint.sh		entrypoint.sh
get-pip.py		get-pip.py
jobscript.sh		jobscript.sh
models.dvc		models.dvc
project_structure.md		project_structure.md
project_structure.txt		project_structure.txt
report.html		report.html
requirements.txt		requirements.txt
requirements_tests.txt		requirements_tests.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini
trainer.dockerfile		trainer.dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DTU-MLOps-Group7

Authors:

Project Description

Overall goal of the project

What framework are you going to use (PyTorch Image Models, Transformer, Pytorch-Geometrics)

How do you intend to include the framework into your project

What data are you going to run on (initially, may change)

What deep learning models do you expect to use

Project Implementation

Project Organization

Project Checklist

Week 1

Week 2

Week 3

Additional

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DTU-MLOps-Group7

Authors:

Project Description

Overall goal of the project

What framework are you going to use (PyTorch Image Models, Transformer, Pytorch-Geometrics)

How do you intend to include the framework into your project

What data are you going to run on (initially, may change)

What deep learning models do you expect to use

Project Implementation

Project Organization

Project Checklist

Week 1

Week 2

Week 3

Additional

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages