HemoSight: Adaptive and Efficient Leukocyte Classification

This is the repository for the study published in this paper. This paper presents a self-supervised model for classifying white blood cells in peripheral blood smears, achieving high accuracy (F1: 96.2%) while generalizing to diverse label sets. The lightweight EfficientNetV2-B0-based approach enhances label efficiency with active learning and is available as the HemoSight web app to streamline clinical workflows.

Data Preparation

Workflow Overview

General flow of the data pipeline:

curate.py: Curate raw data folders to create a data reference CSV.
loader.py: Load the data reference CSV and perform train, validation, and test splits.
train.py, generator.py: Train the model; data augmentation is generated from the generator.
val.py, predictor.py: Validate trained model; predictor performs classification on embeddings.

Repository Organization

src stores python source code.
frontend stores frontend source code.
data is used to store raw data.
derived is used to store results.
mongodb is used by the MongoDB database. Three folders (data, derived, src) are needed for file I/O, and should be mounted to the container.

Configuration files:

Path configuration file (/src/core/settings.json):
This file is loaded by util.GlobalSettings. settings.json is used by default but can be overridden by settings_{systemname}.json, where {systemname} is the computer name, such as ThinkPad. This enables the same repository codebase to be cloned and run in multiple environments.
Job configuration file: Some examples are located at /src/config_*.json. See below for their usage.

Environment

Local Docker

Dockerfile.gpu is the GPU version. Dockerfile.cpu is the CPU version.

Build

docker build -t hematology:v1 -f Dockerfile.cpu .

Run the container with above three folder mounted

docker run -it --gpus all --rm -v "$(pwd)/src:/src" -v "$(pwd)/derived:/derived" -v "D:/Drive/Data/Hematology:/data" hematology:v1

After the container is running, execute the following in the container.
- Training python -m model.train --cfg config.json
- Validation python -m model.val --run 20231208192208

Model deployment

Deploy to Docker Containers

Build

docker compose -f docker-compose.dev.yaml up --build

Access http://localhost:4002 for the index page.

Disclaimer

This project was developed with the assistance of generative AI. All outputs were reviewed and validated by the developers for correctness and quality.

This software is provided for research use only. It is not intended for clinical use, has not been validated for diagnostic purposes, and is not FDA approved.

Citation

If you find this repository helpful in your research or work, please consider citing our paper:

@INPROCEEDINGS{10913825,
  author={Liu, Zhuohe and Castillo, Simon P. and Han, Xin and Sun, Xiaoping and Hu, Zhihong and Yuan, Yinyin},
  booktitle={2024 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)}, 
  title={Adaptive Self-Supervised Learning of Morphological Landscape for Leukocytes Classification in Peripheral Blood Smears}, 
  year={2024},
  volume={},
  number={},
  pages={1-7},
  keywords={White blood cells;Adaptation models;Reviews;Hematology;Active learning;Self-supervised learning;Manuals;Predictive models;Image classification;Testing;image classification;self-supervised learning;active learning;hematology;peripheral blood smear},
  doi={10.1109/BHI62660.2024.10913825}}

Feel free to contact us for further information or questions related to the paper and this repository.

Yuan Lab @ MD Anderson Cancer Center

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
frontend		frontend
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile.cpu		Dockerfile.cpu
Dockerfile.fastapi.dev		Dockerfile.fastapi.dev
Dockerfile.gpu		Dockerfile.gpu
Dockerfile.nodejs.dev		Dockerfile.nodejs.dev
Dockerfile.worker.dev		Dockerfile.worker.dev
LICENSE		LICENSE
README.md		README.md
docker-compose.dev.yaml		docker-compose.dev.yaml
rs-initiate.js		rs-initiate.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HemoSight: Adaptive and Efficient Leukocyte Classification

Data Preparation

Workflow Overview

Repository Organization

Environment

Local Docker

Model deployment

Deploy to Docker Containers

Disclaimer

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HemoSight: Adaptive and Efficient Leukocyte Classification

Data Preparation

Workflow Overview

Repository Organization

Environment

Local Docker

Model deployment

Deploy to Docker Containers

Disclaimer

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages