Skip to content

Latest commit

 

History

History
151 lines (125 loc) · 6.44 KB

File metadata and controls

151 lines (125 loc) · 6.44 KB

Run with Minimal Input

To get initial impressions of Auto3DSeg, users can try this two-minute example. The example covers the entire pipeline from start to finish and can be done in two minutes using a single GPU (GPU RAM >= 8GB).

Here are detailed steps to quickly launch Auto3DSeg for general medical image segmentation.

Step 0. Download public data or prepare internal data in a custom data root. For data from Medical Segmentation Decathlon (MSD), users can use the following Python script to download it.

import os
from monai.apps import download_and_extract

root = "./"
msd_task = "Task05_Prostate"
resource = "https://msd-for-monai.s3-us-west-2.amazonaws.com/" + msd_task + ".tar"
compressed_file = os.path.join(root, msd_task + ".tar")
if os.path.exists(root):
    download_and_extract(resource, compressed_file, root)

Step 1. Provide a datalist.json file. See the documentation under the load_decathlon_datalist function in monai.data.decathlon_datalist for details on the file format.

For the AutoRunner, you only need the training field with its list of training files:

{
    "training":
        [
            {"image": "/path/to/image_1.nii.gz", "label": "/path/to/label_1.nii.gz"},
            {"image": "/path/to/image_2.nii.gz", "label": "/path/to/label_2.nii.gz"},
            ...
        ],
    "testing":
        [
           "/path/to/test_image_1.nii.gz",
           "/path/to/test_image_2.nii.gz",
            ...
        ]
}

In each training item, you can add a fold field (with an integer starting at 0) to pre-specify the cross-validation folds, otherwise the AutoRunner will generate its own folds (always 5). All trained algorithms will use the same generated or pre-specified folds, the file can be found in the work_dir folder that the AutoRunner generates. If you have a validation set, you can include it under a validation key with the same format as the training list. This will disable cross-validation. A "testing" list can also be added, which only requires the image files, not the labels. If it is included, the AutoRunner will output predictions on the testing set after training. It is recommended to add a name field and any other metadata fields that allow you to track which version of your dataset the models are trained on.

Save the file to ./datalist.json.

Step 2. Prepare "task.yaml" with the necessary information as follows.

modality: CT  # or MRI
datalist: "./datalist.json"
dataroot: "/workspace/data/task"

Step 3. Run the following bash command to start the pipeline without any further intervention.

python -m monai.apps.auto3dseg AutoRunner run --input='./task.yaml'

Input

A typical example of an input folder structure with all necessary components is as follows. Components can be located anywhere in the machine as long as the paths in task.yaml are correct.

./Task/
├─ Data/
├─ task.json
└─ task.yaml

Output

When the pipeline finishes, all output files will be saved in the directory "./workdir" by default. And the output folder structure is shown as follows.

./Task/
├── Data/
├── task.json
├── task.yaml
└── workdir/
    ├── datastats.yaml
    ├── algorithm_templates
    │   ├── dints
    │   ├── segresnet
    │   ├── segresnet2d
    │   ├── swinunetr
    ├── dints_0
    │   ├── configs
    │   ├── model_fold0
    │   └── scripts
	...
    ├── segresnet_0
    │   ├── configs
    │   ├── model_fold0
    │   └── scripts
	...
    ├── segresnet2d_0
    │   ├── configs
    │   ├── model_fold0
    │   └── scripts
	...
    ├── swinunetr_0
    │   ├── configs
    │   ├── model_fold0
    │   └── scripts
    ...
    └── ensemble_output

Several important components are generated along the way.

  1. "datastats.yaml" is a summary of the dataset from the data analyzer. The summary report includes information such as data size, spacing, intensity distribution, etc., for a better understanding of the dataset. An example "datastats.yaml" is shown as follows.
...
stats_summary:
  image_foreground_stats:
    intensity: {max: 1326.0, mean: 353.68545989990236, median: 339.03333333333336,
      min: 0.0, percentile_00_5: 94.70366643269857, percentile_10_0: 210.9, percentile_90_0: 518.3333333333334,
      percentile_99_5: 734.7439453125, stdev: 122.72876790364583}
  image_stats:
    channels:
      max: 2
      mean: 2.0
      median: 2.0
      min: 2
      percentile: [2, 2, 2, 2]
      percentile_00_5: 2
      percentile_10_0: 2
      percentile_90_0: 2
      percentile_99_5: 2
      stdev: 0.0
    intensity: {max: 2965.0, mean: 307.1866872151693, median: 239.9, min: 0.0, percentile_00_5: 1.5333333333333334,
      percentile_10_0: 54.53333333333333, percentile_90_0: 649.3333333333334, percentile_99_5: 1044.0333333333333,
      stdev: 238.39599100748697}
    shape:
      max: [384, 384, 24]
      mean: [317.8666666666667, 317.8666666666667, 18.8]
...

2."algorithm_templates" are the algorithm templates used to generate actual algorithm bundle folders with information from data statistics.

3."dints_x", "segresnet_x", "segresnet2d_x", and "swinunetr_x" are automatically generated 5-fold MONAI bundle based on established networks and well-tuned training recipes. They are self-contained folders, which can be used for model training, inference, and validation via executing commands in the README document of each bundle folder. More information can be referred to via this link](https://docs.monai.io/en/latest/mb_specification.html). And "model_foldx" is where checkpoints after training are saved together with training history and tensorboard event files.

Note: if users would like to run model training parallel with more computing resources, they can stop the pipeline after bundle folders are generated and executed model training via commands in the README document of each bundle folder.

4."predictions_testing" are the predictions for the test data (with the "testing" key in the data list) from the model ensemble. By default, We select the best model/algorithm from each fold for the ensemble.