Dev optimize inference v2 #107

vqdang · 2021-03-10T15:57:14Z

Carry on the idea from #104 and to resolve some of the problems when using caching. This update adds several things

Add a wsi processing version super_wsi.py that allows running inference without caching. This should potentially resolve the problem of numpy mmap on HPC and on systems with low storage. This will rely on python Multiprocessing Context Manager for data transfer between parallel processes. Python will pickle the data and keep track of it in tmp or tmpf, so this may be a concern on some systems.
The running will now do async between model forward and postprocess. The main thread will dispatch a dedicated process for foward and waiting for result of each tile, communicated through a queue (can contain upto nr_tiles . Upon any available tile prediction map, the main process remove the tile prediction map and will launch new process to do the postproc on that and aggregate the results. For system with low memory, setting nr_postproc_worker is esssential to prevent blowing the memory.
Update the protocol for pipepline and allow persistent workers, potentially further speeding up the inference.
Need pytorch 1.7 at minimum

At the moment, the code for (1) is fully functional (super_wsi.py), however there are some unclear behaviors wrt to expected runtime performance. This needs to be investigated on other systems. Current test case is TCGA-NJ-A4YI-01Z-00-DX1, and it is of size 70kx70k. Theoretically, currently It would take ~1h10min using 3 TITAN XP 12GB with a batch size of 32, on Linux system with 128GB, 8 workers for forward and 16 workers for postprocessing (entire process would occupy upto 70GB).

fercer · 2023-01-09T17:43:00Z

Hi @vqdang, thanks for this amazing work.

I just found out that you had started working on a HoVer Net version that doesn't require caching the intermediate network prediction.
We are extensively using HoVer Net at my lab and I started working on a similar approach to perform the inference without caching the predictions on disk.

My implementation uses Dask to manage all the multiprocessing and lazy loading the chunks of the input image. I also included a handler to open WSI files stored in the Zarr format. That handler is for files converted from proprietary file formats to the OME-NGFF specification.

Let me know if you would be interested on continuing with this version since I'm happy to contribute with it.

vqdang · 2023-01-12T16:03:03Z

Hi @fercer , thank you for your interest in our work. I am sorry for my late reply.

I have a non-caching version fully implemented over here
https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/tiatoolbox/models/engine/nucleus_instance_segmentor.py
theoretically, you can swap it with any other backbone, not just HoVerNet (StarDist, etc.). This branch is a legacy now,

However, it is not based on Dask. and likely not applicable on the cluster level. Other than that, It still has some kinks that I have yet to figure out but its working fine, just magnitude slower. Fundamentally, non-caching will require re-running multiple inferences for existing locations. I would suggest improving the other one if possible.

As a caveat though, non-caching won't allow doing inference and post-processing in parallel easily at an entire slide level so you should keep that in mind (infer slide 1 => post proc slide 1 and infer slide 2 =>, etc.).

vqdang added 20 commits January 29, 2021 18:01

UPD: refactor to optimize for speed

084dfb6

Merge branch 'master' of https://github.com/vqdang/hover_net

aa82df6

UPD: add test code for tile pos

0b037a2

UPD: fix x and add removal flag for boundary

74ec8d8

UPD: add removal flag for normal tiling

201e63c

UPD: test persistent worker

c664a5b

UPD: fleshed out async pipeline

f2b20e8

UPD: track

6179a8d

UPD: fix extract inst on margin line, add viz test

309e8d5

UPD: add 1 pix off to inst on to resolve inclusion

801f41d

UPD: xsect fix complete

e137e05

UPD: finish integrate multiproc forward/postproc

5b68e2f

UPD: uncomment

1ba04a5

UPD: update write permission

ce97348

UPD: add proto for serialize wsi, doesnt work

7f6b8f4

UPD: sync handler implementation

91592f9

UPD: sync protocol

a376758

UPD: persistent workers for old wsi infer mode

9258369

UPD: remove debug code

9e54d88

UPD: update protocol

09de5da

vqdang mentioned this pull request Mar 10, 2021

Run inference script crashes #79

Closed

vqdang closed this Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dev optimize inference v2 #107

Dev optimize inference v2 #107

Uh oh!

vqdang commented Mar 10, 2021 •

edited

Loading

Uh oh!

fercer commented Jan 9, 2023

Uh oh!

vqdang commented Jan 12, 2023

Uh oh!

Uh oh!

Dev optimize inference v2 #107

Dev optimize inference v2 #107

Uh oh!

Conversation

vqdang commented Mar 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fercer commented Jan 9, 2023

Uh oh!

vqdang commented Jan 12, 2023

Uh oh!

Uh oh!

vqdang commented Mar 10, 2021 •

edited

Loading