-
Notifications
You must be signed in to change notification settings - Fork 256
Dev optimize inference v2 #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @vqdang, thanks for this amazing work. I just found out that you had started working on a HoVer Net version that doesn't require caching the intermediate network prediction. My implementation uses Dask to manage all the multiprocessing and lazy loading the chunks of the input image. I also included a handler to open WSI files stored in the Zarr format. That handler is for files converted from proprietary file formats to the OME-NGFF specification. Let me know if you would be interested on continuing with this version since I'm happy to contribute with it. |
Hi @fercer , thank you for your interest in our work. I am sorry for my late reply. I have a non-caching version fully implemented over here However, it is not based on Dask. and likely not applicable on the cluster level. Other than that, It still has some kinks that I have yet to figure out but its working fine, just magnitude slower. Fundamentally, non-caching will require re-running multiple inferences for existing locations. I would suggest improving the other one if possible. As a caveat though, non-caching won't allow doing inference and post-processing in parallel easily at an entire slide level so you should keep that in mind (infer slide 1 => post proc slide 1 and infer slide 2 =>, etc.). |
Carry on the idea from #104 and to resolve some of the problems when using caching. This update adds several things
super_wsi.py
that allows running inference without caching. This should potentially resolve the problem of numpy mmap on HPC and on systems with low storage. This will rely on python Multiprocessing Context Manager for data transfer between parallel processes. Python will pickle the data and keep track of it intmp
ortmpf
, so this may be a concern on some systems.nr_tiles
. Upon any available tile prediction map, the main process remove the tile prediction map and will launch new process to do the postproc on that and aggregate the results. For system with low memory, settingnr_postproc_worker
is esssential to prevent blowing the memory.At the moment, the code for (1) is fully functional (
super_wsi.py
), however there are some unclear behaviors wrt to expected runtime performance. This needs to be investigated on other systems. Current test case isTCGA-NJ-A4YI-01Z-00-DX1
, and it is of size 70kx70k. Theoretically, currently It would take ~1h10min using 3 TITAN XP 12GB with a batch size of 32, on Linux system with 128GB, 8 workers for forward and 16 workers for postprocessing (entire process would occupy upto 70GB).