Hi,
Great tool! I run into an issue where when I'm running large amount sequence generation. I notice the output slowly accumulates in the GPU's memory, which cause CUDA OOM error at some point of the run. Does anyone know how we can offload the predicted results to the output file early on? Or cache them in CPU memory occasionally during the run? Is this an implemented feature?
Thanks