I am reopening this discussion and turning it into an issue. It is not ideal that serial processing or massive reduction in batch size are considered the only solutions. Some thoughts:
-
Maybe GGIR should reserve not just one core per input file when running GGIR part 1 but use 2 or 3 cores.
-
Documentation needs to better cover this challenge.
-
In relation to UK Biobank: I am wondering whether it would help if they had the processed output of part 1 available, rather than expecting everyone to running the exact same time consuming process on the CWA files, wasting time and computing resources. Possibility making a new release once every two years.
Originally posted by @vincentvanhees in #1441 (comment)