In the documentation all examples use ntasks or n to specify the number of CPUs needed per GPU. This generally works fine, but external tools (such as submitit ) have a specific interpretation of ntasks, which can lead to issues. It might be better to explicitly use the cpus-per-gpu slurm option in the examples to avoid such issues. The options both work identically in my tests requesting GPUs on the debug node and on wice.