Question about the multimem in NCCL

Hello, authors. Thank you for your wonderful work!
I have a question about multimem in H100 GPU. In Figure 5 ( Compare all-reduce(default) with all-reduce(multimem) ), pytorch.distribute.all_reduce() is considered as the result of turning off multimem , so is NCCL's Multimem function (NVLS) turned off by default in VLLM?