Similar to https://github.com/microsoft/Megatron-DeepSpeed/pull/50 We want a Megatron fork based on MCR-DL. Need to create a new branch of https://github.com/OSU-Nowlab/Megatron-LM and replace all calls to torch.dist with mcr-dl. I'll test once ready.