Skip to content

Commit 8521e3d

Browse files
tushar00jainmeta-codesync[bot]
authored andcommitted
fix allreduce usage (meta-pytorch#279)
Summary: Pull Request resolved: meta-pytorch#279 pass full allreduce options to the pg allreduce to avoid the watchdog abort from getting triggered Reviewed By: d4l3k Differential Revision: D84101243 fbshipit-source-id: b2ecf071d9943d75d3f21091805b92948d6e6617
1 parent 7e1c8d1 commit 8521e3d

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

torchft/manager.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -423,7 +423,9 @@ def allreduce(
423423
torch.accelerator.current_stream(),
424424
)
425425
else:
426-
work = self._pg.allreduce([tensor], reduce_op)
426+
opts = AllreduceOptions()
427+
opts.reduceOp = reduce_op
428+
work = self._pg.allreduce([tensor], opts)
427429

428430
# schedule grad normalization as a continuation
429431
# on the Future

0 commit comments

Comments
 (0)