Skip to content
Discussion options

You must be logged in to vote

looks like in your case, DDP is not triggered for some reason since if you are not changing the batch_size and total batches in the progress bar should be reduced with DDP on 4 GPUs.

did you see any logs like this when you call trainer.fit ??

Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/4
Initializing distributed: GLOBAL_RANK: 2, MEMBER: 3/4
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/4
Initializing distributed: GLOBAL_RANK: 3, MEMBER: 4/4

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@nian-liu
Comment options

Answer selected by akihironitta
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants