-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Description
Hi Authors,
Congrats on your paper acceptance by NeurIPS!!
I have one question:
If I understand correctly, the code intend to accumulate gradients and update networks every K iterations(K can be set via accum_iter, e.g., accum_iter=2 in you provided script). However, I notice that when you are doing .backward(), 'retain_graph=True' is not set. Does this mean that the gradient is not accumulated and we are updating network using the gradient from the Kth iteration instead of the gradient accumulation from 1 to K ith ?
Line 259 in 9fa1ef5
| self._scaler.scale(loss).backward(create_graph=create_graph) |
Thanks in advance for your time~
Metadata
Metadata
Assignees
Labels
No labels