Weight Decay gradient question

In `WeightDecay` regularization class, the [code](https://github.com/szymonmaszke/torchlayers/blob/1eff7c55fdb3733e0acc180be79354ed35e4167c/torchlayers/regularization.py#L173) replaces the parameter's gradient with the gradient of the regularization:
```python
param.grad = self.regularize(param)
```

Should it instead **add** the regularization gradient to the existing parameter gradient? i.e.:
```python
param.grad.add_(self.regularize(param))
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weight Decay gradient question #18

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Weight Decay gradient question #18

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions