This repository was archived by the owner on Apr 10, 2024. It is now read-only.
File tree Expand file tree Collapse file tree 1 file changed +13
-5
lines changed Expand file tree Collapse file tree 1 file changed +13
-5
lines changed Original file line number Diff line number Diff line change 1515
1616"""Redirected ReLu Gradient Overrides
1717
18- When visualizing models we often[0] have to optimize through ReLu activation
19- functions. Where accessing pre-relu tensors is too hard, we use these
20- overrides to allow gradient to flow back through the ReLu—even if it didn't
21- activate ("dead neuron") and thus its derivative is 0.
18+ When we visualize ReLU networks, the initial random input we give the model may
19+ not cause the neuron we're visualizing to fire at all. For a ReLU neuron, this
20+ means that no gradient flow backwards and the visualization never takes off.
21+ One solution would be to find the pre-ReLU tensor, but that can be tedious.
22+
23+ These functions provide a more convenient solution: temporarily override the
24+ gradient of ReLUs to allow gradient to flow back through the ReLU -- even if it
25+ didn't activate and had a derivative of zero -- allowing the visualization
26+ process to get started.
2227
2328Usage:
2429```python
2530from lucid.misc.gradient_override import gradient_override_map
2631from lucid.misc.redirected_relu_grad import redirected_relu_grad
2732
2833with gradient_override_map({'Relu': redirected_relu_grad}):
29- model.import_graph(… )
34+ model.import_graph(... )
3035```
3136
3237Discussion:
3540change this behavior to allow gradient pushing the input into a desired regime
3641between these points.
3742
43+ (This override first checks if the entire gradient would be blocked, and only
44+ changes it in that case. It does this check independently for each batch entry.)
45+
3846In effect, this replaces the relu gradient with the following:
3947
4048Regime | Effect
You can’t perform that action at this time.
0 commit comments