Skip to content

Commit 3c7be9b

Browse files
committed
fix: jacobian matrix instead of approximation
1 parent 5871657 commit 3c7be9b

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

neuralnetlib/layers.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2693,7 +2693,9 @@ def backward_pass(self, output_error: np.ndarray) -> np.ndarray | tuple[np.ndarr
26932693

26942694
d_attention = np.matmul(d_attention_output, np.transpose(self.reshaped_value, (0, 1, 3, 2)))
26952695
attention_probs = self.attention_weights
2696-
d_attention_probs = d_attention * (attention_probs - np.power(attention_probs, 2))
2696+
dot = np.sum(d_attention * attention_probs, axis=-1, keepdims=True)
2697+
d_attention_probs = d_attention - dot
2698+
d_attention_probs = d_attention_probs * attention_probs
26972699

26982700
d_values = np.matmul(attention_probs.transpose(0, 1, 3, 2), d_attention_output)
26992701
d_query = np.matmul(d_attention_probs, self.reshaped_key)

0 commit comments

Comments
 (0)