Skip to content

Fix missing parentheses in TF Adam/Yogi update denominator#2691

Open
Chessing234 wants to merge 1 commit intod2l-ai:masterfrom
Chessing234:fix/adam-yogi-tf-parentheses
Open

Fix missing parentheses in TF Adam/Yogi update denominator#2691
Chessing234 wants to merge 1 commit intod2l-ai:masterfrom
Chessing234:fix/adam-yogi-tf-parentheses

Conversation

@Chessing234
Copy link
Copy Markdown

Bug

In chapter_optimization/adam.md, the TensorFlow implementations of adam() and yogi() write the parameter update as:

p[:].assign(p - hyperparams['lr'] * v_bias_corr
            / tf.math.sqrt(s_bias_corr) + eps)

Root cause

Python operator precedence parses a / b + c as (a / b) + c, so this expression is:

p - (lr * v_bias_corr / sqrt(s_bias_corr)) + eps

That is, eps is added directly to the parameter update, instead of being added to the denominator for numerical stability.

The prose in this section and the paired PyTorch implementations a few lines above (for both adam and yogi) correctly use:

p[:] -= hyperparams['lr'] * v_bias_corr / (torch.sqrt(s_bias_corr) + eps)

So this is a simple missing-parentheses bug in the TF code blocks only.

Fix

Wrap the denominator in parentheses so eps is inside the sqrt(...) + eps term, matching the formula and the PyTorch tabs:

p[:].assign(p - hyperparams['lr'] * v_bias_corr
            / (tf.math.sqrt(s_bias_corr) + eps))

Two call sites updated (adam, yogi), 4 lines changed.

In the TensorFlow implementations of adam() and yogi() in
chapter_optimization/adam.md, the parameter update is written as

    p - lr * v_bias_corr / tf.math.sqrt(s_bias_corr) + eps

Due to Python operator precedence this evaluates to

    (p - lr * v_bias_corr / tf.math.sqrt(s_bias_corr)) + eps

so 'eps' is added to the parameter instead of being added to the
denominator for numerical stability. The math given in the text and the
paired PyTorch implementations correctly use

    p - lr * v_bias_corr / (tf.math.sqrt(s_bias_corr) + eps)

Add the missing parentheses so the TF code matches the formula.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant