Fix: Remove momentum from SGD to show standard optimizer behavior #205

ShindeShivam · 2025-07-18T12:50:06Z

Chapter : 11
Cell : Faster Optimizers [47]

Changed:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9)

To:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)

Why change was needed:

While working through Chapter 11, I noticed that the SGD optimizer was being initialized with a momentum value.
Since this section appears to be comparing various optimizers — including plain SGD — I felt that including momentum here might unintentionally misrepresent how standard SGD behaves. Momentum definitely helps improve performance, but it has been added separately after this cell to SGD.
I removed the momentum parameter so the optimizer now reflects vanilla SGD.

**I’ve attached the optimizer loss plots from before and after the change (see images below). **
In the original version, the “SGD” curve seemed to perform better than expected — and after inspecting the code, I realized it was actually using momentum. After the fix, SGD’s curve is now visible and shows the slower convergence we typically expect from it.

Before->

After->

ShindeShivam · 2025-07-18T12:55:19Z

I’ve also opened a couple of other PRs recently (#196 and #202)

ageron · 2025-08-10T10:21:42Z

Great catch, thanks! Merged. 👍

ShindeShivam added 2 commits July 18, 2025 18:03

Fix: Remove momentum from SGD to show standard optimizer behavior

d78acc3

Cleanup: Remove unintended file from PR

43131dd

This was referenced Jul 18, 2025

Fix: Adjusted PiecewiseConstantDecay schedule to match training steps on Fashion MNIST #206

Merged

Fix: Copy trained weights to MC Dropout model for correct inference #207

Closed

ShindeShivam mentioned this pull request Aug 2, 2025

Incorrect L1 Regularizer Formula #211

Closed

ageron merged commit 23700a2 into ageron:main Aug 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Remove momentum from SGD to show standard optimizer behavior #205

Fix: Remove momentum from SGD to show standard optimizer behavior #205

Uh oh!

ShindeShivam commented Jul 18, 2025

Uh oh!

ShindeShivam commented Jul 18, 2025

Uh oh!

ageron commented Aug 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix: Remove momentum from SGD to show standard optimizer behavior #205

Fix: Remove momentum from SGD to show standard optimizer behavior #205

Uh oh!

Conversation

ShindeShivam commented Jul 18, 2025

Why change was needed:

Before->

After->

Uh oh!

ShindeShivam commented Jul 18, 2025

Uh oh!

ageron commented Aug 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants