Skip to content

Conversation

ShindeShivam
Copy link
Contributor

Chapter : 11
Cell : Faster Optimizers [47]

Changed:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9)

To:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)

Why change was needed:

While working through Chapter 11, I noticed that the SGD optimizer was being initialized with a momentum value.
Since this section appears to be comparing various optimizers — including plain SGD — I felt that including momentum here might unintentionally misrepresent how standard SGD behaves. Momentum definitely helps improve performance, but it has been added separately after this cell to SGD.
I removed the momentum parameter so the optimizer now reflects vanilla SGD.

**I’ve attached the optimizer loss plots from before and after the change (see images below). **
In the original version, the “SGD” curve seemed to perform better than expected — and after inspecting the code, I realized it was actually using momentum. After the fix, SGD’s curve is now visible and shows the slower convergence we typically expect from it.

Before->

Screenshot 2025-07-18 at 6 06 50 PM

After->

Screenshot 2025-07-18 at 6 07 40 PM

@ShindeShivam
Copy link
Contributor Author

I’ve also opened a couple of other PRs recently (#196 and #202)

@ageron
Copy link
Owner

ageron commented Aug 10, 2025

Great catch, thanks! Merged. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants