Skip to content

Conversation

@deadlywing
Copy link
Collaborator

  • remove policy_sgd for simplicity
  • support beaver cache opt for both inner product and activation

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @deadlywing, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the Stochastic Gradient Descent (SGD) implementation by removing the policy_sgd strategy and integrating SPU beaver cache optimization. The primary goal is to enhance the efficiency of secure computation by minimizing communication costs associated with frequently used intermediate values, such as the feature matrix and activation inputs. This change streamlines the SGD codebase and introduces a performance-critical optimization for privacy-preserving machine learning.

Highlights

  • Removed policy_sgd strategy: The policy_sgd strategy, along with its associated learning rate decay mechanism, has been removed from the SGD implementation for simplification.
  • Introduced SPU Beaver Cache Optimization: The SGD models now support SPU beaver cache optimization, specifically for the inner product calculation (feature matrix X_with_bias) and optionally for activation function inputs, to reduce communication overhead in secure computation.
  • Refactored SGD Training Loop: The internal _fit method in SGDBase has been significantly refactored to pre-split data into batches, manage cached variables, and use jax.lax.fori_loop for performance when early stopping is disabled, or jax.lax.while_loop when it's enabled.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@rivertalk rivertalk requested a review from danqi437 December 19, 2025 03:14
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly refactors the SGD implementation to simplify the code by removing the policy_sgd strategy and introducing SPU beaver cache optimizations. The new structure is more modular and readable, especially the _fit method and the improved early stopping logic. The tests are also well-updated to cover the new functionality, including a valuable performance profiling test for the cache feature.

My review identifies a critical performance issue in the new batching strategy that is not compatible with JAX's JIT compilation for large datasets. I've also pointed out two medium-severity edge cases that could lead to division-by-zero errors. Addressing these points will ensure the implementation is both robust and performant.

@rivertalk rivertalk requested a review from oeqqwq December 19, 2025 03:14
@deadlywing
Copy link
Collaborator Author

deadlywing commented Dec 19, 2025

@danqi437 可以测一下sf里的性能不?

测试条件:5w*100 数据,batch size 1024, 10 epochs, with l2, sigtype为t5,无早停,无训练/测试集划分

训练+预测:
SML with cache: 237MB comm, 15661 rounds
SML without cache: 604MB comm, 15661 rounds

Copy link
Collaborator

@danqi437 danqi437 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@deadlywing deadlywing merged commit 1cb0573 into main Dec 26, 2025
6 checks passed
@deadlywing deadlywing deleted the zjj/add_mul_beaver_cache branch December 26, 2025 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants