Skip to content

Conversation

ZiyiTsang
Copy link
Collaborator

@ZiyiTsang ZiyiTsang commented Oct 11, 2025

This pull request expands optimizer support in the codebase, allowing users to select between Adam, SGD, and a new AnyPrecision AdamW optimizer (adam_bf16). It also introduces a custom implementation of AnyPrecisionAdamW for improved mixed-precision training and updates documentation and CLI argument validation accordingly.

Optimizer Support Enhancements:

  • Added support for sgd and adam_bf16 optimizer types in OptimizerConfig, and updated CLI argument validation to reflect these new options.
  • Updated the base HuggingFace engine (areal/engine/base_hf_engine.py) to allow selection between AdamW, SGD, and the new AnyPrecisionAdamW optimizer, invoking the appropriate optimizer based on user configuration.
  • Updated the Megatron engine (areal/experimental/megatron_engine.py) to allow both AdamW and SGD optimizers, passing the selected type to the optimizer configuration.

New Optimizer Implementation:

  • Added a new AnyPrecisionAdamW optimizer in areal/utils/fsdp/optimizer.py, supporting flexible precision (bfloat16/float32) and optional Kahan summation for improved numerical stability during mixed-precision training.

AdamW_bf16/SGD optimizer maybe useful to avoid OOM, but less stability

Note
AnyPrecisionAdamW should not be changed as it is from other repo's implementation.
PR #366 should put after this PR

Optimizer Support

Optimizer FSDP Megatron
AdamW
SGD
AdamW_bf16

@ZiyiTsang
Copy link
Collaborator Author

/gemini review

gemini-code-assist[bot]

This comment was marked as resolved.

@ZiyiTsang
Copy link
Collaborator Author

PR #366 should put after this PR

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@ZiyiTsang ZiyiTsang marked this pull request as ready for review October 12, 2025 09:05
@ZiyiTsang
Copy link
Collaborator Author

Clear for review.

Copy link
Collaborator

@garrett4wade garrett4wade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, waiting for other reviewer's comments.

@garrett4wade garrett4wade requested a review from nuzant October 13, 2025 02:01
Copy link
Collaborator

@nuzant nuzant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only one small issue.

@rchardx rchardx requested a review from Copilot October 13, 2025 03:04
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request expands optimizer support in the codebase, allowing users to select between Adam, SGD, and a new AnyPrecision AdamW optimizer (adam_bf16) for improved mixed-precision training.

  • Adds support for sgd and adam_bf16 optimizer types across CLI validation, FSDP, and Megatron engines
  • Implements a custom AnyPrecisionAdamW optimizer with flexible precision and optional Kahan summation
  • Refactors optimizer creation from base class to engine-specific implementations

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
docs/cli_reference.md Updates documentation to reflect new optimizer choices and parameter applicability
areal/utils/fsdp/optimizer.py Adds new AnyPrecisionAdamW optimizer implementation with precision control
areal/experimental/megatron_engine.py Updates Megatron engine to support AdamW and SGD optimizers
areal/engine/ppo/actor.py Minor formatting changes (whitespace additions)
areal/engine/fsdp_engine.py Implements optimizer creation with support for all three optimizer types
areal/engine/base_hf_engine.py Refactors optimizer creation to abstract method pattern
areal/api/cli_args.py Updates CLI argument validation and adds version check logic

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@rchardx
Copy link
Collaborator

rchardx commented Oct 13, 2025

/gemini review

gemini-code-assist[bot]

This comment was marked as resolved.

@ZiyiTsang
Copy link
Collaborator Author

/gemini review

gemini-code-assist[bot]

This comment was marked as resolved.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@ZiyiTsang
Copy link
Collaborator Author

Done. Review again please @garrett4wade .

Copy link
Collaborator

@garrett4wade garrett4wade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@garrett4wade garrett4wade merged commit 9ba6dc4 into inclusionAI:main Oct 13, 2025
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants