Skip to content

Conversation

@uralik
Copy link
Contributor

@uralik uralik commented Sep 20, 2025

Previous online DPO fix did not consider that athene reward also mimic dummy batch although its likely not needed, but for now we copy the same logic as from math verify reward

@uralik uralik requested a review from cbalioglu as a code owner September 20, 2025 02:51
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants