-
Notifications
You must be signed in to change notification settings - Fork 458
Adds StreamingDataLoader class following the OLMo-core definition, and changes grpo_fast.py to use it.
#1202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
finbarrtimbers
wants to merge
32
commits into
main
Choose a base branch
from
oc-dataloader
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
grpo_fast.py to pull a lot of complexity into a dataloader classStreamingDataLoader class following the OLMo-core definition, and changes grpo_fast.py to use it.
Moved work_dir, global_batch_size, dp_world_size, and max_possible_score from StreamingDataLoaderConfig fields to build() method parameters. These values are computed at runtime from Args and should not be CLI arguments. - work_dir comes from args.output_dir - global_batch_size comes from args.num_unique_prompts_rollout - dp_world_size comes from the actual world_size (number of PolicyTrainerRayProcess instances) - max_possible_score is computed in Args.__post_init__ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Moved max_prompt_token_length and response_length to StreamingDataLoaderConfig and added __post_init__ to validate pack_length assertion there instead of in Args. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…erConfig Removed these fields from Args to avoid argparse conflicts and updated all references to use streaming_config instead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Added async_steps and num_samples_per_prompt_rollout fields to StreamingDataLoaderConfig and moved the validation logic there. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…oaderConfig Refactored to access these values directly from config instead of passing them as parameters. Updated function signatures to pass streaming_config where needed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Move num_samples_per_prompt_rollout validation to StreamingDataLoaderConfig and fix references to use self instead of self.config. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Moved the inference_batch_size default calculation from Args.__post_init__ to setup_runtime_variables where we have access to streaming_config. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Runs: