Skip to content

Conversation

@luciaquirke
Copy link
Collaborator

@luciaquirke luciaquirke commented Nov 10, 2025

Because of natural variance in batches it's possible for some time to pass before an OOM error. This is intended to make the error happen immediately. I'm not sure whether this belongs in collect_gradients or in the build/query scripts but it can be more concise if it happens here.

Closes #56


def validate_batch_size(
model: PreTrainedModel,
token_batch_size: int | None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swap arg order

@baberabb
Copy link

baberabb commented Nov 18, 2025

accelerate has an auto batch utility you could adapt, if you wanted to extend this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: run the maximum sequence length through the model before kicking off the run

3 participants