Skip to content

Conversation

baonudesifeizhai
Copy link

@baonudesifeizhai baonudesifeizhai commented Aug 24, 2025

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ETOgaosion
Copy link

Thanks for your efforts. Thinking it's better for now.

@baonudesifeizhai
Copy link
Author

Thanks for your efforts. Thinking it's better for now.

Thanks for your efforts. Thinking it's better for now.

does the cicd test fail really matter? i change my code status to main andtest it locally ,but still failed...

@vasqu
Copy link
Contributor

vasqu commented Aug 25, 2025

See #40399 (comment), #40424 will introduce a proper fix as _prepare_from_posids is only for training, we no longer need these checks on the seq_lens

@@ -409,6 +409,22 @@ def _prepare_from_posids(query, key, value, position_ids, query_length):
(max_seqlen_in_batch_q, max_seqlen_in_batch_k) (`tuple[int]`):
Maximum sequence length in batch (`max_seqlen_in_batch_q` for the target sequence i.e. query, `max_seqlen_in_batch_k` for the source sequence i.e. key/value).
"""
# Validate query_length only when not compiling to avoid graph breaks
try:
if not torch._dynamo.is_compiling():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to use

from .utils.import_utils import is_torchdynamo_compiling

@FightingZhen
Copy link
Contributor

We also find similar NaN mistake on Ascend NPU without this PR, can we push it to be merged?

@vasqu
Copy link
Contributor

vasqu commented Aug 28, 2025

@FightingZhen can you test #40424 instead? We shouldn't enter this path at all anymore

@vasqu vasqu closed this in #40424 Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants