- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.9k
[DeepSeek v3.2] Make top-k work for any logit values. #27568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[DeepSeek v3.2] Make top-k work for any logit values. #27568
Conversation
Signed-off-by: Daniel Campora <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the top-k kernel to correctly handle logit values that differ only in their least significant bits, by using a multi-pass histogram approach on the full 32-bit float representation. The changes are extensive and introduce a more complex but precise algorithm.
My review has identified several critical issues and areas for improvement:
- There are critical bugs related to memory access and incorrect output indices when rowStartis not zero. The tests should be expanded to cover this case.
- The logic for selecting the sorting algorithm in the host-side launcher functions appears to be flawed, potentially leading to significant performance degradation.
- A fallback sorting algorithm within the kernel is misleadingly commented and has quadratic complexity, which can be a performance bottleneck.
- There is some code duplication that should be addressed to improve maintainability.
I have provided specific comments with code suggestions to address these points. Overall, the direction is good, but these issues need to be fixed before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: Daniel Campora <[email protected]>
Purpose
This PR allows top_k_per_row work for any values in logits. Even if the logits differ only in the least significant bytes, top-k is now guaranteed to always give a correct answer.
Solves #26554
Test Plan
The test test_top_k_per_row has been amplified to cover these cases too. They didn't pass before and now they pass.
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.