-
Notifications
You must be signed in to change notification settings - Fork 14
Grouped gemm cutlass #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jikunshang
merged 50 commits into
vllm-project:main
from
Liangliang-Ma:grouped_gemm_cutlass
Sep 25, 2025
Merged
Changes from 46 commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
d41cf57
add flash attention interface
jikunshang ce9f31d
update interface
jikunshang fb6784f
add cutlass deps (#1)
jikunshang ce27fa2
add chunk_prefill step<1>
YizhouZ ed0f846
fix register
YizhouZ b02a5a8
fix cmake
YizhouZ a4a76ee
debug msg
YizhouZ ee1b719
functional ready
YizhouZ 4ef938f
dev base
Liangliang-Ma 480c72f
base of grouped_gemm_fp8
Liangliang-Ma 24709b8
update func
Liangliang-Ma f5757a9
add test
Liangliang-Ma 435e6df
update functor
Liangliang-Ma f76fb97
update grouped_gemm
Liangliang-Ma 9408e94
build ready
Liangliang-Ma 439cf3c
base integration done
Liangliang-Ma 48abd9f
grouped gemm base ready
Liangliang-Ma 67eeb47
gemm2 use cutlass grouped_mm
Liangliang-Ma a62752f
gemm1 use cutlass group_mm
Liangliang-Ma cfb724b
rm flash_attn in this pr
Liangliang-Ma f7518e0
rebase CMakeLists
Liangliang-Ma 083bde5
use main Cmakes
Liangliang-Ma 48a4808
use main setup
Liangliang-Ma 22d1ade
mv utils
Liangliang-Ma c0e70c4
Merge branch 'main' into grouped_gemm_cutlass
Liangliang-Ma 1c7f46d
finish rebase
Liangliang-Ma df0b915
add profile and change to col-maj
Liangliang-Ma 76fe4bc
dont not reserve block_C
Liangliang-Ma ad0fdd6
remove redundant allocation
Liangliang-Ma 54e64a7
e2e debug
Liangliang-Ma 3c40008
add release func
Liangliang-Ma 985004d
gemm args allocate once
Liangliang-Ma 9c18092
hidden_states copy
Liangliang-Ma a47ecef
output bf16
Liangliang-Ma 1a2d655
use static tensor buffer
Liangliang-Ma f7dee65
remove ptr_C
Liangliang-Ma ad2dc48
fix device lost
Liangliang-Ma 56cb570
acc and oom fixed
Liangliang-Ma 81555ab
Fix acc and oom issue
Liangliang-Ma d1edf17
base
Liangliang-Ma 55f36a8
update CMakeLists
Liangliang-Ma 54e7219
Merge branch 'main' into grouped_gemm_cutlass
Liangliang-Ma 513377a
refactor csrc of cutlass
Liangliang-Ma 534c7c3
put src in vllm
Liangliang-Ma 1fc6959
add adapter src
Liangliang-Ma db6b292
clean up
Liangliang-Ma d651d9d
add test
Liangliang-Ma a29cfa6
clean up
Liangliang-Ma c66f152
fix format
Liangliang-Ma a681e73
fix format f841
Liangliang-Ma File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why using private forked cutlass-sycl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rebase to cutlass-sycl/main.