forked from NVIDIA/cutlass
-
Notifications
You must be signed in to change notification settings - Fork 56
Pull requests: intel/sycl-tla
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Support
nullptr
value of argument ptr_C
for xe_array_epilogue
#541
opened Sep 29, 2025 by
sanchitintel
Loading…
Use newer version of mma_atom and copy_atom in 00_bmg_gemm
#540
opened Sep 29, 2025 by
anamikac-intel
Loading…
Add dimension check to prevent out-of-bounds access in example 05_bmg_gemm_with_epilogue_splitk
#529
opened Sep 23, 2025 by
ClarkChin08
Loading…
[WIP] Added support for Rotary Embedding in flash_attention
#523
opened Sep 19, 2025 by
pralay-das
•
Draft
Add a new tile scheduler for varlen prefill to avoid launching empty work groups
#516
opened Sep 18, 2025 by
carsonwang
Loading…
Also use column-major B matrix in the example
00_bmg_gemm.cpp
#510
opened Sep 13, 2025 by
sanchitintel
Loading…
Remove redundant code from GroupGEMM implementation
#508
opened Sep 12, 2025 by
sanchitintel
Loading…
Example of FP32 -> BF16 conversion in epilogue of GEMM
#506
opened Sep 12, 2025 by
sanchitintel
•
Draft
1 task
Support FP32 -> BF16 conversion in epilogue of GroupedGEMM
#505
opened Sep 12, 2025 by
sanchitintel
•
Draft
Support fp32 accumulation for bf16 gemm and grouped gemm
#482
opened Aug 27, 2025 by
wuxun-zhang
Loading…
[WIP] FP8 scaledMM with DeepSeek-style dequantization
#453
opened Jul 2, 2025 by
sanchitintel
•
Draft
4 tasks
Previous Next
ProTip!
Adding no:label will show everything without a label.