Skip to content

Commit 6a7e5d8

Browse files
renganxufacebook-github-bot
authored andcommitted
Update early prune to support any N (#4734)
Summary: Pull Request resolved: #4734 X-link: facebookresearch/FBGEMM#1754 When N is small and not multiple of BLOCK_N, e.g. 221, and M=121, the early prune will prune all configs and then an empty config is returned. This will lead to error "ValueError: min() arg is an empty sequence". This diff fixes this issue. Reviewed By: sunfish2010 Differential Revision: D80542648 fbshipit-source-id: 8e29d90ac87978b1bb4c69002b3583d10ba5f8a3
1 parent e8778c7 commit 6a7e5d8

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

fbgemm_gpu/experimental/gemm/triton_gemm/grouped_gemm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ def early_config_prune(configs, named_args, dtsize=None, dtype=None, **kwargs):
184184
num_sm = driver.active.utils.get_device_properties(device)[
185185
"multiprocessor_count"
186186
]
187-
N_TILES = N // BLOCK_N
187+
N_TILES = (N + BLOCK_N - 1) // BLOCK_N
188188
MIN_N_TILES = 32 if torch.version.hip else 64
189189
# 4. make sure we don't load N tiles that are too big
190190
if (

0 commit comments

Comments
 (0)