Skip to content

Commit df1c7e4

Browse files
karturovCopilot
andcommitted
gpu: fix (avoid) jit matmul post-ops with layout not matching dst
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent f0c6428 commit df1c7e4

File tree

1 file changed

+19
-0
lines changed

1 file changed

+19
-0
lines changed

src/gpu/intel/gemm/jit/gen_kernel.cpp

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -401,9 +401,28 @@ status_t gen_desc_t::transfer_post_ops(
401401

402402
bool trans = is_multi_row && !src_rmd.inner_dim.is_innermost();
403403

404+
// Reject col-only binary post-ops with non-unit stride in the
405+
// col direction. Non-unit stride requires Scattered access with
406+
// correct multi-block offsets, which is not yet supported.
407+
{
408+
int rmd_ndims = src_rmd.ndims();
409+
bool col_only = is_multi_col && !is_multi_row;
410+
if (col_only
411+
&& !src_rmd.is_inner_dim(rmd_ndims - 2, rmd_ndims))
412+
return status::unimplemented;
413+
}
414+
404415
if (swap_ab) {
405416
trans = !trans;
406417
std::swap(is_multi_row, is_multi_col);
418+
419+
// After swap, original row-only becomes col-only. Col
420+
// direction is now ndims-1 instead of ndims-2.
421+
int rmd_ndims = src_rmd.ndims();
422+
bool col_only = is_multi_col && !is_multi_row;
423+
if (col_only
424+
&& !src_rmd.is_inner_dim(rmd_ndims - 1, rmd_ndims))
425+
return status::unimplemented;
407426
}
408427

409428
problem_.Tbinary.push_back(T);

0 commit comments

Comments
 (0)