Skip to content

Work around a jit post-op issue with layout not matching dst#4864

Open
karturov wants to merge 1 commit intorls-v3.11from
karturov/MFDNN-14794
Open

Work around a jit post-op issue with layout not matching dst#4864
karturov wants to merge 1 commit intorls-v3.11from
karturov/MFDNN-14794

Conversation

@karturov
Copy link
Contributor

@karturov karturov commented Mar 20, 2026

Fixes MFDNN-14794. Backport to v3.11 (to allow ref path in Graph SDPA training work).

@karturov karturov requested review from a team as code owners March 20, 2026 06:39
@github-actions github-actions bot added documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel component:graph-api Codeowner: @oneapi-src/onednn-graph component:tests Codeowner: @oneapi-src/onednn-arch component:build component:common third_party labels Mar 20, 2026
@karturov karturov changed the base branch from main to rls-v3.11 March 20, 2026 06:41
@jondea jondea removed the request for review from a team March 20, 2026 09:28
@Sqvid Sqvid removed documentation A request to change/fix/improve the documentation. Codeowner: @oneapi-src/onednn-doc platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 component:graph-api Codeowner: @oneapi-src/onednn-graph component:tests Codeowner: @oneapi-src/onednn-arch component:build component:common third_party labels Mar 20, 2026
@Sqvid Sqvid removed request for a team March 20, 2026 11:18
@Sqvid Sqvid removed the request for review from a team March 20, 2026 11:18
Comment on lines +494 to +497
else if (col) {
emul(1, tempQ0, j0, ld, strategy, state);
eadd(1, offset, offset, tempQ0.reinterpret(0, offset.getType()), strategy, state);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes don't look right. The intent is that we pass state.inputs.binaryLDs only when we have 2D post-ops. For either row or column (1D) post-ops, we assume packed data, and here we scale by the data type size to get the stride.

Comment on lines +415 to +424
// For col-only binary, check whether the col direction has unit stride.
// Col direction corresponds to dim[ndims-2] (no swap) or dim[ndims-1] (swap_ab).
bool col_only = is_multi_col && !is_multi_row;
int rmd_ndims = src_rmd.ndims();
bool col_unit = !col_only
|| (swap_ab ? src_rmd.is_inner_dim(rmd_ndims - 1, rmd_ndims)
: src_rmd.is_inner_dim(rmd_ndims - 2, rmd_ndims));
// Non-unit stride requires Scattered access with correct multi-block offsets,
// which is not yet supported. Reject at PD time to avoid silent wrong results.
if (!col_unit) return status::unimplemented;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix may work, but only this block of code is needed. Everything else looks wrong. Please also move this to before the if (swap_ab) block above, and assume swap_ab = false, then inside of the if (swap_ab) block do any necessary "fixing up". Eventually this logic will be moved somewhere else, so we need to keep it separate from the base logic.

Copy link
Contributor

@atkassen atkassen Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we force 2d in these cases? E.g.

        if (!is_multi_row && is_multi_col && !src_rmd.is_inner_dim(rmd_ndims - 2, rmd_ndims))
            is_multi_row = true;

edit: nevermind, I thought the problem shape had N=1.

@karturov karturov force-pushed the karturov/MFDNN-14794 branch from 34180a5 to df1c7e4 Compare March 20, 2026 21:07
@vpirogov vpirogov marked this pull request as draft March 20, 2026 21:21
@karturov karturov changed the title [DRAFT] Work around a jit post-op issue with layout not matching dst Work around a jit post-op issue with layout not matching dst Mar 20, 2026
@karturov karturov marked this pull request as ready for review March 20, 2026 23:07
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Simonsays095 Simonsays095 force-pushed the karturov/MFDNN-14794 branch from df1c7e4 to e4543be Compare March 20, 2026 23:25
@Simonsays095
Copy link
Contributor

@rjoursler Please comment on the logic related to relative_md_t. I'm not that familiar with the semantics of is_innermost and relative mds, so I want to make sure you're on board with the changes.

@karturov
Copy link
Contributor Author

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_matmul
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg
enable arch_gpu_xe3-lpg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants