[LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingAttr #7748

lezcano · 2025-08-02T10:15:58Z

We do so by modelling M/N as describing elements and not the hardware 32bit registers.
This allows us to avoid the issue of having two elements pointing to the same register when unpacked=False.

We also tighten the MemDescType verifier and the TensorMemoryEncodingAttr verifier to be consistent with the definition we are using. Doing this makes us having to update a ton of lit tests that were silently wrong...

lib/Dialect/TritonGPU/IR/LinearLayoutConversions.cpp

This `toLinearLayout` is a bit different to way we construct the layouts for SharedEncoding. In particular, here we map the tensor into the memory, and not the other way around. This is to be able to model the `packed=True` version of the layout, where we map two different elements to the same `M/N` location. Representing it this way is not an issue in practice, as we always use these layouts by composing their inverse with a distributed layout, so this way we simply have the inverse already at hand.

lezcano · 2025-08-04T13:04:26Z

Different approach after discussing it with @ThomasRaoux. Now we implement it as a map from hardware to the tensor as all the other layouts, which makes everything simpler. Updated the OP reflecting this

include/triton/Dialect/TritonGPU/IR/TritonGPUOps.td

ThomasRaoux

LGTM, one comment about the linear representation that is probably going to be important

ThomasRaoux · 2025-08-04T16:09:39Z

lib/Dialect/TritonGPU/IR/LinearLayoutConversions.cpp

+  // We model packed layouts as having the rows/cols dimensions of bitwidth=16
+  // This means that a layout with unpacked=True is the same as one with
+  // unpacked=False


I think if we will want to track at the byte granularity. For scales we do have 8bits data in the Tensor memory so I think that will help want handling this

Let's revisit this once we do the scales, but sounds like a reasonable ask

…tr (#7748)" This reverts commit 40335eb. git-pr-chain: revert_layouts_implement_tolinearlayout__9b92

…ttr (#7748)" This reverts commit e6eb871.

We do so by modelling M/N as describing elements and not the hardware 32bit registers. This allows us to avoid the issue of having two elements pointing to the same register when `unpacked=False`. We also tighten the `MemDescType` verifier and the `TensorMemoryEncodingAttr` verifier to be consistent with the definition we are using. Doing this makes us having to update a ton of lit tests that were silently wrong...

lezcano requested a review from ptillet as a code owner August 2, 2025 10:15

lezcano requested a review from ThomasRaoux August 2, 2025 10:16

lezcano changed the title ~~[LAYOUS] Implement toLinearLayout for TensorMemoryEncodingAttr~~ [LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingAttr Aug 2, 2025

Jokeren reviewed Aug 2, 2025

View reviewed changes

lib/Dialect/TritonGPU/IR/LinearLayoutConversions.cpp Outdated Show resolved Hide resolved

lezcano force-pushed the tmem branch from df00f53 to e6eb58a Compare August 4, 2025 13:01

Jokeren approved these changes Aug 4, 2025

View reviewed changes

include/triton/Dialect/TritonGPU/IR/TritonGPUOps.td Show resolved Hide resolved

lezcano added 3 commits August 4, 2025 14:11

smaller diff

3c55bf4

fix

815b41f

fix

1d94c1f

lezcano enabled auto-merge (squash) August 4, 2025 13:56

fix attention

6dec865

ThomasRaoux reviewed Aug 4, 2025

View reviewed changes

lezcano merged commit 40335eb into main Aug 4, 2025
25 of 27 checks passed

lezcano deleted the tmem branch August 4, 2025 21:34

peterbell10 added a commit that referenced this pull request Aug 5, 2025

Revert "[LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingAt…

e6eb871

…tr (#7748)" This reverts commit 40335eb. git-pr-chain: revert_layouts_implement_tolinearlayout__9b92

peterbell10 mentioned this pull request Aug 5, 2025

[IR] Loosen tensor memory encoding checks added in #7748 #7784

Merged

peterbell10 added a commit that referenced this pull request Aug 5, 2025

Reapply "[LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingA…

701d0a2

…ttr (#7748)" This reverts commit e6eb871.

peterbell10 added a commit that referenced this pull request Aug 6, 2025

[IR] Loosen tensor memory encoding checks added in #7748 (#7784)

bbdbbd1

ptillet pushed a commit that referenced this pull request Aug 7, 2025

[IR] Loosen tensor memory encoding checks added in #7748 (#7784)

53b0dd9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingAttr #7748

[LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingAttr #7748

Uh oh!

lezcano commented Aug 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

lezcano commented Aug 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

ThomasRaoux left a comment

Uh oh!

ThomasRaoux Aug 4, 2025

Uh oh!

lezcano Aug 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingAttr #7748

[LAYOUTS] Implement toLinearLayout for TensorMemoryEncodingAttr #7748

Uh oh!

Conversation

lezcano commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lezcano commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ThomasRaoux left a comment

Choose a reason for hiding this comment

Uh oh!

ThomasRaoux Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

lezcano Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lezcano commented Aug 2, 2025 •

edited

Loading

lezcano commented Aug 4, 2025 •

edited

Loading