Skip to content

[mlir][linalg] Add getCollapsedVecType and update vectorization of linalg.unpack #151503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 1, 2025

Conversation

banach-space
Copy link
Contributor

This patch introduces a new helper, getCollapsedVecType, and updates
vectorizeAsTensorUnpackOp to use it. The motivation stems from improving
how vector.shape_cast operations are generated when vectorizing
linalg.unpack.

Previously, the vectorizer relied on

  • tensor::CollapseShapeOp::inferCollapsedType

to compute the collapsed vector type. This approach is suboptimal
because:

  • inferCollapsedType lacks awareness of scalable vector flags.
  • Linalg vectorization should not depend on Tensor dialect utilities.

Instead of relocating inferCollapsedType, we introduce
getCollapsedVecType — a lightweight, specialized hook that:

  • Assumes no dynamic sizes.
  • Handles scalable flags alongside shape dimensions.

This change also reduces temporary variables in
vectorizeAsTensorUnpackOp and paves the way for a cleaner update in
#149293.

@llvmbot
Copy link
Member

llvmbot commented Jul 31, 2025

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: Andrzej Warzyński (banach-space)

Changes

This patch introduces a new helper, getCollapsedVecType, and updates
vectorizeAsTensorUnpackOp to use it. The motivation stems from improving
how vector.shape_cast operations are generated when vectorizing
linalg.unpack.

Previously, the vectorizer relied on

  • tensor::CollapseShapeOp::inferCollapsedType

to compute the collapsed vector type. This approach is suboptimal
because:

  • inferCollapsedType lacks awareness of scalable vector flags.
  • Linalg vectorization should not depend on Tensor dialect utilities.

Instead of relocating inferCollapsedType, we introduce
getCollapsedVecType — a lightweight, specialized hook that:

  • Assumes no dynamic sizes.
  • Handles scalable flags alongside shape dimensions.

This change also reduces temporary variables in
vectorizeAsTensorUnpackOp and paves the way for a cleaner update in
#149293.


Full diff: https://github.com/llvm/llvm-project/pull/151503.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp (+45-11)
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index ea68b1ad572c3..a82f31d988f76 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1831,6 +1831,46 @@ vectorizeAsTensorPackOp(RewriterBase &rewriter, linalg::PackOp packOp,
   return success();
 }
 
+/// Given the re-associations, "collapses" the input Vector type
+///
+/// This is similar to CollapseShapeOp::inferCollapsedType with two notable
+/// differences:
+///   * We can safely assume that there are no dynamic sizes.
+///   * Scalable flags are updated alongside regular dims.
+///
+/// When collapsing scalable flags, conservatively avoids cases with two
+/// scalable dims. We could re-visit this in the future.
+static VectorType getCollapsedVecType(VectorType type,
+                                      ArrayRef<AffineMap> reassociation) {
+  assert(type.getNumScalableDims() < 2 &&
+         "Collapsing more than 1 scalable dim is not supported ATM");
+
+  // Use the fact that reassociation is valid to simplify the logic: only use
+  // each map's rank.
+  assert(isReassociationValid(reassociation) && "invalid reassociation");
+
+  auto shape = type.getShape();
+  auto scalableFlags = type.getScalableDims();
+  SmallVector<int64_t> newShape;
+  SmallVector<bool> newScalableFlags;
+
+  unsigned currentDim = 0;
+  for (AffineMap m : reassociation) {
+    unsigned dim = m.getNumResults();
+    int64_t size = 1;
+    bool flag = false;
+    for (unsigned d = 0; d < dim; ++d) {
+      size *= shape[currentDim + d];
+      flag |= scalableFlags[currentDim + d];
+    }
+    newShape.push_back(size);
+    newScalableFlags.push_back(flag);
+    currentDim += dim;
+  }
+
+  return VectorType::get(newShape, type.getElementType(), newScalableFlags);
+}
+
 /// Vectorize a `linalg::UnPackOp` to these 4 Ops:
 ///   Vector::TransferReadOp - Reads a vector from the source tensor
 ///   vector::TransposeOp - Transpose the Source tensor
@@ -1928,23 +1968,17 @@ vectorizeAsTensorUnpackOp(RewriterBase &rewriter, linalg::UnPackOp unpackOp,
   PackingMetadata packMetadata;
   SmallVector<int64_t> lastDimToInsertPosPerm =
       getUnPackInverseSrcPerm(unpackOp, packMetadata);
-  ShapedType maskedOpShapedType = cast<ShapedType>(readResult.getType());
-  SmallVector<int64_t> stripMineShape(maskedOpShapedType.getShape());
-  mlir::Type stripMineElemType = maskedOpShapedType.getElementType();
-  applyPermutationToVector(stripMineShape, lastDimToInsertPosPerm);
-  RankedTensorType stripMineTensorType =
-      RankedTensorType::get(stripMineShape, stripMineElemType);
   // Transpose the appropriate rows to match output.
   vector::TransposeOp transposeOp = vector::TransposeOp::create(
       rewriter, loc, readResult, lastDimToInsertPosPerm);
 
   // Collapse the vector to the size required by result.
-  RankedTensorType collapsedType = tensor::CollapseShapeOp::inferCollapsedType(
-      stripMineTensorType, packMetadata.reassociations);
-  mlir::VectorType vecCollapsedType =
-      VectorType::get(collapsedType.getShape(), collapsedType.getElementType());
+  VectorType collapsedVecType = getCollapsedVecType(
+      transposeOp.getType(),
+      getSymbolLessAffineMaps(convertReassociationIndicesToExprs(
+          rewriter.getContext(), packMetadata.reassociations)));
   vector::ShapeCastOp shapeCastOp = vector::ShapeCastOp::create(
-      rewriter, loc, vecCollapsedType, transposeOp->getResult(0));
+      rewriter, loc, collapsedVecType, transposeOp->getResult(0));
 
   Operation *write = createWriteOrMaskedWrite(
       rewriter, loc, shapeCastOp.getResult(), unpackOp.getDest(),

Copy link
Contributor

@egebeysel egebeysel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks much cleaner to me. I'm not sure if we unit-test these kind of stuff, but if we don't, then LGTM 😃

Comment on lines +1862 to +1872
for (unsigned d = 0; d < dim; ++d) {
size *= shape[currentDim + d];
flag |= scalableFlags[currentDim + d];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we could add a small example or something here for clarity, it might be more clear for future readers that way and they wouldn't have to go check the inferCollapsedType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion!

@banach-space
Copy link
Contributor Author

I'm not sure if we unit-test these kind of stuff

Not really, otherwise we would end up with many unit tests 😅. From https://llvm.org/docs/TestingGuide.html#unit-tests

In general unit tests are reserved for targeting the support library and other generic data structure, we prefer relying on regression tests for testing transformations and analysis on the IR.

Admittedly, this leaves room for interpretation From my experience, we don't really test such helper hooks in isolation. Instead, we rely on IR tests for the transformations in which such hooks are used. Hopefully this makes sense :)

…nalg.unpack

This patch introduces a new helper, `getCollapsedVecType`, and updates
`vectorizeAsTensorUnpackOp` to use it. The motivation stems from improving
how `vector.shape_cast` operations are generated when vectorizing
`linalg.unpack`.

Previously, the vectorizer relied on
* `tensor::CollapseShapeOp::inferCollapsedType`

to compute the collapsed vector type. This approach is suboptimal
because:
  * `inferCollapsedType` lacks awareness of scalable vector flags.
  * Linalg vectorization should not depend on Tensor dialect utilities.

Instead of relocating `inferCollapsedType`, we introduce
`getCollapsedVecType` — a lightweight, specialized hook that:
  * Assumes no dynamic sizes.
  * Handles scalable flags alongside shape dimensions.

This change also reduces temporary variables in
`vectorizeAsTensorUnpackOp` and paves the way for a cleaner update in
 #149293.
@banach-space banach-space force-pushed the users/banach-space/update-vec-unpack branch from 7fc78d2 to 4d6cb14 Compare August 1, 2025 10:15
@banach-space banach-space merged commit 77363fb into main Aug 1, 2025
9 checks passed
@banach-space banach-space deleted the users/banach-space/update-vec-unpack branch August 1, 2025 10:26
hanhanW pushed a commit to iree-org/llvm-project that referenced this pull request Aug 1, 2025
…nalg.unpack (llvm#151503)

This patch introduces a new helper, `getCollapsedVecType`, and updates
`vectorizeAsTensorUnpackOp` to use it. The motivation stems from improving how
`vector.shape_cast` operations are generated when vectorizing `linalg.unpack`.

Previously, the vectorizer relied on
* `tensor::CollapseShapeOp::inferCollapsedType`

to compute the collapsed vector type. This approach is suboptimal
because:
  * `inferCollapsedType` lacks awareness of scalable vector flags.
  * Linalg vectorization should not depend on Tensor dialect utilities.

Instead of relocating `inferCollapsedType`, we introduce
`getCollapsedVecType` — a lightweight, specialized hook that:
  * Assumes no dynamic sizes.
  * Handles scalable flags alongside shape dimensions.

This change also reduces temporary variables in
`vectorizeAsTensorUnpackOp` and paves the way for a cleaner update in
 llvm#149293.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants