This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Conversion/MemRefToLLVM/
-
Conversion/
-
MemRefToLLVM/
-
MemRefToLLVM.cpp
-
test/
-
Conversion/MemRefToLLVM/
-
MemRefToLLVM/
-
expand-then-convert-to-llvm.mlir
-
memref-to-llvm.mlir
-
Integration/Dialect/Linalg/CPU/
-
Dialect/
-
Linalg/
-
CPU/
-
test-collapse-tensor.mlir
-
test-expand-tensor.mlir

Differential D136483

[mlir][MemRefToLLVM] Remove the code for lowering collaspe/expand_shape
ClosedPublic

Authored by qcolombet on Oct 21 2022, 12:37 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache
ftynse
chelini
stellaraccident
dcaballe

Commits

rG3914836273aa: [mlir][MemRefToLLVM] Remove the code for lowering collaspe/expand_shape

Summary

collapse/expand_shape are supposed to be expanded before we hit the lowering code.
The expansion is done with the pass called expand-strided-metadata.

This patch is NFC in spirit but not in practice because expand-strided-metadata won't try to accomodate for "invalid" strides for dynamic sizes that are 1 at runtime.

The previous code was broken in that respect too, but differently: it handled only the case of row-major layouts.
That whole part is being reworked separately.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

qcolombet created this revision.Oct 21 2022, 12:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2022, 12:37 PM

Herald added subscribers: zero9178, bzcheeseman, awarzynski and 20 others. · View Herald Transcript

qcolombet requested review of this revision.Oct 21 2022, 12:37 PM

Herald added a reviewer: dcaballe. · View Herald TranscriptOct 21 2022, 12:37 PM

Herald added a subscriber: stephenneuendorffer. · View Herald Transcript

Harbormaster completed remote builds in B193603: Diff 469717.Oct 21 2022, 12:38 PM

Could you comment (potentially the commit description) why the length of tests in the generated IR increases significantly after what is said to be a simplification? Were the previous tests just checking a subset of IR operations that were actually generated?

Hi ,

Could you comment (potentially the commit description) why the length of tests in the generated IR increases significantly after what is said to be a simplification? Were the previous tests just checking a subset of IR operations that were actually generated?

Good point, I haven't repeated the message explaining that from https://reviews.llvm.org/D136377.

Here is the relevant part:

This patch is NFC in spirit but not in practice because subview [here expand/collapse_shape] gets lowered into reinterpret_cast(extract_strided_metadata, <some math>) which lowers in two memref descriptors (one for reinterpert_cast and one for extract_strided_metadata), which creates some noise of the form: extractvalue(unrealized_cast(extractvalue[0]))[0] that is currently not simplified within MLIR but that is really just noop in that case.

Note: This patch builds on top of https://reviews.llvm.org/D136377, so it suffers the same problem as that one, i.e., the affine-to-std and arith-to-llvm dependencies.

As far as the term simplification goes, I should really say expansion. I'll fix that.

Cheers,
-Quentin

Oh and I forgot.
Another reason for having more IR is because we kept each affine expression independent to each other and some terms may be repeated between two expressions.
E.g.,

newSize = oldSize1 * oldSize2
finalOffset = oldOffset * oldSize1 * oldSize2

Here oldSize1 * oldSize2 could be reused (or CSE'd) but is currently expanded twice (once for each affine.apply expression.)

So if I understand correctly, we are now emitting more arithmetics that we are unable to simplify at the MLIR level. This sounds concerning and seem to defy the purpose of using affine maps, which should be the easy-to-compose closed-form expressions. Do we expect LLVM's CSE to simplify this? Is there any indication of this actually happening or not? (In one my previous projects, we've seen a performance improvement attributable to better/simpler address generation via memrefs, which we may now undo...)

This is similar to https://reviews.llvm.org/D136377, feel free to just land this with a similar solution once the first one is agreed on and landed.

This revision is now accepted and ready to land.Oct 27 2022, 8:29 AM

So if I understand correctly, we are now emitting more arithmetics that we are unable to simplify at the MLIR level.

That's correct.

This sounds concerning and seem to defy the purpose of using affine maps, which should be the easy-to-compose closed-form expressions.

When I first wrote this lowering (see https://reviews.llvm.org/D133166 for the details.), we started with doing compositions, but I found that it made the resulting IR hard to reason about. Now that I am more familiar with MLIR and affine maps in particular, maybe this is not as bad. I.e., we could revert this decision if that cause any problem.

Do we expect LLVM's CSE to simplify this?

Yes, I would expect LLVM's CSE to pick it up since we are dealing with simple math operations.

Is there any indication of this actually happening or not?

I haven't actually checked. I'll do that.
How do you go from the llvm dialect to actual LLVM IR?

That said, this is an interesting issue. Do we expect MLIR to produce the most optimized/concise code as possible or do/can we rely on the lower layers to do the cleanups?

If this is the former, then for instance why are we lowering dead code to beginning with?

(In one my previous projects, we've seen a performance improvement attributable to better/simpler address generation via memrefs, which we may now undo...)

Let's double check if the CSE happens right now or not.

Reporting on this:

Do we expect LLVM's CSE to simplify this? Is there any indication of this actually happening or not?

Yes, I confirmed that CSE is happening just fine.

Here is what I did:
Old mlir-opt:

mlir-opt -convert-memref-to-llvm -lower-affine -convert-arith-to-llvm  -convert-func-to-llvm -reconcile-unrealized-casts <input>.mlir -o  <output>.mlir
mlir-translate -mlir-to-llvmir  <output>.mlir -o - | opt -S -early-cse -o old-static.ll

New mlir-opt, i.e., with this patch:

# Run the expand pass first (right now it is called simplify-extract-strided-metadata
mlir-opt -simplify-extract-strided-metadata -convert-memref-to-llvm -lower-affine -convert-arith-to-llvm  -convert-func-to-llvm -reconcile-unrealized-casts <input>.mlir -o  <output>.mlir
mlir-translate -mlir-to-llvmir  <output>.mlir -o - | opt -S -early-cse -o new-static.ll

Result the IR is semantically equivalent and as performant in both case. The only difference is the extract_strided_metadata descriptor that stays around is we don't run DCE.
E.g., with the function collapse_shape_static from in memref-to-llvm.mlir:

--- old-static-cse.ll   2022-11-05 01:35:30.604898681 +0000
+++ new-static-cse.ll   2022-11-05 01:35:24.384293356 +0000
@@ -1,36 +1,38 @@
 ; ModuleID = '<stdin>'
 source_filename = "LLVMDialectModule"
 
 declare ptr @malloc(i64)
 
 declare void @free(ptr)
 
 define { ptr, ptr, i64, [3 x i64], [3 x i64] } @collapse_shape_static(ptr %0, ptr %1, i64 %2, i64 %3, i64 %4, i64 %5, i64 %6, i64 %7, i64 %8, i64 %9, i64 %10, i64 %11, i64 %12) {
   %14 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } undef, ptr %0, 0
   %15 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %14, ptr %1, 1
   %16 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %15, i64 %2, 2
   %17 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %16, i64 %3, 3, 0
   %18 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %17, i64 %8, 4, 0
   %19 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %18, i64 %4, 3, 1
   %20 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %19, i64 %9, 4, 1
   %21 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %20, i64 %5, 3, 2
   %22 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %21, i64 %10, 4, 2
   %23 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %22, i64 %6, 3, 3
   %24 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %23, i64 %11, 4, 3
   %25 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %24, i64 %7, 3, 4
   %26 = insertvalue { ptr, ptr, i64, [5 x i64], [5 x i64] } %25, i64 %12, 4, 4
-  %27 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } undef, ptr %0, 0
-  %28 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %27, ptr %1, 1
-  %29 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %28, i64 %2, 2
-  %30 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %29, i64 3, 3, 0
-  %31 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %30, i64 4, 3, 1
-  %32 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %31, i64 5, 3, 2
+  %27 = insertvalue { ptr, ptr, i64 } undef, ptr %0, 0
+  %28 = insertvalue { ptr, ptr, i64 } %27, ptr %1, 1
+  %29 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } undef, ptr %0, 0
+  %30 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %29, ptr %1, 1
+  %31 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %30, i64 0, 2
+  %32 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %31, i64 3, 3, 0
   %33 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %32, i64 20, 4, 0
-  %34 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %33, i64 5, 4, 1
-  %35 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %34, i64 1, 4, 2
-  ret { ptr, ptr, i64, [3 x i64], [3 x i64] } %35
+  %34 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %33, i64 4, 3, 1
+  %35 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %34, i64 5, 4, 1
+  %36 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %35, i64 5, 3, 2
+  %37 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %36, i64 1, 4, 2
+  ret { ptr, ptr, i64, [3 x i64], [3 x i64] } %37
 }
 
 !llvm.module.flags = !{!0}
 
 !0 = !{i32 2, !"Debug Info Version", i32 3}

And it gets more evident that things are the same if you run instcombine instead of cse:

--- old-static-instcombine.ll   2022-11-05 02:07:09.937594426 +0000
+++ new-static-instcombine.ll   2022-11-05 02:07:17.966374824 +0000
@@ -1,23 +1,23 @@
 ; ModuleID = '<stdin>'
 source_filename = "LLVMDialectModule"
 
 declare ptr @malloc(i64)
 
 declare void @free(ptr)
 
 define { ptr, ptr, i64, [3 x i64], [3 x i64] } @collapse_shape_static(ptr %0, ptr %1, i64 %2, i64 %3, i64 %4, i64 %5, i64 %6, i64 %7, i64 %8, i64 %9, i64 %10, i64 %11, i64 %12) {
   %14 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } undef, ptr %0, 0
   %15 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %14, ptr %1, 1
-  %16 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %15, i64 %2, 2
+  %16 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %15, i64 0, 2
   %17 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %16, i64 3, 3, 0
-  %18 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %17, i64 4, 3, 1
-  %19 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %18, i64 5, 3, 2
-  %20 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %19, i64 20, 4, 0
-  %21 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %20, i64 5, 4, 1
+  %18 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %17, i64 20, 4, 0
+  %19 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %18, i64 4, 3, 1
+  %20 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %19, i64 5, 4, 1
+  %21 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %20, i64 5, 3, 2
   %22 = insertvalue { ptr, ptr, i64, [3 x i64], [3 x i64] } %21, i64 1, 4, 2
   ret { ptr, ptr, i64, [3 x i64], [3 x i64] } %22
 }
 
 !llvm.module.flags = !{!0}
 
 !0 = !{i32 2, !"Debug Info Version", i32 3}

Herald added a subscriber: Moerafaat. · View Herald TranscriptNov 4 2022, 7:08 PM

BTW, regarding the dynamic case, I noticed that the new code and the old code differ.

Here is the ir for collapse_shape_dynamic_with_non_identity_layout, after renaming the variables and reordering them for clearer diff:

--- old.ll      2022-11-05 06:19:12.140322531 +0000
+++ new.ll      2022-11-05 06:22:09.989617895 +0000
@@ -1,31 +1,23 @@
 ; ModuleID = '<stdin>'
 source_filename = "LLVMDialectModule"
 
 declare ptr @malloc(i64)
 
 declare void @free(ptr)
 
 define { ptr, ptr, i64, [2 x i64], [2 x i64] } @collapse_shape_dynamic_with_non_identity_layout(ptr %arg, ptr %arg1, i64 %arg2, i64 %arg3, i64 %arg4, i64 %arg5, i64 %arg6, i64 %arg7, i64 %arg8) {
 bb:
-  %test.not = icmp eq i64 %arg5, 1
-  br i1 %test.not, label %bb34, label %bb36
-
-bb34:                                             ; preds = %bb
-  br label %bb36
-
-bb36:                                             ; preds = %bb34, %bb
-  %stride1 = phi i64 [ %arg7, %bb34 ], [ %arg8, %bb ]
   %desc = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } undef, ptr %arg, 0
   %desc0 = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc, ptr %arg1, 1
   %desc1 = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc0, i64 %arg2, 2
   %desc2 = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc1, i64 4, 3, 0
   %size1 = mul i64 %arg4, %arg5
   %desc3 = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc2, i64 %size1, 3, 1
   %desc4 = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc3, i64 %arg6, 4, 0
-  %desc5 = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc4, i64 %stride1, 4, 1
+  %desc5 = insertvalue { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc4, i64 1, 4, 1
   ret { ptr, ptr, i64, [2 x i64], [2 x i64] } %desc5
 }
 
 !llvm.module.flags = !{!0}
 
 !0 = !{i32 2, !"Debug Info Version", i32 3}

The new code code ditches the check for the dimensions of size one.
That's interesting because the old code was trying to find a "better" stride for the collapsed dimensions, but I believe this is not generally correct and if I understand the "specifications" of collapse shape, this shouldn't be required at all:

Collapsing non-contiguous dimensions is undefined behavior.

Put differently if we collapse dimensions that are not contiguous, unless I miss something, that means that our stride would have to go through gaps within the same dimension, which we don't support right now. Hence, I don't think the old code was doing the right thing here, but it's late, it's Friday and I may not be thinking clear anymore :).

That code with the check for stride == 1 comes from https://reviews.llvm.org/D124001.

@cathyzhyi, @springerm, what do you think?
CC'ing: @ftynse and @nicolasvasilache to have a better "letter of the law" semantic of the collapse_shape :).

Collapsing non-contiguous dimensions is undefined behavior.

Non-contiguous dimensions of size 1 can be collapsed. But for dims of size 1 it doesn't really make sense to have a stride in the first place; so such a stride could be simplified, e.g., to just 1. Is that what's happening here?

(Collapsing non-contiguous dimensions of size >1 should crash at runtime, but we don't generate the assert at the moment.)

@springerm thanks for your answer.

But for dims of size 1 it doesn't really make sense to have a stride in the first place; so such a stride could be simplified, e.g., to just 1. Is that what's happening here?

With the new code what is happening is when we collapse dimensions, we take the stride of the innermost dimension as the stride of the whole collapsed dimension, since the underlying tensor is supposed to be contiguous.
e.g.,

collapse_shape <5x?x?x?x?xi16, strided<[?, ?, ?, ?, ?]>>, [[0, 1], [2, 3, 4]] -> <?x?xi16>

>

dim(0) == 5 x orig_shape.dim(1)
dim(1) == orig_shape.dim(2) x  orig_shape.dim(3) x orig_shape.dim(4)
----
stride(0) == orig_shape.stride(1)
stride(1) == orig_shape.stride(4)

Now, this doesn't match what we were doing before this patch where in codegen we would explicitly skip the dimensions with size 1.
I.e., the old version would generate:

dim(0) == 5 x orig_shape.dim(1)
dim(1) == orig_shape.dim(2) x  orig_shape.dim(3) x orig_shape.dim(4)
----
stride(0) == orig_shape.dim(1) != 1? orig_shape.stride(1) : orig_shape.stride(0)
stride(1) == orig_shape.dim(4) != 1?
    orig_shape.stride(4) :
    (orig_shape.dim(3) != 1?
        orig_shape.stride(3):
        orig_shape.stride(2))

I understand that strides of dimensions of size 1 don't really make sense, but I found the old generated code to paper over something that is underspecified. Put differently, it actively harms codegen to have to ignore strides for dimension of size 1 and I was wondering if it is intended.

At my first reading of the spec, I was expecting that strides should be contiguous even for size 1 dimensions.

To take a concrete example from @collapse_shape_dynamic_with_non_identity_layout:

%0 = memref.collapse_shape %arg0 [[0], [1, 2]]:
  memref<4x?x?xf32, strided<[?, 4, 1], offset: ?>> into
  memref<4x?xf32, strided<[?, ?], offset: ?>>

Here I was expecting that it is okay to infer that the inner most stride (stride(1)) is one, since we collapse dimensions with respectively strides 4 and 1. And if that's not true we are in the undefined behavior realm.
However with what you're saying, the stride could be either 1 or 4 depending whether orig_shape.dim(2) == 1 or not.

Cheers,
-Quentin

A problematic case would be:

memref<3x1xi16, strided<[8, 1]>>

This memref can be collapsed to memref<3xi16, strided<[8]>>. Using stride 1 (as your new code computes) would be incorrect here, we have to skip the stride of the dim of size 1.

Note, the code that you are looking at is for the "dynamic" case, where strides etc. must be computed at runtime based on data in the memref descriptor. We also have a "static" case, where we infer/verify the static layout map in case there are no ? strides and/or dims. (If there are a few ?, only part of the layout map can be inferred/verified.) This code path mirrors the code path that you are looking at and is probably easier to understand and experiment with than the dynamic case. But it implements (or at least should implement) the same logic.

In particular, there is this comment in computeCollapsedLayoutMap (MemRefOps.cpp):

The result stride of a reassociation group is the stride of the last entry
of the reassociation. (...) Dimensions of size 1 should be skipped, because
their strides are meaningless and could have any arbitrary value.

I agree that many things related to strides, layout maps, etc. do not have good documentation, are maybe even underspecified. I fixed multiple bugs in expand_shape/collapse_shape due to this about a year ago. It is probably still not as good as it could be, so any improvements are appreciated!

qcolombet mentioned this in D136377: [mlir][MemRefToLLVM] Remove the code for lowering subview.Nov 16 2022, 3:12 PM

qcolombet mentioned this in D139329: [mlir][ExpandStridedMetadata] Handle collapse_shape of dim of size 1 gracefully.Dec 5 2022, 7:36 AM

qcolombet mentioned this in rG64f99842a6c0: [mlir][ExpandStridedMetadata] Handle collapse_shape of dim of size 1 gracefully.Dec 7 2022, 11:41 PM

Rebase
Move the expand/collapse_shape tests in expand-then-convert-to-llvm.mlir, since now they require to run the expansion pass beforehand
Update PR description

Herald added subscribers: hanchung, jsetoain, mravishankar. · View Herald TranscriptDec 14 2022, 10:10 AM

Harbormaster completed remote builds in B203165: Diff 482911.Dec 14 2022, 10:30 AM

Run clang-format

Harbormaster completed remote builds in B203285: Diff 483088.Dec 15 2022, 1:54 AM

Rebase

Now that we confirm that memref.reinterpret_cast does what we needed (see https://github.com/llvm/llvm-project/issues/59896), I feel confortable moving forward with this patch again.

@ftynse what do you think?

Harbormaster completed remote builds in B207563: Diff 488894.Jan 13 2023, 1:12 AM

@ftynse gentle ping

ftynse accepted this revision.Jan 20 2023, 2:17 AM

Closed by commit rG3914836273aa: [mlir][MemRefToLLVM] Remove the code for lowering collaspe/expand_shape (authored by qcolombet). · Explain WhyJan 23 2023, 7:48 AM

This revision was automatically updated to reflect the committed changes.

qcolombet added a commit: rG3914836273aa: [mlir][MemRefToLLVM] Remove the code for lowering collaspe/expand_shape.

Revision Contents

Path

Size

mlir/

lib/

Conversion/

MemRefToLLVM/

MemRefToLLVM.cpp

305 lines

test/

Conversion/

MemRefToLLVM/

expand-then-convert-to-llvm.mlir

295 lines

memref-to-llvm.mlir

250 lines

Integration/

Dialect/

Linalg/

CPU/

test-collapse-tensor.mlir

1 line

test-expand-tensor.mlir

1 line

Diff 491376

mlir/lib/Conversion/MemRefToLLVM/MemRefToLLVM.cpp

Show First 20 Lines • Show All 1,368 Lines • ▼ Show 20 Lines	convertSourceMemRefToDescriptor(ConversionPatternRewriter &rewriter,
// Reset position to beginning of new remainder block.		// Reset position to beginning of new remainder block.
rewriter.setInsertionPointToStart(remainder);		rewriter.setInsertionPointToStart(remainder);

*descriptor = targetDesc;		*descriptor = targetDesc;
return success();		return success();
}		}
};		};

/// Helper function to convert a vector of `OpFoldResult`s into a vector of		/// RessociatingReshapeOp must be expanded before we reach this stage.
/// `Value`s.		/// Report that information.
static SmallVector<Value> getAsValues(OpBuilder &b, Location loc,
Type &llvmIndexType,
ArrayRef<OpFoldResult> valueOrAttrVec) {
return llvm::to_vector<4>(
llvm::map_range(valueOrAttrVec, [&](OpFoldResult value) -> Value {
if (auto attr = value.dyn_cast<Attribute>())
return b.create<LLVM::ConstantOp>(loc, llvmIndexType, attr);
return value.get<Value>();
}));
}

/// Compute a map that for a given dimension of the expanded type gives the
/// dimension in the collapsed type it maps to. Essentially its the inverse of
/// the `reassocation` maps.
static DenseMap<int64_t, int64_t>
getExpandedDimToCollapsedDimMap(ArrayRef<ReassociationIndices> reassociation) {
llvm::DenseMap<int64_t, int64_t> expandedDimToCollapsedDim;
for (auto &en : enumerate(reassociation)) {
for (auto dim : en.value())
expandedDimToCollapsedDim[dim] = en.index();
}
return expandedDimToCollapsedDim;
}

static OpFoldResult
getExpandedOutputDimSize(OpBuilder &b, Location loc, Type &llvmIndexType,
int64_t outDimIndex, ArrayRef<int64_t> outStaticShape,
MemRefDescriptor &inDesc,
ArrayRef<int64_t> inStaticShape,
ArrayRef<ReassociationIndices> reassocation,
DenseMap<int64_t, int64_t> &outDimToInDimMap) {
int64_t outDimSize = outStaticShape[outDimIndex];
if (!ShapedType::isDynamic(outDimSize))
return b.getIndexAttr(outDimSize);

// Calculate the multiplication of all the out dim sizes except the
// current dim.
int64_t inDimIndex = outDimToInDimMap[outDimIndex];
int64_t otherDimSizesMul = 1;
for (auto otherDimIndex : reassocation[inDimIndex]) {
if (otherDimIndex == static_cast<unsigned>(outDimIndex))
continue;
int64_t otherDimSize = outStaticShape[otherDimIndex];
assert(!ShapedType::isDynamic(otherDimSize) &&
"single dimension cannot be expanded into multiple dynamic "
"dimensions");
otherDimSizesMul *= otherDimSize;
}

// outDimSize = inDimSize / otherOutDimSizesMul
int64_t inDimSize = inStaticShape[inDimIndex];
Value inDimSizeDynamic =
ShapedType::isDynamic(inDimSize)
? inDesc.size(b, loc, inDimIndex)
: b.create<LLVM::ConstantOp>(loc, llvmIndexType,
b.getIndexAttr(inDimSize));
Value outDimSizeDynamic = b.create<LLVM::SDivOp>(
loc, inDimSizeDynamic,
b.create<LLVM::ConstantOp>(loc, llvmIndexType,
b.getIndexAttr(otherDimSizesMul)));
return outDimSizeDynamic;
}

static OpFoldResult getCollapsedOutputDimSize(
OpBuilder &b, Location loc, Type &llvmIndexType, int64_t outDimIndex,
int64_t outDimSize, ArrayRef<int64_t> inStaticShape,
MemRefDescriptor &inDesc, ArrayRef<ReassociationIndices> reassocation) {
if (!ShapedType::isDynamic(outDimSize))
return b.getIndexAttr(outDimSize);

Value c1 = b.create<LLVM::ConstantOp>(loc, llvmIndexType, b.getIndexAttr(1));
Value outDimSizeDynamic = c1;
for (auto inDimIndex : reassocation[outDimIndex]) {
int64_t inDimSize = inStaticShape[inDimIndex];
Value inDimSizeDynamic =
ShapedType::isDynamic(inDimSize)
? inDesc.size(b, loc, inDimIndex)
: b.create<LLVM::ConstantOp>(loc, llvmIndexType,
b.getIndexAttr(inDimSize));
outDimSizeDynamic =
b.create<LLVM::MulOp>(loc, outDimSizeDynamic, inDimSizeDynamic);
}
return outDimSizeDynamic;
}

static SmallVector<OpFoldResult, 4>
getCollapsedOutputShape(OpBuilder &b, Location loc, Type &llvmIndexType,
ArrayRef<ReassociationIndices> reassociation,
ArrayRef<int64_t> inStaticShape,
MemRefDescriptor &inDesc,
ArrayRef<int64_t> outStaticShape) {
return llvm::to_vector<4>(llvm::map_range(
llvm::seq<int64_t>(0, outStaticShape.size()), [&](int64_t outDimIndex) {
return getCollapsedOutputDimSize(b, loc, llvmIndexType, outDimIndex,
outStaticShape[outDimIndex],
inStaticShape, inDesc, reassociation);
}));
}

static SmallVector<OpFoldResult, 4>
getExpandedOutputShape(OpBuilder &b, Location loc, Type &llvmIndexType,
ArrayRef<ReassociationIndices> reassociation,
ArrayRef<int64_t> inStaticShape,
MemRefDescriptor &inDesc,
ArrayRef<int64_t> outStaticShape) {
DenseMap<int64_t, int64_t> outDimToInDimMap =
getExpandedDimToCollapsedDimMap(reassociation);
return llvm::to_vector<4>(llvm::map_range(
llvm::seq<int64_t>(0, outStaticShape.size()), [&](int64_t outDimIndex) {
return getExpandedOutputDimSize(b, loc, llvmIndexType, outDimIndex,
outStaticShape, inDesc, inStaticShape,
reassociation, outDimToInDimMap);
}));
}

static SmallVector<Value>
getDynamicOutputShape(OpBuilder &b, Location loc, Type &llvmIndexType,
ArrayRef<ReassociationIndices> reassociation,
ArrayRef<int64_t> inStaticShape, MemRefDescriptor &inDesc,
ArrayRef<int64_t> outStaticShape) {
return outStaticShape.size() < inStaticShape.size()
? getAsValues(b, loc, llvmIndexType,
getCollapsedOutputShape(b, loc, llvmIndexType,
reassociation, inStaticShape,
inDesc, outStaticShape))
: getAsValues(b, loc, llvmIndexType,
getExpandedOutputShape(b, loc, llvmIndexType,
reassociation, inStaticShape,
inDesc, outStaticShape));
}

static void fillInStridesForExpandedMemDescriptor(
OpBuilder &b, Location loc, MemRefType srcType, MemRefDescriptor &srcDesc,
MemRefDescriptor &dstDesc, ArrayRef<ReassociationIndices> reassociation) {
// See comments for computeExpandedLayoutMap for details on how the strides
// are calculated.
for (auto &en : llvm::enumerate(reassociation)) {
auto currentStrideToExpand = srcDesc.stride(b, loc, en.index());
for (auto dstIndex : llvm::reverse(en.value())) {
dstDesc.setStride(b, loc, dstIndex, currentStrideToExpand);
Value size = dstDesc.size(b, loc, dstIndex);
currentStrideToExpand =
b.create<LLVM::MulOp>(loc, size, currentStrideToExpand);
}
}
}

static void fillInStridesForCollapsedMemDescriptor(
ConversionPatternRewriter &rewriter, Location loc, Operation *op,
TypeConverter *typeConverter, MemRefType srcType, MemRefDescriptor &srcDesc,
MemRefDescriptor &dstDesc, ArrayRef<ReassociationIndices> reassociation) {
auto llvmIndexType = typeConverter->convertType(rewriter.getIndexType());
// See comments for computeCollapsedLayoutMap for details on how the strides
// are calculated.
auto srcShape = srcType.getShape();
for (auto &en : llvm::enumerate(reassociation)) {
rewriter.setInsertionPoint(op);
auto dstIndex = en.index();
ArrayRef<int64_t> ref = llvm::ArrayRef(en.value());
while (srcShape[ref.back()] == 1 && ref.size() > 1)
ref = ref.drop_back();
if (!ShapedType::isDynamic(srcShape[ref.back()]) \|\| ref.size() == 1) {
dstDesc.setStride(rewriter, loc, dstIndex,
srcDesc.stride(rewriter, loc, ref.back()));
} else {
// Iterate over the source strides in reverse order. Skip over the
// dimensions whose size is 1.
// TODO: we should take the minimum stride in the reassociation group
// instead of just the first where the dimension is not 1.
//
// +------------------------------------------------------+
// \| curEntry: \|
// \| %srcStride = strides[srcIndex] \|
// \| %neOne = cmp sizes[srcIndex],1 +--+
// \| cf.cond_br %neOne, continue(%srcStride), nextEntry \| \|
// +-------------------------+----------------------------+ \|
// \| \|
// v \|
// +-----------------------------+ \|
// \| nextEntry: \| \|
// \| ... +---+ \|
// +--------------+--------------+ \| \|
// \| \| \|
// v \| \|
// +-----------------------------+ \| \|
// \| nextEntry: \| \| \|
// \| ... \| \| \|
// +--------------+--------------+ \| +--------+
// \| \| \|
// v v v
// +--------------------------------------------------+
// \| continue(%newStride): \|
// \| %newMemRefDes = setStride(%newStride,dstIndex) \|
// +--------------------------------------------------+
OpBuilder::InsertionGuard guard(rewriter);
Block *initBlock = rewriter.getInsertionBlock();
Block *continueBlock =
rewriter.splitBlock(initBlock, rewriter.getInsertionPoint());
continueBlock->insertArgument(unsigned(0), srcDesc.getIndexType(), loc);
rewriter.setInsertionPointToStart(continueBlock);
dstDesc.setStride(rewriter, loc, dstIndex, continueBlock->getArgument(0));

Block *curEntryBlock = initBlock;
Block *nextEntryBlock;
for (auto srcIndex : llvm::reverse(ref)) {
if (srcShape[srcIndex] == 1 && srcIndex != ref.front())
continue;
rewriter.setInsertionPointToEnd(curEntryBlock);
Value srcStride = srcDesc.stride(rewriter, loc, srcIndex);
if (srcIndex == ref.front()) {
rewriter.create<LLVM::BrOp>(loc, srcStride, continueBlock);
break;
}
Value one = rewriter.create<LLVM::ConstantOp>(loc, llvmIndexType,
rewriter.getIndexAttr(1));
Value predNeOne = rewriter.create<LLVM::ICmpOp>(
loc, LLVM::ICmpPredicate::ne, srcDesc.size(rewriter, loc, srcIndex),
one);
{
OpBuilder::InsertionGuard guard(rewriter);
nextEntryBlock = rewriter.createBlock(
initBlock->getParent(), Region::iterator(continueBlock), {});
}
rewriter.create<LLVM::CondBrOp>(loc, predNeOne, continueBlock,
srcStride, nextEntryBlock,
std::nullopt);
curEntryBlock = nextEntryBlock;
}
}
}
}

static void fillInDynamicStridesForMemDescriptor(
ConversionPatternRewriter &b, Location loc, Operation *op,
TypeConverter *typeConverter, MemRefType srcType, MemRefType dstType,
MemRefDescriptor &srcDesc, MemRefDescriptor &dstDesc,
ArrayRef<ReassociationIndices> reassociation) {
if (srcType.getRank() > dstType.getRank())
fillInStridesForCollapsedMemDescriptor(b, loc, op, typeConverter, srcType,
srcDesc, dstDesc, reassociation);
else
fillInStridesForExpandedMemDescriptor(b, loc, srcType, srcDesc, dstDesc,
reassociation);
}

// ReshapeOp creates a new view descriptor of the proper rank.
// For now, the only conversion supported is for target MemRef with static sizes
// and strides.
template <typename ReshapeOp>		template <typename ReshapeOp>
class ReassociatingReshapeOpConversion		class ReassociatingReshapeOpConversion
: public ConvertOpToLLVMPattern<ReshapeOp> {		: public ConvertOpToLLVMPattern<ReshapeOp> {
public:		public:
using ConvertOpToLLVMPattern<ReshapeOp>::ConvertOpToLLVMPattern;		using ConvertOpToLLVMPattern<ReshapeOp>::ConvertOpToLLVMPattern;
using ReshapeOpAdaptor = typename ReshapeOp::Adaptor;		using ReshapeOpAdaptor = typename ReshapeOp::Adaptor;

LogicalResult		LogicalResult
matchAndRewrite(ReshapeOp reshapeOp, typename ReshapeOp::Adaptor adaptor,		matchAndRewrite(ReshapeOp reshapeOp, typename ReshapeOp::Adaptor adaptor,
ConversionPatternRewriter &rewriter) const override {		ConversionPatternRewriter &rewriter) const override {
MemRefType dstType = reshapeOp.getResultType();
MemRefType srcType = reshapeOp.getSrcType();

int64_t offset;
SmallVector<int64_t, 4> strides;
if (failed(getStridesAndOffset(dstType, strides, offset))) {
return rewriter.notifyMatchFailure(		return rewriter.notifyMatchFailure(
reshapeOp, "failed to get stride and offset exprs");		reshapeOp,
}		"reassociation operations should have been expanded beforehand");

MemRefDescriptor srcDesc(adaptor.getSrc());
Location loc = reshapeOp->getLoc();
auto dstDesc = MemRefDescriptor::undef(
rewriter, loc, this->typeConverter->convertType(dstType));
dstDesc.setAllocatedPtr(rewriter, loc, srcDesc.allocatedPtr(rewriter, loc));
dstDesc.setAlignedPtr(rewriter, loc, srcDesc.alignedPtr(rewriter, loc));
dstDesc.setOffset(rewriter, loc, srcDesc.offset(rewriter, loc));

ArrayRef<int64_t> srcStaticShape = srcType.getShape();
ArrayRef<int64_t> dstStaticShape = dstType.getShape();
Type llvmIndexType =
this->typeConverter->convertType(rewriter.getIndexType());
SmallVector<Value> dstShape = getDynamicOutputShape(
rewriter, loc, llvmIndexType, reshapeOp.getReassociationIndices(),
srcStaticShape, srcDesc, dstStaticShape);
for (auto &en : llvm::enumerate(dstShape))
dstDesc.setSize(rewriter, loc, en.index(), en.value());

if (llvm::all_of(strides, isStaticStrideOrOffset)) {
for (auto &en : llvm::enumerate(strides))
dstDesc.setConstantStride(rewriter, loc, en.index(), en.value());
} else if (srcType.getLayout().isIdentity() &&
dstType.getLayout().isIdentity()) {
Value c1 = rewriter.create<LLVM::ConstantOp>(loc, llvmIndexType,
rewriter.getIndexAttr(1));
Value stride = c1;
for (auto dimIndex :
llvm::reverse(llvm::seq<int64_t>(0, dstShape.size()))) {
dstDesc.setStride(rewriter, loc, dimIndex, stride);
stride = rewriter.create<LLVM::MulOp>(loc, dstShape[dimIndex], stride);
}
} else {
// There could be mixed static/dynamic strides. For simplicity, we
// recompute all strides if there is at least one dynamic stride.
fillInDynamicStridesForMemDescriptor(
rewriter, loc, reshapeOp, this->typeConverter, srcType, dstType,
srcDesc, dstDesc, reshapeOp.getReassociationIndices());
}
rewriter.replaceOp(reshapeOp, {dstDesc});
return success();
}		}
};		};

/// Subviews must be expanded before we reach this stage.		/// Subviews must be expanded before we reach this stage.
/// Report that information.		/// Report that information.
struct SubViewOpLowering : public ConvertOpToLLVMPattern<memref::SubViewOp> {		struct SubViewOpLowering : public ConvertOpToLLVMPattern<memref::SubViewOp> {
using ConvertOpToLLVMPattern<memref::SubViewOp>::ConvertOpToLLVMPattern;		using ConvertOpToLLVMPattern<memref::SubViewOp>::ConvertOpToLLVMPattern;

▲ Show 20 Lines • Show All 373 Lines • Show Last 20 Lines

mlir/test/Conversion/MemRefToLLVM/expand-then-convert-to-llvm.mlir

Show First 20 Lines • Show All 367 Lines • ▼ Show 20 Lines	func.func @subview_negative_stride(%arg0 : memref<7xf32>) -> memref<7xf32, strided<[-1], offset: 6>> {
// CHECK: %[[CST_STRIDE0:.*]] = llvm.mlir.constant(-1 : index) : i64		// CHECK: %[[CST_STRIDE0:.*]] = llvm.mlir.constant(-1 : index) : i64
// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[CST_STRIDE0]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[CST_STRIDE0]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC4]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<7xf32, strided<[-1], offset: 6>>		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC4]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> to memref<7xf32, strided<[-1], offset: 6>>
// CHECK: return %[[RES]] : memref<7xf32, strided<[-1], offset: 6>>		// CHECK: return %[[RES]] : memref<7xf32, strided<[-1], offset: 6>>

%0 = memref.subview %arg0[6] [7] [-1] : memref<7xf32> to memref<7xf32, strided<[-1], offset: 6>>		%0 = memref.subview %arg0[6] [7] [-1] : memref<7xf32> to memref<7xf32, strided<[-1], offset: 6>>
return %0 : memref<7xf32, strided<[-1], offset: 6>>		return %0 : memref<7xf32, strided<[-1], offset: 6>>
}		}

		// -----

		func.func @collapse_shape_static(%arg0: memref<1x3x4x1x5xf32>) -> memref<3x4x5xf32> {
		%0 = memref.collapse_shape %arg0 [[0, 1], [2], [3, 4]] :
		memref<1x3x4x1x5xf32> into memref<3x4x5xf32>
		return %0 : memref<3x4x5xf32>
		}
		// CHECK-LABEL: func @collapse_shape_static
		// CHECK-SAME: %[[ARG:.*]]: memref<1x3x4x1x5xf32>) -> memref<3x4x5xf32> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<1x3x4x1x5xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : index) : i64
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC0]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[C0]], %[[DESC1]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C3:.*]] = llvm.mlir.constant(3 : index) : i64
		// CHECK: %[[DESC3:.*]] = llvm.insertvalue %[[C3]], %[[DESC2]][3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C20:.*]] = llvm.mlir.constant(20 : index) : i64
		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[C20]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C4:.*]] = llvm.mlir.constant(4 : index) : i64
		// CHECK: %[[DESC5:.*]] = llvm.insertvalue %[[C4]], %[[DESC4]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C5:.*]] = llvm.mlir.constant(5 : index) : i64
		// CHECK: %[[DESC6:.*]] = llvm.insertvalue %[[C5]], %[[DESC5]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC7:.*]] = llvm.insertvalue %[[C5]], %[[DESC6]][3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C1:.*]] = llvm.mlir.constant(1 : index) : i64
		// CHECK: %[[DESC8:.*]] = llvm.insertvalue %[[C1]], %[[DESC7]][4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC8]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)> to memref<3x4x5xf32>
		// CHECK: return %[[RES]] : memref<3x4x5xf32>
		// CHECK: }

		// -----

		func.func @collapse_shape_dynamic_with_non_identity_layout(
		%arg0 : memref<4x?x?xf32, strided<[?, 4, 1], offset: ?>>) ->
		memref<4x?xf32, strided<[?, ?], offset: ?>> {
		%0 = memref.collapse_shape %arg0 [[0], [1, 2]]:
		memref<4x?x?xf32, strided<[?, 4, 1], offset: ?>> into
		memref<4x?xf32, strided<[?, ?], offset: ?>>
		return %0 : memref<4x?xf32, strided<[?, ?], offset: ?>>
		}
		// CHECK-LABEL: func.func @collapse_shape_dynamic_with_non_identity_layout(
		// CHECK-SAME: %[[ARG:.*]]: memref<4x?x?xf32, strided<[?, 4, 1], offset: ?>>) -> memref<4x?xf32, strided<[?, ?], offset: ?>> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<4x?x?xf32, strided<[?, 4, 1], offset: ?>> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[OFFSET:.*]] = llvm.extractvalue %[[MEM]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[SIZE1:.*]] = llvm.extractvalue %[[MEM]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[SIZE2:.*]] = llvm.extractvalue %[[MEM]][3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[STRIDE0:.*]] = llvm.extractvalue %[[MEM]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[STRIDE0_TO_IDX:.*]] = builtin.unrealized_conversion_cast %[[STRIDE0]] : i64 to index
		// CHECK: %[[STRIDE0:.*]] = builtin.unrealized_conversion_cast %[[STRIDE0_TO_IDX]] : index to i64
		// CHECK: %[[FINAL_SIZE1:.*]] = llvm.mul %[[SIZE1]], %[[SIZE2]] : i64
		// CHECK: %[[SIZE1_TO_IDX:.*]] = builtin.unrealized_conversion_cast %[[FINAL_SIZE1]] : i64 to index
		// CHECK: %[[FINAL_SIZE1:.*]] = builtin.unrealized_conversion_cast %[[SIZE1_TO_IDX]] : index to i64
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC0]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[OFFSET]], %[[DESC1]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[C4:.*]] = llvm.mlir.constant(4 : index) : i64
		// CHECK: %[[DESC3:.*]] = llvm.insertvalue %[[C4]], %[[DESC2]][3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[STRIDE0]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC5:.*]] = llvm.insertvalue %[[FINAL_SIZE1]], %[[DESC4]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[C1:.*]] = llvm.mlir.constant(1 : index) : i64
		// CHECK: %[[DESC6:.*]] = llvm.insertvalue %[[C1]], %[[DESC5]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC6]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)> to memref<4x?xf32, strided<[?, ?], offset: ?>>
		// CHECK: return %[[RES]] : memref<4x?xf32, strided<[?, ?], offset: ?>>
		// CHECK: }
		// CHECK32-LABEL: func @collapse_shape_dynamic_with_non_identity_layout(
		// CHECK32: llvm.mlir.constant(1 : index) : i32
		// CHECK32: llvm.mlir.constant(4 : index) : i32
		// CHECK32: llvm.mlir.constant(1 : index) : i32

		// -----


		func.func @expand_shape_static(%arg0: memref<3x4x5xf32>) -> memref<1x3x4x1x5xf32> {
		// Reshapes that expand a contiguous tensor with some 1's.
		%0 = memref.expand_shape %arg0 [[0, 1], [2], [3, 4]]
		: memref<3x4x5xf32> into memref<1x3x4x1x5xf32>
		return %0 : memref<1x3x4x1x5xf32>
		}
		// CHECK-LABEL: func @expand_shape_static
		// CHECK-SAME: %[[ARG:.*]]: memref<3x4x5xf32>) -> memref<1x3x4x1x5xf32> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<3x4x5xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : index) : i64
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC0]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[C0]], %[[DESC1]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[C1:.*]] = llvm.mlir.constant(1 : index) : i64
		// CHECK: %[[DESC3:.*]] = llvm.insertvalue %[[C1]], %[[DESC2]][3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[C60:.*]] = llvm.mlir.constant(60 : index) : i64
		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[C60]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[C3:.*]] = llvm.mlir.constant(3 : index) : i64
		// CHECK: %[[DESC5:.*]] = llvm.insertvalue %[[C3]], %[[DESC4]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[C20:.*]] = llvm.mlir.constant(20 : index) : i64
		// CHECK: %[[DESC6:.*]] = llvm.insertvalue %[[C20]], %[[DESC5]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[C4:.*]] = llvm.mlir.constant(4 : index) : i64
		// CHECK: %[[DESC7:.*]] = llvm.insertvalue %[[C4]], %[[DESC6]][3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[C5:.*]] = llvm.mlir.constant(5 : index) : i64
		// CHECK: %[[DESC8:.*]] = llvm.insertvalue %[[C5]], %[[DESC7]][4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[DESC9:.*]] = llvm.insertvalue %[[C1]], %[[DESC8]][3, 3] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[DESC10:.*]] = llvm.insertvalue %[[C5]], %[[DESC9]][4, 3] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[DESC11:.*]] = llvm.insertvalue %[[C5]], %[[DESC10]][3, 4] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[DESC12:.*]] = llvm.insertvalue %[[C1]], %[[DESC11]][4, 4] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC12]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)> to memref<1x3x4x1x5xf32>
		// CHECK: return %[[RES]] : memref<1x3x4x1x5xf32>
		// CHECK: }

		// -----

		func.func @collapse_shape_fold_zero_dim(%arg0 : memref<1x1xf32>) -> memref<f32> {
		%0 = memref.collapse_shape %arg0 [] : memref<1x1xf32> into memref<f32>
		return %0 : memref<f32>
		}
		// CHECK-LABEL: func.func @collapse_shape_fold_zero_dim(
		// CHECK-SAME: %[[ARG:.*]]: memref<1x1xf32>) -> memref<f32> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<1x1xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
		// CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC0]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
		// CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : index) : i64
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[C0]], %[[DESC1]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC2]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)> to memref<f32>
		// CHECK: return %[[RES]] : memref<f32>
		// CHECK: }

		// -----

		func.func @expand_shape_zero_dim(%arg0 : memref<f32>) -> memref<1x1xf32> {
		%0 = memref.expand_shape %arg0 [] : memref<f32> into memref<1x1xf32>
		return %0 : memref<1x1xf32>
		}

		// CHECK-LABEL: func.func @expand_shape_zero_dim(
		// CHECK-SAME: %[[ARG:.*]]: memref<f32>) -> memref<1x1xf32> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<f32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
		// CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : index) : i64
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC0]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[C0]], %[[DESC1]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[C1:.*]] = llvm.mlir.constant(1 : index) : i64
		// CHECK: %[[DESC3:.*]] = llvm.insertvalue %[[C1]], %[[DESC2]][3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[C1]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC5:.*]] = llvm.insertvalue %[[C1]], %[[DESC4]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC6:.*]] = llvm.insertvalue %[[C1]], %[[DESC5]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC6]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)> to memref<1x1xf32>
		// CHECK: return %[[RES]] : memref<1x1xf32>
		// CHECK: }

		// -----

		func.func @collapse_shape_dynamic(%arg0 : memref<1x2x?xf32>) -> memref<1x?xf32> {
		%0 = memref.collapse_shape %arg0 [[0], [1, 2]]: memref<1x2x?xf32> into memref<1x?xf32>
		return %0 : memref<1x?xf32>
		}

		// CHECK-LABEL: func.func @collapse_shape_dynamic(
		// CHECK-SAME: %[[ARG:.*]]: memref<1x2x?xf32>) -> memref<1x?xf32> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<1x2x?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : index) : i64
		// CHECK: %[[SIZE2:.*]] = llvm.extractvalue %[[MEM]][3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[STRIDE0:.*]] = llvm.extractvalue %[[MEM]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[STRIDE1:.*]] = llvm.extractvalue %[[MEM]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C2:.*]] = llvm.mlir.constant(2 : index) : i64
		// CHECK: %[[FINAL_SIZE1:.*]] = llvm.mul %[[SIZE2]], %[[C2]] : i64
		// CHECK: %[[SIZE1_TO_IDX:.*]] = builtin.unrealized_conversion_cast %[[FINAL_SIZE1]] : i64 to index
		// CHECK: %[[FINAL_SIZE1:.*]] = builtin.unrealized_conversion_cast %[[SIZE1_TO_IDX]] : index to i64
		// CHECK: %[[C1:.*]] = llvm.mlir.constant(1 : index) : i64
		// CHECK: %[[IS_MIN_STRIDE1:.*]] = llvm.icmp "slt" %[[STRIDE1]], %[[C1]] : i64
		// CHECK: %[[MIN_STRIDE1:.*]] = llvm.select %[[IS_MIN_STRIDE1]], %[[STRIDE1]], %[[C1]] : i1, i64
		// CHECK: %[[MIN_STRIDE1_TO_IDX:.*]] = builtin.unrealized_conversion_cast %[[MIN_STRIDE1]] : i64 to index
		// CHECK: %[[MIN_STRIDE1:.*]] = builtin.unrealized_conversion_cast %[[MIN_STRIDE1_TO_IDX]] : index to i64
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC0]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[C0]], %[[DESC1]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC3:.*]] = llvm.insertvalue %[[C1]], %[[DESC2]][3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[STRIDE0]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC5:.*]] = llvm.insertvalue %[[FINAL_SIZE1]], %[[DESC4]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[DESC6:.*]] = llvm.insertvalue %[[MIN_STRIDE1]], %[[DESC5]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC6]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)> to memref<1x?xf32>
		// CHECK: return %[[RES]] : memref<1x?xf32>
		// CHECK: }

		// -----

		func.func @expand_shape_dynamic(%arg0 : memref<1x?xf32>) -> memref<1x2x?xf32> {
		%0 = memref.expand_shape %arg0 [[0], [1, 2]]: memref<1x?xf32> into memref<1x2x?xf32>
		return %0 : memref<1x2x?xf32>
		}

		// CHECK-LABEL: func.func @expand_shape_dynamic(
		// CHECK-SAME: %[[ARG:.*]]: memref<1x?xf32>) -> memref<1x2x?xf32> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<1x?xf32> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : index) : i64
		// CHECK: %[[SIZE1:.*]] = llvm.extractvalue %[[MEM]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[STRIDE0:.*]] = llvm.extractvalue %[[MEM]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[C2:.*]] = llvm.mlir.constant(2 : index) : i64
		// CHECK: %[[CMINUS1:.*]] = llvm.mlir.constant(-1 : index) : i64
		// CHECK: %[[IS_NEGATIVE_SIZE1:.*]] = llvm.icmp "slt" %[[SIZE1]], %[[C0]] : i64
		// CHECK: %[[ABS_SIZE1_MINUS_1:.*]] = llvm.sub %[[CMINUS1]], %[[SIZE1]] : i64
		// CHECK: %[[ADJ_SIZE1:.*]] = llvm.select %[[IS_NEGATIVE_SIZE1]], %[[ABS_SIZE1_MINUS_1]], %[[SIZE1]] : i1, i64
		// CHECK: %[[SIZE2:.*]] = llvm.sdiv %[[ADJ_SIZE1]], %[[C2]] : i64
		// CHECK: %[[NEGATIVE_SIZE2:.*]] = llvm.sub %[[CMINUS1]], %[[SIZE2]] : i64
		// CHECK: %[[FINAL_SIZE2:.*]] = llvm.select %[[IS_NEGATIVE_SIZE1]], %[[NEGATIVE_SIZE2]], %[[SIZE2]] : i1, i64
		// CHECK: %[[SIZE2_TO_IDX:.*]] = builtin.unrealized_conversion_cast %[[FINAL_SIZE2]] : i64 to index
		// CHECK: %[[FINAL_SIZE2:.*]] = builtin.unrealized_conversion_cast %[[SIZE2_TO_IDX]] : index to i64
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC0]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[C0]], %[[DESC1]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C1:.*]] = llvm.mlir.constant(1 : index) : i64
		// CHECK: %[[DESC3:.*]] = llvm.insertvalue %[[C1]], %[[DESC2]][3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[STRIDE0]], %[[DESC3]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC5:.*]] = llvm.insertvalue %[[C2]], %[[DESC4]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// In this example stride1 and size2 are the same.
		// Hence with CSE, we get the same SSA value.
		// CHECK: %[[DESC6:.*]] = llvm.insertvalue %[[FINAL_SIZE2]], %[[DESC5]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC7:.*]] = llvm.insertvalue %[[FINAL_SIZE2]], %[[DESC6]][3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC8:.*]] = llvm.insertvalue %[[C1]], %[[DESC7]][4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC8]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)> to memref<1x2x?xf32>
		// CHECK: return %[[RES]] : memref<1x2x?xf32>
		// CHECK: }

		// -----

		func.func @expand_shape_dynamic_with_non_identity_layout(
		%arg0 : memref<1x?xf32, strided<[?, ?], offset: ?>>) ->
		memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>> {
		%0 = memref.expand_shape %arg0 [[0], [1, 2]]:
		memref<1x?xf32, strided<[?, ?], offset: ?>> into
		memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>>
		return %0 : memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>>
		}
		// CHECK-LABEL: func.func @expand_shape_dynamic_with_non_identity_layout(
		// CHECK-SAME: %[[ARG:.*]]: memref<1x?xf32, strided<[?, ?], offset: ?>>) -> memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>> {
		// CHECK: %[[MEM:.*]] = builtin.unrealized_conversion_cast %[[ARG]] : memref<1x?xf32, strided<[?, ?], offset: ?>> to !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[BASE_BUFFER:.*]] = llvm.extractvalue %[[MEM]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[ALIGNED_BUFFER:.*]] = llvm.extractvalue %[[MEM]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
		// CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : index) : i64
		// CHECK: %[[OFFSET:.*]] = llvm.extractvalue %[[MEM]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[SIZE1:.*]] = llvm.extractvalue %[[MEM]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[STRIDE0:.*]] = llvm.extractvalue %[[MEM]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[STRIDE1:.*]] = llvm.extractvalue %[[MEM]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
		// CHECK: %[[C2:.*]] = llvm.mlir.constant(2 : index) : i64
		// CHECK: %[[CMINUS1:.*]] = llvm.mlir.constant(-1 : index) : i64
		// CHECK: %[[IS_NEGATIVE_SIZE1:.*]] = llvm.icmp "slt" %[[SIZE1]], %[[C0]] : i64
		// CHECK: %[[ABS_SIZE1_MINUS_1:.*]] = llvm.sub %[[CMINUS1]], %[[SIZE1]] : i64
		// CHECK: %[[ADJ_SIZE1:.*]] = llvm.select %[[IS_NEGATIVE_SIZE1]], %[[ABS_SIZE1_MINUS_1]], %[[SIZE1]] : i1, i64
		// CHECK: %[[SIZE2:.*]] = llvm.sdiv %[[ADJ_SIZE1]], %[[C2]] : i64
		// CHECK: %[[NEGATIVE_SIZE2:.*]] = llvm.sub %[[CMINUS1]], %[[SIZE2]] : i64
		// CHECK: %[[TMP_SIZE2:.*]] = llvm.select %[[IS_NEGATIVE_SIZE1]], %[[NEGATIVE_SIZE2]], %[[SIZE2]] : i1, i64
		// CHECK: %[[SIZE2_TO_IDX:.*]] = builtin.unrealized_conversion_cast %[[TMP_SIZE2]] : i64 to index
		// CHECK: %[[FINAL_SIZE2:.*]] = builtin.unrealized_conversion_cast %[[SIZE2_TO_IDX]] : index to i64
		// CHECK: %[[FINAL_STRIDE1:.*]] = llvm.mul %[[TMP_SIZE2]], %[[STRIDE1]]
		// CHECK: %[[STRIDE1_TO_IDX:.*]] = builtin.unrealized_conversion_cast %[[FINAL_STRIDE1]] : i64 to index
		// CHECK: %[[FINAL_STRIDE1:.*]] = builtin.unrealized_conversion_cast %[[STRIDE1_TO_IDX]] : index to i64
		// CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC1:.*]] = llvm.insertvalue %[[BASE_BUFFER]], %[[DESC]][0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC2:.*]] = llvm.insertvalue %[[ALIGNED_BUFFER]], %[[DESC1]][1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC3:.*]] = llvm.insertvalue %[[OFFSET]], %[[DESC2]][2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[C1:.*]] = llvm.mlir.constant(1 : index) : i64
		// CHECK: %[[DESC4:.*]] = llvm.insertvalue %[[C1]], %[[DESC3]][3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC5:.*]] = llvm.insertvalue %[[STRIDE0]], %[[DESC4]][4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC6:.*]] = llvm.insertvalue %[[C2]], %[[DESC5]][3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC7:.*]] = llvm.insertvalue %[[FINAL_STRIDE1]], %[[DESC6]][4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC8:.*]] = llvm.insertvalue %[[FINAL_SIZE2]], %[[DESC7]][3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[DESC9:.*]] = llvm.insertvalue %[[STRIDE1]], %[[DESC8]][4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
		// CHECK: %[[RES:.*]] = builtin.unrealized_conversion_cast %[[DESC9]] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)> to memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>>
		// CHECK: return %[[RES]] : memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>>
		// CHECK: }

		// -----

		// CHECK-LABEL: func @collapse_static_shape_with_non_identity_layout
		func.func @collapse_static_shape_with_non_identity_layout(%arg: memref<1x1x8x8xf32, strided<[64, 64, 8, 1], offset: ?>>) -> memref<64xf32, strided<[1], offset: ?>> {
		// CHECK-NOT: memref.collapse_shape
		%1 = memref.collapse_shape %arg [[0, 1, 2, 3]] : memref<1x1x8x8xf32, strided<[64, 64, 8, 1], offset: ?>> into memref<64xf32, strided<[1], offset: ?>>
		return %1 : memref<64xf32, strided<[1], offset: ?>>
		}

mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir

Show First 20 Lines • Show All 284 Lines • ▼ Show 20 Lines
}		}

// Test scalar memref with an alignment.		// Test scalar memref with an alignment.
// CHECK: llvm.mlir.global private @gv4(1.000000e+00 : f32) {addr_space = 0 : i32, alignment = 64 : i64} : f32		// CHECK: llvm.mlir.global private @gv4(1.000000e+00 : f32) {addr_space = 0 : i32, alignment = 64 : i64} : f32
memref.global "private" @gv4 : memref<f32> = dense<1.0> {alignment = 64}		memref.global "private" @gv4 : memref<f32> = dense<1.0> {alignment = 64}

// -----		// -----

func.func @collapse_shape_static(%arg0: memref<1x3x4x1x5xf32>) -> memref<3x4x5xf32> {		// Expand shapes need to be expanded outside of the memref-to-llvm pass.
%0 = memref.collapse_shape %arg0 [[0, 1], [2], [3, 4]] :		// CHECK-LABEL: func @expand_shape_static(
memref<1x3x4x1x5xf32> into memref<3x4x5xf32>		// CHECK-SAME: %[[ARG:.]]: memref<{{.}}>)
return %0 : memref<3x4x5xf32>
}
// CHECK-LABEL: func @collapse_shape_static
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mlir.constant(3 : index) : i64
// CHECK: llvm.mlir.constant(4 : index) : i64
// CHECK: llvm.mlir.constant(5 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mlir.constant(20 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mlir.constant(5 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>

// -----

func.func @collapse_shape_dynamic_with_non_identity_layout(
%arg0 : memref<4x?x?xf32, strided<[?, 4, 1], offset: ?>>) ->
memref<4x?xf32, strided<[?, ?], offset: ?>> {
%0 = memref.collapse_shape %arg0 [[0], [1, 2]]:
memref<4x?x?xf32, strided<[?, 4, 1], offset: ?>> into
memref<4x?xf32, strided<[?, ?], offset: ?>>
return %0 : memref<4x?xf32, strided<[?, ?], offset: ?>>
}
// CHECK-LABEL: func @collapse_shape_dynamic_with_non_identity_layout(
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.extractvalue %{{.*}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.extractvalue %{{.*}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.mlir.constant(4 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.extractvalue %{{.*}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.icmp "ne" %{{.}}, %{{.}} : i64
// CHECK: llvm.cond_br %{{.}}, ^bb2(%{{.}} : i64), ^bb1
// CHECK: ^bb1:
// CHECK: llvm.extractvalue %{{.*}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.br ^bb2(%{{.*}} : i64)
// CHECK: ^bb2(%[[STRIDE:.*]]: i64):
// CHECK: llvm.insertvalue %[[STRIDE]], %{{.*}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK32-LABEL: func @collapse_shape_dynamic_with_non_identity_layout(
// CHECK32: llvm.mlir.constant(1 : index) : i32
// CHECK32: llvm.mlir.constant(4 : index) : i32
// CHECK32: llvm.mlir.constant(1 : index) : i32

// -----

func.func @expand_shape_static(%arg0: memref<3x4x5xf32>) -> memref<1x3x4x1x5xf32> {		func.func @expand_shape_static(%arg0: memref<3x4x5xf32>) -> memref<1x3x4x1x5xf32> {
		// CHECK: memref.expand_shape %[[ARG]] {{\[}}[0, 1], [2], [3, 4]]
// Reshapes that expand a contiguous tensor with some 1's.		// Reshapes that expand a contiguous tensor with some 1's.
%0 = memref.expand_shape %arg0 [[0, 1], [2], [3, 4]]		%0 = memref.expand_shape %arg0 [[0, 1], [2], [3, 4]]
: memref<3x4x5xf32> into memref<1x3x4x1x5xf32>		: memref<3x4x5xf32> into memref<1x3x4x1x5xf32>
return %0 : memref<1x3x4x1x5xf32>		return %0 : memref<1x3x4x1x5xf32>
}		}
// CHECK-LABEL: func @expand_shape_static
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.mlir.constant(3 : index) : i64
// CHECK: llvm.mlir.constant(4 : index) : i64
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.mlir.constant(5 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 3] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 4] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.mlir.constant(60 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.mlir.constant(20 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.mlir.constant(5 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.mlir.constant(5 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 3] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 4] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<5 x i64>, array<5 x i64>)>


// -----

func.func @collapse_shape_fold_zero_dim(%arg0 : memref<1x1xf32>) -> memref<f32> {
%0 = memref.collapse_shape %arg0 [] : memref<1x1xf32> into memref<f32>
return %0 : memref<f32>
}
// CHECK-LABEL: func @collapse_shape_fold_zero_dim
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>

// -----

func.func @expand_shape_zero_dim(%arg0 : memref<f32>) -> memref<1x1xf32> {
%0 = memref.expand_shape %arg0 [] : memref<f32> into memref<1x1xf32>
return %0 : memref<1x1xf32>
}
// CHECK-LABEL: func @expand_shape_zero_dim
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>

// -----

func.func @collapse_shape_dynamic(%arg0 : memref<1x2x?xf32>) -> memref<1x?xf32> {
%0 = memref.collapse_shape %arg0 [[0], [1, 2]]: memref<1x2x?xf32> into memref<1x?xf32>
return %0 : memref<1x?xf32>
}
// CHECK-LABEL: func @collapse_shape_dynamic(
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.mlir.constant(2 : index) : i64
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.extractvalue %{{.*}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64

// -----		// -----

func.func @expand_shape_dynamic(%arg0 : memref<1x?xf32>) -> memref<1x2x?xf32> {		// Collapse shapes need to be expanded outside of the memref-to-llvm pass.
%0 = memref.expand_shape %arg0 [[0], [1, 2]]: memref<1x?xf32> into memref<1x2x?xf32>		// CHECK-LABEL: func @collapse_shape_static
return %0 : memref<1x2x?xf32>		// CHECK-SAME: %[[ARG:.*]]: memref<1x3x4x1x5xf32>) -> memref<3x4x5xf32> {
}		func.func @collapse_shape_static(%arg0: memref<1x3x4x1x5xf32>) -> memref<3x4x5xf32> {
// CHECK-LABEL: func @expand_shape_dynamic(		%0 = memref.collapse_shape %arg0 [[0, 1], [2], [3, 4]] :
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>		memref<1x3x4x1x5xf32> into memref<3x4x5xf32>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>		return %0 : memref<3x4x5xf32>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(2 : index) : i64
// CHECK: llvm.sdiv %{{.}}, %{{.}} : i64
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.mlir.constant(2 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64

// -----

func.func @expand_shape_dynamic_with_non_identity_layout(
%arg0 : memref<1x?xf32, strided<[?, ?], offset: ?>>) ->
memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>> {
%0 = memref.expand_shape %arg0 [[0], [1, 2]]:
memref<1x?xf32, strided<[?, ?], offset: ?>> into
memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>>
return %0 : memref<1x2x?xf32, strided<[?, ?, ?], offset: ?>>
}		}
// CHECK-LABEL: func @expand_shape_dynamic_with_non_identity_layout(
// CHECK: llvm.mlir.undef : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.mlir.constant(2 : index) : i64
// CHECK: llvm.sdiv %{{.}}, %{{.}} : i64
// CHECK: llvm.mlir.constant(1 : index) : i64
// CHECK: llvm.mlir.constant(2 : index) : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.extractvalue %{{.*}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[3, 2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64
// CHECK: llvm.insertvalue %{{.}}, %{{.}}[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.extractvalue %{{.*}}[3, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<3 x i64>, array<3 x i64>)>
// CHECK: llvm.mul %{{.}}, %{{.}} : i64

// -----		// -----

// CHECK-LABEL: func @rank_of_unranked		// CHECK-LABEL: func @rank_of_unranked
// CHECK32-LABEL: func @rank_of_unranked		// CHECK32-LABEL: func @rank_of_unranked
func.func @rank_of_unranked(%unranked: memref<*xi32>) {		func.func @rank_of_unranked(%unranked: memref<*xi32>) {
%rank = memref.rank %unranked : memref<*xi32>		%rank = memref.rank %unranked : memref<*xi32>
return		return
Show All 33 Lines	func.func @atomic_rmw(%I : memref<10xi32>, %ival : i32, %F : memref<10xf32>, %fval : f32, %i : index) {
// CHECK: llvm.atomicrmw _or %{{.}}, %{{.}} acq_rel		// CHECK: llvm.atomicrmw _or %{{.}}, %{{.}} acq_rel
memref.atomic_rmw andi %ival, %I[%i] : (i32, memref<10xi32>) -> i32		memref.atomic_rmw andi %ival, %I[%i] : (i32, memref<10xi32>) -> i32
// CHECK: llvm.atomicrmw _and %{{.}}, %{{.}} acq_rel		// CHECK: llvm.atomicrmw _and %{{.}}, %{{.}} acq_rel
return		return
}		}

// -----		// -----

// CHECK-LABEL: func @collapse_static_shape_with_non_identity_layout
func.func @collapse_static_shape_with_non_identity_layout(%arg: memref<1x1x8x8xf32, strided<[64, 64, 8, 1], offset: ?>>) -> memref<64xf32, strided<[1], offset: ?>> {
// CHECK-NOT: memref.collapse_shape
%1 = memref.collapse_shape %arg [[0, 1, 2, 3]] : memref<1x1x8x8xf32, strided<[64, 64, 8, 1], offset: ?>> into memref<64xf32, strided<[1], offset: ?>>
return %1 : memref<64xf32, strided<[1], offset: ?>>
}

// -----

// CHECK-LABEL: func @generic_atomic_rmw		// CHECK-LABEL: func @generic_atomic_rmw
func.func @generic_atomic_rmw(%I : memref<10xi32>, %i : index) {		func.func @generic_atomic_rmw(%I : memref<10xi32>, %i : index) {
%x = memref.generic_atomic_rmw %I[%i] : memref<10xi32> {		%x = memref.generic_atomic_rmw %I[%i] : memref<10xi32> {
^bb0(%old_value : i32):		^bb0(%old_value : i32):
memref.atomic_yield %old_value : i32		memref.atomic_yield %old_value : i32
}		}
// CHECK: [[init:%.]] = llvm.load %{{.}} : !llvm.ptr<i32>		// CHECK: [[init:%.]] = llvm.load %{{.}} : !llvm.ptr<i32>
// CHECK-NEXT: llvm.br ^bb1([[init]] : i32)		// CHECK-NEXT: llvm.br ^bb1([[init]] : i32)
▲ Show 20 Lines • Show All 169 Lines • Show Last 20 Lines

mlir/test/Integration/Dialect/Linalg/CPU/test-collapse-tensor.mlir

	// RUN: mlir-opt %s -linalg-bufferize \			// RUN: mlir-opt %s -linalg-bufferize \
	// RUN: -arith-bufferize -tensor-bufferize -func-bufferize \			// RUN: -arith-bufferize -tensor-bufferize -func-bufferize \
	// RUN: -finalizing-bufferize -buffer-deallocation -convert-linalg-to-llvm \			// RUN: -finalizing-bufferize -buffer-deallocation -convert-linalg-to-llvm \
				// RUN: -expand-strided-metadata -lower-affine \
	// RUN: -convert-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \| \			// RUN: -convert-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \| \
	// RUN: mlir-cpu-runner -e main -entry-point-result=void \			// RUN: mlir-cpu-runner -e main -entry-point-result=void \
	// RUN: -shared-libs=%mlir_lib_dir/libmlir_runner_utils%shlibext \			// RUN: -shared-libs=%mlir_lib_dir/libmlir_runner_utils%shlibext \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s


	func.func @main() {			func.func @main() {
	%const = arith.constant dense<[[[[-3.9058,0.9072],[-2.9470,-2.2055],[18.3946,8.2997]],[[3.4700,5.9006],[-17.2267,4.9777],[1.0450,-0.8201]]],[[[17.6996,-11.1763],[26.7775,-3.8823],[-4.2492,-5.8966]],[[2.1259,13.1794],[-10.7136,0.8428],[16.4233,9.4589]]]]> : tensor<2x2x3x2xf32>			%const = arith.constant dense<[[[[-3.9058,0.9072],[-2.9470,-2.2055],[18.3946,8.2997]],[[3.4700,5.9006],[-17.2267,4.9777],[1.0450,-0.8201]]],[[[17.6996,-11.1763],[26.7775,-3.8823],[-4.2492,-5.8966]],[[2.1259,13.1794],[-10.7136,0.8428],[16.4233,9.4589]]]]> : tensor<2x2x3x2xf32>
	Show All 27 Lines

mlir/test/Integration/Dialect/Linalg/CPU/test-expand-tensor.mlir

	// RUN: mlir-opt %s -linalg-bufferize \			// RUN: mlir-opt %s -linalg-bufferize \
	// RUN: -arith-bufferize -tensor-bufferize -func-bufferize \			// RUN: -arith-bufferize -tensor-bufferize -func-bufferize \
	// RUN: -finalizing-bufferize -buffer-deallocation -convert-linalg-to-llvm \			// RUN: -finalizing-bufferize -buffer-deallocation -convert-linalg-to-llvm \
				// RUN: -expand-strided-metadata -lower-affine \
	// RUN: -convert-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \| \			// RUN: -convert-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \| \
	// RUN: mlir-cpu-runner -e main -entry-point-result=void \			// RUN: mlir-cpu-runner -e main -entry-point-result=void \
	// RUN: -shared-libs=%mlir_lib_dir/libmlir_runner_utils%shlibext \			// RUN: -shared-libs=%mlir_lib_dir/libmlir_runner_utils%shlibext \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s


	func.func @main() {			func.func @main() {
	%const = arith.constant dense<[[[-3.9058,0.9072],[-2.9470,-2.2055],[18.3946,8.2997],[3.4700,5.9006],[-17.2267,4.9777],[1.0450,-0.8201]],[[17.6996,-11.1763],[26.7775,-3.8823],[-4.2492,-5.8966],[2.1259,13.1794],[-10.7136,0.8428],[16.4233,9.4589]]]> : tensor<2x6x2xf32>			%const = arith.constant dense<[[[-3.9058,0.9072],[-2.9470,-2.2055],[18.3946,8.2997],[3.4700,5.9006],[-17.2267,4.9777],[1.0450,-0.8201]],[[17.6996,-11.1763],[26.7775,-3.8823],[-4.2492,-5.8966],[2.1259,13.1794],[-10.7136,0.8428],[16.4233,9.4589]]]> : tensor<2x6x2xf32>
	Show All 28 Lines