This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/MemRef/Transforms/
-
mlir/
-
Dialect/
-
MemRef/
-
Transforms/
2/2
Passes.h
-
Passes.td
-
lib/Dialect/MemRef/Transforms/
-
Dialect/
-
MemRef/
-
Transforms/
-
CMakeLists.txt
20/21
FoldMemRefAliasOps.cpp
-
FoldSubViewOps.cpp
-
test/Dialect/MemRef/
-
Dialect/
-
MemRef/
2/2
fold-memref-alias-ops.mlir
-
fold-subview-ops.mlir
-
tools/mlir-vulkan-runner/
-
mlir-vulkan-runner/
1/1
mlir-vulkan-runner.cpp

Differential D128986

Fold memref.expand_shape and memref.collapse_shape ops
ClosedPublic

Authored by arnab-oss on Jul 1 2022, 5:52 AM.

Download Raw Diff

Details

Reviewers

bondhugula
dcaballe
nicolasvasilache
mehdi_amini
mravishankar
aartbik

Commits

rG1b002d276835: Fold memref.expand_shape and memref.collapse_shape ops

Summary

Fold memref.expand_shape and memref.collapse_shape ops into
their memref/affine load/store ops.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

arnab-oss created this revision.Jul 1 2022, 5:52 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2022, 5:52 AM

Herald added subscribers: anlunx, bzcheeseman, sdasgup3 and 22 others. · View Herald Transcript

arnab-oss requested review of this revision.Jul 1 2022, 5:52 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2022, 5:52 AM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

arnab-oss added reviewers: bondhugula, dcaballe, nicolasvasilache, mehdi_amini.Jul 1 2022, 5:54 AM

Harbormaster completed remote builds in B173224: Diff 441670.Jul 1 2022, 5:57 AM

Discarding unnecessary changes.

Harbormaster completed remote builds in B173225: Diff 441671.Jul 1 2022, 6:19 AM

Thanks for improving this part of the codebase @arnab-oss!
Here is a first batch of comments.

nicolasvasilache added inline comments.Jul 8 2022, 1:35 AM

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
30	nit: typo: "of a load/store" here and below
55	Please generalize this a bit and add as a helper in include/mlir/Dialect/Utils/IndexingUtils.h, this will be useful in other places too. I am surprised there isn't already something like this, I only see utilities to linearize / delinearize atm.
66	This feels like a linearization similar to the one in IndexingUtils.h but more generally applicable. Could you try generalizing the AffineExpr building part into a new IndexingUtils.h and compose that with the extra reorderings? The more it looks like: AffineExpr srcIndexExpr = linearAffineExpression(suffixProduct); SmallVector<Value> dynamicIndices = gather(indices, groups); sourceIndices.push_back(rewriter.create<AffineApplyOp>(...)); the better the composability and overall code idiom power.
107	Same comments as above but for a delinearized form.
177	Have not yet looked at this deeply but please consider a similar type of refactoring / reuse as proposed above if possible. I'll come back to it once the first batch of comments is resolved. (I see you are merely moving code here but if there are opportunities to improve I'd take them).
244	Looking at the amount of code below I am wondering if you are going overboard with templates. Seems it would be quite fewer code to have 3 separate `LoadOpOfXXXFolder` and inline the impl at the right place. What you save in not typing the pattern decl and the matchAndRewrite decl you pay at least 4-5x the price in the extra indirection and templating logic.

bondhugula added a reviewer: mravishankar.Jul 10 2022, 12:43 AM

This would be a great enhancement for code generation! -- to be able to de-abstract these ops out if/when needed. Would have been great to have these from the start!

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
1	Update summary line: memref.subview -> memref aliasing ops.

In D128986#3640925, @bondhugula wrote:

This would be a great enhancement for code generation! -- to be able to de-abstract these ops out if/when needed. Would have been great to have these from the start!

Feel free to contribute when you need something ! :)

Addressed review comments.

Herald added a subscriber: arphaman. · View Herald TranscriptJul 13 2022, 8:27 AM

Harbormaster completed remote builds in B175125: Diff 444272.Jul 13 2022, 9:11 AM

Addressd review comments.

Harbormaster completed remote builds in B175165: Diff 444323.Jul 13 2022, 12:19 PM

ping @nicolasvasilache can you please re-review?

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
177	I feel this part of the code usage is specific to this case, and is not beneficial to expose it as a utility function.

It looks like the commit summary has got into the title, and the former is missing!

arnab-oss retitled this revision from Fold memref.expand_shape and memref.collapse_shape ops, when they are the parent op of source value of memref/affine load/store ops. to Fold memref.expand_shape and memref.collapse_shape ops..Jul 18 2022, 1:03 AM

nicolasvasilache added inline comments.Jul 19 2022, 8:23 AM

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
123	This seems super artificial, I would rather duplicate this affine builder void AffineLoadOp::build(OpBuilder &builder, OperationState &result, Value memref, ValueRange indices) { to allow taking ArrayRef<OpFoldResult> and just pass fill your sourceIndices with `b.getIndexAttr(0)`.
177	faire enough, thanks for looking into it.
270	This decl, its def can be folded in all its uses now I think. This would be the part that reduces the code size.
496	I think you can now: llvm::TypeSwitch<Operation , void>(loadOp) .Case<affine::LoadOp, memref::LoadOp>([&](auto op) { rewriter.replaceOpWithNewOp<decltype(op)>( loadOp, expandShapeOp.getViewSource(), sourceIndices); }) .Default([](Operation ) { ;}); And drop all the overloads and decls?

nicolasvasilache added inline comments.Jul 20 2022, 1:48 AM

mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
278	Please also add some tests for `memref.load/store` and `vector.transfer`.

@nicolasvasilache can you please take a look at my comments?

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
123	Hi, I'm sorry I do not get your comment. Can you please elaborate?
270	I think I cannot unify all the cases involving different varieties of load/store ops, as their build functions expect different kinds and numbers of arguments.
496	I think this is not possible. Please take a look at my comment above.

Sorry was OOO the last 2 weeks, replied inline.

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp

496

This is precisely why I used a specific TypeSwitch + auto + decltype to resolve the type inside the Case lambda. To spell it more, you should be able to:

llvm::TypeSwitch<Operation *, void>(loadOp)
        .Case<affine::LoadOp, memref::LoadOp>([&](auto op) {
          rewriter.replaceOpWithNewOp<decltype(op)>(
            loadOp, expandShapeOp.getViewSource(), sourceIndices);
        })
        .Case<vector::TransferReadOp>([&](auto transferReadOp) {
          if (transferReadOp.getTransferRank() == 0) {
            // TODO: Propagate the error.
            return;
          }
          rewriter.replaceOpWithNewOp<vector::TransferReadOp>(
              transferReadOp, transferReadOp.getVectorType(), subViewOp.source(),
              sourceIndices,
              getPermutationMapAttr(rewriter.getContext(), subViewOp,
                                    transferReadOp.getPermutationMap()),
              transferReadOp.getPadding(),
              /*mask=*/Value(), transferReadOp.getInBoundsAttr());
        })
        .Default([](Operation *) { ;});

Did you try and hit compilation errors? If so, could you paste the error message ?

I'm getting the following error:

/home2/arnab/llvm-project/mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp:447:30: error: ‘class mlir::vector::TransferReadOp’ has no member named ‘getValue’
  447 |             storeOp, storeOp.getValue(), subViewOp.source(), sourceIndices);

AffineStoreOp and memref::StoreOp have the getValue(), but vector::TransferReadOp does not.

In D128986#3718717, @arnab-oss wrote:
I'm getting the following error:
/home2/arnab/llvm-project/mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp:447:30: error: ‘class mlir::vector::TransferReadOp’ has no member named ‘getValue’
  447 |             storeOp, storeOp.getValue(), subViewOp.source(), sourceIndices);
AffineStoreOp and memref::StoreOp have the getValue(), but vector::TransferReadOp does not.

Sorry I missed this.. This seems like the type of errors that the TypeSwitch solution I suggested should easily handle .. Can you post another WIP PR which shows how you are trying to do this?
In any case, adding a getValue to TransferReadOp or renaming the result vector to value is also an easy normalization of APIs.

If you find this really difficult, please just add more tests as requested and feel free to land as is.
I can pick up the slack in a followup commit.

In D128986#3718717, @arnab-oss wrote:
I'm getting the following error:
/home2/arnab/llvm-project/mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp:447:30: error: ‘class mlir::vector::TransferReadOp’ has no member named ‘getValue’
  447 |             storeOp, storeOp.getValue(), subViewOp.source(), sourceIndices);
AffineStoreOp and memref::StoreOp have the getValue(), but vector::TransferReadOp does not.

You can just make the APIs (accessor names here) consistent.

This revision is now accepted and ready to land.Aug 19 2022, 6:51 PM

bondhugula added inline comments.Aug 19 2022, 6:52 PM

mlir/include/mlir/Dialect/MemRef/Transforms/Passes.h
42	This comment is outdated.
96	This one as well. Please update the doc comments to reflect the generalization.

There are still several style and efficiency issues (unnecessary copies).

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
50–51	This would lead to unnecessary copies - please don't use `SmallVector's` by copy this way; `groups` is only read below. Normally const ref, but `ArrayRef` here.
97	Likewise.

This revision now requires changes to proceed.Aug 19 2022, 6:55 PM

bondhugula accepted this revision.Aug 19 2022, 6:59 PM

bondhugula added inline comments.

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
50–51	Sorry, looks like I didn't see the RHS here. This is a SmallVector returned from a function. So this is correct as is.
97	Nothing to change.

This revision is now accepted and ready to land.Aug 19 2022, 6:59 PM

Addressed comments

Herald added a reviewer: aartbik. · View Herald TranscriptAug 22 2022, 3:39 AM

Herald added a subscriber: ThomasRaoux. · View Herald Transcript

@nicolasvasilache I've made the required changes in the API. Also added tests featuring memref.load/store. However I've not handled folding related to vector transfer ops, so no tests involving those ops.

Harbormaster completed remote builds in B182550: Diff 454442.Aug 22 2022, 4:00 AM

bondhugula accepted this revision.Aug 22 2022, 5:28 AM

bondhugula added inline comments.

mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
1488–1490 ↗	(On Diff #454442)	Doc comment here. You can mention that this is added for uniformity with other similar ops.
1488–1490 ↗	(On Diff #454442)	Nit: This will fit in a single line.

bondhugula added inline comments.Aug 22 2022, 5:31 AM

mlir/include/mlir/Dialect/Utils/IndexingUtils.h
18 ↗	(On Diff #454442)	You don't need this include. Please use fwd decl.

Addressed comments.

Harbormaster completed remote builds in B182572: Diff 454473.Aug 22 2022, 6:58 AM

nicolasvasilache added inline comments.Aug 22 2022, 7:51 AM

utils/arcanist/clang-format.sh
54 ↗	(On Diff #454473)	please revert this change

Removed wrong changes in clang-format.sh

Harbormaster completed remote builds in B182741: Diff 454703.Aug 22 2022, 11:29 PM

@arnab-oss thanks for refactoring to a nicer outcome.

Please address the last 2 points I raised and let's ship this!

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
344	Please add an `llvm_unreachable` here and all other places below, we don't want to silently fail if/when new ops get added in the future.
mlir/tools/mlir-vulkan-runner/mlir-vulkan-runner.cpp
51	Seems unrelated to this PR. Even if it has no effects on the logic, I'd rather keep separate pieces separate.

Addressed comments by nicolas.

Minor changes.

Harbormaster completed remote builds in B182819: Diff 454794.Aug 23 2022, 5:29 AM

A few minor things.

Please add a commit summary; it's currently blank.
Drop the full stop at the end of the commit title.
There is a whitespace issue in the test case file: $ git diff --check HEAD~.

mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
418–419	Indent by two to be consistent.

Minor NFC changes.

arnab-oss edited the summary of this revision. (Show Details)Aug 26 2022, 1:29 AM

Harbormaster completed remote builds in B183539: Diff 455821.Aug 26 2022, 1:47 AM

arnab-oss edited the summary of this revision. (Show Details)Aug 26 2022, 11:16 PM

bondhugula retitled this revision from Fold memref.expand_shape and memref.collapse_shape ops. to Fold memref.expand_shape and memref.collapse_shape ops.Aug 27 2022, 6:27 PM

bondhugula edited the summary of this revision. (Show Details)

Closed by commit rG1b002d276835: Fold memref.expand_shape and memref.collapse_shape ops (authored by arnab-oss, committed by bondhugula). · Explain WhyAug 27 2022, 6:32 PM

This revision was automatically updated to reflect the committed changes.

bondhugula added a commit: rG1b002d276835: Fold memref.expand_shape and memref.collapse_shape ops.

nicolasvasilache mentioned this in D132697: [mlir][MemRef] Expose utility to invert subview index mapping.Aug 29 2022, 3:20 AM

nicolasvasilache mentioned this in D129699: [mlir][Tensor] Add rewrites to extract slices through `tensor.collape_shape`.Aug 29 2022, 4:02 AM

nicolasvasilache mentioned this in D133166: [mlir][MemRef] Canonicalize extract_strided_metadata(subview).Sep 6 2022, 1:01 AM

nicolasvasilache mentioned this in D133625: [mlir][MemRef] Simplify extract_strided_metadata(expand_shape).Sep 14 2022, 9:57 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

MemRef/

Transforms/

Passes.h

4 lines

Passes.td

8 lines

lib/

Dialect/

MemRef/

Transforms/

CMakeLists.txt

2 lines

FoldMemRefAliasOps.cpp

538 lines

FoldSubViewOps.cpp

test/

Dialect/

MemRef/

	fold-memref-alias-ops.mlir
	fold-subview-ops.mlir

152 lines

fold-subview-ops.mlir

tools/

mlir-vulkan-runner/

mlir-vulkan-runner.cpp

2 lines

Diff 441671

mlir/include/mlir/Dialect/MemRef/Transforms/Passes.h

	Show All 33 Lines
	class AllocOp;			class AllocOp;
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Patterns			// Patterns
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	/// Collects a set of patterns to rewrite ops within the memref dialect.			/// Collects a set of patterns to rewrite ops within the memref dialect.
	void populateExpandOpsPatterns(RewritePatternSet &patterns);			void populateExpandOpsPatterns(RewritePatternSet &patterns);

	/// Appends patterns for folding memref.subview ops into consumer load/store ops			/// Appends patterns for folding memref.subview ops into consumer load/store ops
				bondhugulaUnsubmitted Done Reply Inline Actions This comment is outdated. bondhugula: This comment is outdated.
	/// into `patterns`.			/// into `patterns`.
	void populateFoldSubViewOpPatterns(RewritePatternSet &patterns);			void populateFoldMemRefAliasOpPatterns(RewritePatternSet &patterns);

	/// Appends patterns that resolve `memref.dim` operations with values that are			/// Appends patterns that resolve `memref.dim` operations with values that are
	/// defined by operations that implement the			/// defined by operations that implement the
	/// `ReifyRankedShapeTypeShapeOpInterface`, in terms of shapes of its input			/// `ReifyRankedShapeTypeShapeOpInterface`, in terms of shapes of its input
	/// operands.			/// operands.
	void populateResolveRankedShapeTypeResultDimsPatterns(			void populateResolveRankedShapeTypeResultDimsPatterns(
	RewritePatternSet &patterns);			RewritePatternSet &patterns);

	Show All 35 Lines

	/// Creates an instance of the ExpandOps pass that legalizes memref dialect ops			/// Creates an instance of the ExpandOps pass that legalizes memref dialect ops
	/// to be convertible to LLVM. For example, `memref.reshape` gets converted to			/// to be convertible to LLVM. For example, `memref.reshape` gets converted to
	/// `memref_reinterpret_cast`.			/// `memref_reinterpret_cast`.
	std::unique_ptr<Pass> createExpandOpsPass();			std::unique_ptr<Pass> createExpandOpsPass();

	/// Creates an operation pass to fold memref.subview ops into consumer			/// Creates an operation pass to fold memref.subview ops into consumer
	/// load/store ops into `patterns`.			/// load/store ops into `patterns`.
	std::unique_ptr<Pass> createFoldSubViewOpsPass();			std::unique_ptr<Pass> createFoldMemRefAliasOpsPass();
				bondhugulaUnsubmitted Done Reply Inline Actions This one as well. Please update the doc comments to reflect the generalization. bondhugula: This one as well. Please update the doc comments to reflect the generalization.

	/// Creates an interprocedural pass to normalize memrefs to have a trivial			/// Creates an interprocedural pass to normalize memrefs to have a trivial
	/// (identity) layout map.			/// (identity) layout map.
	std::unique_ptr<OperationPass<ModuleOp>> createNormalizeMemRefsPass();			std::unique_ptr<OperationPass<ModuleOp>> createNormalizeMemRefsPass();

	/// Creates an operation pass to resolve `memref.dim` operations with values			/// Creates an operation pass to resolve `memref.dim` operations with values
	/// that are defined by operations that implement the			/// that are defined by operations that implement the
	/// `ReifyRankedShapeTypeShapeOpInterface`, in terms of shapes of its input			/// `ReifyRankedShapeTypeShapeOpInterface`, in terms of shapes of its input
	Show All 20 Lines

mlir/include/mlir/Dialect/MemRef/Transforms/Passes.td

	Show All 10 Lines

	include "mlir/Pass/PassBase.td"			include "mlir/Pass/PassBase.td"

	def ExpandOps : Pass<"memref-expand"> {			def ExpandOps : Pass<"memref-expand"> {
	let summary = "Legalize memref operations to be convertible to LLVM.";			let summary = "Legalize memref operations to be convertible to LLVM.";
	let constructor = "mlir::memref::createExpandOpsPass()";			let constructor = "mlir::memref::createExpandOpsPass()";
	}			}

	def FoldSubViewOps : Pass<"fold-memref-subview-ops"> {			def FoldMemRefAliasOps : Pass<"fold-memref-alias-ops"> {
	let summary = "Fold memref.subview ops into consumer load/store ops";			let summary = "Fold memref alias ops into consumer load/store ops";
	let description = [{			let description = [{
	The pass folds loading/storing from/to subview ops to loading/storing			The pass folds loading/storing from/to memref aliasing ops to loading/storing
	from/to the original memref.			from/to the original memref.
	}];			}];
	let constructor = "mlir::memref::createFoldSubViewOpsPass()";			let constructor = "mlir::memref::createFoldMemRefAliasOpsPass()";
	let dependentDialects = [			let dependentDialects = [
	"AffineDialect", "memref::MemRefDialect", "vector::VectorDialect"			"AffineDialect", "memref::MemRefDialect", "vector::VectorDialect"
	];			];
	}			}

	def NormalizeMemRefs : Pass<"normalize-memrefs", "ModuleOp"> {			def NormalizeMemRefs : Pass<"normalize-memrefs", "ModuleOp"> {
	let summary = "Normalize memrefs";			let summary = "Normalize memrefs";
	let description = [{			let description = [{
	▲ Show 20 Lines • Show All 144 Lines • Show Last 20 Lines

mlir/lib/Dialect/MemRef/Transforms/CMakeLists.txt

	add_mlir_dialect_library(MLIRMemRefTransforms			add_mlir_dialect_library(MLIRMemRefTransforms
	ComposeSubView.cpp			ComposeSubView.cpp
	ExpandOps.cpp			ExpandOps.cpp
	FoldSubViewOps.cpp			FoldMemRefAliasOps.cpp
	MultiBuffer.cpp			MultiBuffer.cpp
	NormalizeMemRefs.cpp			NormalizeMemRefs.cpp
	ResolveShapedTypeResultDims.cpp			ResolveShapedTypeResultDims.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/MemRef			${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/MemRef

	DEPENDS			DEPENDS
	Show All 16 Lines

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp

This file was added.

				//===- FoldMemRefAliasOps.cpp - Fold memref.subview ops -----===//
				bondhugulaUnsubmitted Done Reply Inline Actions Update summary line: memref.subview -> memref aliasing ops. bondhugula: Update summary line: memref.subview -> memref aliasing ops.
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This transformation pass folds loading/storing from/to subview ops into
				// loading/storing from/to the original memref.
				//
				//===----------------------------------------------------------------------===//

				#include "PassDetail.h"
				#include "mlir/Dialect/Affine/IR/AffineOps.h"
				#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
				#include "mlir/Dialect/MemRef/IR/MemRef.h"
				#include "mlir/Dialect/MemRef/Transforms/Passes.h"
				#include "mlir/Dialect/Vector/IR/VectorOps.h"
				#include "mlir/IR/BuiltinTypes.h"
				#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
				#include "llvm/ADT/SmallBitVector.h"

				using namespace mlir;

				//===----------------------------------------------------------------------===//
				// Utility functions
				//===----------------------------------------------------------------------===//

				/// Given the 'indices' of an load/store operation where the memref is a result
				nicolasvasilacheUnsubmitted Done Reply Inline Actions nit: typo: "of a load/store" here and below nicolasvasilache: nit: typo: "of a load/store" here and below
				/// of a expand_shape op, returns the indices w.r.t to the source memref of the
				/// expand_shape op. For example
				///
				/// %0 = ... : memref<12x42xf32>
				/// %1 = memref.expand_shape %0 [[0, 1], [2]]
				/// : memref<12x42xf32> into memref<2x6x42xf32>
				/// %2 = load %1[%i1, %i2, %i3] : memref<2x6x42xf32
				///
				/// could be folded into
				///
				/// %2 = load %0[6 * i1 + i2, %i3] :
				/// memref<12x42xf32>
				static LogicalResult
				resolveSourceIndicesExpandShape(Location loc, PatternRewriter &rewriter,
				memref::ExpandShapeOp expandShapeOp,
				ValueRange indices,
				SmallVectorImpl<Value> &sourceIndices) {
				for (SmallVector<int64_t, 2> groups :
				expandShapeOp.getReassociationIndices()) {
				assert(!groups.empty() && "association indices groups cannot be empty");
				unsigned groupSize = groups.size();
				bondhugulaUnsubmitted Done Reply Inline Actions This would lead to unnecessary copies - please don't use `SmallVector's` by copy this way; `groups` is only read below. Normally const ref, but `ArrayRef` here. bondhugula: This would lead to unnecessary copies - please don't use `SmallVector's` by copy this way…
				bondhugulaUnsubmitted Done Reply Inline Actions Sorry, looks like I didn't see the RHS here. This is a SmallVector returned from a function. So this is correct as is. bondhugula: Sorry, looks like I didn't see the RHS here. This is a SmallVector returned from a function. So…
				SmallVector<int64_t> suffixProduct(groupSize);
				// Calculate suffix product of dimension sizes for all dimensions of expand
				// shape op result.
				suffixProduct[groupSize - 1] = 1;
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Please generalize this a bit and add as a helper in include/mlir/Dialect/Utils/IndexingUtils.h, this will be useful in other places too. I am surprised there isn't already something like this, I only see utilities to linearize / delinearize atm. nicolasvasilache: Please generalize this a bit and add as a helper in include/mlir/Dialect/Utils/IndexingUtils.h…
				for (unsigned i = groupSize - 1; i > 0; i--)
				suffixProduct[i - 1] =
				suffixProduct[i] *
				expandShapeOp.getType().cast<MemRefType>().getDimSize(groups[i]);
				// Construct the expression for the index value w.r.t to expand shape op
				// source corresponding the indices wrt to expand shape op result.
				AffineExpr srcIndexExpr = rewriter.getAffineDimExpr(0);
				srcIndexExpr = srcIndexExpr * suffixProduct[0];
				SmallVector<Value> dynamicIndices(groupSize);
				dynamicIndices[0] = indices[groups[0]];
				for (unsigned i = 1; i < groupSize; i++) {
				nicolasvasilacheUnsubmitted Done Reply Inline Actions This feels like a linearization similar to the one in IndexingUtils.h but more generally applicable. Could you try generalizing the AffineExpr building part into a new IndexingUtils.h and compose that with the extra reorderings? The more it looks like: AffineExpr srcIndexExpr = linearAffineExpression(suffixProduct); SmallVector<Value> dynamicIndices = gather(indices, groups); sourceIndices.push_back(rewriter.create<AffineApplyOp>(...)); the better the composability and overall code idiom power. nicolasvasilache: This feels like a linearization similar to the one in IndexingUtils.h but more generally…
				srcIndexExpr =
				srcIndexExpr + rewriter.getAffineDimExpr(i) * suffixProduct[i];
				dynamicIndices[i] = indices[groups[i]];
				}
				sourceIndices.push_back(rewriter.create<AffineApplyOp>(
				loc,
				AffineMap::get(/numDims=/groupSize, /numSymbols=/0, srcIndexExpr),
				dynamicIndices));
				}
				return success();
				}

				/// Given the 'indices' of an load/store operation where the memref is a result
				/// of a collapse_shape op, returns the indices w.r.t to the source memref of
				/// the collapse_shape op. For example
				///
				/// %0 = ... : memref<2x6x42xf32>
				/// %1 = memref.collapse_shape %0 [[0, 1], [2]]
				/// : memref<2x6x42xf32> into memref<12x42xf32>
				/// %2 = load %1[%i1, %i2] : memref<12x42xf32>
				///
				/// could be folded into
				///
				/// %2 = load %0[%i1 / 6, %i1 % 6, %i2] :
				/// memref<2x6x42xf32>
				static LogicalResult
				resolveSourceIndicesCollapseShape(Location loc, PatternRewriter &rewriter,
				memref::CollapseShapeOp collapseShapeOp,
				ValueRange indices,
				SmallVectorImpl<Value> &sourceIndices) {
				unsigned cnt = 0;
				bondhugulaUnsubmitted Done Reply Inline Actions Likewise. bondhugula: Likewise.
				bondhugulaUnsubmitted Done Reply Inline Actions Nothing to change. bondhugula: Nothing to change.
				SmallVector<Value> tmp(indices.size());
				SmallVector<Value> dynamicIndices;
				for (SmallVector<int64_t, 2> groups :
				collapseShapeOp.getReassociationIndices()) {
				assert(!groups.empty() && "association indices groups cannot be empty");
				dynamicIndices.push_back(indices[cnt++]);
				unsigned groupSize = groups.size();
				SmallVector<int64_t> suffixProduct(groupSize);
				// Calculate suffix product for all collapse op source dimension sizes.
				suffixProduct[groupSize - 1] = 1;
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Same comments as above but for a delinearized form. nicolasvasilache: Same comments as above but for a delinearized form.
				for (unsigned i = groupSize - 1; i > 0; i--)
				suffixProduct[i - 1] =
				suffixProduct[i] * collapseShapeOp.getSrcType().getDimSize(groups[i]);
				// Derive the index values along all dimensions of the source corresponding
				// to the index wrt to collapsed shape op output.
				AffineExpr srcIndexExpr = rewriter.getAffineDimExpr(0);
				auto modifiedIndices = rewriter.create<AffineApplyOp>(
				loc,
				AffineMap::get(/numDims=/1, /numSymbols=/0,
				srcIndexExpr.floorDiv(suffixProduct[0])),
				dynamicIndices);
				sourceIndices.push_back(modifiedIndices);
				srcIndexExpr = srcIndexExpr % suffixProduct[0];
				for (unsigned i = 1; i < groupSize; i++) {
				sourceIndices.push_back(rewriter.create<AffineApplyOp>(
				loc,
				nicolasvasilacheUnsubmitted Done Reply Inline Actions This seems super artificial, I would rather duplicate this affine builder void AffineLoadOp::build(OpBuilder &builder, OperationState &result, Value memref, ValueRange indices) { to allow taking ArrayRef<OpFoldResult> and just pass fill your sourceIndices with `b.getIndexAttr(0)`. nicolasvasilache: This seems super artificial, I would rather duplicate this affine builder ``` void AffineLoadOp…
				arnab-ossAuthorUnsubmitted Done Reply Inline Actions Hi, I'm sorry I do not get your comment. Can you please elaborate? arnab-oss: Hi, I'm sorry I do not get your comment. Can you please elaborate?
				AffineMap::get(/numDims=/1, /numSymbols=/0,
				srcIndexExpr.floorDiv(suffixProduct[i])),
				dynamicIndices));
				srcIndexExpr = srcIndexExpr % suffixProduct[i];
				}
				dynamicIndices.clear();
				}
				if (collapseShapeOp.getReassociationIndices().empty()) {
				auto zeroAffineMap = rewriter.getConstantAffineMap(0);
				unsigned srcRank =
				collapseShapeOp.getViewSource().getType().cast<MemRefType>().getRank();
				for (unsigned i = 0; i < srcRank; i++)
				sourceIndices.push_back(
				rewriter.create<AffineApplyOp>(loc, zeroAffineMap, dynamicIndices));
				}
				return success();
				}

				/// Given the 'indices' of an load/store operation where the memref is a result
				/// of a subview op, returns the indices w.r.t to the source memref of the
				/// subview op. For example
				///
				/// %0 = ... : memref<12x42xf32>
				/// %1 = subview %0[%arg0, %arg1][][%stride1, %stride2] : memref<12x42xf32> to
				/// memref<4x4xf32, offset=?, strides=[?, ?]>
				/// %2 = load %1[%i1, %i2] : memref<4x4xf32, offset=?, strides=[?, ?]>
				///
				/// could be folded into
				///
				/// %2 = load %0[%arg0 + %i1 * %stride1][%arg1 + %i2 * %stride2] :
				/// memref<12x42xf32>
				static LogicalResult
				resolveSourceIndicesSubView(Location loc, PatternRewriter &rewriter,
				memref::SubViewOp subViewOp, ValueRange indices,
				SmallVectorImpl<Value> &sourceIndices) {
				SmallVector<OpFoldResult> mixedOffsets = subViewOp.getMixedOffsets();
				SmallVector<OpFoldResult> mixedSizes = subViewOp.getMixedSizes();
				SmallVector<OpFoldResult> mixedStrides = subViewOp.getMixedStrides();

				SmallVector<Value> useIndices;
				// Check if this is rank-reducing case. Then for every unit-dim size add a
				// zero to the indices.
				unsigned resultDim = 0;
				llvm::SmallBitVector unusedDims = subViewOp.getDroppedDims();
				for (auto dim : llvm::seq<unsigned>(0, subViewOp.getSourceType().getRank())) {
				if (unusedDims.test(dim))
				useIndices.push_back(rewriter.create<arith::ConstantIndexOp>(loc, 0));
				else
				useIndices.push_back(indices[resultDim++]);
				}
				if (useIndices.size() != mixedOffsets.size())
				return failure();
				sourceIndices.resize(useIndices.size());
				for (auto index : llvm::seq<size_t>(0, mixedOffsets.size())) {
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Have not yet looked at this deeply but please consider a similar type of refactoring / reuse as proposed above if possible. I'll come back to it once the first batch of comments is resolved. (I see you are merely moving code here but if there are opportunities to improve I'd take them). nicolasvasilache: Have not yet looked at this deeply but please consider a similar type of refactoring / reuse as…
				arnab-ossAuthorUnsubmitted Done Reply Inline Actions I feel this part of the code usage is specific to this case, and is not beneficial to expose it as a utility function. arnab-oss: I feel this part of the code usage is specific to this case, and is not beneficial to expose it…
				nicolasvasilacheUnsubmitted Done Reply Inline Actions faire enough, thanks for looking into it. nicolasvasilache: faire enough, thanks for looking into it.
				SmallVector<Value> dynamicOperands;
				AffineExpr expr = rewriter.getAffineDimExpr(0);
				unsigned numSymbols = 0;
				dynamicOperands.push_back(useIndices[index]);

				// Multiply the stride;
				if (auto attr = mixedStrides[index].dyn_cast<Attribute>()) {
				expr = expr * attr.cast<IntegerAttr>().getInt();
				} else {
				dynamicOperands.push_back(mixedStrides[index].get<Value>());
				expr = expr * rewriter.getAffineSymbolExpr(numSymbols++);
				}

				// Add the offset.
				if (auto attr = mixedOffsets[index].dyn_cast<Attribute>()) {
				expr = expr + attr.cast<IntegerAttr>().getInt();
				} else {
				dynamicOperands.push_back(mixedOffsets[index].get<Value>());
				expr = expr + rewriter.getAffineSymbolExpr(numSymbols++);
				}
				Location loc = subViewOp.getLoc();
				sourceIndices[index] = rewriter.create<AffineApplyOp>(
				loc, AffineMap::get(1, numSymbols, expr), dynamicOperands);
				}
				return success();
				}

				/// Helpers to access the memref operand for each op.
				template <typename LoadOrStoreOpTy>
				static Value getMemRefOperand(LoadOrStoreOpTy op) {
				return op.getMemref();
				}

				static Value getMemRefOperand(vector::TransferReadOp op) {
				return op.getSource();
				}

				static Value getMemRefOperand(vector::TransferWriteOp op) {
				return op.getSource();
				}

				/// Given the permutation map of the original
				/// `vector.transfer_read`/`vector.transfer_write` operations compute the
				/// permutation map to use after the subview is folded with it.
				static AffineMapAttr getPermutationMapAttr(MLIRContext *context,
				memref::SubViewOp subViewOp,
				AffineMap currPermutationMap) {
				llvm::SmallBitVector unusedDims = subViewOp.getDroppedDims();
				SmallVector<AffineExpr> exprs;
				int64_t sourceRank = subViewOp.getSourceType().getRank();
				for (auto dim : llvm::seq<int64_t>(0, sourceRank)) {
				if (unusedDims.test(dim))
				continue;
				exprs.push_back(getAffineDimExpr(dim, context));
				}
				auto resultDimToSourceDimMap = AffineMap::get(sourceRank, 0, exprs, context);
				return AffineMapAttr::get(
				currPermutationMap.compose(resultDimToSourceDimMap));
				}

				//===----------------------------------------------------------------------===//
				// Patterns
				//===----------------------------------------------------------------------===//

				namespace {
				/// Merges subview operation with load/transferRead operation.
				template <typename OpTy>
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Looking at the amount of code below I am wondering if you are going overboard with templates. Seems it would be quite fewer code to have 3 separate `LoadOpOfXXXFolder` and inline the impl at the right place. What you save in not typing the pattern decl and the matchAndRewrite decl you pay at least 4-5x the price in the extra indirection and templating logic. nicolasvasilache: Looking at the amount of code below I am wondering if you are going overboard with templates.
				class LoadOpOfAliasFolder final : public OpRewritePattern<OpTy> {
				public:
				using OpRewritePattern<OpTy>::OpRewritePattern;

				LogicalResult matchAndRewrite(OpTy loadOp,
				PatternRewriter &rewriter) const override;

				private:
				void replaceSubViewOp(OpTy loadOp, memref::SubViewOp subViewOp,
				ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const;
				void replaceExpandShapeOp(OpTy loadOp, memref::ExpandShapeOp expandShapeOp,
				ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const;
				void replaceCollapseShapeOp(OpTy loadOp,
				memref::CollapseShapeOp collapseShapeOp,
				ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const;
				};

				/// Merges subview operation with store/transferWriteOp operation.
				template <typename OpTy>
				class StoreOpOfAliasFolder final : public OpRewritePattern<OpTy> {
				public:
				using OpRewritePattern<OpTy>::OpRewritePattern;

				nicolasvasilacheUnsubmitted Done Reply Inline Actions This decl, its def can be folded in all its uses now I think. This would be the part that reduces the code size. nicolasvasilache: This decl, its def can be folded in all its uses now I think. This would be the part that…
				arnab-ossAuthorUnsubmitted Done Reply Inline Actions I think I cannot unify all the cases involving different varieties of load/store ops, as their build functions expect different kinds and numbers of arguments. arnab-oss: I think I cannot unify all the cases involving different varieties of load/store ops, as their…
				LogicalResult matchAndRewrite(OpTy storeOp,
				PatternRewriter &rewriter) const override;

				private:
				void replaceSubViewOp(OpTy loadOp, memref::SubViewOp subViewOp,
				ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const;
				void replaceExpandShapeOp(OpTy loadOp, memref::ExpandShapeOp expandShapeOp,
				ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const;
				void replaceCollapseShapeOp(OpTy loadOp,
				memref::CollapseShapeOp collapseShapeOp,
				ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const;
				};

				template <typename LoadOpTy>
				void LoadOpOfAliasFolder<LoadOpTy>::replaceSubViewOp(
				LoadOpTy loadOp, memref::SubViewOp subViewOp, ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const {
				rewriter.replaceOpWithNewOp<LoadOpTy>(loadOp, subViewOp.source(),
				sourceIndices);
				}

				template <typename LoadOpTy>
				void LoadOpOfAliasFolder<LoadOpTy>::replaceExpandShapeOp(
				LoadOpTy loadOp, memref::ExpandShapeOp expandShapeOp,
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				rewriter.replaceOpWithNewOp<LoadOpTy>(loadOp, expandShapeOp.getViewSource(),
				sourceIndices);
				}

				template <typename LoadOpTy>
				void LoadOpOfAliasFolder<LoadOpTy>::replaceCollapseShapeOp(
				LoadOpTy loadOp, memref::CollapseShapeOp collapseShapeOp,
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				rewriter.replaceOpWithNewOp<LoadOpTy>(loadOp, collapseShapeOp.getViewSource(),
				sourceIndices);
				}

				template <>
				void LoadOpOfAliasFolder<vector::TransferReadOp>::replaceSubViewOp(
				vector::TransferReadOp transferReadOp, memref::SubViewOp subViewOp,
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				// TODO: support 0-d corner case.
				if (transferReadOp.getTransferRank() == 0)
				return;
				rewriter.replaceOpWithNewOp<vector::TransferReadOp>(
				transferReadOp, transferReadOp.getVectorType(), subViewOp.source(),
				sourceIndices,
				getPermutationMapAttr(rewriter.getContext(), subViewOp,
				transferReadOp.getPermutationMap()),
				transferReadOp.getPadding(),
				/mask=/Value(), transferReadOp.getInBoundsAttr());
				}

				template <>
				void LoadOpOfAliasFolder<vector::TransferReadOp>::replaceExpandShapeOp(
				vector::TransferReadOp transferReadOp, memref::ExpandShapeOp expandShapeOp,
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				return;
				}

				template <>
				void LoadOpOfAliasFolder<vector::TransferReadOp>::replaceCollapseShapeOp(
				vector::TransferReadOp transferReadOp,
				memref::CollapseShapeOp collapseShapeOp, ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const {
				return;
				}

				template <typename StoreOpTy>
				void StoreOpOfAliasFolder<StoreOpTy>::replaceSubViewOp(
				StoreOpTy storeOp, memref::SubViewOp subViewOp,
				nicolasvasilacheUnsubmitted Not Done Reply Inline Actions Please add an `llvm_unreachable` here and all other places below, we don't want to silently fail if/when new ops get added in the future. nicolasvasilache: Please add an `llvm_unreachable` here and all other places below, we don't want to silently…
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				rewriter.replaceOpWithNewOp<StoreOpTy>(storeOp, storeOp.getValue(),
				subViewOp.source(), sourceIndices);
				}

				template <typename StoreOpTy>
				void StoreOpOfAliasFolder<StoreOpTy>::replaceExpandShapeOp(
				StoreOpTy storeOp, memref::ExpandShapeOp expandShapeOp,
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				rewriter.replaceOpWithNewOp<StoreOpTy>(
				storeOp, storeOp.value(), expandShapeOp.getViewSource(), sourceIndices);
				}

				template <typename StoreOpTy>
				void StoreOpOfAliasFolder<StoreOpTy>::replaceCollapseShapeOp(
				StoreOpTy storeOp, memref::CollapseShapeOp collapseShapeOp,
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				rewriter.replaceOpWithNewOp<StoreOpTy>(
				storeOp, storeOp.value(), collapseShapeOp.getViewSource(), sourceIndices);
				}

				template <>
				void StoreOpOfAliasFolder<vector::TransferWriteOp>::replaceSubViewOp(
				vector::TransferWriteOp transferWriteOp, memref::SubViewOp subViewOp,
				ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
				// TODO: support 0-d corner case.
				if (transferWriteOp.getTransferRank() == 0)
				return;
				rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(
				transferWriteOp, transferWriteOp.getVector(), subViewOp.source(),
				sourceIndices,
				getPermutationMapAttr(rewriter.getContext(), subViewOp,
				transferWriteOp.getPermutationMap()),
				transferWriteOp.getInBoundsAttr());
				}

				template <>
				void StoreOpOfAliasFolder<vector::TransferWriteOp>::replaceExpandShapeOp(
				vector::TransferWriteOp transferWriteOp,
				memref::ExpandShapeOp expandShapeOp, ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const {
				return;
				}

				template <>
				void StoreOpOfAliasFolder<vector::TransferWriteOp>::replaceCollapseShapeOp(
				vector::TransferWriteOp transferWriteOp,
				memref::CollapseShapeOp collapseShapeOp, ArrayRef<Value> sourceIndices,
				PatternRewriter &rewriter) const {
				return;
				}
				} // namespace

				static SmallVector<Value>
				calculateExpandedAccessIndices(AffineMap affineMap, SmallVector<Value> indices,
				Location loc, PatternRewriter &rewriter) {
				SmallVector<Value> expandedIndices;
				for (unsigned i = 0, e = affineMap.getNumResults(); i < e; i++)
				expandedIndices.push_back(
				rewriter.create<AffineApplyOp>(loc, affineMap.getSubMap({i}), indices));
				return expandedIndices;
				}

				template <typename OpTy>
				LogicalResult
				LoadOpOfAliasFolder<OpTy>::matchAndRewrite(OpTy loadOp,
				PatternRewriter &rewriter) const {
				auto subViewOp =
				getMemRefOperand(loadOp).template getDefiningOp<memref::SubViewOp>();
				auto expandShapeOp =
				getMemRefOperand(loadOp).template getDefiningOp<memref::ExpandShapeOp>();
				auto collapseShapeOp = getMemRefOperand(loadOp)
				.template getDefiningOp<memref::CollapseShapeOp>();

				SmallVector<Value> indices(loadOp.indices().begin(), loadOp.indices().end());
				// For affine ops, we need to apply the map to get the operands to get the
				// "actual" indices.
				if (auto affineLoadOp = dyn_cast<AffineLoadOp>(loadOp.getOperation())) {
				AffineMap affineMap = affineLoadOp.getAffineMap();
				auto expandedIndices = calculateExpandedAccessIndices(
				affineMap, indices, loadOp.getLoc(), rewriter);
				indices.assign(expandedIndices.begin(), expandedIndices.end());
				}
				SmallVector<Value, 4> sourceIndices;
				if (subViewOp) {
				if (failed(resolveSourceIndicesSubView(loadOp.getLoc(), rewriter, subViewOp,
				indices, sourceIndices)))
				return failure();
				replaceSubViewOp(loadOp, subViewOp, sourceIndices, rewriter);
				return success();
				}
				if (expandShapeOp) {
				if (failed(resolveSourceIndicesExpandShape(
				loadOp.getLoc(), rewriter, expandShapeOp, indices, sourceIndices)))
				return failure();
				replaceExpandShapeOp(loadOp, expandShapeOp, sourceIndices, rewriter);
				return success();
				}
				if (collapseShapeOp) {
				if (failed(resolveSourceIndicesCollapseShape(loadOp.getLoc(), rewriter,
				collapseShapeOp, indices,
				sourceIndices)))
				return failure();
				replaceCollapseShapeOp(loadOp, collapseShapeOp, sourceIndices, rewriter);
				return success();
				}
				return failure();
				}

				template <typename OpTy>
				LogicalResult
				StoreOpOfAliasFolder<OpTy>::matchAndRewrite(OpTy storeOp,
				PatternRewriter &rewriter) const {
				auto subViewOp =
				getMemRefOperand(storeOp).template getDefiningOp<memref::SubViewOp>();
				auto expandShapeOp =
				getMemRefOperand(storeOp).template getDefiningOp<memref::ExpandShapeOp>();
				auto collapseShapeOp = getMemRefOperand(storeOp)
				.template getDefiningOp<memref::CollapseShapeOp>();

				SmallVector<Value> indices(storeOp.indices().begin(),
				storeOp.indices().end());
				// For affine ops, we need to apply the map to get the operands to get the
				// "actual" indices.
				if (auto affineStoreOp = dyn_cast<AffineStoreOp>(storeOp.getOperation())) {
				AffineMap affineMap = affineStoreOp.getAffineMap();
				auto expandedIndices = calculateExpandedAccessIndices(
				affineMap, indices, storeOp.getLoc(), rewriter);
				indices.assign(expandedIndices.begin(), expandedIndices.end());
				}
				if (subViewOp) {
				SmallVector<Value, 4> sourceIndices;
				if (failed(resolveSourceIndicesSubView(storeOp.getLoc(), rewriter,
				subViewOp, indices, sourceIndices)))
				return failure();
				replaceSubViewOp(storeOp, subViewOp, sourceIndices, rewriter);
				return success();
				}
				if (expandShapeOp) {
				SmallVector<Value, 4> sourceIndices;
				if (failed(resolveSourceIndicesExpandShape(
				storeOp.getLoc(), rewriter, expandShapeOp, indices, sourceIndices)))
				return failure();
				replaceExpandShapeOp(storeOp, expandShapeOp, sourceIndices, rewriter);
				return success();
				}
				if (collapseShapeOp) {
				SmallVector<Value, 4> sourceIndices;
				if (failed(resolveSourceIndicesCollapseShape(storeOp.getLoc(), rewriter,
				collapseShapeOp, indices,
				sourceIndices)))
				return failure();
				nicolasvasilacheUnsubmitted Done Reply Inline Actions I think you can now: llvm::TypeSwitch<Operation , void>(loadOp) .Case<affine::LoadOp, memref::LoadOp>([&](auto op) { rewriter.replaceOpWithNewOp<decltype(op)>( loadOp, expandShapeOp.getViewSource(), sourceIndices); }) .Default([](Operation ) { ;}); And drop all the overloads and decls? nicolasvasilache: I think you can now: ``` llvm::TypeSwitch<Operation *, void>(loadOp) .Case<affine…
				arnab-ossAuthorUnsubmitted Done Reply Inline Actions I think this is not possible. Please take a look at my comment above. arnab-oss: I think this is not possible. Please take a look at my comment above.
				nicolasvasilacheUnsubmitted Done Reply Inline Actions This is precisely why I used a specific TypeSwitch + auto + decltype to resolve the type inside the Case lambda. To spell it more, you should be able to: llvm::TypeSwitch<Operation , void>(loadOp) .Case<affine::LoadOp, memref::LoadOp>([&](auto op) { rewriter.replaceOpWithNewOp<decltype(op)>( loadOp, expandShapeOp.getViewSource(), sourceIndices); }) .Case<vector::TransferReadOp>([&](auto transferReadOp) { if (transferReadOp.getTransferRank() == 0) { // TODO: Propagate the error. return; } rewriter.replaceOpWithNewOp<vector::TransferReadOp>( transferReadOp, transferReadOp.getVectorType(), subViewOp.source(), sourceIndices, getPermutationMapAttr(rewriter.getContext(), subViewOp, transferReadOp.getPermutationMap()), transferReadOp.getPadding(), /mask=/Value(), transferReadOp.getInBoundsAttr()); }) .Default([](Operation ) { ;}); Did you try and hit compilation errors? If so, could you paste the error message ? nicolasvasilache: This is precisely why I used a specific TypeSwitch + auto + decltype to resolve the type inside…
				replaceCollapseShapeOp(storeOp, collapseShapeOp, sourceIndices, rewriter);
				return success();
				}
				return failure();
				}

				void memref::populateFoldMemRefAliasOpPatterns(RewritePatternSet &patterns) {
				patterns.add<LoadOpOfAliasFolder<AffineLoadOp>,
				LoadOpOfAliasFolder<memref::LoadOp>,
				LoadOpOfAliasFolder<vector::TransferReadOp>,
				StoreOpOfAliasFolder<AffineStoreOp>,
				StoreOpOfAliasFolder<memref::StoreOp>,
				StoreOpOfAliasFolder<vector::TransferWriteOp>>(
				patterns.getContext());
				}

				//===----------------------------------------------------------------------===//
				// Pass registration
				//===----------------------------------------------------------------------===//

				namespace {

				#define GEN_PASS_CLASSES
				#include "mlir/Dialect/MemRef/Transforms/Passes.h.inc"

				struct FoldMemRefAliasOpsPass final
				: public FoldMemRefAliasOpsBase<FoldMemRefAliasOpsPass> {
				void runOnOperation() override;
				};

				} // namespace

				void FoldMemRefAliasOpsPass::runOnOperation() {
				RewritePatternSet patterns(&getContext());
				memref::populateFoldMemRefAliasOpPatterns(patterns);
				(void)applyPatternsAndFoldGreedily(getOperation()->getRegions(),
				std::move(patterns));
				}

				std::unique_ptr<Pass> memref::createFoldMemRefAliasOpsPass() {
				return std::make_unique<FoldMemRefAliasOpsPass>();
				}

mlir/lib/Dialect/MemRef/Transforms/FoldSubViewOps.cpp

This file was deleted.

	//===- FoldSubViewOps.cpp - Fold memref.subview ops -----------------------===//
	//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//
	//===----------------------------------------------------------------------===//
	//
	// This transformation pass folds loading/storing from/to subview ops into
	// loading/storing from/to the original memref.
	//
	//===----------------------------------------------------------------------===//

	#include "PassDetail.h"
	#include "mlir/Dialect/Affine/IR/AffineOps.h"
	#include "mlir/Dialect/Arithmetic/IR/Arithmetic.h"
	#include "mlir/Dialect/MemRef/IR/MemRef.h"
	#include "mlir/Dialect/MemRef/Transforms/Passes.h"
	#include "mlir/Dialect/Vector/IR/VectorOps.h"
	#include "mlir/IR/BuiltinTypes.h"
	#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
	#include "llvm/ADT/SmallBitVector.h"

	using namespace mlir;

	//===----------------------------------------------------------------------===//
	// Utility functions
	//===----------------------------------------------------------------------===//

	/// Given the 'indices' of an load/store operation where the memref is a result
	/// of a subview op, returns the indices w.r.t to the source memref of the
	/// subview op. For example
	///
	/// %0 = ... : memref<12x42xf32>
	/// %1 = subview %0[%arg0, %arg1][][%stride1, %stride2] : memref<12x42xf32> to
	/// memref<4x4xf32, offset=?, strides=[?, ?]>
	/// %2 = load %1[%i1, %i2] : memref<4x4xf32, offset=?, strides=[?, ?]>
	///
	/// could be folded into
	///
	/// %2 = load %0[%arg0 + %i1 * %stride1][%arg1 + %i2 * %stride2] :
	/// memref<12x42xf32>
	static LogicalResult
	resolveSourceIndices(Location loc, PatternRewriter &rewriter,
	memref::SubViewOp subViewOp, ValueRange indices,
	SmallVectorImpl<Value> &sourceIndices) {
	SmallVector<OpFoldResult> mixedOffsets = subViewOp.getMixedOffsets();
	SmallVector<OpFoldResult> mixedSizes = subViewOp.getMixedSizes();
	SmallVector<OpFoldResult> mixedStrides = subViewOp.getMixedStrides();

	SmallVector<Value> useIndices;
	// Check if this is rank-reducing case. Then for every unit-dim size add a
	// zero to the indices.
	unsigned resultDim = 0;
	llvm::SmallBitVector unusedDims = subViewOp.getDroppedDims();
	for (auto dim : llvm::seq<unsigned>(0, subViewOp.getSourceType().getRank())) {
	if (unusedDims.test(dim))
	useIndices.push_back(rewriter.create<arith::ConstantIndexOp>(loc, 0));
	else
	useIndices.push_back(indices[resultDim++]);
	}
	if (useIndices.size() != mixedOffsets.size())
	return failure();
	sourceIndices.resize(useIndices.size());
	for (auto index : llvm::seq<size_t>(0, mixedOffsets.size())) {
	SmallVector<Value> dynamicOperands;
	AffineExpr expr = rewriter.getAffineDimExpr(0);
	unsigned numSymbols = 0;
	dynamicOperands.push_back(useIndices[index]);

	// Multiply the stride;
	if (auto attr = mixedStrides[index].dyn_cast<Attribute>()) {
	expr = expr * attr.cast<IntegerAttr>().getInt();
	} else {
	dynamicOperands.push_back(mixedStrides[index].get<Value>());
	expr = expr * rewriter.getAffineSymbolExpr(numSymbols++);
	}

	// Add the offset.
	if (auto attr = mixedOffsets[index].dyn_cast<Attribute>()) {
	expr = expr + attr.cast<IntegerAttr>().getInt();
	} else {
	dynamicOperands.push_back(mixedOffsets[index].get<Value>());
	expr = expr + rewriter.getAffineSymbolExpr(numSymbols++);
	}
	Location loc = subViewOp.getLoc();
	sourceIndices[index] = rewriter.create<AffineApplyOp>(
	loc, AffineMap::get(1, numSymbols, expr), dynamicOperands);
	}
	return success();
	}

	/// Helpers to access the memref operand for each op.
	template <typename LoadOrStoreOpTy>
	static Value getMemRefOperand(LoadOrStoreOpTy op) {
	return op.getMemref();
	}

	static Value getMemRefOperand(vector::TransferReadOp op) {
	return op.getSource();
	}

	static Value getMemRefOperand(vector::TransferWriteOp op) {
	return op.getSource();
	}

	/// Given the permutation map of the original
	/// `vector.transfer_read`/`vector.transfer_write` operations compute the
	/// permutation map to use after the subview is folded with it.
	static AffineMapAttr getPermutationMapAttr(MLIRContext *context,
	memref::SubViewOp subViewOp,
	AffineMap currPermutationMap) {
	llvm::SmallBitVector unusedDims = subViewOp.getDroppedDims();
	SmallVector<AffineExpr> exprs;
	int64_t sourceRank = subViewOp.getSourceType().getRank();
	for (auto dim : llvm::seq<int64_t>(0, sourceRank)) {
	if (unusedDims.test(dim))
	continue;
	exprs.push_back(getAffineDimExpr(dim, context));
	}
	auto resultDimToSourceDimMap = AffineMap::get(sourceRank, 0, exprs, context);
	return AffineMapAttr::get(
	currPermutationMap.compose(resultDimToSourceDimMap));
	}

	//===----------------------------------------------------------------------===//
	// Patterns
	//===----------------------------------------------------------------------===//

	namespace {
	/// Merges subview operation with load/transferRead operation.
	template <typename OpTy>
	class LoadOpOfSubViewFolder final : public OpRewritePattern<OpTy> {
	public:
	using OpRewritePattern<OpTy>::OpRewritePattern;

	LogicalResult matchAndRewrite(OpTy loadOp,
	PatternRewriter &rewriter) const override;

	private:
	void replaceOp(OpTy loadOp, memref::SubViewOp subViewOp,
	ArrayRef<Value> sourceIndices,
	PatternRewriter &rewriter) const;
	};

	/// Merges subview operation with store/transferWriteOp operation.
	template <typename OpTy>
	class StoreOpOfSubViewFolder final : public OpRewritePattern<OpTy> {
	public:
	using OpRewritePattern<OpTy>::OpRewritePattern;

	LogicalResult matchAndRewrite(OpTy storeOp,
	PatternRewriter &rewriter) const override;

	private:
	void replaceOp(OpTy storeOp, memref::SubViewOp subViewOp,
	ArrayRef<Value> sourceIndices,
	PatternRewriter &rewriter) const;
	};

	template <typename LoadOpTy>
	void LoadOpOfSubViewFolder<LoadOpTy>::replaceOp(
	LoadOpTy loadOp, memref::SubViewOp subViewOp, ArrayRef<Value> sourceIndices,
	PatternRewriter &rewriter) const {
	rewriter.replaceOpWithNewOp<LoadOpTy>(loadOp, subViewOp.source(),
	sourceIndices);
	}

	template <>
	void LoadOpOfSubViewFolder<vector::TransferReadOp>::replaceOp(
	vector::TransferReadOp transferReadOp, memref::SubViewOp subViewOp,
	ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
	// TODO: support 0-d corner case.
	if (transferReadOp.getTransferRank() == 0)
	return;
	rewriter.replaceOpWithNewOp<vector::TransferReadOp>(
	transferReadOp, transferReadOp.getVectorType(), subViewOp.source(),
	sourceIndices,
	getPermutationMapAttr(rewriter.getContext(), subViewOp,
	transferReadOp.getPermutationMap()),
	transferReadOp.getPadding(),
	/mask=/Value(), transferReadOp.getInBoundsAttr());
	}

	template <typename StoreOpTy>
	void StoreOpOfSubViewFolder<StoreOpTy>::replaceOp(
	StoreOpTy storeOp, memref::SubViewOp subViewOp,
	ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
	rewriter.replaceOpWithNewOp<StoreOpTy>(storeOp, storeOp.getValue(),
	subViewOp.source(), sourceIndices);
	}

	template <>
	void StoreOpOfSubViewFolder<vector::TransferWriteOp>::replaceOp(
	vector::TransferWriteOp transferWriteOp, memref::SubViewOp subViewOp,
	ArrayRef<Value> sourceIndices, PatternRewriter &rewriter) const {
	// TODO: support 0-d corner case.
	if (transferWriteOp.getTransferRank() == 0)
	return;
	rewriter.replaceOpWithNewOp<vector::TransferWriteOp>(
	transferWriteOp, transferWriteOp.getVector(), subViewOp.source(),
	sourceIndices,
	getPermutationMapAttr(rewriter.getContext(), subViewOp,
	transferWriteOp.getPermutationMap()),
	transferWriteOp.getInBoundsAttr());
	}
	} // namespace

	template <typename OpTy>
	LogicalResult
	LoadOpOfSubViewFolder<OpTy>::matchAndRewrite(OpTy loadOp,
	PatternRewriter &rewriter) const {
	auto subViewOp =
	getMemRefOperand(loadOp).template getDefiningOp<memref::SubViewOp>();
	if (!subViewOp)
	return failure();

	SmallVector<Value, 4> sourceIndices;
	if (failed(resolveSourceIndices(loadOp.getLoc(), rewriter, subViewOp,
	loadOp.getIndices(), sourceIndices)))
	return failure();

	replaceOp(loadOp, subViewOp, sourceIndices, rewriter);
	return success();
	}

	template <typename OpTy>
	LogicalResult
	StoreOpOfSubViewFolder<OpTy>::matchAndRewrite(OpTy storeOp,
	PatternRewriter &rewriter) const {
	auto subViewOp =
	getMemRefOperand(storeOp).template getDefiningOp<memref::SubViewOp>();
	if (!subViewOp)
	return failure();

	SmallVector<Value, 4> sourceIndices;
	if (failed(resolveSourceIndices(storeOp.getLoc(), rewriter, subViewOp,
	storeOp.getIndices(), sourceIndices)))
	return failure();

	replaceOp(storeOp, subViewOp, sourceIndices, rewriter);
	return success();
	}

	void memref::populateFoldSubViewOpPatterns(RewritePatternSet &patterns) {
	patterns.add<LoadOpOfSubViewFolder<AffineLoadOp>,
	LoadOpOfSubViewFolder<memref::LoadOp>,
	LoadOpOfSubViewFolder<vector::TransferReadOp>,
	StoreOpOfSubViewFolder<AffineStoreOp>,
	StoreOpOfSubViewFolder<memref::StoreOp>,
	StoreOpOfSubViewFolder<vector::TransferWriteOp>>(
	patterns.getContext());
	}

	//===----------------------------------------------------------------------===//
	// Pass registration
	//===----------------------------------------------------------------------===//

	namespace {

	struct FoldSubViewOpsPass final
	: public FoldSubViewOpsBase<FoldSubViewOpsPass> {
	void runOnOperation() override;
	};

	} // namespace

	void FoldSubViewOpsPass::runOnOperation() {
	RewritePatternSet patterns(&getContext());
	memref::populateFoldSubViewOpPatterns(patterns);
	(void)applyPatternsAndFoldGreedily(getOperation(), std::move(patterns));
	}

	std::unique_ptr<Pass> memref::createFoldSubViewOpsPass() {
	return std::make_unique<FoldSubViewOpsPass>();
	}

mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir

This file was moved from mlir/test/Dialect/MemRef/fold-subview-ops.mlir.

// RUN: mlir-opt -fold-memref-subview-ops -split-input-file %s -o - \| FileCheck %s		// RUN: mlir-opt -fold-memref-alias-ops -split-input-file %s -o - \| FileCheck %s

func.func @fold_static_stride_subview_with_load(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index, %arg4 : index) -> f32 {		func.func @fold_static_stride_subview_with_load(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index, %arg4 : index) -> f32 {
%0 = memref.subview %arg0[%arg1, %arg2][4, 4][2, 3] : memref<12x32xf32> to memref<4x4xf32, offset:?, strides: [64, 3]>		%0 = memref.subview %arg0[%arg1, %arg2][4, 4][2, 3] : memref<12x32xf32> to memref<4x4xf32, offset:?, strides: [64, 3]>
%1 = memref.load %0[%arg3, %arg4] : memref<4x4xf32, offset:?, strides: [64, 3]>		%1 = memref.load %0[%arg3, %arg4] : memref<4x4xf32, offset:?, strides: [64, 3]>
return %1 : f32		return %1 : f32
}		}
// CHECK-DAG: #[[MAP0:.+]] = affine_map<(d0)[s0] -> (d0 * 2 + s0)>		// CHECK-DAG: #[[MAP0:.+]] = affine_map<(d0)[s0] -> (d0 * 2 + s0)>
// CHECK-DAG: #[[MAP1:.+]] = affine_map<(d0)[s0] -> (d0 * 3 + s0)>		// CHECK-DAG: #[[MAP1:.+]] = affine_map<(d0)[s0] -> (d0 * 3 + s0)>
▲ Show 20 Lines • Show All 257 Lines • ▼ Show 20 Lines	func.func @fold_static_stride_subview_with_affine_load_store(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index, %arg4 : index) -> f32 {
// CHECK-NEXT: affine.load		// CHECK-NEXT: affine.load
affine.store %1, %0[%arg3, %arg4] : memref<4x4xf32, offset:?, strides: [64, 3]>		affine.store %1, %0[%arg3, %arg4] : memref<4x4xf32, offset:?, strides: [64, 3]>
// CHECK-NEXT: affine.apply		// CHECK-NEXT: affine.apply
// CHECK-NEXT: affine.apply		// CHECK-NEXT: affine.apply
// CHECK-NEXT: affine.store		// CHECK-NEXT: affine.store
// CHECK-NEXT: return		// CHECK-NEXT: return
return %1 : f32		return %1 : f32
}		}

		// -----

		// CHECK-DAG: #[[$MAP:.]] = affine_map<(d0, d1) -> (d0 6 + d1)>
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please also add some tests for `memref.load/store` and `vector.transfer`. nicolasvasilache: Please also add some tests for `memref.load/store` and `vector.transfer`.
		// CHECK-LABEL: fold_static_stride_subview_with_affine_load_store_expand_shape
		// CHECK-SAME: (%[[ARG0:.]]: memref<12x32xf32>, %[[ARG1:.]]: index, %[[ARG2:.]]: index, %[[ARG3:.]]: index) -> f32 {
		func @fold_static_stride_subview_with_affine_load_store_expand_shape(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index) -> f32 {
		%0 = memref.expand_shape %arg0 [[0, 1], [2]] : memref<12x32xf32> into memref<2x6x32xf32>
		%1 = affine.load %0[%arg1, %arg2, %arg3] : memref<2x6x32xf32>
		return %1 : f32
		}
		// CHECK: %[[INDEX:.*]] = affine.apply #[[$MAP]](%[[ARG1]], %[[ARG2]])
		// CHECK-NEXT: %[[RESULT:.*]] = affine.load %[[ARG0]][%[[INDEX]], %[[ARG3]]] : memref<12x32xf32>
		// CHECK-NEXT: return %[[RESULT]] : f32

		// -----

		// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0) -> (d0 floordiv 6)>
		// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0) -> (d0 mod 6)>
		// CHECK-LABEL: @fold_static_stride_subview_with_affine_load_store_collapse_shape
		// CHECK-SAME: (%[[ARG0:.]]: memref<2x6x32xf32>, %[[ARG1:.]]: index, %[[ARG2:.*]]: index)
		func @fold_static_stride_subview_with_affine_load_store_collapse_shape(%arg0 : memref<2x6x32xf32>, %arg1 : index, %arg2 : index) -> f32 {
		%0 = memref.collapse_shape %arg0 [[0, 1], [2]] : memref<2x6x32xf32> into memref<12x32xf32>
		%1 = affine.load %0[%arg1, %arg2] : memref<12x32xf32>
		return %1 : f32
		}
		// CHECK-NEXT: %[[MODIFIED_INDEX0:.*]] = affine.apply #[[$MAP0]](%[[ARG1]])
		// CHECK-NEXT: %[[MODIFIED_INDEX1:.*]] = affine.apply #[[$MAP1]](%[[ARG1]])
		// CHECK-NEXT: %[[RESULT:.*]] = affine.load %[[ARG0]][%[[MODIFIED_INDEX0]], %[[MODIFIED_INDEX1]], %[[ARG2]]] : memref<2x6x32xf32>
		// CHECK-NEXT: return %[[RESULT]] : f32

		// -----

		// CHECK-DAG: #[[$MAP:.]] = affine_map<(d0, d1, d2) -> (d0 6 + d1 * 3 + d2)>
		// CHECK-LABEL: fold_static_stride_subview_with_affine_load_store_expand_shape_3d
		// CHECK-SAME: (%[[ARG0:.]]: memref<12x32xf32>, %[[ARG1:.]]: index, %[[ARG2:.]]: index, %[[ARG3:.]]: index, %[[ARG4:.*]]: index) -> f32 {
		func @fold_static_stride_subview_with_affine_load_store_expand_shape_3d(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index, %arg4: index) -> f32 {
		%0 = memref.expand_shape %arg0 [[0, 1, 2], [3]] : memref<12x32xf32> into memref<2x2x3x32xf32>
		%1 = affine.load %0[%arg1, %arg2, %arg3, %arg4] : memref<2x2x3x32xf32>
		return %1 : f32
		}
		// CHECK: %[[INDEX:.*]] = affine.apply #[[$MAP]](%[[ARG1]], %[[ARG2]], %[[ARG3]])
		// CHECK-NEXT: %[[RESULT:.*]] = affine.load %[[ARG0]][%[[INDEX]], %[[ARG4]]] : memref<12x32xf32>
		// CHECK-NEXT: return %[[RESULT]] : f32

		// -----

		// CHECK-DAG: #[[$MAP0:.]] = affine_map<(d0, d1) -> (d0 1024 + d1)>
		// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>
		// CHECK-LABEL: fold_static_stride_subview_with_affine_load_store_expand_shape
		// CHECK-SAME: (%[[ARG0:.]]: memref<1024x1024xf32>, %[[ARG1:.]]: memref<1xf32>, %[[ARG2:.*]]: index)
		func @fold_static_stride_subview_with_affine_load_store_expand_shape(%arg0: memref<1024x1024xf32>, %arg1: memref<1xf32>, %arg2: index) -> f32 {
		%0 = memref.expand_shape %arg0 [[0, 1], [2, 3]] : memref<1024x1024xf32> into memref<1x1024x1024x1xf32>
		affine.for %arg3 = 0 to 1 {
		affine.for %arg4 = 0 to 1024 {
		affine.for %arg5 = 0 to 1020 {
		affine.for %arg6 = 0 to 1 {
		%1 = affine.load %0[%arg3, %arg4, %arg5, %arg6] : memref<1x1024x1024x1xf32>
		affine.store %1, %arg1[%arg2] : memref<1xf32>
		}
		}
		}
		}
		%2 = affine.load %arg1[%arg2] : memref<1xf32>
		return %2 : f32
		}
		// CHECK-NEXT: affine.for %[[ARG3:.*]] = 0 to 1 {
		// CHECK-NEXT: affine.for %[[ARG4:.*]] = 0 to 1024 {
		// CHECK-NEXT: affine.for %[[ARG5:.*]] = 0 to 1020 {
		// CHECK-NEXT: affine.for %[[ARG6:.*]] = 0 to 1 {
		// CHECK-NEXT: %[[IDX1:.*]] = affine.apply #[[$MAP0]](%[[ARG3]], %[[ARG4]])
		// CHECK-NEXT: %[[IDX2:.*]] = affine.apply #[[$MAP1]](%[[ARG5]], %[[ARG6]])
		// CHECK-NEXT: affine.load %[[ARG0]][%[[IDX1]], %[[IDX2]]] : memref<1024x1024xf32>

		// -----

		// CHECK-DAG: #[[$MAP0:.*]] = affine_map<(d0, d1, d2, d3) -> (d1 + d0)>
		// CHECK-DAG: #[[$MAP1:.]] = affine_map<(d0, d1) -> (d0 1024 + d1)>
		// CHECK-DAG: #[[$MAP2:.*]] = affine_map<(d0, d1) -> (d0 + d1)>
		// CHECK-LABEL: fold_static_stride_subview_with_affine_load_store_expand_shape_when_access_index_is_an_expression
		// CHECK-SAME: (%[[ARG0:.]]: memref<1024x1024xf32>, %[[ARG1:.]]: memref<1xf32>, %[[ARG2:.*]]: index)
		func @fold_static_stride_subview_with_affine_load_store_expand_shape_when_access_index_is_an_expression(%arg0: memref<1024x1024xf32>, %arg1: memref<1xf32>, %arg2: index) -> f32 {
		%0 = memref.expand_shape %arg0 [[0, 1], [2, 3]] : memref<1024x1024xf32> into memref<1x1024x1024x1xf32>
		affine.for %arg3 = 0 to 1 {
		affine.for %arg4 = 0 to 1024 {
		affine.for %arg5 = 0 to 1020 {
		affine.for %arg6 = 0 to 1 {
		%1 = affine.load %0[%arg3, %arg4 + %arg3, %arg5, %arg6] : memref<1x1024x1024x1xf32>
		affine.store %1, %arg1[%arg2] : memref<1xf32>
		}
		}
		}
		}
		%2 = affine.load %arg1[%arg2] : memref<1xf32>
		return %2 : f32
		}
		// CHECK-NEXT: affine.for %[[ARG3:.*]] = 0 to 1 {
		// CHECK-NEXT: affine.for %[[ARG4:.*]] = 0 to 1024 {
		// CHECK-NEXT: affine.for %[[ARG5:.*]] = 0 to 1020 {
		// CHECK-NEXT: affine.for %[[ARG6:.*]] = 0 to 1 {
		// CHECK-NEXT: %[[TMP1:.*]] = affine.apply #[[$MAP0]](%[[ARG3]], %[[ARG4]], %[[ARG5]], %[[ARG6]])
		// CHECK-NEXT: %[[TMP2:.*]] = affine.apply #[[$MAP1]](%[[ARG3]], %[[TMP1]])
		// CHECK-NEXT: %[[TMP3:.*]] = affine.apply #map2(%[[ARG5]], %[[ARG6]])
		// CHECK-NEXT: affine.load %[[ARG0]][%[[TMP2]], %[[TMP3]]] : memref<1024x1024xf32>

		// -----

		// CHECK-DAG: #[[$MAP0:.]] = affine_map<(d0, d1) -> (d0 1024 + d1)>
		// CHECK-DAG: #[[$MAP1:.*]] = affine_map<(d0, d1) -> (d0 + d1)>
		// CHECK-LABEL: fold_static_stride_subview_with_affine_load_store_expand_shape_with_constant_access_index
		// CHECK-SAME: (%[[ARG0:.]]: memref<1024x1024xf32>, %[[ARG1:.]]: memref<1xf32>, %[[ARG2:.*]]: index)
		func @fold_static_stride_subview_with_affine_load_store_expand_shape_with_constant_access_index(%arg0: memref<1024x1024xf32>, %arg1: memref<1xf32>, %arg2: index) -> f32 {
		%0 = memref.expand_shape %arg0 [[0, 1], [2, 3]] : memref<1024x1024xf32> into memref<1x1024x1024x1xf32>
		affine.for %arg3 = 0 to 1 {
		affine.for %arg4 = 0 to 1024 {
		affine.for %arg5 = 0 to 1020 {
		affine.for %arg6 = 0 to 1 {
		%1 = affine.load %0[%arg3, 0, %arg5, %arg6] : memref<1x1024x1024x1xf32>
		affine.store %1, %arg1[%arg2] : memref<1xf32>
		}
		}
		}
		}
		%2 = affine.load %arg1[%arg2] : memref<1xf32>
		return %2 : f32
		}
		// CHECK-NEXT: %[[ZERO:.*]] = arith.constant 0 : index
		// CHECK-NEXT: affine.for %[[ARG3:.*]] = 0 to 1 {
		// CHECK-NEXT: affine.for %[[ARG4:.*]] = 0 to 1024 {
		// CHECK-NEXT: affine.for %[[ARG5:.*]] = 0 to 1020 {
		// CHECK-NEXT: affine.for %[[ARG6:.*]] = 0 to 1 {
		// CHECK-NEXT: %[[TMP1:.*]] = affine.apply #[[$MAP0]](%[[ARG3]], %[[ZERO]])
		// CHECK-NEXT: %[[TMP2:.*]] = affine.apply #[[$MAP1]](%[[ARG5]], %[[ARG6]])
		// CHECK-NEXT: affine.load %[[ARG0]][%[[TMP1]], %[[TMP2]]] : memref<1024x1024xf32>

		// -----

		// CHECK-LABEL: fold_static_stride_subview_with_affine_load_store_collapse_shape_with_0d_result
		// CHECK-SAME: (%[[ARG0:.]]: memref<1xf32>, %[[ARG1:.]]: memref<1xf32>)
		func @fold_static_stride_subview_with_affine_load_store_collapse_shape_with_0d_result(%arg0: memref<1xf32>, %arg1: memref<1xf32>) -> memref<1xf32> {
		%0 = memref.collapse_shape %arg0 [] : memref<1xf32> into memref<f32>
		affine.for %arg2 = 0 to 3 {
		%1 = affine.load %0[] : memref<f32>
		affine.store %1, %arg1[0] : memref<1xf32>
		}
		bondhugulaUnsubmitted Not Done Reply Inline Actions Indent by two to be consistent. bondhugula: Indent by two to be consistent.
		return %arg1 : memref<1xf32>
		}
		// CHECK-NEXT: %[[ZERO:.*]] = arith.constant 0 : index
		// CHECK-NEXT: affine.for %{{.*}} = 0 to 3 {
		// CHECK-NEXT: affine.load %[[ARG0]][%[[ZERO]]] : memref<1xf32>

mlir/test/Dialect/MemRef/fold-subview-ops.mlir

This file was moved to mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir.

mlir/tools/mlir-vulkan-runner/mlir-vulkan-runner.cpp

	Show All 40 Lines

	using namespace mlir;			using namespace mlir;

	static LogicalResult runMLIRPasses(ModuleOp module) {			static LogicalResult runMLIRPasses(ModuleOp module) {
	PassManager passManager(module.getContext());			PassManager passManager(module.getContext());
	applyPassManagerCLOptions(passManager);			applyPassManagerCLOptions(passManager);

	passManager.addPass(createGpuKernelOutliningPass());			passManager.addPass(createGpuKernelOutliningPass());
	passManager.addPass(memref::createFoldSubViewOpsPass());			passManager.addPass(memref::createFoldMemRefAliasOpsPass());
	passManager.addPass(createConvertGPUToSPIRVPass());			passManager.addPass(createConvertGPUToSPIRVPass());
	OpPassManager &modulePM = passManager.nest<spirv::ModuleOp>();			OpPassManager &modulePM = passManager.nest<spirv::ModuleOp>();
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Seems unrelated to this PR. Even if it has no effects on the logic, I'd rather keep separate pieces separate. nicolasvasilache: Seems unrelated to this PR. Even if it has no effects on the logic, I'd rather keep separate…
	modulePM.addPass(spirv::createLowerABIAttributesPass());			modulePM.addPass(spirv::createLowerABIAttributesPass());
	modulePM.addPass(spirv::createUpdateVersionCapabilityExtensionPass());			modulePM.addPass(spirv::createUpdateVersionCapabilityExtensionPass());
	passManager.addPass(createConvertGpuLaunchFuncToVulkanLaunchFuncPass());			passManager.addPass(createConvertGpuLaunchFuncToVulkanLaunchFuncPass());
	LowerToLLVMOptions llvmOptions(module.getContext(), DataLayout(module));			LowerToLLVMOptions llvmOptions(module.getContext(), DataLayout(module));
	passManager.addPass(createMemRefToLLVMPass());			passManager.addPass(createMemRefToLLVMPass());
	passManager.nest<func::FuncOp>().addPass(LLVM::createRequestCWrappersPass());			passManager.nest<func::FuncOp>().addPass(LLVM::createRequestCWrappersPass());
	passManager.addPass(createConvertFuncToLLVMPass(llvmOptions));			passManager.addPass(createConvertFuncToLLVMPass(llvmOptions));
	passManager.addPass(createReconcileUnrealizedCastsPass());			passManager.addPass(createReconcileUnrealizedCastsPass());
	Show All 23 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Fold memref.expand_shape and memref.collapse_shape opsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 441671

mlir/include/mlir/Dialect/MemRef/Transforms/Passes.h

mlir/include/mlir/Dialect/MemRef/Transforms/Passes.td

mlir/lib/Dialect/MemRef/Transforms/CMakeLists.txt

mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp

mlir/lib/Dialect/MemRef/Transforms/FoldSubViewOps.cpp

mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir

mlir/test/Dialect/MemRef/fold-subview-ops.mlir

mlir/tools/mlir-vulkan-runner/mlir-vulkan-runner.cpp

Fold memref.expand_shape and memref.collapse_shape ops
ClosedPublic