This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Dialect/
-
SCF/Utils/
-
Utils/
2/2
AffineCanonicalizationUtils.h
-
Tensor/
-
CMakeLists.txt
-
TransformOps/
1/1
CMakeLists.txt
-
TensorTransformOps.h
3/3
TensorTransformOps.td
-
Transforms/
6/6
Transforms.h
-
InitAllDialects.h
-
lib/Dialect/
-
Dialect/
-
Affine/Analysis/
-
Analysis/
-
AffineStructures.cpp
-
SCF/Utils/
-
Utils/
2/2
AffineCanonicalizationUtils.cpp
-
Tensor/
-
CMakeLists.txt
-
TransformOps/
-
CMakeLists.txt
1/1
TensorTransformOps.cpp
-
Transforms/
-
CMakeLists.txt
2/2
LoopHoisting.cpp
-
test/Dialect/Tensor/
-
Dialect/
-
Tensor/
-
transform-op-make-loop-independent.mlir
-
utils/bazel/llvm-project-overlay/mlir/
-
bazel/
-
llvm-project-overlay/
-
mlir/
-
BUILD.bazel

Differential D143910

[mlir][tensor] Add transform to make tensor.pad/empty loop-independent
ClosedPublic

Authored by springerm on Feb 13 2023, 6:21 AM.

Download Raw Diff

Details

Reviewers

dcaballe
nicolasvasilache
bondhugula

Commits

rG77124386feb6: [mlir][tensor] Add transform to make tensor.pad loop-independent

Summary

Add a transform to make tensor.pad and tensor.empty ops independent of SCF loop IVs. Such ops can then be hoisted.

E.g.:

scf.for %iv = %lb to %ub step %step {
  %high = affine.apply affine_map<(d0)[s0] -> (s0 - d0)> (%i)[%ub]
  %p = tensor.pad %t low[5] high[%high] ...
  ...
}

Is transformed to:

%high_new = affine.apply affine_map<()[s0, s1] -> (-s0 + s1)> ()[%lb, %ub]
%p_hoistable = tensor.pad %t low[5] high[%high_new]
%dim = tensor.dim %t, %c0
%size = affine.apply affine_map<(d0)[s0, s1] -> (-d0 + s0 + s1 + 5)>(%iv)[%ub, %dim]
%slice = tensor.extract_slice %p_hoistable [0] [%size] [1]

Depends On: D146524

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

springerm created this revision.Feb 13 2023, 6:21 AM

Herald added a reviewer: bondhugula. · View Herald TranscriptFeb 13 2023, 6:21 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: hanchung, Moerafaat, bzcheeseman and 22 others. · View Herald Transcript

springerm requested review of this revision.Feb 13 2023, 6:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 6:21 AM

Herald added a subscriber: stephenneuendorffer. · View Herald Transcript

@dcaballe The same infrastructure can be used for computing upper bounds for memref allocations.

Harbormaster completed remote builds in B213412: Diff 496952.Feb 13 2023, 6:21 AM

Herald added a subscriber: jsetoain. · View Herald TranscriptFeb 13 2023, 6:21 AM

add op documentation

Harbormaster completed remote builds in B213587: Diff 497222.Feb 14 2023, 12:09 AM

springerm added a parent revision: D143909: [mlir][SCF][Utils][NFC] Make some utils public for better reuse.Feb 22 2023, 1:10 AM

nicolasvasilache added inline comments.Feb 23 2023, 9:17 AM

mlir/include/mlir/Dialect/SCF/Utils/AffineCanonicalizationUtils.h
17	Isn't the best practice here to just forward-declare ? I am unclear why you reverted to a mix of forward-declaration for OpBuilder and include for AffineMap/Value/ValueRange ?
mlir/include/mlir/Dialect/Tensor/TransformOps/CMakeLists.txt
6	This fails to build for me with: CMake Error at /usr/local/google/home/ntv/github/llvm-project/mlir/include/mlir/Dialect/Tensor/TransformOps/CMakeLists.txt:6 (add_mlir_doc): add_mlir_doc Function invoked with incorrect arguments for function named: add_mlir_doc
mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
26	Please explain a little more what this entails because it involves increasing the tensor size/dimensionality.
mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
36	The footprint in this example can be too high: you'd want to take the ceildiv by %step.
mlir/lib/Dialect/SCF/Utils/AffineCanonicalizationUtils.cpp
87	Can we use `step 1` instead here and mention that this is conservative but less precise in the context of this method?

springerm marked 4 inline comments as done.Feb 27 2023, 1:24 AM

springerm added inline comments.

mlir/include/mlir/Dialect/SCF/Utils/AffineCanonicalizationUtils.h
17	I thought I needed the definition so that it can be used in `FailureOr<...>`. But it actually works without.
mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
36	I tried to do that but `FlatAffineConstraints` was unable to compute an upper bound in that case. It's unclear why this is happening, I'm looking into this. Maybe the system of inequalities is getting too complex (with various "semi-affine exprs"...). It looks like `FlatAffineConstraints` must be extended.

address comments

Harbormaster completed remote builds in B216159: Diff 500698.Feb 27 2023, 1:54 AM

springerm added inline comments.Feb 27 2023, 2:16 AM

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
36	This is indeed due to a shortcoming in `FlatAffineValueConstraints`: // TODO: Whenever there are local variables in the dependence // constraints, we'll conservatively over-approximate, since we don't // always explicitly compute them above (in the while loop). It is worth fixing this? Will probably take a while to understand and rewrite a 200 LOC function.

nicolasvasilache added inline comments.Feb 27 2023, 6:15 AM

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
36	Yes, this is the same rationale as using `step 1` below. Can we update the doc to make this explicit? I.e. divide by %step and mention that in the case of a symbol, we over-approximate with setting `step` to `1` to circumvent the limitation of the analysis.
mlir/lib/Dialect/SCF/Utils/AffineCanonicalizationUtils.cpp
87	if we use step 1, this is not an optional anymore and there is a bit of simplification in your code down the line

nicolasvasilache added inline comments.Feb 28 2023, 5:41 AM

mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
23	Can we rename this transform to `transform.loop_privatize` or something similar and have it take the op to privatize and the number of enclosing loops above which we want to hoist ? This should be kept in sync and compose with the recently landed `hoist_pad`. We can later evolve the syntax from num_loops to something better, informed by how we want these things to compose.
mlir/lib/Dialect/Tensor/Transforms/LoopHoisting.cpp
22	The filename is misleading here, this is not performing hoisting but privatization that will later enable hoisting. Can we move this functionality to a `LoopPrivatization.cpp` file ? As a followup, we should integrate the usage of privatization its usage into `HoistPadding.cpp`. We also now have `SubsetHoisting.cpp` for mechanical parts related to actual hoisting of loop-independent quantities that will also come in handy..

springerm marked 4 inline comments as done.Mar 1 2023, 1:58 AM

springerm added inline comments.

mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
23	This does not privatize the tensor/op though. It simply changes the size of the tensor. Added the number of loops to the op.
mlir/lib/Dialect/Tensor/Transforms/LoopHoisting.cpp
22	How about `LoopTransforms.cpp`? This transformation does not do any privatization. (Assuming "privatization" = "making a private copy of a tensor for each loop iteration".)

address comments

Harbormaster completed remote builds in B216670: Diff 501427.Mar 1 2023, 2:26 AM

update

Harbormaster completed remote builds in B216682: Diff 501448.Mar 1 2023, 3:48 AM

reimplement with ValueBoundsOpInterface

springerm retitled this revision from [mlir][tensor] Add transform to make tensor.pad loop-independent to [mlir][tensor] Add transform to make tensor.pad/empty loop-independent.Mar 21 2023, 6:36 AM

springerm edited the summary of this revision. (Show Details)

Herald added a subscriber: Groverkss. · View Herald TranscriptMar 21 2023, 6:36 AM

springerm edited parent revisions, added: D146524: [mlir][Arith] ValueBoundsOpInterface: Reify with Arith ops; removed: D143909: [mlir][SCF][Utils][NFC] Make some utils public for better reuse.Mar 21 2023, 6:36 AM

Harbormaster completed remote builds in B220709: Diff 506957.Mar 21 2023, 7:45 AM

Does this update change the way this is supposed to interact with HoistPadding ?
Or in other words, how do you see this interacting with HoistPadding ?

In D143910#4212287, @nicolasvasilache wrote:

Does this update change the way this is supposed to interact with HoistPadding ?
Or in other words, how do you see this interacting with HoistPadding ?

HoistPadding does not require this functionality, it just clones the entire loop nest. So no interaction with HoistPadding.

This revision is already a month old and was never landed, but is useful for Diego as an example how to make ops (memref.alloca in his example) hoistable. So I reimplemented it so that it takes advantage of ValueBoundsOpInterface.

springerm mentioned this in D146870: [mlir][Interfaces] ValueBoundsOpInterface: Support IntegerTypes.Mar 25 2023, 4:35 AM

springerm added a child revision: D146870: [mlir][Interfaces] ValueBoundsOpInterface: Support IntegerTypes.Mar 25 2023, 4:35 AM

springerm mentioned this in D145681: [mlir][Interfaces] Add ValueBoundsOpInterface and tensor dialect op impl.Mar 30 2023, 9:22 AM

rebase

Harbormaster completed remote builds in B222970: Diff 509973.Mar 31 2023, 6:27 AM

Looking at the ValueBounds part only for now.

mlir/include/mlir/Dialect/Affine/Transforms/Transforms.h
89 ↗	(On Diff #509973)	I can't follow from the description... This is converting an Affine-based value bound into an Arith-based value bound?
mlir/include/mlir/Interfaces/ValueBoundsOpInterface.h
119 ↗	(On Diff #509973)	typos Could you please elaborate a bit more on what "independent of the values in independencies" mean?
mlir/lib/Interfaces/ValueBoundsOpInterface.cpp
368 ↗	(On Diff #509973)	Something like this in the header doc is what I was hoping for :)

Herald added a subscriber: bviyer. · View Herald TranscriptApr 6 2023, 4:40 PM

address comments

springerm added inline comments.Apr 20 2023, 6:42 PM

mlir/include/mlir/Dialect/Affine/Transforms/Transforms.h
89 ↗	(On Diff #509973)	The name was confusing. I renamed the function and added some more documentation.

Harbormaster completed remote builds in B227038: Diff 515554.Apr 20 2023, 6:48 PM

rebase

springerm mentioned this in D149316: [mlir][memref] Add transform to make alloca ops loop-independent.Apr 26 2023, 6:23 PM

springerm added a child revision: D149316: [mlir][memref] Add transform to make alloca ops loop-independent.Apr 26 2023, 6:24 PM

Harbormaster completed remote builds in B228469: Diff 517420.Apr 26 2023, 7:15 PM

Thanks!

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
36	What happened with this? Was it addressed?
mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp
67	nit: ub to var
mlir/lib/Dialect/Tensor/Transforms/IndependenceTransforms.cpp
79 ↗	(On Diff #517420)	nit: ub to var

This revision is now accepted and ready to land.Apr 26 2023, 10:06 PM

This revision was landed with ongoing or failed builds.Apr 27 2023, 7:47 PM

Closed by commit rG77124386feb6: [mlir][tensor] Add transform to make tensor.pad loop-independent (authored by springerm). · Explain Why

This revision was automatically updated to reflect the committed changes.

springerm marked 3 inline comments as done.

springerm added a commit: rG77124386feb6: [mlir][tensor] Add transform to make tensor.pad loop-independent.

springerm added inline comments.Apr 27 2023, 7:52 PM

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h
36	There is a TODO in `SCF/IR/ValueBoundsOpInterfaceImpl.cpp`. We can't do any better at the moment.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

SCF/

Utils/

AffineCanonicalizationUtils.h

15 lines

Tensor/

CMakeLists.txt

1 line

TransformOps/

CMakeLists.txt

6 lines

TensorTransformOps.h

30 lines

TensorTransformOps.td

62 lines

Transforms/

Transforms.h

29 lines

InitAllDialects.h

2 lines

lib/

Dialect/

Affine/

Analysis/

AffineStructures.cpp

2 lines

SCF/

Utils/

AffineCanonicalizationUtils.cpp

115 lines

Tensor/

CMakeLists.txt

1 line

TransformOps/

CMakeLists.txt

16 lines

TensorTransformOps.cpp

69 lines

Transforms/

CMakeLists.txt

2 lines

LoopHoisting.cpp

106 lines

test/

Dialect/

Tensor/

transform-op-make-loop-independent.mlir

73 lines

utils/

bazel/

llvm-project-overlay/

mlir/

BUILD.bazel

51 lines

Diff 497222

mlir/include/mlir/Dialect/SCF/Utils/AffineCanonicalizationUtils.h

	//===- AffineCanonicalizationUtils.h ----------------------------- C++ --===//			//===- AffineCanonicalizationUtils.h ----------------------------- C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This header file defines utility functions to canonicalize affine ops			// This header file defines utility functions to canonicalize affine ops
	// within SCF op regions.			// within SCF op regions.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef MLIR_DIALECT_SCF_UTILS_AFFINECANONICALIZATIONUTILS_H_			#ifndef MLIR_DIALECT_SCF_UTILS_AFFINECANONICALIZATIONUTILS_H_
	#define MLIR_DIALECT_SCF_UTILS_AFFINECANONICALIZATIONUTILS_H_			#define MLIR_DIALECT_SCF_UTILS_AFFINECANONICALIZATIONUTILS_H_

				#include "mlir/IR/Value.h"
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Isn't the best practice here to just forward-declare ? I am unclear why you reverted to a mix of forward-declaration for OpBuilder and include for AffineMap/Value/ValueRange ? nicolasvasilache: Isn't the best practice here to just forward-declare ? I am unclear why you reverted to a mix…
				springermAuthorUnsubmitted Done Reply Inline Actions I thought I needed the definition so that it can be used in `FailureOr<...>`. But it actually works without. springerm: I thought I needed the definition so that it can be used in `FailureOr<...>`. But it actually…
	#include "mlir/Support/LLVM.h"			#include "mlir/Support/LLVM.h"
	#include "mlir/Support/LogicalResult.h"			#include "mlir/Support/LogicalResult.h"

	namespace mlir {			namespace mlir {
	class AffineApplyOp;			class AffineApplyOp;
	class AffineMap;
	class FlatAffineValueConstraints;			class FlatAffineValueConstraints;
	struct LogicalResult;			struct LogicalResult;
				class OpBuilder;
	class Operation;			class Operation;
	class OpFoldResult;			class OpFoldResult;
	class RewriterBase;			class RewriterBase;
	class Value;
	class ValueRange;

	namespace scf {			namespace scf {
	class IfOp;			class IfOp;

	/// Match "for loop"-like operations: If the first parameter is an iteration			/// Match "for loop"-like operations: If the first parameter is an iteration
	/// variable, return lower/upper bounds via the second/third parameter and the			/// variable, return lower/upper bounds via the second/third parameter and the
	/// step size via the last parameter. The function should return `success` in			/// step size via the last parameter. The function should return `success` in
	/// that case. If the first parameter is not an iteration variable, return			/// that case. If the first parameter is not an iteration variable, return
	/// `failure`.			/// `failure`.
	using LoopMatcherFn = function_ref<LogicalResult(			using LoopMatcherFn = function_ref<LogicalResult(
	Value, OpFoldResult &, OpFoldResult &, OpFoldResult &)>;			Value, OpFoldResult &, OpFoldResult &, OpFoldResult &)>;

	/// Match "for loop"-like operations from the SCF dialect.			/// Match "for loop"-like operations from the SCF dialect.
	LogicalResult matchForLikeLoop(Value iv, OpFoldResult &lb, OpFoldResult &ub,			LogicalResult matchForLikeLoop(Value iv, OpFoldResult &lb, OpFoldResult &ub,
	OpFoldResult &step);			OpFoldResult &step);

	/// Populate the given constraint set with induction variable constraints of a			/// Populate the given constraint set with induction variable constraints of a
	/// "for" loop with the given range and step.			/// "for" loop with the given range and step. The step is optional.
	LogicalResult addLoopRangeConstraints(FlatAffineValueConstraints &cstr,			LogicalResult addLoopRangeConstraints(FlatAffineValueConstraints &cstr,
	Value iv, OpFoldResult lb,			Value iv, OpFoldResult lb,
	OpFoldResult ub, OpFoldResult step);			OpFoldResult ub, OpFoldResult step);

				/// Build a value that computes an upper bound of the result of the given
				/// AffineApplyOp without any loop induction variables.
				///
				/// Return failure if no upper bound could be determined. `changed` is set to
				/// true if the returned bound differs from the given affine.apply op result.
				FailureOr<Value> buildInductionVarIndependentUpperBound(
				OpBuilder &b, Location loc, AffineApplyOp applyOp, bool *changed = nullptr);

	/// Try to canonicalize the given affine.min/max operation in the context of			/// Try to canonicalize the given affine.min/max operation in the context of
	/// for `loops` with a known range.			/// for `loops` with a known range.
	///			///
	/// `loopMatcher` is used to retrieve loop bounds and the step size for a given			/// `loopMatcher` is used to retrieve loop bounds and the step size for a given
	/// iteration variable.			/// iteration variable.
	///			///
	/// Note: `loopMatcher` allows this function to be used with any "for loop"-like			/// Note: `loopMatcher` allows this function to be used with any "for loop"-like
	/// operation (scf.for, scf.parallel and even ops defined in other dialects).			/// operation (scf.for, scf.parallel and even ops defined in other dialects).
	Show All 24 Lines

mlir/include/mlir/Dialect/Tensor/CMakeLists.txt

	add_subdirectory(IR)			add_subdirectory(IR)
	add_subdirectory(Transforms)			add_subdirectory(Transforms)
				add_subdirectory(TransformOps)

mlir/include/mlir/Dialect/Tensor/TransformOps/CMakeLists.txt

This file was added.

				set(LLVM_TARGET_DEFINITIONS TensorTransformOps.td)
				mlir_tablegen(TensorTransformOps.h.inc -gen-op-decls)
				mlir_tablegen(TensorTransformOps.cpp.inc -gen-op-defs)
				add_public_tablegen_target(MLIRTensorTransformOpsIncGen)

				add_mlir_doc(TensorTransformOps Dialects/ -gen-op-doc)
				nicolasvasilacheUnsubmitted Done Reply Inline Actions This fails to build for me with: CMake Error at /usr/local/google/home/ntv/github/llvm-project/mlir/include/mlir/Dialect/Tensor/TransformOps/CMakeLists.txt:6 (add_mlir_doc): add_mlir_doc Function invoked with incorrect arguments for function named: add_mlir_doc nicolasvasilache: This fails to build for me with: ``` CMake Error at /usr/local/google/home/ntv/github/llvm…

mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.h

This file was added.

				//===- TensorTransformOps.h - Tensor transformation ops ---------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_DIALECT_TENSOR_TRANSFORMOPS_TENSORTRANSFORMOPS_H
				#define MLIR_DIALECT_TENSOR_TRANSFORMOPS_TENSORTRANSFORMOPS_H

				#include "mlir/Dialect/PDL/IR/PDLTypes.h"
				#include "mlir/Dialect/Transform/IR/TransformInterfaces.h"
				#include "mlir/Dialect/Transform/IR/TransformTypes.h"
				#include "mlir/IR/OpImplementation.h"

				namespace mlir {
				class DialectRegistry;

				namespace tensor {
				class PadOp;

				void registerTransformDialectExtension(DialectRegistry &registry);
				} // namespace tensor
				} // namespace mlir

				#define GET_OP_CLASSES
				#include "mlir/Dialect/Tensor/TransformOps/TensorTransformOps.h.inc"

				#endif // MLIR_DIALECT_TENSOR_TRANSFORMOPS_TENSORTRANSFORMOPS_H

mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td

This file was added.

				//===- TensorTransformOps.td - Tensor transformation ops ---- tablegen --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef TENSOR_TRANSFORM_OPS
				#define TENSOR_TRANSFORM_OPS

				include "mlir/Dialect/PDL/IR/PDLTypes.td"
				include "mlir/Dialect/Transform/IR/TransformDialect.td"
				include "mlir/Dialect/Transform/IR/TransformInterfaces.td"
				include "mlir/Dialect/Transform/IR/TransformTypes.td"
				include "mlir/Interfaces/SideEffectInterfaces.td"
				include "mlir/IR/OpBase.td"

				def Transform_TensorPadOp : Transform_ConcreteOpType<"tensor.pad">;

				def MakeLoopIndependentOp
				: Op<Transform_Dialect, "tensor.make_loop_independent",
				[FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Can we rename this transform to `transform.loop_privatize` or something similar and have it take the op to privatize and the number of enclosing loops above which we want to hoist ? This should be kept in sync and compose with the recently landed `hoist_pad`. We can later evolve the syntax from num_loops to something better, informed by how we want these things to compose. nicolasvasilache: Can we rename this transform to `transform.loop_privatize` or something similar and have it…
				springermAuthorUnsubmitted Done Reply Inline Actions This does not privatize the tensor/op though. It simply changes the size of the tensor. Added the number of loops to the op. springerm: This does not privatize the tensor/op though. It simply changes the size of the tensor. Added…
				TransformOpInterface, TransformEachOpTrait]> {
				let description = [{
				Rewrite the targeted ops such that their index-typed operands no longer
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Please explain a little more what this entails because it involves increasing the tensor size/dimensionality. nicolasvasilache: Please explain a little more what this entails because it involves increasing the tensor…
				depend on any loop induction variable.

				Currently supported operations are:
				- tensor.pad: Replaced by an upper bound padding, followed by a
				tensor.extract_slice.

				Note: Only index-typed operands that are affine.apply ops are taken into
				account at the moment. Furthermore, only direct uses of SCF induction
				variables are eliminated.

				#### Return modes

				This operation fails if at least one induction variable could not be
				eliminated. In case the targeted op is already independent of induction
				variables, this transform succeeds and returns the unmodified target op.

				Otherwise, the returned handle points to a subset of the produced ops:
				- tensor.pad: The returned handle points to the tensor.extract_slice op.

				This transform op consumes the target handle and produces a result handle.
				}];

				let arguments = (ins PDL_Operation:$target);
				let results = (outs PDL_Operation:$transformed);
				let assemblyFormat = "$target attr-dict";

				let extraClassDeclaration = [{
				::mlir::DiagnosedSilenceableFailure applyToOne(
				::mlir::tensor::PadOp target,
				::mlir::transform::ApplyToEachResultList &results,
				::mlir::transform::TransformState &state);
				}];
				}

				#endif // TENSOR_TRANSFORM_OPS

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h

	Show All 9 Lines
	#define MLIR_DIALECT_TENSOR_TRANSFORMS_TRANSFORMS_H			#define MLIR_DIALECT_TENSOR_TRANSFORMS_TRANSFORMS_H

	#include "mlir/Dialect/Tensor/IR/Tensor.h"			#include "mlir/Dialect/Tensor/IR/Tensor.h"
	#include "mlir/IR/PatternMatch.h"			#include "mlir/IR/PatternMatch.h"

	namespace mlir {			namespace mlir {
	namespace tensor {			namespace tensor {

				/// Build a new tensor::PadOp with low/high padding that is independent of any
				/// SCF loop induction variables. If the op is already independent of loop IVs,
				/// the same PadOp result is returned.
				///
				/// Failure indicates the no suitable upper bound for low/high padding could be
				/// found.
				///
				/// Note: This function takes into account only low/high padding values that
				/// are affine.apply ops that directly use a loop's IV.
				///
				/// Example:
				/// scf.for %iv = %lb to %ub step %step {
				/// %high = affine.apply affine_map<(d0)[s0] -> (s0 - d0)> (%i)[%ub]
				/// %p = tensor.pad %t low[5] high[%high] ...
				/// ...
				/// }
				///
				/// The function builds IR such as:
				/// %high_new = affine.apply affine_map<()[s0, s1] -> (-s0 + s1)> ()[%lb, %ub]
				nicolasvasilacheUnsubmitted Done Reply Inline Actions The footprint in this example can be too high: you'd want to take the ceildiv by %step. nicolasvasilache: The footprint in this example can be too high: you'd want to take the ceildiv by %step.
				springermAuthorUnsubmitted Done Reply Inline Actions I tried to do that but `FlatAffineConstraints` was unable to compute an upper bound in that case. It's unclear why this is happening, I'm looking into this. Maybe the system of inequalities is getting too complex (with various "semi-affine exprs"...). It looks like `FlatAffineConstraints` must be extended. springerm: I tried to do that but `FlatAffineConstraints` was unable to compute an upper bound in that…
				springermAuthorUnsubmitted Done Reply Inline Actions This is indeed due to a shortcoming in `FlatAffineValueConstraints`: // TODO: Whenever there are local variables in the dependence // constraints, we'll conservatively over-approximate, since we don't // always explicitly compute them above (in the while loop). It is worth fixing this? Will probably take a while to understand and rewrite a 200 LOC function. springerm: This is indeed due to a shortcoming in `FlatAffineValueConstraints`: ``` // TODO…
				nicolasvasilacheUnsubmitted Done Reply Inline Actions Yes, this is the same rationale as using `step 1` below. Can we update the doc to make this explicit? I.e. divide by %step and mention that in the case of a symbol, we over-approximate with setting `step` to `1` to circumvent the limitation of the analysis. nicolasvasilache: Yes, this is the same rationale as using `step 1` below. Can we update the doc to make this…
				dcaballeUnsubmitted Done Reply Inline Actions What happened with this? Was it addressed? dcaballe: What happened with this? Was it addressed?
				springermAuthorUnsubmitted Done Reply Inline Actions There is a TODO in `SCF/IR/ValueBoundsOpInterfaceImpl.cpp`. We can't do any better at the moment. springerm: There is a TODO in `SCF/IR/ValueBoundsOpInterfaceImpl.cpp`. We can't do any better at the…
				/// %p_hoistable = tensor.pad %t low[5] high[%high_new]
				/// %dim = tensor.dim %t, %c0
				/// %size = affine.apply affine_map<(d0)[s0, s1] -> (-d0 + s0 + s1 + 5)>
				/// (%iv)[%ub, %dim]
				/// %slice = tensor.extract_slice %p_hoistable [0] [%size] [1]
				///
				/// The slice is returned.
				FailureOr<Value> buildInductionVarIndependentOp(OpBuilder &b,
				tensor::PadOp padOp);

	/// Populates `patterns` with patterns to wrap a tensor.pad op with an scf.if op			/// Populates `patterns` with patterns to wrap a tensor.pad op with an scf.if op
	/// to separate the cases where we don't need padding (all pad sizes are			/// to separate the cases where we don't need padding (all pad sizes are
	/// actually zeros) and where we indeed need padding.			/// actually zeros) and where we indeed need padding.
	void populateSplitPaddingPatterns(RewritePatternSet &patterns,			void populateSplitPaddingPatterns(RewritePatternSet &patterns,
	PatternBenefit baseBenefit = 1);			PatternBenefit baseBenefit = 1);

	/// Pattern to swap an `tensor.extract_slice` with its producer when the			/// Pattern to swap an `tensor.extract_slice` with its producer when the
	/// producer implements the `TilingInterface`. The pattern itself does not			/// producer implements the `TilingInterface`. The pattern itself does not
	Show All 30 Lines

mlir/include/mlir/InitAllDialects.h

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines
#include "mlir/Dialect/SPIRV/IR/SPIRVDialect.h"		#include "mlir/Dialect/SPIRV/IR/SPIRVDialect.h"
#include "mlir/Dialect/Shape/IR/Shape.h"		#include "mlir/Dialect/Shape/IR/Shape.h"
#include "mlir/Dialect/Shape/Transforms/BufferizableOpInterfaceImpl.h"		#include "mlir/Dialect/Shape/Transforms/BufferizableOpInterfaceImpl.h"
#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"		#include "mlir/Dialect/SparseTensor/IR/SparseTensor.h"
#include "mlir/Dialect/SparseTensor/Transforms/BufferizableOpInterfaceImpl.h"		#include "mlir/Dialect/SparseTensor/Transforms/BufferizableOpInterfaceImpl.h"
#include "mlir/Dialect/Tensor/IR/Tensor.h"		#include "mlir/Dialect/Tensor/IR/Tensor.h"
#include "mlir/Dialect/Tensor/IR/TensorInferTypeOpInterfaceImpl.h"		#include "mlir/Dialect/Tensor/IR/TensorInferTypeOpInterfaceImpl.h"
#include "mlir/Dialect/Tensor/IR/TensorTilingInterfaceImpl.h"		#include "mlir/Dialect/Tensor/IR/TensorTilingInterfaceImpl.h"
		#include "mlir/Dialect/Tensor/TransformOps/TensorTransformOps.h"
#include "mlir/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.h"		#include "mlir/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.h"
#include "mlir/Dialect/Tosa/IR/TosaOps.h"		#include "mlir/Dialect/Tosa/IR/TosaOps.h"
#include "mlir/Dialect/Transform/IR/TransformDialect.h"		#include "mlir/Dialect/Transform/IR/TransformDialect.h"
#include "mlir/Dialect/Vector/IR/VectorOps.h"		#include "mlir/Dialect/Vector/IR/VectorOps.h"
#include "mlir/Dialect/Vector/TransformOps/VectorTransformOps.h"		#include "mlir/Dialect/Vector/TransformOps/VectorTransformOps.h"
#include "mlir/Dialect/Vector/Transforms/BufferizableOpInterfaceImpl.h"		#include "mlir/Dialect/Vector/Transforms/BufferizableOpInterfaceImpl.h"
#include "mlir/Dialect/X86Vector/X86VectorDialect.h"		#include "mlir/Dialect/X86Vector/X86VectorDialect.h"
#include "mlir/IR/Dialect.h"		#include "mlir/IR/Dialect.h"
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	inline void registerAllDialects(DialectRegistry &registry) {

// Register all dialect extensions.		// Register all dialect extensions.
affine::registerTransformDialectExtension(registry);		affine::registerTransformDialectExtension(registry);
bufferization::registerTransformDialectExtension(registry);		bufferization::registerTransformDialectExtension(registry);
gpu::registerTransformDialectExtension(registry);		gpu::registerTransformDialectExtension(registry);
linalg::registerTransformDialectExtension(registry);		linalg::registerTransformDialectExtension(registry);
memref::registerTransformDialectExtension(registry);		memref::registerTransformDialectExtension(registry);
scf::registerTransformDialectExtension(registry);		scf::registerTransformDialectExtension(registry);
		tensor::registerTransformDialectExtension(registry);
vector::registerTransformDialectExtension(registry);		vector::registerTransformDialectExtension(registry);

// Register all external models.		// Register all external models.
arith::registerBufferizableOpInterfaceExternalModels(registry);		arith::registerBufferizableOpInterfaceExternalModels(registry);
bufferization::func_ext::registerBufferizableOpInterfaceExternalModels(		bufferization::func_ext::registerBufferizableOpInterfaceExternalModels(
registry);		registry);
linalg::registerBufferizableOpInterfaceExternalModels(registry);		linalg::registerBufferizableOpInterfaceExternalModels(registry);
linalg::registerTilingInterfaceExternalModels(registry);		linalg::registerTilingInterfaceExternalModels(registry);
Show All 20 Lines

mlir/lib/Dialect/Affine/Analysis/AffineStructures.cpp

	Show First 20 Lines • Show All 1,003 Lines • ▼ Show 20 Lines
	/// variables (starting at 'offset') as affine maps of the remaining			/// variables (starting at 'offset') as affine maps of the remaining
	/// variables (dimensional and symbolic variables). Local variables are			/// variables (dimensional and symbolic variables). Local variables are
	/// themselves explicitly computed as affine functions of other variables in			/// themselves explicitly computed as affine functions of other variables in
	/// this process if needed.			/// this process if needed.
	void FlatAffineValueConstraints::getSliceBounds(			void FlatAffineValueConstraints::getSliceBounds(
	unsigned offset, unsigned num, MLIRContext *context,			unsigned offset, unsigned num, MLIRContext *context,
	SmallVectorImpl<AffineMap> lbMaps, SmallVectorImpl<AffineMap> ubMaps,			SmallVectorImpl<AffineMap> lbMaps, SmallVectorImpl<AffineMap> ubMaps,
	bool getClosedUB) {			bool getClosedUB) {
	assert(num < getNumDimVars() && "invalid range");			assert(offset + num <= getNumDimVars() && "invalid range");

	// Basic simplification.			// Basic simplification.
	normalizeConstraintsByGCD();			normalizeConstraintsByGCD();

	LLVM_DEBUG(llvm::dbgs() << "getSliceBounds for first " << num			LLVM_DEBUG(llvm::dbgs() << "getSliceBounds for first " << num
	<< " variables\n");			<< " variables\n");
	LLVM_DEBUG(dump());			LLVM_DEBUG(dump());

	▲ Show 20 Lines • Show All 844 Lines • Show Last 20 Lines

mlir/lib/Dialect/SCF/Utils/AffineCanonicalizationUtils.cpp

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	return rewriter.replaceOpWithNewOp<AffineApplyOp>(
op, simplified->getAffineMap(), simplified->getOperands());		op, simplified->getAffineMap(), simplified->getOperands());
}		}

LogicalResult scf::addLoopRangeConstraints(FlatAffineValueConstraints &cstr,		LogicalResult scf::addLoopRangeConstraints(FlatAffineValueConstraints &cstr,
Value iv, OpFoldResult lb,		Value iv, OpFoldResult lb,
OpFoldResult ub, OpFoldResult step) {		OpFoldResult ub, OpFoldResult step) {
Builder b(iv.getContext());		Builder b(iv.getContext());

// IntegerPolyhedron does not support semi-affine expressions.		// Note: IntegerPolyhedron does not support semi-affine expressions.
// Therefore, only constant step values are supported.		// Therefore, only constant step values are supported. In case of non-const
auto stepInt = getConstantIntValue(step);		// step sizes, the step is not taken into account.
if (!stepInt)		auto stepInt = step ? getConstantIntValue(step) : std::nullopt;
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can we use `step 1` instead here and mention that this is conservative but less precise in the context of this method? nicolasvasilache: Can we use `step 1` instead here and mention that this is conservative but less precise in the…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions if we use step 1, this is not an optional anymore and there is a bit of simplification in your code down the line nicolasvasilache: if we use step 1, this is not an optional anymore and there is a bit of simplification in your…
return failure();

unsigned dimIv = cstr.appendDimVar(iv);		unsigned dimIv = cstr.appendDimVar(iv);
auto lbv = lb.dyn_cast<Value>();		auto lbv = lb.dyn_cast<Value>();
unsigned symLb =		unsigned symLb =
lbv ? cstr.appendSymbolVar(lbv) : cstr.appendSymbolVar(/num=/1);		lbv ? cstr.appendSymbolVar(lbv) : cstr.appendSymbolVar(/num=/1);
auto ubv = ub.dyn_cast<Value>();		auto ubv = ub.dyn_cast<Value>();
unsigned symUb =		unsigned symUb =
ubv ? cstr.appendSymbolVar(ubv) : cstr.appendSymbolVar(/num=/1);		ubv ? cstr.appendSymbolVar(ubv) : cstr.appendSymbolVar(/num=/1);
Show All 9 Lines	LogicalResult scf::addLoopRangeConstraints(FlatAffineValueConstraints &cstr,
// Lower bound: iv >= lb (equiv.: iv - lb >= 0)		// Lower bound: iv >= lb (equiv.: iv - lb >= 0)
SmallVector<int64_t> ineqLb(cstr.getNumCols(), 0);		SmallVector<int64_t> ineqLb(cstr.getNumCols(), 0);
ineqLb[dimIv] = 1;		ineqLb[dimIv] = 1;
ineqLb[symLb] = -1;		ineqLb[symLb] = -1;
cstr.addInequality(ineqLb);		cstr.addInequality(ineqLb);

// Upper bound		// Upper bound
AffineExpr ivUb;		AffineExpr ivUb;
if (lbInt && ubInt && (lbInt + stepInt >= *ubInt)) {		AffineExpr exprLb = lbInt
		? b.getAffineConstantExpr(*lbInt)
		: b.getAffineSymbolExpr(symLb - cstr.getNumDimVars());
		AffineExpr exprUb = ubInt
		? b.getAffineConstantExpr(*ubInt)
		: b.getAffineSymbolExpr(symUb - cstr.getNumDimVars());
		if (!stepInt) {
		// iv < ub
		ivUb = exprUb;
		} else if (lbInt && ubInt && (lbInt + stepInt >= *ubInt)) {
// The loop has at most one iteration.		// The loop has at most one iteration.
// iv < lb + 1		// iv < lb + 1
// TODO: Try to derive this constraint by simplifying the expression in		// TODO: Try to derive this constraint by simplifying the expression in
// the else-branch.		// the else-branch.
ivUb = b.getAffineSymbolExpr(symLb - cstr.getNumDimVars()) + 1;		ivUb = b.getAffineSymbolExpr(symLb - cstr.getNumDimVars()) + 1;
} else {		} else {
// The loop may have more than one iteration.		// The loop may have more than one iteration.
// iv < lb + step * ((ub - lb - 1) floorDiv step) + 1		// iv < lb + step * ((ub - lb - 1) floorDiv step) + 1
AffineExpr exprLb =
lbInt ? b.getAffineConstantExpr(*lbInt)
: b.getAffineSymbolExpr(symLb - cstr.getNumDimVars());
AffineExpr exprUb =
ubInt ? b.getAffineConstantExpr(*ubInt)
: b.getAffineSymbolExpr(symUb - cstr.getNumDimVars());
ivUb = exprLb + 1 + (stepInt ((exprUb - exprLb - 1).floorDiv(*stepInt)));		ivUb = exprLb + 1 + (stepInt ((exprUb - exprLb - 1).floorDiv(*stepInt)));
}		}
auto map = AffineMap::get(		auto map = AffineMap::get(
/dimCount=/cstr.getNumDimVars(),		/dimCount=/cstr.getNumDimVars(),
/symbolCount=/cstr.getNumSymbolVars(), /result=/ivUb);		/symbolCount=/cstr.getNumSymbolVars(), /result=/ivUb);

return cstr.addBound(IntegerPolyhedron::UB, dimIv, map);		return cstr.addBound(IntegerPolyhedron::UB, dimIv, map);
}		}

		static void unpackOptionalValues(ArrayRef<std::optional<Value>> source,
		SmallVector<Value> &target) {
		target =
		llvm::to_vector<4>(llvm::map_range(source, [](std::optional<Value> val) {
		return val.has_value() ? *val : Value();
		}));
		}

		/// Bound an identifier `pos` in a given FlatAffineValueConstraints with
		/// constraints drawn from an affine map. Before adding the constraint, the
		/// dimensions/symbols of the affine map are aligned with `constraints`.
		/// `operands` are the SSA Value operands used with the affine map.
		/// Note: This function adds a new symbol column to the `constraints` for each
		/// dimension/symbol that exists in the affine map but not in `constraints`.
		static LogicalResult alignAndAddBound(FlatAffineValueConstraints &constraints,
		IntegerPolyhedron::BoundType type,
		unsigned pos, AffineMap map,
		ValueRange operands) {
		SmallVector<Value> dims, syms, newSyms;
		unpackOptionalValues(constraints.getMaybeValues(VarKind::SetDim), dims);
		unpackOptionalValues(constraints.getMaybeValues(VarKind::Symbol), syms);

		AffineMap alignedMap =
		alignAffineMapWithValues(map, operands, dims, syms, &newSyms);
		for (unsigned i = syms.size(); i < newSyms.size(); ++i)
		constraints.appendSymbolVar(newSyms[i]);
		return constraints.addBound(type, pos, alignedMap);
		}

		FailureOr<Value> scf::buildInductionVarIndependentUpperBound(
		OpBuilder &b, Location loc, AffineApplyOp applyOp, bool *changed) {
		if (changed)
		*changed = false;

		// Build constraint set for the loop
		FlatAffineValueConstraints cstr;
		unsigned applyOpDim = cstr.appendDimVar();

		SmallVector<Value> allIvs;
		// Find all iteration variables among the operands add constrain them.
		for (Value operand : applyOp->getOperands()) {
		// Skip duplicate ivs.
		if (llvm::is_contained(allIvs, operand))
		continue;

		// If `operand` is an iteration variable: Find corresponding loop
		// bounds and step.
		Value iv = operand;
		OpFoldResult lb, ub, step;
		if (failed(matchForLikeLoop(operand, lb, ub, step)))
		continue;
		allIvs.push_back(iv);
		if (failed(addLoopRangeConstraints(cstr, iv, lb, ub,
		/step=/OpFoldResult())))
		return failure();
		}
		if (allIvs.empty())
		return success();

		// Add the affine map of the affine.apply op.
		if (failed(alignAndAddBound(
		cstr, presburger::IntegerPolyhedron::BoundType::EQ, applyOpDim,
		applyOp.getAffineMap(), applyOp->getOperands())))
		return failure();

		// Project out all iteration variables.
		for (Value iv : allIvs)
		cstr.projectOut(iv);

		// Compute an upper bound for the affine.apply op.
		SmallVector<AffineMap> opLb(1), opUb(1);
		cstr.getSliceBounds(applyOpDim, 1, applyOp->getContext(), &opLb, &opUb);
		if (opUb.empty() \|\| !opUb[0])
		return failure();
		assert(opUb[0].getNumResults() == 1 && "expected single result");

		// Create new AffineApplyOp.
		if (changed)
		*changed = true;
		// Turn open bound into closed bound.
		AffineMap newMap = AffineMap::get(
		opUb[0].getNumDims(), opUb[0].getNumSymbols(), opUb[0].getResult(0) - 1);
		SmallVector<Value> newOperands;
		for (auto maybeValue : cstr.getMaybeValues().drop_front())
		newOperands.push_back(*maybeValue);
		mlir::canonicalizeMapAndOperands(&newMap, &newOperands);
		return b.create<AffineApplyOp>(loc, newMap, newOperands).getResult();
		}

/// Canonicalize min/max operations in the context of for loops with a known		/// Canonicalize min/max operations in the context of for loops with a known
/// range. Call `canonicalizeMinMaxOp` and add the following constraints to		/// range. Call `canonicalizeMinMaxOp` and add the following constraints to
/// the constraint system (along with the missing dimensions):		/// the constraint system (along with the missing dimensions):
///		///
/// * iv >= lb		/// * iv >= lb
/// * iv < lb + step * ((ub - lb - 1) floorDiv step) + 1		/// * iv < lb + step * ((ub - lb - 1) floorDiv step) + 1
///		///
/// Note: Due to limitations of IntegerPolyhedron, only constant step sizes		/// Note: Due to limitations of IntegerPolyhedron, only constant step sizes
▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines

mlir/lib/Dialect/Tensor/CMakeLists.txt

	add_subdirectory(IR)			add_subdirectory(IR)
	add_subdirectory(Transforms)			add_subdirectory(Transforms)
				add_subdirectory(TransformOps)
	add_subdirectory(Utils)			add_subdirectory(Utils)

mlir/lib/Dialect/Tensor/TransformOps/CMakeLists.txt

This file was added.

				add_mlir_dialect_library(MLIRTensorTransformOps
				TensorTransformOps.cpp

				ADDITIONAL_HEADER_DIRS
				${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/Tensor/TransformOps

				DEPENDS
				MLIRTensorTransformOpsIncGen

				LINK_LIBS PUBLIC
				MLIRAffineDialect
				MLIRIR
				MLIRPDLDialect
				MLIRTensorTransforms
				MLIRTransformDialect
				)

mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp

This file was added.

				//===- TensorTransformOps.cpp - Implementation of tensor transform ops ----===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Dialect/Tensor/TransformOps/TensorTransformOps.h"

				#include "mlir/Dialect/Affine/IR/AffineOps.h"
				#include "mlir/Dialect/Tensor/Transforms/Transforms.h"
				#include "mlir/Dialect/Transform/IR/TransformDialect.h"
				#include "mlir/Dialect/Transform/IR/TransformInterfaces.h"
				#include "mlir/Dialect/Transform/IR/TransformUtils.h"

				using namespace mlir;

				//===----------------------------------------------------------------------===//
				// MakeLoopIndependentOp
				//===----------------------------------------------------------------------===//

				DiagnosedSilenceableFailure transform::MakeLoopIndependentOp::applyToOne(
				tensor::PadOp target, transform::ApplyToEachResultList &results,
				transform::TransformState &state) {
				IRRewriter rewriter(target->getContext());
				FailureOr<Value> replacement =
				tensor::buildInductionVarIndependentOp(rewriter, target);
				if (failed(replacement)) {
				DiagnosedSilenceableFailure diag =
				emitSilenceableError() << "could not make target op loop-independent";
				diag.attachNote(target->getLoc()) << "target op";
				return diag;
				}
				rewriter.replaceOp(target, *replacement);
				results.push_back(replacement->getDefiningOp());
				return DiagnosedSilenceableFailure::success();
				}

				//===----------------------------------------------------------------------===//
				// Transform op registration
				//===----------------------------------------------------------------------===//

				namespace {
				class TensorTransformDialectExtension
				: public transform::TransformDialectExtension<
				TensorTransformDialectExtension> {
				public:
				using Base::Base;

				void init() {
				declareGeneratedDialect<AffineDialect>();
				declareGeneratedDialect<tensor::TensorDialect>();

				registerTransformOps<
				#define GET_OP_LIST
				#include "mlir/Dialect/Tensor/TransformOps/TensorTransformOps.cpp.inc"
				>();
				}
				};
				} // namespace

				#define GET_OP_CLASSES
				#include "mlir/Dialect/Tensor/TransformOps/TensorTransformOps.cpp.inc"

				void mlir::tensor::registerTransformDialectExtension(
				DialectRegistry &registry) {
				dcaballeUnsubmitted Done Reply Inline Actions nit: ub to var dcaballe: nit: ub to var
				registry.addExtensions<TensorTransformDialectExtension>();
				}

mlir/lib/Dialect/Tensor/Transforms/CMakeLists.txt

	add_mlir_dialect_library(MLIRTensorTransforms			add_mlir_dialect_library(MLIRTensorTransforms
	BufferizableOpInterfaceImpl.cpp			BufferizableOpInterfaceImpl.cpp
	Bufferize.cpp			Bufferize.cpp
	EmptyOpPatterns.cpp			EmptyOpPatterns.cpp
	ExtractSliceFromReshapeUtils.cpp			ExtractSliceFromReshapeUtils.cpp
	FoldIntoPackAndUnpackPatterns.cpp			FoldIntoPackAndUnpackPatterns.cpp
				LoopHoisting.cpp
	MergeConsecutiveInsertExtractSlicePatterns.cpp			MergeConsecutiveInsertExtractSlicePatterns.cpp
	ReshapePatterns.cpp			ReshapePatterns.cpp
	SplitPaddingPatterns.cpp			SplitPaddingPatterns.cpp
	SwapExtractSliceWithProducerPatterns.cpp			SwapExtractSliceWithProducerPatterns.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/Tensor/Transforms			${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/Tensor/Transforms

	DEPENDS			DEPENDS
	MLIRTensorTransformsIncGen			MLIRTensorTransformsIncGen

	LINK_LIBS PUBLIC			LINK_LIBS PUBLIC
	MLIRAffineDialect			MLIRAffineDialect
	MLIRAffineUtils			MLIRAffineUtils
	MLIRArithDialect			MLIRArithDialect
	MLIRBufferizationDialect			MLIRBufferizationDialect
	MLIRBufferizationTransforms			MLIRBufferizationTransforms
	MLIRIR			MLIRIR
	MLIRLinalgDialect			MLIRLinalgDialect
	MLIRMemRefDialect			MLIRMemRefDialect
	MLIRPass			MLIRPass
	MLIRSCFDialect			MLIRSCFDialect
				MLIRSCFUtils
	MLIRTensorDialect			MLIRTensorDialect
	MLIRTilingInterface			MLIRTilingInterface
	MLIRTransforms			MLIRTransforms
	)			)

mlir/lib/Dialect/Tensor/Transforms/LoopHoisting.cpp

This file was added.

				//===- LoopHoisting.cpp - Hoisting ops from loops -------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Dialect/Tensor/Transforms/Transforms.h"

				#include "mlir/Dialect/Affine/IR/AffineOps.h"
				#include "mlir/Dialect/SCF/Utils/AffineCanonicalizationUtils.h"
				#include "mlir/Dialect/Tensor/IR/Tensor.h"
				#include "mlir/Dialect/Utils/StaticValueUtils.h"

				using namespace mlir;
				using namespace mlir::tensor;

				// Compute upper bounds for low/high padding such that they are independent of
				// any SCF loop induction variables.
				FailureOr<Value> tensor::buildInductionVarIndependentOp(OpBuilder &b,
				tensor::PadOp padOp) {
				nicolasvasilacheUnsubmitted Done Reply Inline Actions The filename is misleading here, this is not performing hoisting but privatization that will later enable hoisting. Can we move this functionality to a `LoopPrivatization.cpp` file ? As a followup, we should integrate the usage of privatization its usage into `HoistPadding.cpp`. We also now have `SubsetHoisting.cpp` for mechanical parts related to actual hoisting of loop-independent quantities that will also come in handy.. nicolasvasilache: The filename is misleading here, this is not performing hoisting but privatization that will…
				springermAuthorUnsubmitted Done Reply Inline Actions How about `LoopTransforms.cpp`? This transformation does not do any privatization. (Assuming "privatization" = "making a private copy of a tensor for each loop iteration".) springerm: How about `LoopTransforms.cpp`? This transformation does not do any privatization. (Assuming…
				OpBuilder::InsertionGuard g(b);
				b.setInsertionPoint(padOp);
				Location loc = padOp.getLoc();

				// Non-constant padding not supported.
				Value constantPadding = padOp.getConstantPaddingValue();
				if (!constantPadding)
				return failure();

				// Try to compute upper bounds for the given values if they affine.apply ops.
				// If they are not affine.apply ops or if the affine.apply ops do not directly
				// depend on loop IVs, simply store them in `result`.
				bool foundUb = false;
				auto computeUpperBounds = [&](ValueRange values, SmallVector<Value> &result) {
				for (Value v : values) {
				auto applyOp = v.getDefiningOp<AffineApplyOp>();
				if (!applyOp) {
				result.push_back(v);
				continue;
				}
				bool changed;
				auto ub = scf::buildInductionVarIndependentUpperBound(b, loc, applyOp,
				&changed);
				if (failed(ub) \|\| !changed) {
				result.push_back(v);
				continue;
				}
				result.push_back(*ub);
				foundUb = true;
				}
				};

				// Compute new low/high padding.
				SmallVector<Value> newLow, newHigh;
				computeUpperBounds(padOp.getLow(), newLow);
				computeUpperBounds(padOp.getHigh(), newHigh);
				// Return failure if no upper bound was computed. (This function would be a
				// no-op.)
				if (!foundUb)
				return failure();
				SmallVector<OpFoldResult> newMixedLow =
				getMixedValues(padOp.getStaticLow(), newLow, b);
				SmallVector<OpFoldResult> newMixedHigh =
				getMixedValues(padOp.getStaticHigh(), newHigh, b);

				// Create a new tensor::PadOp.
				auto newPadOp = b.create<PadOp>(
				loc, padOp.getResultType(), padOp.getSource(), newMixedLow, newMixedHigh,
				constantPadding, padOp.getNofold(), /attrs=/ArrayRef<NamedAttribute>{});

				// Create a tensor::ExtractSliceOp.
				// Reify the result sizes of the old tensor::PadOp.
				ReifiedRankedShapedTypeDims reifiedSizes;
				ReifyRankedShapedTypeOpInterface reifyShapedTypeInterface =
				dyn_cast<ReifyRankedShapedTypeOpInterface>(padOp.getOperation());
				if (failed(reifyShapedTypeInterface.reifyResultShapes(b, reifiedSizes)))
				return failure();
				SmallVector<OpFoldResult> offsets, sizes, strides;
				for (int64_t i = 0; i < padOp.getResultType().getRank(); ++i) {
				// offset = ub(low_padding) - low_padding
				OpFoldResult prevLow = padOp.getMixedLowPad()[i];
				if (prevLow.is<Attribute>()) {
				offsets.push_back(b.getIndexAttr(0));
				} else {
				offsets.push_back(
				b.create<AffineApplyOp>(
				loc, b.getAffineDimExpr(0) - b.getAffineDimExpr(1),
				std::initializer_list<Value>{newMixedLow[i].get<Value>(),
				prevLow.get<Value>()})
				.getResult());
				}
				// size = reified result size
				if (!padOp.getResultType().isDynamicDim(i)) {
				sizes.push_back(b.getIndexAttr(padOp.getResultType().getDimSize(i)));
				} else {
				sizes.push_back(reifiedSizes[0][i]);
				}
				// stride = 1
				strides.push_back(b.getIndexAttr(1));
				}

				return b.create<ExtractSliceOp>(loc, newPadOp, offsets, sizes, strides)
				.getResult();
				}

mlir/test/Dialect/Tensor/transform-op-make-loop-independent.mlir

This file was added.

				// RUN: mlir-opt %s -allow-unregistered-dialect \
				// RUN: -test-transform-dialect-interpreter -canonicalize \
				// RUN: -split-input-file \| FileCheck %s

				// This is a test case where "high" padding depends on the IV.

				// CHECK: #[[$map:.*]] = affine_map<()[s0, s1] -> (-s0 + s1)>
				// CHECK: #[[$map1:.*]] = affine_map<(d0)[s0, s1] -> (-d0 + s0 + s1 + 5)>
				// CHECK-LABEL: func @make_pad_loop_independent_1(
				// CHECK-SAME: %[[lb:.]]: index, %[[ub:.]]: index, %[[step:.*]]: index,
				// CHECK-SAME: %[[t:.*]]: tensor<?xf32>
				func.func @make_pad_loop_independent_1(%lb: index, %ub: index, %step: index,
				%t: tensor<?xf32>, %f: f32) {
				// CHECK: scf.for %[[iv:.*]] = %[[lb]] to %[[ub]]
				scf.for %i = %lb to %ub step %step {
				// CHECK: %[[high:.*]] = affine.apply #[[$map]]()[%[[lb]], %[[ub]]]
				// CHECK: %[[padded:.*]] = tensor.pad %[[t]] low[5] high[%[[high]]]
				// CHECK: %[[dim:.*]] = tensor.dim %[[t]]
				// CHECK: %[[size:.*]] = affine.apply #[[$map1]](%[[iv]])[%[[ub]], %[[dim]]]
				// CHECK: %[[replacement:.*]] = tensor.extract_slice %[[padded]][0] [%[[size]]] [1]
				%high = affine.apply affine_map<(d0)[s0] -> (s0 - d0)> (%i)[%ub]
				%p = tensor.pad %t low[5] high[%high] {
				^bb0(%arg1: index):
				tensor.yield %f : f32
				} : tensor<?xf32> to tensor<?xf32>
				// CHECK: "dummy.some_use"(%[[replacement]])
				"dummy.some_use"(%p) : (tensor<?xf32>) -> ()
				}
				return
				}

				transform.sequence failures(propagate) {
				^bb1(%arg1: !pdl.operation):
				%0 = transform.structured.match ops{["tensor.pad"]} in %arg1 : (!pdl.operation) -> !pdl.operation
				%1 = transform.tensor.make_loop_independent %0
				}

				// -----

				// This is a test case where "low" padding depends on the IV.

				// CHECK: #[[$map:.*]] = affine_map<()[s0, s1] -> (-s0 + s1)>
				// CHECK: #[[$map1:.*]] = affine_map<(d0)[s0, s1] -> (-d0 + s0 + s1 + 5)>
				// CHECK: #[[$map2:.*]] = affine_map<(d0)[s0] -> (d0 - s0)>
				// CHECK-LABEL: func @make_pad_loop_independent_1(
				// CHECK-SAME: %[[lb:.]]: index, %[[ub:.]]: index, %[[step:.*]]: index,
				// CHECK-SAME: %[[t:.*]]: tensor<?xf32>
				func.func @make_pad_loop_independent_1(%lb: index, %ub: index, %step: index,
				%t: tensor<?xf32>, %f: f32) {
				// CHECK: scf.for %[[iv:.*]] = %[[lb]] to %[[ub]]
				scf.for %i = %lb to %ub step %step {
				// CHECK: %[[low:.*]] = affine.apply #[[$map]]()[%[[lb]], %[[ub]]]
				// CHECK: %[[padded:.*]] = tensor.pad %[[t]] low[%[[low]]] high[5]
				// CHECK: %[[dim:.*]] = tensor.dim %[[t]]
				// CHECK: %[[size:.*]] = affine.apply #[[$map1]](%[[iv]])[%[[ub]], %[[dim]]]
				// CHECK: %[[offset:.*]] = affine.apply #[[$map2]](%[[iv]])[%[[lb]]]
				// CHECK: %[[replacement:.*]] = tensor.extract_slice %[[padded]][%[[offset]]] [%[[size]]] [1]
				%low = affine.apply affine_map<(d0)[s0] -> (s0 - d0)> (%i)[%ub]
				%p = tensor.pad %t low[%low] high[5] {
				^bb0(%arg1: index):
				tensor.yield %f : f32
				} : tensor<?xf32> to tensor<?xf32>
				// CHECK: "dummy.some_use"(%[[replacement]])
				"dummy.some_use"(%p) : (tensor<?xf32>) -> ()
				}
				return
				}

				transform.sequence failures(propagate) {
				^bb1(%arg1: !pdl.operation):
				%0 = transform.structured.match ops{["tensor.pad"]} in %arg1 : (!pdl.operation) -> !pdl.operation
				%1 = transform.tensor.make_loop_independent %0
				}

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,530 Lines • ▼ Show 20 Lines	deps = [
":BufferizationTransforms",		":BufferizationTransforms",
":DialectUtils",		":DialectUtils",
":FuncDialect",		":FuncDialect",
":IR",		":IR",
":LinalgDialect",		":LinalgDialect",
":MemRefDialect",		":MemRefDialect",
":Pass",		":Pass",
":SCFDialect",		":SCFDialect",
		":SCFUtils",
":TensorDialect",		":TensorDialect",
":TensorPassIncGen",		":TensorPassIncGen",
":TilingInterface",		":TilingInterface",
":Transforms",		":Transforms",
"//llvm:Support",		"//llvm:Support",
],		],
)		)

		td_library(
		name = "TensorTransformOpsTdFiles",
		srcs = [
		"include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td",
		],
		includes = ["include"],
		deps = [
		":PDLDialect",
		":TransformDialectTdFiles",
		],
		)

		gentbl_cc_library(
		name = "TensorTransformOpsIncGen",
		strip_include_prefix = "include",
		tbl_outs = [
		(
		["-gen-op-decls"],
		"include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.h.inc",
		),
		(
		["-gen-op-defs"],
		"include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.cpp.inc",
		),
		],
		tblgen = ":mlir-tblgen",
		td_file = "include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td",
		deps = [
		":TensorTransformOpsTdFiles",
		],
		)

		cc_library(
		name = "TensorTransformOps",
		srcs = glob(["lib/Dialect/Tensor/TransformOps/*.cpp"]),
		hdrs = glob(["include/mlir/Dialect/Tensor/TransformOps/*.h"]),
		includes = ["include"],
		deps = [
		":AffineDialect",
		":IR",
		":PDLDialect",
		":TensorDialect",
		":TensorTransformOpsIncGen",
		":TensorTransforms",
		":TransformDialect",
		"//llvm:Support",
		],
		)

cc_library(		cc_library(
name = "Rewrite",		name = "Rewrite",
srcs = glob([		srcs = glob([
"lib/Rewrite/*.cpp",		"lib/Rewrite/*.cpp",
"lib/Rewrite/*.h",		"lib/Rewrite/*.h",
]),		]),
hdrs = glob(["include/mlir/Rewrite/*.h"]),		hdrs = glob(["include/mlir/Rewrite/*.h"]),
includes = ["include"],		includes = ["include"],
▲ Show 20 Lines • Show All 1,412 Lines • ▼ Show 20 Lines	deps = [
":ShapeTransforms",		":ShapeTransforms",
":ShapeTransformsPassIncGen",		":ShapeTransformsPassIncGen",
":SparseTensorDialect",		":SparseTensorDialect",
":SparseTensorPipelines",		":SparseTensorPipelines",
":SparseTensorTransforms",		":SparseTensorTransforms",
":TensorDialect",		":TensorDialect",
":TensorInferTypeOpInterfaceImpl",		":TensorInferTypeOpInterfaceImpl",
":TensorTilingInterfaceImpl",		":TensorTilingInterfaceImpl",
		":TensorTransformOps",
":TensorTransforms",		":TensorTransforms",
":TosaDialect",		":TosaDialect",
":TosaToLinalg",		":TosaToLinalg",
":TransformDialect",		":TransformDialect",
":TransformDialectTransforms",		":TransformDialectTransforms",
":Transforms",		":Transforms",
":TransformsPassIncGen",		":TransformsPassIncGen",
":VectorDialect",		":VectorDialect",
▲ Show 20 Lines • Show All 3,610 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][tensor] Add transform to make tensor.pad/empty loop-independentClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 497222

mlir/include/mlir/Dialect/SCF/Utils/AffineCanonicalizationUtils.h

mlir/include/mlir/Dialect/Tensor/CMakeLists.txt

mlir/include/mlir/Dialect/Tensor/TransformOps/CMakeLists.txt

mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.h

mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td

mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h

mlir/include/mlir/InitAllDialects.h

mlir/lib/Dialect/Affine/Analysis/AffineStructures.cpp

mlir/lib/Dialect/SCF/Utils/AffineCanonicalizationUtils.cpp

mlir/lib/Dialect/Tensor/CMakeLists.txt

mlir/lib/Dialect/Tensor/TransformOps/CMakeLists.txt

mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp

mlir/lib/Dialect/Tensor/Transforms/CMakeLists.txt

mlir/lib/Dialect/Tensor/Transforms/LoopHoisting.cpp

mlir/test/Dialect/Tensor/transform-op-make-loop-independent.mlir

utils/bazel/llvm-project-overlay/mlir/BUILD.bazel

[mlir][tensor] Add transform to make tensor.pad/empty loop-independent
ClosedPublic