This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Transforms/
-
mlir/
-
Transforms/
-
Passes.h
1/3
Passes.td
-
lib/Transforms/
-
Transforms/
13/16
BufferResultsToOutParams.cpp
-
CMakeLists.txt
-
test/Transforms/
-
Transforms/
2/3
buffer-results-to-out-params.mlir

Differential D90071

[mlir] Add BufferResultsToOutParams pass.
ClosedPublic

Authored by silvas on Oct 23 2020, 12:53 PM.

Download Raw Diff

Details

Reviewers

herhut
dfki-mako

Commits

rGb8665742462d: [mlir] Add BufferResultsToOutParams pass.

Summary

This pass allows removing getResultConversionKind from
BufferizeTypeConverter. This pass replaces the AppendToArgumentsList
functionality. As far as I could tell, the only use of this functionlity
is to perform the transformation that is implemented in this pass.

Future patches will remove the getResultConversionKind machinery from
BufferizeTypeConverter, but sending this patch for individual review for
clarity.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

silvas created this revision.Oct 23 2020, 12:53 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 23 2020, 12:53 PM

Herald added subscribers: rdzhabarov, tatianashp, msifontes and 15 others. · View Herald Transcript

silvas requested review of this revision.Oct 23 2020, 12:53 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald TranscriptOct 23 2020, 12:53 PM

Harbormaster completed remote builds in B76237: Diff 300387.Oct 23 2020, 1:22 PM

dfki-mako added inline comments.Oct 27 2020, 4:41 PM

mlir/include/mlir/Transforms/Passes.td
220	Just for my understanding: This pass basically simplifies the use of `populateWithBufferizeOpConversionPatterns` via a custom Rewriter instance for an input dialect, right? Do you plan to remove the conversion patterns in favor of this pass in the future? My question is here: Why should this pass be executed after the entire bufferization process? Isn't it useful to apply this pass before applying the `BufferDeallocation` pass, which inserts all the necessary deallocations? Or do you refer to the first part of the "overall" bufferization process, which ports the input program into the "buffer world" (without taking deallocations into account)? Regarding the Linalg dialect dependency: We are going to discuss possible solutions to remove the Std- and Linalg dialect dependencies from the `BufferDeallocation` pass in the public Discourse-Forums these days. I suspect that the result of this discussion can also be applied to this pass (in the future).

silvas added inline comments.Oct 27 2020, 5:00 PM

mlir/include/mlir/Transforms/Passes.td
220	This pass eliminates the need for the AppendToArgumentList conversion kind, which allows removing that complexity from BufferizeTypeConverter. (and together with removing the tuple stuff, BufferizeTypeConverter can just be a regular type converter; everything gets quite a bit simpler). Regarding BufferDeallocation, I think we agree but just got confused by terminology. "bufferization" (as defined in Bufferize.h and being used here) is about applying conversions. So this would be applied right after applying the conversions, but before any buffer optimizations.

I guess this approach introduces a limitation with respect to applicability to arbitrary dialects. We are now able to match ReturnOp and CallOps only while the previous version supported a generic customization to arbitrary dialects, right?
I would recommend that we should try to match CallOpInterface and ReturnLike implementations to make this code more generic in favor of standard CallOp and ReturnOp-based matchers from the Std dialect.

mlir/include/mlir/Transforms/Passes.td
220	AppendToArgumentList conversion kind, which allows removing that complexity from BufferizeTypeConverter Sounds reasonable, since it simplifies the type converter to become "just a type converter" (as you said). Regarding BufferDeallocation, I think we agree but just got confused by terminology. "bufferization" Sounds pretty good to me, too. In this context, this pass should be the last one to be executed (as you mentioned) to "fix" all return ops and call sites (at least for statically shaped memrefs).
mlir/lib/Transforms/BufferResultsToOutParams.cpp
38	`functionType` (above) vs `func.getType()` (2 uses)
44	I am note sure about the braces here since it is a single statement. However, they improve readability in this case.
53	Suggestion: maybe we could change the function to receive an additional vector by reference and push the results into this vector instead of returning an empty vector in this case.
59	nit: maybe a tiny comment would be nice in this case something like: `// Updates all ReturnOps in the scope of the given FuncOp by either keeping them as return values or copying the associated buffer contents into the given out params.`
71	Suggestion: maybe change `t` to `outParamEntry`. Regarding the braces: same comment as above.
80	nit: maybe another tiny comment would be nice in this case, as well something like: // Updates all CallOps in the scope of the given ModuleOp by allocating temporary buffers for newly introduced out params
94	Maybe I am missing the point but can't we just check this in the first loop to avoid this additional cast?
112	nit: maybe change `t` to something more meaningful.
127	Suggestion: If we change the signature of the `updateFuncOp` function, we can avoid this variable. (see above)
mlir/test/Transforms/buffer-results-to-out-params.mlir
98	As you are updating the tests anyway, it would be really nice to break after 80 chars.

dfki-mako requested changes to this revision.Oct 28 2020, 5:42 AM

This revision now requires changes to proceed.Oct 28 2020, 5:42 AM

Address @dfki-mako comments.

Thanks Marcel for the very thorough review!

In D90071#2358875, @dfki-mako wrote:

I guess this approach introduces a limitation with respect to applicability to arbitrary dialects. We are now able to match ReturnOp and CallOps only while the previous version supported a generic customization to arbitrary dialects, right?
I would recommend that we should try to match CallOpInterface and ReturnLike implementations to make this code more generic in favor of standard CallOp and ReturnOp-based matchers from the Std dialect.

Good point! I really would like to write this pass as being generic as well. I tried to write this pass against CallOpInterface and ReturnLike and FunctionLike, but they currently don't have enough functionality for this transformation, in particular the parts where we need to mutate/recreate the ops.

For example, FunctionLike doesn't guarantee that the op has a FunctionType, so all the code here where we modify the function type can't be written generically (see this thread for a similar issue:
https://llvm.discourse.group/t/moving-erasearguments-from-funcop-to-functionlike/2016/14). In particular, even after the patches in that thread, we cannot append new result types. (also, I don't think that traits like FunctionLike / ReturnLike.

Also, note that FunctionLike is a trait, not an interface, so it cannot be accessed dynamically. One must already have an object of the concrete type to use it.

mlir/lib/Transforms/BufferResultsToOutParams.cpp
94	I conceptually associate this error with the creation of the AllocOp -- we don't care about this limitation elsewhere in the entire pass, and one can imagine ways to avoid this, which would involve changing this code, rather than the code above (which merely determines which results we need to convert to an out param -- not how we do it). For example, we could require annotating functions with an attribute that somehow describes how to compute their out param sizes e.g. `func @callee(...) -> (...) attributes {shape_transfer_function = @compute_callee_out_param_shapes}`, or do so via some analysis, etc. In that circumstance, the static shape case is but one of multiple ways that we would know how to allocate the results. Also this pass only runs once in the pipeline, so performance isn't that critical. For example, we are walking every op in the module just to find call ops, and we effectively do the same for return. If we really cared about performance, we would only do a single walk of the whole module, but that would make the code a lot more confusing.
112	Same comment as above, I think that succinctness here makes the code clearer, as the minimum meaningful name is just too long and distracting. It's too bad that C++ doesn't let us do something like this: for (auto [replaceMe, newValue] : llvm::zip(replaceWithNewCallResults, newCall.getResults())) replaceMe.replaceAllUsesWith(newValue) I think that C++17 has a feature for this, but I don't know if we allow using it yet: https://en.cppreference.com/w/cpp/language/structured_binding
mlir/test/Transforms/buffer-results-to-out-params.mlir
98	I typically don't break in test cases so that it reflects the typical printed IR structure. The theory is that our brains are mostly tuned to parse non-line-broken IR since that is what we usually read (unlike in C++), so line-breaking it tends to be more confusing since there aren't clear style preferences and mental heuristics for parsing it.

Also, to answer your specific question, the previous version did not support a generic customization to arbitrary dialects. The FuncOp / CallOp patterns in the previous code are hard-coded (which as I described, cannot be removed with the current set of traits/interfaces). Only BufferizeReturnOpConverter converter is templatized, but without being able to convert generic function-like ops, that doesn't seem to be very useful (since converting the function-like ops is the hardest part).

Harbormaster completed remote builds in B76819: Diff 301444.Oct 28 2020, 5:15 PM

For example, FunctionLike doesn't guarantee that the op has a FunctionType, so all the code here where we modify the function type can't be written generically (see this thread for a similar issue:

Unfortunately, we cannot use this trait for the reasons you mentioned.

Also, to answer your specific question, the previous version did not support a generic customization to arbitrary dialects.

Yeah, you are absolutely right. Consequently, we will not lose any genericity by merging your CL, which is really great +1
Future extensions in the MLIR world might enable us to make this pass more generic with respect to FuncOps.

dfki-mako added inline comments.Oct 29 2020, 3:55 PM

mlir/lib/Transforms/BufferResultsToOutParams.cpp
94	Sounds reasonable. I guess we can safely keep it this way +1.
112	Yes, unfortunately that is also true. As far as I know, we shouldn't use this extension at the moment.
mlir/test/Transforms/buffer-results-to-out-params.mlir
98	+1

Awesome work +1. I guess this should be ready to go after replacing some autos with their underlying types.

mlir/lib/Transforms/BufferResultsToOutParams.cpp
66	nit: `auto` -> `OpOperand`
88	nit: `auto` -> `OpResult`
96	nit: `auto` -> `Value`

address comments

Harbormaster completed remote builds in B77003: Diff 301792.Oct 29 2020, 5:28 PM

dfki-mako accepted this revision.Oct 29 2020, 5:38 PM

This revision is now accepted and ready to land.Oct 29 2020, 5:38 PM

This revision was landed with ongoing or failed builds.Oct 30 2020, 2:08 PM

Closed by commit rGb8665742462d: [mlir] Add BufferResultsToOutParams pass. (authored by silvas). · Explain Why

This revision was automatically updated to reflect the committed changes.

silvas added a commit: rGb8665742462d: [mlir] Add BufferResultsToOutParams pass..

silvas mentioned this in D90778: [mlir] Remove AppendToArgumentsList functionality from BufferizeTypeConverter..Nov 4 2020, 11:00 AM

silvas mentioned this in rGf7bc56826616: [mlir] Remove AppendToArgumentsList functionality from BufferizeTypeConverter..Nov 5 2020, 11:22 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

Transforms/

Passes.h

3 lines

Passes.td

25 lines

lib/

Transforms/

BufferResultsToOutParams.cpp

143 lines

CMakeLists.txt

1 line

test/

Transforms/

buffer-results-to-out-params.mlir

113 lines

Diff 302013

mlir/include/mlir/Transforms/Passes.h

	Show All 38 Lines
	/// Creates a pass that moves allocations upwards out of loops. This avoids			/// Creates a pass that moves allocations upwards out of loops. This avoids
	/// reallocations inside of loops.			/// reallocations inside of loops.
	std::unique_ptr<Pass> createBufferLoopHoistingPass();			std::unique_ptr<Pass> createBufferLoopHoistingPass();

	/// Creates a pass that promotes heap-based allocations to stack-based ones.			/// Creates a pass that promotes heap-based allocations to stack-based ones.
	std::unique_ptr<Pass>			std::unique_ptr<Pass>
	createPromoteBuffersToStackPass(unsigned maxAllocSizeInBytes = 1024);			createPromoteBuffersToStackPass(unsigned maxAllocSizeInBytes = 1024);

				/// Creates a pass that converts memref function results to out-params.
				std::unique_ptr<Pass> createBufferResultsToOutParamsPass();

	/// Creates an instance of the Canonicalizer pass.			/// Creates an instance of the Canonicalizer pass.
	std::unique_ptr<Pass> createCanonicalizerPass();			std::unique_ptr<Pass> createCanonicalizerPass();

	/// Create a pass that removes unnecessary Copy operations.			/// Create a pass that removes unnecessary Copy operations.
	std::unique_ptr<Pass> createCopyRemovalPass();			std::unique_ptr<Pass> createCopyRemovalPass();

	/// Creates a pass to perform common sub expression elimination.			/// Creates a pass to perform common sub expression elimination.
	std::unique_ptr<Pass> createCSEPass();			std::unique_ptr<Pass> createCSEPass();
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

mlir/include/mlir/Transforms/Passes.td

Show First 20 Lines • Show All 211 Lines • ▼ Show 20 Lines	def PromoteBuffersToStack : FunctionPass<"promote-buffers-to-stack"> {
let constructor = "mlir::createPromoteBuffersToStackPass()";		let constructor = "mlir::createPromoteBuffersToStackPass()";
let options = [		let options = [
Option<"maxAllocSizeInBytes", "max-alloc-size-in-bytes", "unsigned",		Option<"maxAllocSizeInBytes", "max-alloc-size-in-bytes", "unsigned",
/default=/"1024",		/default=/"1024",
"Define the maximum size in bytes to promote allocations to stack.">,		"Define the maximum size in bytes to promote allocations to stack.">,
];		];
}		}

		def BufferResultsToOutParams : Pass<"buffer-results-to-out-params", "ModuleOp"> {
		dfki-makoUnsubmitted Not Done Reply Inline Actions Just for my understanding: This pass basically simplifies the use of `populateWithBufferizeOpConversionPatterns` via a custom Rewriter instance for an input dialect, right? Do you plan to remove the conversion patterns in favor of this pass in the future? My question is here: Why should this pass be executed after the entire bufferization process? Isn't it useful to apply this pass before applying the `BufferDeallocation` pass, which inserts all the necessary deallocations? Or do you refer to the first part of the "overall" bufferization process, which ports the input program into the "buffer world" (without taking deallocations into account)? Regarding the Linalg dialect dependency: We are going to discuss possible solutions to remove the Std- and Linalg dialect dependencies from the `BufferDeallocation` pass in the public Discourse-Forums these days. I suspect that the result of this discussion can also be applied to this pass (in the future). dfki-mako: Just for my understanding: This pass basically simplifies the use of…
		silvasAuthorUnsubmitted Done Reply Inline Actions This pass eliminates the need for the AppendToArgumentList conversion kind, which allows removing that complexity from BufferizeTypeConverter. (and together with removing the tuple stuff, BufferizeTypeConverter can just be a regular type converter; everything gets quite a bit simpler). Regarding BufferDeallocation, I think we agree but just got confused by terminology. "bufferization" (as defined in Bufferize.h and being used here) is about applying conversions. So this would be applied right after applying the conversions, but before any buffer optimizations. silvas: This pass eliminates the need for the AppendToArgumentList conversion kind, which allows…
		dfki-makoUnsubmitted Not Done Reply Inline Actions AppendToArgumentList conversion kind, which allows removing that complexity from BufferizeTypeConverter Sounds reasonable, since it simplifies the type converter to become "just a type converter" (as you said). Regarding BufferDeallocation, I think we agree but just got confused by terminology. "bufferization" Sounds pretty good to me, too. In this context, this pass should be the last one to be executed (as you mentioned) to "fix" all return ops and call sites (at least for statically shaped memrefs). dfki-mako: > AppendToArgumentList conversion kind, which allows removing that complexity from…
		let summary = "Converts memref-typed function results to out-params";
		let description = [{
		Some calling conventions prefer to pass output memrefs as "out params". The
		conversion to this calling convention must be done as an atomic
		transformation of the entire program (hence this is a module pass).

		For example, if a call is rewritten, the callee needs to be rewritten
		otherwise the IR will end up invalid. Thus, this transformation
		require an atomic change to the entire program (e.g. the whole module).

		This pass is expected to run immediately after bufferization is finished.
		At that point, tensor-typed results will have been converted to memref-typed
		results, and can be consistently converted to out params.

		All memref-typed results are appended to the function argument list.

		The main issue with this pass (and the out-param calling convention) is that
		buffers for results need to be allocated in the caller. This currently only
		works for static shaped memrefs.
		}];
		let constructor = "mlir::createBufferResultsToOutParamsPass()";
		let dependentDialects = ["linalg::LinalgDialect"];
		}

def Canonicalizer : Pass<"canonicalize"> {		def Canonicalizer : Pass<"canonicalize"> {
let summary = "Canonicalize operations";		let summary = "Canonicalize operations";
let description = [{		let description = [{
This pass performs various types of canonicalizations over a set of		This pass performs various types of canonicalizations over a set of
operations. See [Operation Canonicalization](Canonicalization.md) for more		operations. See [Operation Canonicalization](Canonicalization.md) for more
details.		details.
}];		}];
let constructor = "mlir::createCanonicalizerPass()";		let constructor = "mlir::createCanonicalizerPass()";
▲ Show 20 Lines • Show All 349 Lines • Show Last 20 Lines

mlir/lib/Transforms/BufferResultsToOutParams.cpp

This file was added.

				//===- BufferResultsToOutParams.cpp - Calling convention conversion -------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "PassDetail.h"
				#include "mlir/Dialect/Linalg/IR/LinalgOps.h"
				#include "mlir/Dialect/StandardOps/IR/Ops.h"
				#include "mlir/IR/Operation.h"
				#include "mlir/Pass/Pass.h"
				#include "mlir/Transforms/Passes.h"

				using namespace mlir;

				// Updates the func op and entry block.
				//
				// Any args appended to the entry block are added to `appendedEntryArgs`.
				static void updateFuncOp(FuncOp func,
				SmallVectorImpl<BlockArgument> &appendedEntryArgs) {
				auto functionType = func.getType();

				// Collect information about the results will become appended arguments.
				SmallVector<Type, 6> erasedResultTypes;
				SmallVector<unsigned, 6> erasedResultIndices;
				for (auto resultType : llvm::enumerate(functionType.getResults())) {
				if (resultType.value().isa<BaseMemRefType>()) {
				erasedResultIndices.push_back(resultType.index());
				erasedResultTypes.push_back(resultType.value());
				}
				}

				// Add the new arguments to the function type.
				auto newArgTypes = llvm::to_vector<6>(
				llvm::concat<const Type>(functionType.getInputs(), erasedResultTypes));
				auto newFunctionType = FunctionType::get(
				dfki-makoUnsubmitted Done Reply Inline Actions `functionType` (above) vs `func.getType()` (2 uses) dfki-mako: `functionType` (above) vs `func.getType()` (2 uses)
				newArgTypes, functionType.getResults(), func.getContext());
				func.setType(newFunctionType);

				// Transfer the result attributes to arg attributes.
				for (int i = 0, e = erasedResultTypes.size(); i < e; i++)
				func.setArgAttrs(functionType.getNumInputs() + i,
				dfki-makoUnsubmitted Done Reply Inline Actions I am note sure about the braces here since it is a single statement. However, they improve readability in this case. dfki-mako: I am note sure about the braces here since it is a single statement. However, they improve…
				func.getResultAttrs(erasedResultIndices[i]));

				// Erase the results.
				func.eraseResults(erasedResultIndices);

				// Add the new arguments to the entry block if the function is not external.
				if (func.isExternal())
				return;
				auto newArgs = func.front().addArguments(erasedResultTypes);
				dfki-makoUnsubmitted Done Reply Inline Actions Suggestion: maybe we could change the function to receive an additional vector by reference and push the results into this vector instead of returning an empty vector in this case. dfki-mako: Suggestion: maybe we could change the function to receive an additional vector by reference and…
				appendedEntryArgs.append(newArgs.begin(), newArgs.end());
				}

				// Updates all ReturnOps in the scope of the given FuncOp by either keeping them
				// as return values or copying the associated buffer contents into the given
				// out-params.
				dfki-makoUnsubmitted Done Reply Inline Actions nit: maybe a tiny comment would be nice in this case something like: `// Updates all ReturnOps in the scope of the given FuncOp by either keeping them as return values or copying the associated buffer contents into the given out params.` dfki-mako: nit: maybe a tiny comment would be nice in this case something like: `// Updates all ReturnOps…
				static void updateReturnOps(FuncOp func,
				ArrayRef<BlockArgument> appendedEntryArgs) {
				func.walk([&](ReturnOp op) {
				SmallVector<Value, 6> copyIntoOutParams;
				SmallVector<Value, 6> keepAsReturnOperands;
				for (Value operand : op.getOperands()) {
				if (operand.getType().isa<BaseMemRefType>())
				dfki-makoUnsubmitted Done Reply Inline Actions nit: `auto` -> `OpOperand` dfki-mako: nit: `auto` -> `OpOperand`
				copyIntoOutParams.push_back(operand);
				else
				keepAsReturnOperands.push_back(operand);
				}
				OpBuilder builder(op);
				dfki-makoUnsubmitted Done Reply Inline Actions Suggestion: maybe change `t` to `outParamEntry`. Regarding the braces: same comment as above. dfki-mako: Suggestion: maybe change `t` to `outParamEntry`. Regarding the braces: same comment as above.
				for (auto t : llvm::zip(copyIntoOutParams, appendedEntryArgs))
				builder.create<linalg::CopyOp>(op.getLoc(), std::get<0>(t),
				std::get<1>(t));
				builder.create<ReturnOp>(op.getLoc(), keepAsReturnOperands);
				op.erase();
				});
				}

				// Updates all CallOps in the scope of the given ModuleOp by allocating
				dfki-makoUnsubmitted Done Reply Inline Actions nit: maybe another tiny comment would be nice in this case, as well something like: // Updates all CallOps in the scope of the given ModuleOp by allocating temporary buffers for newly introduced out params dfki-mako: nit: maybe another tiny comment would be nice in this case, as well something like: // Updates…
				// temporary buffers for newly introduced out params.
				static LogicalResult updateCalls(ModuleOp module) {
				bool didFail = false;
				module.walk([&](CallOp op) {
				SmallVector<Value, 6> replaceWithNewCallResults;
				SmallVector<Value, 6> replaceWithOutParams;
				for (OpResult result : op.getResults()) {
				if (result.getType().isa<BaseMemRefType>())
				dfki-makoUnsubmitted Done Reply Inline Actions nit: `auto` -> `OpResult` dfki-mako: nit: `auto` -> `OpResult`
				replaceWithOutParams.push_back(result);
				else
				replaceWithNewCallResults.push_back(result);
				}
				SmallVector<Value, 6> outParams;
				OpBuilder builder(op);
				dfki-makoUnsubmitted Not Done Reply Inline Actions Maybe I am missing the point but can't we just check this in the first loop to avoid this additional cast? dfki-mako: Maybe I am missing the point but can't we just check this in the first loop to avoid this…
				silvasAuthorUnsubmitted Done Reply Inline Actions I conceptually associate this error with the creation of the AllocOp -- we don't care about this limitation elsewhere in the entire pass, and one can imagine ways to avoid this, which would involve changing this code, rather than the code above (which merely determines which results we need to convert to an out param -- not how we do it). For example, we could require annotating functions with an attribute that somehow describes how to compute their out param sizes e.g. `func @callee(...) -> (...) attributes {shape_transfer_function = @compute_callee_out_param_shapes}`, or do so via some analysis, etc. In that circumstance, the static shape case is but one of multiple ways that we would know how to allocate the results. Also this pass only runs once in the pipeline, so performance isn't that critical. For example, we are walking every op in the module just to find call ops, and we effectively do the same for return. If we really cared about performance, we would only do a single walk of the whole module, but that would make the code a lot more confusing. silvas: I conceptually associate this error with the creation of the AllocOp -- we don't care about…
				dfki-makoUnsubmitted Done Reply Inline Actions Sounds reasonable. I guess we can safely keep it this way +1. dfki-mako: Sounds reasonable. I guess we can safely keep it this way +1.
				for (Value memref : replaceWithOutParams) {
				if (!memref.getType().cast<BaseMemRefType>().hasStaticShape()) {
				dfki-makoUnsubmitted Not Done Reply Inline Actions nit: `auto` -> `Value` dfki-mako: nit: `auto` -> `Value`
				op.emitError()
				<< "cannot create out param for dynamically shaped result";
				didFail = true;
				return;
				}
				Value outParam = builder.create<AllocOp>(
				op.getLoc(), memref.getType().cast<MemRefType>());
				memref.replaceAllUsesWith(outParam);
				outParams.push_back(outParam);
				}

				auto newOperands = llvm::to_vector<6>(op.getOperands());
				newOperands.append(outParams.begin(), outParams.end());
				auto newResultTypes = llvm::to_vector<6>(llvm::map_range(
				replaceWithNewCallResults, [](Value v) { return v.getType(); }));
				auto newCall = builder.create<CallOp>(op.getLoc(), op.calleeAttr(),
				dfki-makoUnsubmitted Not Done Reply Inline Actions nit: maybe change `t` to something more meaningful. dfki-mako: nit: maybe change `t` to something more meaningful.
				silvasAuthorUnsubmitted Done Reply Inline Actions Same comment as above, I think that succinctness here makes the code clearer, as the minimum meaningful name is just too long and distracting. It's too bad that C++ doesn't let us do something like this: for (auto [replaceMe, newValue] : llvm::zip(replaceWithNewCallResults, newCall.getResults())) replaceMe.replaceAllUsesWith(newValue) I think that C++17 has a feature for this, but I don't know if we allow using it yet: https://en.cppreference.com/w/cpp/language/structured_binding silvas: Same comment as above, I think that succinctness here makes the code clearer, as the minimum…
				dfki-makoUnsubmitted Done Reply Inline Actions Yes, unfortunately that is also true. As far as I know, we shouldn't use this extension at the moment. dfki-mako: Yes, unfortunately that is also true. As far as I know, we shouldn't use this extension at the…
				newResultTypes, newOperands);
				for (auto t : llvm::zip(replaceWithNewCallResults, newCall.getResults()))
				std::get<0>(t).replaceAllUsesWith(std::get<1>(t));
				op.erase();
				});

				return failure(didFail);
				}

				namespace {
				struct BufferResultsToOutParamsPass
				: BufferResultsToOutParamsBase<BufferResultsToOutParamsPass> {
				void runOnOperation() override {
				ModuleOp module = getOperation();

				dfki-makoUnsubmitted Done Reply Inline Actions Suggestion: If we change the signature of the `updateFuncOp` function, we can avoid this variable. (see above) dfki-mako: Suggestion: If we change the signature of the `updateFuncOp` function, we can avoid this…
				for (auto func : module.getOps<FuncOp>()) {
				SmallVector<BlockArgument, 6> appendedEntryArgs;
				updateFuncOp(func, appendedEntryArgs);
				if (func.isExternal())
				continue;
				updateReturnOps(func, appendedEntryArgs);
				}
				if (failed(updateCalls(module)))
				return signalPassFailure();
				}
				};
				} // end anonymous namespace

				std::unique_ptr<Pass> mlir::createBufferResultsToOutParamsPass() {
				return std::make_unique<BufferResultsToOutParamsPass>();
				}

mlir/lib/Transforms/CMakeLists.txt

	add_subdirectory(Utils)			add_subdirectory(Utils)

	add_mlir_library(MLIRTransforms			add_mlir_library(MLIRTransforms
	BufferDeallocation.cpp			BufferDeallocation.cpp
	BufferOptimizations.cpp			BufferOptimizations.cpp
				BufferResultsToOutParams.cpp
	Bufferize.cpp			Bufferize.cpp
	Canonicalizer.cpp			Canonicalizer.cpp
	CopyRemoval.cpp			CopyRemoval.cpp
	CSE.cpp			CSE.cpp
	Inliner.cpp			Inliner.cpp
	LocationSnapshot.cpp			LocationSnapshot.cpp
	LoopCoalescing.cpp			LoopCoalescing.cpp
	LoopFusion.cpp			LoopFusion.cpp
	Show All 30 Lines

mlir/test/Transforms/buffer-results-to-out-params.mlir

This file was added.

				// RUN: mlir-opt -buffer-results-to-out-params -split-input-file -verify-diagnostics %s \| FileCheck %s

				// CHECK-LABEL: func @basic(
				// CHECK-SAME: %[[ARG:.*]]: memref<f32>) {
				// CHECK: %[[RESULT:.*]] = "test.source"() : () -> memref<f32>
				// CHECK: linalg.copy(%[[RESULT]], %[[ARG]]) : memref<f32>, memref<f32>
				// CHECK: return
				// CHECK: }
				func @basic() -> (memref<f32>) {
				%0 = "test.source"() : () -> (memref<f32>)
				return %0 : memref<f32>
				}

				// CHECK-LABEL: func @presence_of_existing_arguments(
				// CHECK-SAME: %[[ARG0:.*]]: memref<1xf32>,
				// CHECK-SAME: %[[ARG1:.*]]: memref<2xf32>) {
				// CHECK: %[[RESULT:.*]] = "test.source"() : () -> memref<2xf32>
				// CHECK: linalg.copy(%[[RESULT]], %[[ARG1]]) : memref<2xf32>, memref<2xf32>
				// CHECK: return
				// CHECK: }
				func @presence_of_existing_arguments(%arg0: memref<1xf32>) -> (memref<2xf32>) {
				%0 = "test.source"() : () -> (memref<2xf32>)
				return %0 : memref<2xf32>
				}

				// CHECK-LABEL: func @multiple_results(
				// CHECK-SAME: %[[ARG0:.*]]: memref<1xf32>,
				// CHECK-SAME: %[[ARG1:.*]]: memref<2xf32>) {
				// CHECK: %[[RESULTS:.*]]:2 = "test.source"() : () -> (memref<1xf32>, memref<2xf32>)
				// CHECK: linalg.copy(%[[RESULTS]]#0, %[[ARG0]]) : memref<1xf32>, memref<1xf32>
				// CHECK: linalg.copy(%[[RESULTS]]#1, %[[ARG1]]) : memref<2xf32>, memref<2xf32>
				// CHECK: return
				// CHECK: }
				func @multiple_results() -> (memref<1xf32>, memref<2xf32>) {
				%0, %1 = "test.source"() : () -> (memref<1xf32>, memref<2xf32>)
				return %0, %1 : memref<1xf32>, memref<2xf32>
				}

				// CHECK-LABEL: func @non_memref_types(
				// CHECK-SAME: %[[OUTPARAM:.*]]: memref<f32>) -> (i1, i32) {
				// CHECK: %[[RESULT1:.*]]:3 = "test.source"() : () -> (i1, memref<f32>, i32)
				// CHECK: linalg.copy(%[[RESULT1]]#1, %[[OUTPARAM]]) : memref<f32>, memref<f32>
				// CHECK: return %[[RESULT1]]#0, %[[RESULT1]]#2 : i1, i32
				// CHECK: }
				func @non_memref_types() -> (i1, memref<f32>, i32) {
				%0, %1, %2 = "test.source"() : () -> (i1, memref<f32>, i32)
				return %0, %1, %2 : i1, memref<f32>, i32
				}

				// CHECK: func @external_function(memref<f32>)
				func @external_function() -> (memref<f32>)
				// CHECK: func @result_attrs(memref<f32> {test.some_attr})
				func @result_attrs() -> (memref<f32> {test.some_attr})
				// CHECK: func @mixed_result_attrs(memref<1xf32>, memref<2xf32> {test.some_attr}, memref<3xf32>)
				func @mixed_result_attrs() -> (memref<1xf32>, memref<2xf32> {test.some_attr}, memref<3xf32>)

				// -----

				// CHECK-LABEL: func @callee(memref<1xf32>)
				func @callee() -> memref<1xf32>

				// CHECK-LABEL: func @call_basic() {
				// CHECK: %[[OUTPARAM:.*]] = alloc() : memref<1xf32>
				// CHECK: call @callee(%[[OUTPARAM]]) : (memref<1xf32>) -> ()
				// CHECK: "test.sink"(%[[OUTPARAM]]) : (memref<1xf32>) -> ()
				// CHECK: return
				// CHECK: }
				func @call_basic() {
				%0 = call @callee() : () -> memref<1xf32>
				"test.sink"(%0) : (memref<1xf32>) -> ()
				return
				}

				// -----

				// CHECK-LABEL: func @callee(memref<1xf32>, memref<2xf32>)
				func @callee() -> (memref<1xf32>, memref<2xf32>)

				// CHECK-LABEL: func @call_multiple_result() {
				// CHECK: %[[RESULT0:.*]] = alloc() : memref<1xf32>
				// CHECK: %[[RESULT1:.*]] = alloc() : memref<2xf32>
				// CHECK: call @callee(%[[RESULT0]], %[[RESULT1]]) : (memref<1xf32>, memref<2xf32>) -> ()
				// CHECK: "test.sink"(%[[RESULT0]], %[[RESULT1]]) : (memref<1xf32>, memref<2xf32>) -> ()
				// CHECK: }
				func @call_multiple_result() {
				%0, %1 = call @callee() : () -> (memref<1xf32>, memref<2xf32>)
				"test.sink"(%0, %1) : (memref<1xf32>, memref<2xf32>) -> ()
				}

				// -----

				// CHECK-LABEL: func @callee(memref<1xf32>) -> (i1, i32)
				func @callee() -> (i1, memref<1xf32>, i32)

				// CHECK-LABEL: func @call_non_memref_result() {
				// CHECK: %[[RESULT0:.*]] = alloc() : memref<1xf32>
				// CHECK: %[[NON_MEMREF_RESULTS:.*]]:2 = call @callee(%[[RESULT0]]) : (memref<1xf32>) -> (i1, i32)
				// CHECK: "test.sink"(%[[NON_MEMREF_RESULTS]]#0, %[[RESULT0]], %[[NON_MEMREF_RESULTS]]#1) : (i1, memref<1xf32>, i32) -> ()
				dfki-makoUnsubmitted Not Done Reply Inline Actions As you are updating the tests anyway, it would be really nice to break after 80 chars. dfki-mako: As you are updating the tests anyway, it would be really nice to break after 80 chars.
				silvasAuthorUnsubmitted Done Reply Inline Actions I typically don't break in test cases so that it reflects the typical printed IR structure. The theory is that our brains are mostly tuned to parse non-line-broken IR since that is what we usually read (unlike in C++), so line-breaking it tends to be more confusing since there aren't clear style preferences and mental heuristics for parsing it. silvas: I typically don't break in test cases so that it reflects the typical printed IR structure. The…
				dfki-makoUnsubmitted Done Reply Inline Actions +1 dfki-mako: +1
				// CHECK: }
				func @call_non_memref_result() {
				%0, %1, %2 = call @callee() : () -> (i1, memref<1xf32>, i32)
				"test.sink"(%0, %1, %2) : (i1, memref<1xf32>, i32) -> ()
				}

				// -----

				func @callee() -> (memref<?xf32>)

				func @call_non_memref_result() {
				// expected-error @+1 {{cannot create out param for dynamically shaped result}}
				%0 = call @callee() : () -> (memref<?xf32>)
				"test.sink"(%0) : (memref<?xf32>) -> ()
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add BufferResultsToOutParams pass.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 302013

mlir/include/mlir/Transforms/Passes.h

mlir/include/mlir/Transforms/Passes.td

mlir/lib/Transforms/BufferResultsToOutParams.cpp

mlir/lib/Transforms/CMakeLists.txt

mlir/test/Transforms/buffer-results-to-out-params.mlir

[mlir] Add BufferResultsToOutParams pass.
ClosedPublic