This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Transforms/
-
Transforms/
45/45
BufferPlacement.cpp
-
test/Transforms/
-
Transforms/
1/1
buffer-placement.mlir

Differential D81926

[mlir] Extended BufferPlacement to support nested region control flow.
ClosedPublic

Authored by dfki-mako on Jun 16 2020, 4:11 AM.

Download Raw Diff

Details

Reviewers

pifon2a
herhut

Commits

rG6f5da84f7bb3: [mlir] Extended BufferPlacement to support nested region control flow.

Summary

The current BufferPlacement implementation does not support nested region control flow. This CL adds support for nested regions via the RegionBranchOpInterface and the detection of branch-like (ReturnLike) terminators inside nested regions.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dfki-mako created this revision.Jun 16 2020, 4:11 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2020, 4:11 AM

Herald added subscribers: msifontes, jurahul, Kayjukh and 15 others. · View Herald Transcript

dfki-mako added reviewers: pifon2a, herhut.Jun 16 2020, 4:12 AM

Can you also add a test with deeper nesting?

mlir/lib/Transforms/BufferPlacement.cpp
81	Another way to describe `exitParentRegion` would be to only consider immediately nested regions. You can use the approach you have now, or simply use a loop over regions/blocks and get their terminators in the true case. Less traversal work.
87	would `terminator->getParentRegion()->getParentOp() == operation` do the same?
136–195	`detailed information` is not very helpful in a comment. What does it query?
150	Maybe `llvm::for_each`? Drop the `{`,`}`.
207	So these are the entry regions for this operation. Maybe write that in the comment.
211	This ties the operands of the op with a region to the block arguments of the target region,
212	`Block &successorBlock = regionSuccessor.getSuccessor()->front();` ?
213	`for_each` or drop the brackets.
217	Now that you have queried the flow-in, you can also query the flow within the op. For this, you can call `regionInterface.getSuccessorRegions` once for each region of the op. The alias registration is the same as with the initial flow into the op. Except, when `regionSuccessor.getSuccessor()` is `nulltptr`. That signals that the terminator of the region will exit the region. So you have to tie the `regionSuccessor.getSuccessorInputs()` to `parentOp->getResults()` in that case.
221	`/exitParentRegion=/false`
221	I don't think this is needed with the above implemented.
296	`front()`?
443	This copies a region result, so maybe reflect that in the name.
451	This also needs to use the `RegionBranchOpInterface` to find all the terminators in the operation and their successor inputs (they might be in a different order than the op results). Querying the terminators directly makes assumptions about ordering that do not exist.
mlir/test/Transforms/buffer-placement.mlir
779	There should be a `dealloc` here, right?

This revision now requires changes to proceed.Jun 16 2020, 5:04 AM

Harbormaster completed remote builds in B60453: Diff 271037.Jun 16 2020, 5:31 AM

dfki-mako retitled this revision from [mlir] Extended BufferPlacement to support nested region control flow. to [mlir] WIP: Extended BufferPlacement to support nested region control flow..Jun 17 2020, 4:16 AM

Added extended support for the RegionOpInterface to query successor bindings for successor regions

Harbormaster failed remote builds in B60999: Diff 272028!Jun 19 2020, 6:27 AM

rriddle added inline comments.Jun 19 2020, 11:05 AM

mlir/lib/Transforms/BufferPlacement.cpp
81	Please move static functions to the global namespace. anonymous namespace should only really be used by things like classes.
139	nit: Use a lambda instead.
154	nit: I don't think this really saves anything over a normal for loop, if anything it is much less efficient.
163	nit: Use the full name for the type.
188	nit: Use parameter names when passing constants, i.e., /someName=/
422	Is this guaranteed to be a RegionBranchOpInterface?
434	You are completely disregarding the fact that getMutableSuccessorOperands can return None, which would cause this to crash. Where are you checking that this is valid?
440	llvm::find?

herhut added inline comments.Jun 22 2020, 1:02 AM

mlir/lib/Transforms/BufferPlacement.cpp
163	Why is this called `operands`? These are the results of the overall operation, right?
165	This region does not have a valid successor block if it terminates the parent operation. In that case, we wire its successor operands with the parent operations results. This could be more obvious in the code if you would pass the parent operation. Please at least rename `operands` to `results`.
170	The loop and the zip could be part of the helper function `registerAliasFunc`.
175	Isn't this the wrong way round? The results alias the successor inputs?
201	This only follows one level of branching. So if the entry region goes to region 2, and that one goes to region 3, the second link will not be seen. It should be good enough to loop over all regions of an operation and then do this linking to successor regions.
434	The underlying assumption is that if the successor has block arguments, then the branch in the predecessor needs to have operands for those. Can we rely on this?

rriddle added inline comments.Jun 22 2020, 1:46 AM

mlir/lib/Transforms/BufferPlacement.cpp
434	Branching operations are not required to have a Value already materialized for a block argument. There are certain classes of operations that internally generate the value that is passed to the branch. For example, operations like LLVM Callbr and certain SIL switches.

Refactored implementation and simplified the iteration over all successor regions.

dfki-mako marked 6 inline comments as done.Jun 23 2020, 6:06 AM

dfki-mako added inline comments.

mlir/lib/Transforms/BufferPlacement.cpp
422	The value should either be a `BlockArgument` or a value resulting from a `RegionBranchOpInterface` operation.
434	We have changed the code to emit an error message. We prefer the error message over silently ignoring this case as the retrieved alias information can be invalid if an operation passes values implicitly to a block argument.

Harbormaster failed remote builds in B61382: Diff 272697!Jun 23 2020, 6:53 AM

dfki-mako retitled this revision from [mlir] WIP: Extended BufferPlacement to support nested region control flow. to [mlir] Extended BufferPlacement to support nested region control flow..Jun 24 2020, 3:51 AM

herhut added inline comments.Jun 24 2020, 4:00 AM

mlir/lib/Transforms/BufferPlacement.cpp
150	nit: interface
177	`entryRegion.getSuccessorInputs` returns the inputs of the target region, typically the block arguments of the first block. So this maps them with itself. Instead, this should map the result from `regionInterface().getSuccessorEntryOperands` to the `entryRegion.getSuccessorInputs`.
193	The terminator should also implement the `BranchOpInterface`, right? So, one should query the inputs to the successor using `BranchOpInterface.getSuccessorOperands`, correct? Here, `successorRegion.getSuccessorInputs` returns the input values to the region, which normally would be the block arguments or, in case this leaves the operation, the results.

Added support for region-region control flow within operations that implement the RegionBranchOpInterface.
Added new test operations to verify more advanced region-region control-flow scenarios.

Harbormaster failed remote builds in B61689: Diff 273290!Jun 25 2020, 4:17 AM

Cool stuff. Great to see this work with complex region interfaces, as well!

So, this now assumes that there is a 1:1 mapping between return-like op operands and successor inputs and we likely want an interface that makes this configurable. Maybe the returnlike trait could provide a function that enables this. (We discussed this, just writing it down.)

Also, we need a clean-up pass to remove unneeded alloc+copy+dealloc triples. This is underway.

With the nits addressed and some of the comments cleaned up, this is good to go.

mlir/lib/Transforms/BufferPlacement.cpp
178	Mega-nit: `entryRegion->getSuccessor` reads weird. It read like getting the successor of the entry region but actually it gets the entry region itself :) Maybe rename this `entrySuccessor`?
182	Maybe `Wire flow between regions and from region exits.`?
188	The length of `operandAttributes` might be wrong here, as it was built for the entry successors.
190	I think this comment is now off. It always wires the terminator operands with the successor inputs. The latter can be block arguments of a region's entry block or the result values, if the terminator exists the op.
295	Should this apply to the case above, as well?
397	Where is this conversion? Or is this comment off?
402	Nit: Doesn't this create a new alloc and a copy? If so, please fix the comment.
433	So is it OK to pass an empty ArrayRef here? If so, why can we not do this in the other cases?
mlir/test/lib/Dialect/Test/TestOps.td
1344 ↗	(On Diff #273290)	Please use let description = [{ ... }]; instead of comment.

This revision now requires changes to proceed.Jun 25 2020, 12:14 PM

With my comments addressed, this is good to go.

This revision is now accepted and ready to land.Jun 26 2020, 4:03 AM

Addressed reviewer comments.

mlir/lib/Transforms/BufferPlacement.cpp
177	Yeah makes sense to me; we should wire `getSuccessorEntryOperands` with `getSuccessorInputs`.
188	According to our interpretation of the comment operands` is a set of optional attributes that correspond to a constant value for each operand this should be an array containing values for each operand of the defining operation; although it might make sense to include an attribute for each block argument. However, we should keep this for now, as it is fully compatible with the current implementation of all instantiations of the `RegionBrachOpInterface`.
193	I guess querying a `BranchOpInterface` might not make sense in the case of a `ReturnLike` op, since it does not branch to any block in general (although this can still be expressed via the `BranchOpInterface`). It feels like that a `ReturnLike` terminator should provide a method to access all operands similar to the `BranchOpInterface`.

Harbormaster completed remote builds in B61906: Diff 273669.Jun 26 2020, 5:24 AM

Closed by commit rG6f5da84f7bb3: [mlir] Extended BufferPlacement to support nested region control flow. (authored by dfki-mako, committed by herhut). · Explain WhyJun 30 2020, 3:14 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

mlir/

lib/

Transforms/

BufferPlacement.cpp

297 lines

test/

Transforms/

buffer-placement.mlir

118 lines

Diff 272697

mlir/lib/Transforms/BufferPlacement.cpp

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
#include "mlir/Dialect/Linalg/IR/LinalgOps.h"		#include "mlir/Dialect/Linalg/IR/LinalgOps.h"
#include "mlir/IR/Operation.h"		#include "mlir/IR/Operation.h"
#include "mlir/Pass/Pass.h"		#include "mlir/Pass/Pass.h"
#include "mlir/Transforms/Passes.h"		#include "mlir/Transforms/Passes.h"
#include "llvm/ADT/SetOperations.h"		#include "llvm/ADT/SetOperations.h"

using namespace mlir;		using namespace mlir;

namespace {		/// Walks over all immediate return-like terminators in the given region.
		template <typename FuncT>
		static void walkReturnOperations(Region *region, const FuncT &func) {
		for (Block &block : *region)
		for (Operation &operation : block) {
		// Skip non-return-like terminators.
		if (operation.hasTrait<OpTrait::ReturnLike>())
		func(&operation);
		}
		}

		namespace {
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// BufferPlacementAliasAnalysis		// BufferPlacementAliasAnalysis
		herhutUnsubmitted Done Reply Inline Actions Another way to describe `exitParentRegion` would be to only consider immediately nested regions. You can use the approach you have now, or simply use a loop over regions/blocks and get their terminators in the true case. Less traversal work. herhut: Another way to describe `exitParentRegion` would be to only consider immediately nested regions.
		rriddleUnsubmitted Done Reply Inline Actions Please move static functions to the global namespace. anonymous namespace should only really be used by things like classes. rriddle: Please move static functions to the global namespace. anonymous namespace should only really be…
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// A straight-forward alias analysis which ensures that all aliases of all		/// A straight-forward alias analysis which ensures that all aliases of all
/// values will be determined. This is a requirement for the BufferPlacement		/// values will be determined. This is a requirement for the BufferPlacement
/// class since you need to determine safe positions to place alloc and		/// class since you need to determine safe positions to place alloc and
/// deallocs.		/// deallocs.
		herhutUnsubmitted Done Reply Inline Actions would `terminator->getParentRegion()->getParentOp() == operation` do the same? herhut: would `terminator->getParentRegion()->getParentOp() == operation` do the same?
class BufferPlacementAliasAnalysis {		class BufferPlacementAliasAnalysis {
public:		public:
using ValueSetT = SmallPtrSet<Value, 16>;		using ValueSetT = SmallPtrSet<Value, 16>;
using ValueMapT = llvm::DenseMap<Value, ValueSetT>;		using ValueMapT = llvm::DenseMap<Value, ValueSetT>;

public:		public:
/// Constructs a new alias analysis using the op provided.		/// Constructs a new alias analysis using the op provided.
BufferPlacementAliasAnalysis(Operation *op) { build(op->getRegions()); }		BufferPlacementAliasAnalysis(Operation *op) { build(op); }

/// Find all immediate aliases this value could potentially have.		/// Find all immediate aliases this value could potentially have.
ValueMapT::const_iterator find(Value value) const {		ValueMapT::const_iterator find(Value value) const {
return aliases.find(value);		return aliases.find(value);
}		}

/// Returns the end iterator that can be used in combination with find.		/// Returns the end iterator that can be used in combination with find.
ValueMapT::const_iterator end() const { return aliases.end(); }		ValueMapT::const_iterator end() const { return aliases.end(); }

/// Find all immediate and indirect aliases this value could potentially		/// Find all immediate and indirect aliases this value could potentially
/// have. Note that the resulting set will also contain the value provided as		/// have. Note that the resulting set will also contain the value provided as
/// it is an alias of itself.		/// it is an alias of itself.
ValueSetT resolve(Value value) const {		ValueSetT resolve(Value value) const {
ValueSetT result;		ValueSetT result;
resolveRecursive(value, result);		resolveRecursive(value, result);
return result;		return result;
}		}

/// Removes the given values from all alias sets.		/// Removes the given values from all alias sets.
void remove(const SmallPtrSetImpl<BlockArgument> &aliasValues) {		void remove(const SmallPtrSetImpl<Value> &aliasValues) {
for (auto &entry : aliases)		for (auto &entry : aliases)
llvm::set_subtract(entry.second, aliasValues);		llvm::set_subtract(entry.second, aliasValues);
}		}

private:		private:
/// Recursively determines alias information for the given value. It stores		/// Recursively determines alias information for the given value. It stores
/// all newly found potential aliases in the given result set.		/// all newly found potential aliases in the given result set.
void resolveRecursive(Value value, ValueSetT &result) const {		void resolveRecursive(Value value, ValueSetT &result) const {
if (!result.insert(value).second)		if (!result.insert(value).second)
return;		return;
auto it = aliases.find(value);		auto it = aliases.find(value);
if (it == aliases.end())		if (it == aliases.end())
return;		return;
for (Value alias : it->second)		for (Value alias : it->second)
resolveRecursive(alias, result);		resolveRecursive(alias, result);
}		}

/// This function constructs a mapping from values to its immediate aliases.		/// This function constructs a mapping from values to its immediate aliases.
/// It iterates over all blocks, gets their predecessors, determines the		/// It iterates over all blocks, gets their predecessors, determines the
/// values that will be passed to the corresponding block arguments and		/// values that will be passed to the corresponding block arguments and
/// inserts them into the underlying map.		/// inserts them into the underlying map. Furthermore, it wires successor
void build(MutableArrayRef<Region> regions) {		/// regions and branch-like return operations from nested regions.
for (Region &region : regions) {		void build(Operation *op) {
for (Block &block : region) {		// Registers all aliases of the given values.
		rriddleUnsubmitted Done Reply Inline Actions nit: Use a lambda instead. rriddle: nit: Use a lambda instead.
// Iterate over all predecessor and get the mapped values to their		auto registerAliases = [&](auto values, auto aliases) {
// corresponding block arguments values.		for (auto entry : llvm::zip(values, aliases))
for (auto it = block.pred_begin(), e = block.pred_end(); it != e;		this->aliases[std::get<0>(entry)].insert(std::get<1>(entry));
++it) {		};
unsigned successorIndex = it.getSuccessorIndex();
// Get the terminator and the values that will be passed to our block.		// Query all branch interfaces to link block argument aliases.
auto branchInterface =		op->walk([&](BranchOpInterface branchInterface) {
dyn_cast<BranchOpInterface>((*it)->getTerminator());		Block *parentBlock = branchInterface.getOperation()->getBlock();
if (!branchInterface)		for (auto it = parentBlock->succ_begin(), e = parentBlock->succ_end();
continue;		it != e; ++it) {
// Query the branch op interace to get the successor operands.		// Query the branch op interace to get the successor operands.
		herhutUnsubmitted Done Reply Inline Actions Maybe `llvm::for_each`? Drop the `{`,`}`. herhut: Maybe `llvm::for_each`? Drop the `{`,`}`.
		herhutUnsubmitted Done Reply Inline Actions nit: interface herhut: nit: interface
auto successorOperands =		auto successorOperands =
branchInterface.getSuccessorOperands(successorIndex);		branchInterface.getSuccessorOperands(it.getIndex());
if (successorOperands.hasValue()) {		if (!successorOperands.hasValue())
		continue;
		rriddleUnsubmitted Done Reply Inline Actions nit: I don't think this really saves anything over a normal for loop, if anything it is much less efficient. rriddle: nit: I don't think this really saves anything over a normal for loop, if anything it is much…
// Build the actual mapping of values to their immediate aliases.		// Build the actual mapping of values to their immediate aliases.
for (auto argPair : llvm::zip(block.getArguments(),		registerAliases(successorOperands.getValue(), (*it)->getArguments());
successorOperands.getValue())) {
aliases[std::get<1>(argPair)].insert(std::get<0>(argPair));
}
}
}		}
		});

		// Query the RegionBranchOpInterface to find potential successor regions.
		op->walk([&](RegionBranchOpInterface regionInterface) {
		// Create an empty attribute for each operand to comply with the
		// `getSuccessorRegions` interface definition that requires a single
		rriddleUnsubmitted Done Reply Inline Actions nit: Use the full name for the type. rriddle: nit: Use the full name for the type.
		herhutUnsubmitted Done Reply Inline Actions Why is this called `operands`? These are the results of the overall operation, right? herhut: Why is this called `operands`? These are the results of the overall operation, right?
		// attribute per operand.
		SmallVector<Attribute, 2> operandAttributes(
		herhutUnsubmitted Done Reply Inline Actions This region does not have a valid successor block if it terminates the parent operation. In that case, we wire its successor operands with the parent operations results. This could be more obvious in the code if you would pass the parent operation. Please at least rename `operands` to `results`. herhut: ``` This region does not have a valid successor block if it terminates the parent operation. In…
		regionInterface.getOperation()->getNumOperands());

		// Extract all entry regions and wire all initial entry successor inputs.
		SmallVector<RegionSuccessor, 2> entryRegions;
		regionInterface.getSuccessorRegions(/index=/llvm::None,
		herhutUnsubmitted Done Reply Inline Actions The loop and the zip could be part of the helper function `registerAliasFunc`. herhut: The loop and the zip could be part of the helper function `registerAliasFunc`.
		operandAttributes, entryRegions);
		for (RegionSuccessor &entryRegion : entryRegions) {
		// Wire the entry region's successor arguments with the initial
		// successor inputs.
		assert(entryRegion.getSuccessor() &&
		herhutUnsubmitted Done Reply Inline Actions Isn't this the wrong way round? The results alias the successor inputs? herhut: Isn't this the wrong way round? The results alias the successor inputs?
		"Invalid entry region without an attached successor region");
		registerAliases(entryRegion.getSuccessorInputs(),
		herhutUnsubmitted Done Reply Inline Actions `entryRegion.getSuccessorInputs` returns the inputs of the target region, typically the block arguments of the first block. So this maps them with itself. Instead, this should map the result from `regionInterface().getSuccessorEntryOperands` to the `entryRegion.getSuccessorInputs`. herhut: `entryRegion.getSuccessorInputs` returns the inputs of the target region, typically the block…
		dfki-makoAuthorUnsubmitted Done Reply Inline Actions Yeah makes sense to me; we should wire `getSuccessorEntryOperands` with `getSuccessorInputs`. dfki-mako: Yeah makes sense to me; we should wire `getSuccessorEntryOperands ` with `getSuccessorInputs`.
		entryRegion.getSuccessor()->front().getArguments());
		herhutUnsubmitted Done Reply Inline Actions Mega-nit: `entryRegion->getSuccessor` reads weird. It read like getting the successor of the entry region but actually it gets the entry region itself :) Maybe rename this `entrySuccessor`? herhut: Mega-nit: `entryRegion->getSuccessor` reads weird. It read like getting the successor of the…
		}

		// Extract all region successors.
		for (Region &region : regionInterface.getOperation()->getRegions()) {
		herhutUnsubmitted Done Reply Inline Actions Maybe `Wire flow between regions and from region exits.`? herhut: Maybe `Wire flow between regions and from region exits.`?
		// Iterate over all successor region entries that are reachable from the
		// current region.
		SmallVector<RegionSuccessor, 2> successorRegions;
		regionInterface.getSuccessorRegions(
		region.getRegionNumber(), operandAttributes, successorRegions);
		for (RegionSuccessor &successorRegion : successorRegions) {
		rriddleUnsubmitted Done Reply Inline Actions nit: Use parameter names when passing constants, i.e., /someName=/ rriddle: nit: Use parameter names when passing constants, i.e., /someName=/
		herhutUnsubmitted Done Reply Inline Actions The length of `operandAttributes` might be wrong here, as it was built for the entry successors. herhut: The length of `operandAttributes` might be wrong here, as it was built for the entry successors.
		dfki-makoAuthorUnsubmitted Done Reply Inline Actions According to our interpretation of the comment operands` is a set of optional attributes that correspond to a constant value for each operand this should be an array containing values for each operand of the defining operation; although it might make sense to include an attribute for each block argument. However, we should keep this for now, as it is fully compatible with the current implementation of all instantiations of the `RegionBrachOpInterface`. dfki-mako: According to our interpretation of the comment > operands` is a set of optional attributes…
		// Iterate over all immediate terminator operations and wire the
		// successor inputs with either the corresponding block arguments or
		herhutUnsubmitted Done Reply Inline Actions I think this comment is now off. It always wires the terminator operands with the successor inputs. The latter can be block arguments of a region's entry block or the result values, if the terminator exists the op. herhut: I think this comment is now off. It always wires the terminator operands with the successor…
		// the operands of each terminator.
		walkReturnOperations(&region, [&](Operation *terminator) {
		registerAliases(terminator->getOperands(),
		herhutUnsubmitted Done Reply Inline Actions The terminator should also implement the `BranchOpInterface`, right? So, one should query the inputs to the successor using `BranchOpInterface.getSuccessorOperands`, correct? Here, `successorRegion.getSuccessorInputs` returns the input values to the region, which normally would be the block arguments or, in case this leaves the operation, the results. herhut: The terminator should also implement the `BranchOpInterface`, right? So, one should query the…
		dfki-makoAuthorUnsubmitted Done Reply Inline Actions I guess querying a `BranchOpInterface` might not make sense in the case of a `ReturnLike` op, since it does not branch to any block in general (although this can still be expressed via the `BranchOpInterface`). It feels like that a `ReturnLike` terminator should provide a method to access all operands similar to the `BranchOpInterface`. dfki-mako: I guess querying a `BranchOpInterface` might not make sense in the case of a `ReturnLike` op…
		successorRegion.getSuccessorInputs());
		});
		herhutUnsubmitted Done Reply Inline Actions `detailed information` is not very helpful in a comment. What does it query? herhut: `detailed information` is not very helpful in a comment. What does it query?
}		}
}		}
		});
}		}

/// Maps values to all immediate aliases this value can have.		/// Maps values to all immediate aliases this value can have.
		herhutUnsubmitted Done Reply Inline Actions This only follows one level of branching. So if the entry region goes to region 2, and that one goes to region 3, the second link will not be seen. It should be good enough to loop over all regions of an operation and then do this linking to successor regions. herhut: This only follows one level of branching. So if the entry region goes to region 2, and that one…
ValueMapT aliases;		ValueMapT aliases;
};		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// BufferPlacement		// BufferPlacement
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		herhutUnsubmitted Done Reply Inline Actions So these are the entry regions for this operation. Maybe write that in the comment. herhut: So these are the entry regions for this operation. Maybe write that in the comment.

// The main buffer placement analysis used to place allocs, copies and deallocs.		// The main buffer placement analysis used to place allocs, copies and deallocs.
class BufferPlacement {		class BufferPlacement {
public:		public:
		herhutUnsubmitted Done Reply Inline Actions This ties the operands of the op with a region to the block arguments of the target region, herhut: This ties the operands of the op with a region to the block arguments of the target region,
using ValueSetT = BufferPlacementAliasAnalysis::ValueSetT;		using ValueSetT = BufferPlacementAliasAnalysis::ValueSetT;
		herhutUnsubmitted Done Reply Inline Actions `Block &successorBlock = regionSuccessor.getSuccessor()->front();` ? herhut: `Block &successorBlock = regionSuccessor.getSuccessor()->front();` ?

		herhutUnsubmitted Done Reply Inline Actions `for_each` or drop the brackets. herhut: `for_each` or drop the brackets.
/// An intermediate representation of a single allocation node.		/// An intermediate representation of a single allocation node.
struct AllocEntry {		struct AllocEntry {
/// A reference to the associated allocation node.		/// A reference to the associated allocation node.
Value allocValue;		Value allocValue;
		herhutUnsubmitted Done Reply Inline Actions Now that you have queried the flow-in, you can also query the flow within the op. For this, you can call `regionInterface.getSuccessorRegions` once for each region of the op. The alias registration is the same as with the initial flow into the op. Except, when `regionSuccessor.getSuccessor()` is `nulltptr`. That signals that the terminator of the region will exit the region. So you have to tie the `regionSuccessor.getSuccessorInputs()` to `parentOp->getResults()` in that case. herhut: Now that you have queried the flow-in, you can also query the flow within the op. For this, you…

/// The associated placement block in which the allocation should be		/// The associated placement block in which the allocation should be
/// performed.		/// performed.
Block *placementBlock;		Block *placementBlock;
		herhutUnsubmitted Done Reply Inline Actions `/exitParentRegion=/false` herhut: `/exitParentRegion=/false`
		herhutUnsubmitted Done Reply Inline Actions I don't think this is needed with the above implemented. herhut: I don't think this is needed with the above implemented.

/// The associated dealloc operation (if any).		/// The associated dealloc operation (if any).
Operation *deallocOperation;		Operation *deallocOperation;
};		};

using AllocEntryList = SmallVector<AllocEntry, 8>;		using AllocEntryList = SmallVector<AllocEntry, 8>;

public:		public:
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	void initBlockMapping() {
});		});
}		}

/// Computes a valid allocation position in a dominator (if possible) for the		/// Computes a valid allocation position in a dominator (if possible) for the
/// given allocation result.		/// given allocation result.
Block *getInitialAllocBlock(OpResult result) {		Block *getInitialAllocBlock(OpResult result) {
// Get all allocation operands as these operands are important for the		// Get all allocation operands as these operands are important for the
// allocation operation.		// allocation operation.
auto operands = result.getOwner()->getOperands();		Operation *owner = result.getOwner();
		auto operands = owner->getOperands();
if (operands.size() < 1)		if (operands.size() < 1)
return findCommonDominator(result, aliases.resolve(result), dominators);		return findCommonDominator(result, aliases.resolve(result), dominators);

// If this node has dependencies, check all dependent nodes with respect		// If this node has dependencies, check all dependent nodes with respect
// to a common post dominator in which all values are available.		// to a common post dominator in which all values are available.
ValueSetT dependencies(++operands.begin(), operands.end());		ValueSetT dependencies(++operands.begin(), operands.end());
return findCommonDominator(*operands.begin(), dependencies, postDominators);		Block *dominator =
		findCommonDominator(*operands.begin(), dependencies, postDominators);
		// Do not move allocs out of their parent regions to keep them local.
		if (dominator->getParent() != owner->getParentRegion())
		herhutUnsubmitted Done Reply Inline Actions Should this apply to the case above, as well? herhut: Should this apply to the case above, as well?
		return &owner->getParentRegion()->front();
		herhutUnsubmitted Done Reply Inline Actions `front()`? herhut: `front()`?
		return dominator;
}		}

/// Finds correct alloc positions according to the algorithm described at		/// Finds correct alloc positions according to the algorithm described at
/// the top of the file for all alloc nodes that can be handled by this		/// the top of the file for all alloc nodes that can be handled by this
/// analysis.		/// analysis.
void placeAllocs() const {		void placeAllocs() const {
for (auto &entry : allocs) {		for (auto &entry : allocs) {
Value alloc = entry.allocValue;		Value alloc = entry.allocValue;
Show All 14 Lines	for (auto &entry : allocs) {
// Move the alloc in front of the start operation.		// Move the alloc in front of the start operation.
Operation *allocOperation = alloc.getDefiningOp();		Operation *allocOperation = alloc.getDefiningOp();
allocOperation->moveBefore(startOperation);		allocOperation->moveBefore(startOperation);
}		}
}		}

/// Introduces required allocs and copy operations to avoid memory leaks.		/// Introduces required allocs and copy operations to avoid memory leaks.
void introduceCopies() {		void introduceCopies() {
// Initialize the set of block arguments that require a dedicated memory		// Initialize the set of values that require a dedicated memory free
// free operation since their arguments cannot be safely deallocated in a		// operation since their operands cannot be safely deallocated in a post
// post dominator.		// dominator.
SmallPtrSet<BlockArgument, 8> blockArgsToFree;		SmallPtrSet<Value, 8> valuesToFree;
llvm::SmallDenseSet<std::tuple<BlockArgument, Block *>> visitedBlockArgs;		llvm::SmallDenseSet<std::tuple<Value, Block *>> visitedValues;
SmallVector<std::tuple<BlockArgument, Block *>, 8> toProcess;		SmallVector<std::tuple<Value, Block *>, 8> toProcess;

// Check dominance relation for proper dominance properties. If the given		// Check dominance relation for proper dominance properties. If the given
// value node does not dominate an alias, we will have to create a copy in		// value node does not dominate an alias, we will have to create a copy in
// order to free all buffers that can potentially leak into a post		// order to free all buffers that can potentially leak into a post
// dominator.		// dominator.
auto findUnsafeValues = [&](Value source, Block *definingBlock) {		auto findUnsafeValues = [&](Value source, Block *definingBlock) {
auto it = aliases.find(source);		auto it = aliases.find(source);
if (it == aliases.end())		if (it == aliases.end())
return;		return;
for (Value value : it->second) {		for (Value value : it->second) {
auto blockArg = value.cast<BlockArgument>();		if (valuesToFree.count(value) > 0)
if (blockArgsToFree.count(blockArg) > 0)
continue;		continue;
// Check whether we have to free this particular block argument.		// Check whether we have to free this particular block argument.
if (!dominators.dominates(definingBlock, blockArg.getOwner())) {		if (!dominators.dominates(definingBlock, value.getParentBlock())) {
toProcess.emplace_back(blockArg, blockArg.getParentBlock());		toProcess.emplace_back(value, value.getParentBlock());
blockArgsToFree.insert(blockArg);		valuesToFree.insert(value);
} else if (visitedBlockArgs		} else if (visitedValues.insert(std::make_tuple(value, definingBlock))
.insert(std::make_tuple(blockArg, definingBlock))
.second)		.second)
toProcess.emplace_back(blockArg, definingBlock);		toProcess.emplace_back(value, definingBlock);
}		}
};		};

// Detect possibly unsafe aliases starting from all allocations.		// Detect possibly unsafe aliases starting from all allocations.
for (auto &entry : allocs)		for (auto &entry : allocs)
findUnsafeValues(entry.allocValue, entry.placementBlock);		findUnsafeValues(entry.allocValue, entry.placementBlock);

// Try to find block arguments that require an explicit free operation		// Try to find block arguments that require an explicit free operation
// until we reach a fix point.		// until we reach a fix point.
while (!toProcess.empty()) {		while (!toProcess.empty()) {
auto current = toProcess.pop_back_val();		auto current = toProcess.pop_back_val();
findUnsafeValues(std::get<0>(current), std::get<1>(current));		findUnsafeValues(std::get<0>(current), std::get<1>(current));
}		}

// Update buffer aliases to ensure that we free all buffers and block		// Update buffer aliases to ensure that we free all buffers and block
// arguments at the correct locations.		// arguments at the correct locations.
aliases.remove(blockArgsToFree);		aliases.remove(valuesToFree);

// Add new allocs and additional copy operations.		// Add new allocs and additional copy operations.
for (BlockArgument blockArg : blockArgsToFree) {		for (Value value : valuesToFree) {
Block *block = blockArg.getOwner();		if (auto blockArg = value.dyn_cast<BlockArgument>())
		introduceBlockArgCopy(blockArg);
		else
		introduceValueCopyForRegionResult(value);

		// Register the value to require a final dealloc. Note that we do not have
		// to assign a block here since we do not want to move the allocation node
		// to another location.
		allocs.push_back({value, nullptr, nullptr});
		}
		}

		/// Introduces temporary allocs in all predecessors and copies the source
		/// values into the newly allocated buffers.
		void introduceBlockArgCopy(BlockArgument blockArg) {
// Allocate a buffer for the current block argument in the block of		// Allocate a buffer for the current block argument in the block of
// the associated value (which will be a predecessor block by		// the associated value (which will be a predecessor block by
// definition).		// definition).
for (auto it = block->pred_begin(), e = block->pred_end(); it != e;		Block *block = blockArg.getOwner();
++it) {		for (auto it = block->pred_begin(), e = block->pred_end(); it != e; ++it) {
// Get the terminator and the value that will be passed to our		// Get the terminator and the value that will be passed to our
// argument.		// argument.
Operation terminator = (it)->getTerminator();		Operation terminator = (it)->getTerminator();
auto branchInterface = cast<BranchOpInterface>(terminator);		auto branchInterface = cast<BranchOpInterface>(terminator);
// Convert the mutable operand range to an immutable range and query the		// Convert the mutable operand range to an immutable range and query the
		herhutUnsubmitted Done Reply Inline Actions Where is this conversion? Or is this comment off? herhut: Where is this conversion? Or is this comment off?
// associated source value.		// associated source value.
Value sourceValue =		Value sourceValue =
branchInterface.getSuccessorOperands(it.getSuccessorIndex())		branchInterface.getSuccessorOperands(it.getSuccessorIndex())
.getValue()[blockArg.getArgNumber()];		.getValue()[blockArg.getArgNumber()];
// Create a new alloc at the current location of the terminator.		// Create a new alloc at the current location of the terminator.
		herhutUnsubmitted Done Reply Inline Actions Nit: Doesn't this create a new alloc and a copy? If so, please fix the comment. herhut: Nit: Doesn't this create a new alloc and a copy? If so, please fix the comment.
		Value alloc = introduceBufferCopy(sourceValue, terminator);
		// Wire new alloc and successor operand.
		auto mutableOperands =
		branchInterface.getMutableSuccessorOperands(it.getSuccessorIndex());
		if (!mutableOperands.hasValue())
		terminator->emitError() << "terminators with immutable successor "
		"operands are not supported";
		else
		mutableOperands.getValue()
		.slice(blockArg.getArgNumber(), 1)
		.assign(alloc);
		}
		}

		/// Introduces temporary allocs in front of all associated nested-region
		/// terminators and copies the source values into the newly allocated buffers.
		void introduceValueCopyForRegionResult(Value value) {
		// Get the actual result index in the scope of the parent terminator.
		Operation *operation = value.getDefiningOp();
		auto regionInterface = cast<RegionBranchOpInterface>(operation);
		rriddleUnsubmitted Done Reply Inline Actions Is this guaranteed to be a RegionBranchOpInterface? rriddle: Is this guaranteed to be a RegionBranchOpInterface?
		dfki-makoAuthorUnsubmitted Done Reply Inline Actions The value should either be a `BlockArgument` or a value resulting from a `RegionBranchOpInterface` operation. dfki-mako: The value should either be a `BlockArgument` or a value resulting from a…

		// Iterate over all immediate regions to adjust their terminators.
		for (Region &region : operation->getRegions()) {
		// Determine whether this region has a successor entry that leaves this
		// region by returning to its parent operation.
		SmallVector<RegionSuccessor, 2> successorRegions;
		regionInterface.getSuccessorRegions(
		region.getRegionNumber(), ArrayRef<Attribute>(), successorRegions);
		auto regionSuccessor = llvm::find_if(
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: 'auto regionSuccessor' can be declared as 'auto regionSuccessor' [llvm-qualified-auto] not useful Lint: Pre-merge checks:* clang-tidy: warning: 'auto regionSuccessor' can be declared as 'auto *regionSuccessor' [llvm…
		successorRegions, [&](RegionSuccessor &successorRegion) {
		return !successorRegion.getSuccessor();
		herhutUnsubmitted Done Reply Inline Actions So is it OK to pass an empty ArrayRef here? If so, why can we not do this in the other cases? herhut: So is it OK to pass an empty ArrayRef here? If so, why can we not do this in the other cases?
		});
		rriddleUnsubmitted Done Reply Inline Actions You are completely disregarding the fact that getMutableSuccessorOperands can return None, which would cause this to crash. Where are you checking that this is valid? rriddle: You are completely disregarding the fact that getMutableSuccessorOperands can return None…
		herhutUnsubmitted Done Reply Inline Actions The underlying assumption is that if the successor has block arguments, then the branch in the predecessor needs to have operands for those. Can we rely on this? herhut: The underlying assumption is that if the successor has block arguments, then the branch in the…
		dfki-makoAuthorUnsubmitted Done Reply Inline Actions We have changed the code to emit an error message. We prefer the error message over silently ignoring this case as the retrieved alias information can be invalid if an operation passes values implicitly to a block argument. dfki-mako: We have changed the code to emit an error message. We prefer the error message over silently…
		rriddleUnsubmitted Done Reply Inline Actions Branching operations are not required to have a Value already materialized for a block argument. There are certain classes of operations that internally generate the value that is passed to the branch. For example, operations like LLVM Callbr and certain SIL switches. rriddle: Branching operations are not required to have a Value already materialized for a block argument.
		if (regionSuccessor == successorRegions.end())
		continue;
		// Get the result index in the context of the current successor input
		// bindings.
		auto resultIndex =
		llvm::find(regionSuccessor->getSuccessorInputs(), value).getIndex();
		rriddleUnsubmitted Done Reply Inline Actions llvm::find? rriddle: llvm::find?

		// Iterate over all immediate terminator operations to introduce
		// new buffer allocations. Thereby, the appropriate terminator operand
		herhutUnsubmitted Done Reply Inline Actions This copies a region result, so maybe reflect that in the name. herhut: This copies a region result, so maybe reflect that in the name.
		// will be adjusted to point to the newly allocated buffer instead.
		walkReturnOperations(&region, [&](Operation *terminator) {
		// Extract the source value from the current terminator.
		Value sourceValue = terminator->getOperand(resultIndex);
		// Create a new alloc at the current location of the terminator.
		Value alloc = introduceBufferCopy(sourceValue, terminator);
		// Wire alloc and terminator operand.
		terminator->setOperand(resultIndex, alloc);
		herhutUnsubmitted Done Reply Inline Actions This also needs to use the `RegionBranchOpInterface` to find all the terminators in the operation and their successor inputs (they might be in a different order than the op results). Querying the terminators directly makes assumptions about ordering that do not exist. herhut: This also needs to use the `RegionBranchOpInterface` to find all the terminators in the…
		});
		}
		}

		/// Creates a new memory allocation for the given source value and copies
		/// its content into the newly allocated buffer. The terminator operation is
		/// used to insert the alloc and copy operations at the right places.
		Value introduceBufferCopy(Value sourceValue, Operation *terminator) {
		// Create a new alloc at the current location of the terminator.
auto memRefType = sourceValue.getType().cast<MemRefType>();		auto memRefType = sourceValue.getType().cast<MemRefType>();
OpBuilder builder(terminator);		OpBuilder builder(terminator);

// Extract information about dynamically shaped types by		// Extract information about dynamically shaped types by
// extracting their dynamic dimensions.		// extracting their dynamic dimensions.
SmallVector<Value, 4> dynamicOperands;		SmallVector<Value, 4> dynamicOperands;
for (auto shapeElement : llvm::enumerate(memRefType.getShape())) {		for (auto shapeElement : llvm::enumerate(memRefType.getShape())) {
if (!ShapedType::isDynamic(shapeElement.value()))		if (!ShapedType::isDynamic(shapeElement.value()))
continue;		continue;
dynamicOperands.push_back(builder.create<DimOp>(		dynamicOperands.push_back(builder.create<DimOp>(
terminator->getLoc(), sourceValue, shapeElement.index()));		terminator->getLoc(), sourceValue, shapeElement.index()));
}		}

// TODO: provide a generic interface to create dialect-specific		// TODO: provide a generic interface to create dialect-specific
// Alloc and CopyOp nodes.		// Alloc and CopyOp nodes.
auto alloc = builder.create<AllocOp>(terminator->getLoc(), memRefType,		auto alloc = builder.create<AllocOp>(terminator->getLoc(), memRefType,
dynamicOperands);		dynamicOperands);
// Wire new alloc and successor operand.
branchInterface.getMutableSuccessorOperands(it.getSuccessorIndex())
.getValue()
.slice(blockArg.getArgNumber(), 1)
.assign(alloc);
// Create a new copy operation that copies to contents of the old		// Create a new copy operation that copies to contents of the old
// allocation to the new one.		// allocation to the new one.
builder.create<linalg::CopyOp>(terminator->getLoc(), sourceValue,		builder.create<linalg::CopyOp>(terminator->getLoc(), sourceValue, alloc);
alloc);
}

// Register the block argument to require a final dealloc. Note that		return alloc;
// we do not have to assign a block here since we do not want to
// move the allocation node to another location.
allocs.push_back({blockArg, nullptr, nullptr});
}
}		}

/// Finds associated deallocs that can be linked to our allocation nodes (if		/// Finds associated deallocs that can be linked to our allocation nodes (if
/// any).		/// any).
void findDeallocs() {		void findDeallocs() {
for (auto &entry : allocs) {		for (auto &entry : allocs) {
auto userIt =		auto userIt =
llvm::find_if(entry.allocValue.getUsers(), [&](Operation *user) {		llvm::find_if(entry.allocValue.getUsers(), [&](Operation *user) {
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	for (auto &entry : allocs) {
}		}
// endOperation is the last operation behind which we can safely store		// endOperation is the last operation behind which we can safely store
// the dealloc taking all potential aliases into account.		// the dealloc taking all potential aliases into account.

// If there is an existing dealloc, move it to the right place.		// If there is an existing dealloc, move it to the right place.
if (entry.deallocOperation) {		if (entry.deallocOperation) {
entry.deallocOperation->moveAfter(endOperation);		entry.deallocOperation->moveAfter(endOperation);
} else {		} else {
// If the Dealloc position is at the terminator operation of the block,		// If the Dealloc position is at the terminator operation of the
// then the value should escape from a deallocation.		// block, then the value should escape from a deallocation.
Operation *nextOp = endOperation->getNextNode();		Operation *nextOp = endOperation->getNextNode();
if (!nextOp)		if (!nextOp)
continue;		continue;
// If there is no dealloc node, insert one in the right place.		// If there is no dealloc node, insert one in the right place.
OpBuilder builder(nextOp);		OpBuilder builder(nextOp);
builder.create<DeallocOp>(alloc.getLoc(), alloc);		builder.create<DeallocOp>(alloc.getLoc(), alloc);
}		}
}		}
▲ Show 20 Lines • Show All 100 Lines • Show Last 20 Lines

mlir/test/Transforms/buffer-placement.mlir

	Show First 20 Lines • Show All 710 Lines • ▼ Show 20 Lines
	}			}
	// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[RESULT:.*]]: memref<5xf32>)			// CHECK: (%[[ARG0:.]]: memref<5xf32>, %[[ARG1:.]]: memref<10xf32>, %[[RESULT:.*]]: memref<5xf32>)
	// CHECK: %[[X:.*]] = alloc()			// CHECK: %[[X:.*]] = alloc()
	// CHECK: %[[Y:.*]] = alloc()			// CHECK: %[[Y:.*]] = alloc()
	// CHECK: linalg.copy			// CHECK: linalg.copy
	// CHECK: dealloc %[[Y]]			// CHECK: dealloc %[[Y]]
	// CHECK: return %[[ARG1]], %[[X]]			// CHECK: return %[[ARG1]], %[[X]]

				// -----

				// Test Case: nested region control flow
				// The alloc position of %1 does not need to be changed and flows through
				// both if branches until it is finally returned. Hence, it does not
				// require a specific dealloc operation. However, %3 requires a dealloc.

				// CHECK-LABEL: func @nested_region_control_flow
				func @nested_region_control_flow(
				%arg0 : index,
				%arg1 : index) -> memref<?x?xf32> {
				%0 = cmpi "eq", %arg0, %arg1 : index
				%1 = alloc(%arg0, %arg0) : memref<?x?xf32>
				%2 = scf.if %0 -> (memref<?x?xf32>) {
				scf.yield %1 : memref<?x?xf32>
				} else {
				%3 = alloc(%arg0, %arg1) : memref<?x?xf32>
				scf.yield %1 : memref<?x?xf32>
				}
				return %2 : memref<?x?xf32>
				}

				// CHECK: %[[ALLOC0:.*]] = alloc(%arg0, %arg0)
				// CHECK-NEXT: %[[ALLOC1:.*]] = scf.if
				// CHECK: scf.yield %[[ALLOC0]]
				// CHECK: %[[ALLOC2:.*]] = alloc(%arg0, %arg1)
				// CHECK-NEXT: dealloc %[[ALLOC2]]
				// CHECK-NEXT: scf.yield %[[ALLOC0]]
				// CHECK: return %[[ALLOC1]]

				// -----

				// Test Case: nested region control flow with a nested buffer allocation in a
				// divergent branch.
				// The alloc positions of %1, %3 does not need to be changed since
				// BufferPlacement does not move allocs out of nested regions at the moment.
				// However, since %3 is allocated and "returned" in a divergent branch, we have
				// to allocate a temporary buffer (like in condBranchDynamicTypeNested).

				// CHECK-LABEL: func @nested_region_control_flow_div
				func @nested_region_control_flow_div(
				%arg0 : index,
				%arg1 : index) -> memref<?x?xf32> {
				%0 = cmpi "eq", %arg0, %arg1 : index
				%1 = alloc(%arg0, %arg0) : memref<?x?xf32>
				%2 = scf.if %0 -> (memref<?x?xf32>) {
				scf.yield %1 : memref<?x?xf32>
				} else {
				%3 = alloc(%arg0, %arg1) : memref<?x?xf32>
				scf.yield %3 : memref<?x?xf32>
				}
				return %2 : memref<?x?xf32>
				}

				// CHECK: %[[ALLOC0:.*]] = alloc(%arg0, %arg0)
				// CHECK-NEXT: %[[ALLOC1:.*]] = scf.if
				// CHECK: %[[ALLOC2:.*]] = alloc
				// CHECK-NEXT: linalg.copy(%[[ALLOC0]], %[[ALLOC2]])
				// CHECK: scf.yield %[[ALLOC2]]
				// CHECK: %[[ALLOC3:.*]] = alloc(%arg0, %arg1)
				// CHECK: %[[ALLOC4:.*]] = alloc
				herhutUnsubmitted Done Reply Inline Actions There should be a `dealloc` here, right? herhut: There should be a `dealloc` here, right?
				// CHECK-NEXT: linalg.copy(%[[ALLOC3]], %[[ALLOC4]])
				// CHECK: dealloc %[[ALLOC3]]
				// CHECK: scf.yield %[[ALLOC4]]
				// CHECK: dealloc %[[ALLOC0]]
				// CHECK-NEXT: return %[[ALLOC1]]

				// -----

				// Test Case: deeply nested region control flow with a nested buffer allocation
				// in a divergent branch.
				// The alloc positions of %1, %4 and %5 does not need to be changed since
				// BufferPlacement does not move allocs out of nested regions at the moment.
				// However, since %4 is allocated and "returned" in a divergent branch, we have
				// to allocate several temporary buffers (like in condBranchDynamicTypeNested).

				// CHECK-LABEL: func @nested_region_control_flow_div_nested
				func @nested_region_control_flow_div_nested(
				%arg0 : index,
				%arg1 : index) -> memref<?x?xf32> {
				%0 = cmpi "eq", %arg0, %arg1 : index
				%1 = alloc(%arg0, %arg0) : memref<?x?xf32>
				%2 = scf.if %0 -> (memref<?x?xf32>) {
				%3 = scf.if %0 -> (memref<?x?xf32>) {
				scf.yield %1 : memref<?x?xf32>
				} else {
				%4 = alloc(%arg0, %arg1) : memref<?x?xf32>
				scf.yield %4 : memref<?x?xf32>
				}
				scf.yield %3 : memref<?x?xf32>
				} else {
				%5 = alloc(%arg1, %arg1) : memref<?x?xf32>
				scf.yield %5 : memref<?x?xf32>
				}
				return %2 : memref<?x?xf32>
				}
				// CHECK: %[[ALLOC0:.*]] = alloc(%arg0, %arg0)
				// CHECK-NEXT: %[[ALLOC1:.*]] = scf.if
				// CHECK-NEXT: %[[ALLOC2:.*]] = scf.if
				// CHECK: %[[ALLOC3:.*]] = alloc
				// CHECK-NEXT: linalg.copy(%[[ALLOC0]], %[[ALLOC3]])
				// CHECK: scf.yield %[[ALLOC3]]
				// CHECK: %[[ALLOC4:.*]] = alloc(%arg0, %arg1)
				// CHECK: %[[ALLOC5:.*]] = alloc
				// CHECK-NEXT: linalg.copy(%[[ALLOC4]], %[[ALLOC5]])
				// CHECK: dealloc %[[ALLOC4]]
				// CHECK: scf.yield %[[ALLOC5]]
				// CHECK: %[[ALLOC6:.*]] = alloc
				// CHECK-NEXT: linalg.copy(%[[ALLOC2]], %[[ALLOC6]])
				// CHECK: dealloc %[[ALLOC2]]
				// CHECK: scf.yield %[[ALLOC6]]
				// CHECK: %[[ALLOC7:.*]] = alloc(%arg1, %arg1)
				// CHECK: %[[ALLOC8:.*]] = alloc
				// CHECK-NEXT: linalg.copy(%[[ALLOC7]], %[[ALLOC8]])
				// CHECK: dealloc %[[ALLOC7]]
				// CHECK: scf.yield %[[ALLOC8]]
				// CHECK: dealloc %[[ALLOC0]]
				// CHECK-NEXT: return %[[ALLOC1]]
				No newline at end of file