This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Transforms/
-
mlir/
-
Transforms/
-
BufferUtils.h
-
Passes.h
-
lib/Transforms/
-
Transforms/
2
BufferDeallocation.cpp

Differential D92562

[MLIR] Refactored BufferDeallocation pass to enable support for individual alloc/clone/free operations.
AbandonedPublic

Authored by dfki-mako on Dec 3 2020, 3:01 AM.

Download Raw Diff

Details

Reviewers

herhut
silvas
dfki-jugr
mehdi_amini

Summary

This CL refactors the BufferDeallocation pass to offer the opportunity to use custom alloc/clone/free operations. This overcomes the current limitations of emitting hard coded std::alloc, linalg::copy and std::dealloc operations.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dfki-mako created this revision.Dec 3 2020, 3:01 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2020, 3:01 AM

Herald added subscribers: teijeong, rdzhabarov, tatianashp and 14 others. · View Herald Transcript

dfki-mako requested review of this revision.Dec 3 2020, 3:01 AM

Herald added subscribers: limo1996, stephenneuendorffer, nicolasvasilache. · View Herald TranscriptDec 3 2020, 3:01 AM

dfki-mako added reviewers: herhut, silvas, dfki-jugr.Dec 3 2020, 3:02 AM

Harbormaster completed remote builds in B80935: Diff 309205.Dec 3 2020, 3:20 AM

Can we please finish the discussion in https://llvm.discourse.group/t/remove-tight-coupling-of-the-bufferdeallocation-pass-to-std-and-linalg-operations/2162/12 before moving forward with this? It's not clear to me that we all agreed that this was the right direction.

I disagree with @silvas on this. This particular change does not take any opinion on the discussion you reference. It merely makes the hard-wired current behavior configurable by choosing different ops. It does not mandate how the decision which free operation to use is done.

This unblocks users that do not want to use std.free or linalg.copy.

mlir/lib/Transforms/BufferDeallocation.cpp
505	Taking a reference of the contents of a shared pointer is dangerous, as the reference might outlive the lifetime of the shared pointer's content.
532	I just filed a bug for this that you can also reference here. https://bugs.llvm.org/show_bug.cgi?id=48385

In D92562#2433473, @herhut wrote:

I disagree with @silvas on this. This particular change does not take any opinion on the discussion you reference. It merely makes the hard-wired current behavior configurable by choosing different ops. It does not mandate how the decision which free operation to use is done.

To be clear, whether one should be able to customize which free operation is used and how that is done is specifically what is being discussed in that thread.

This change is literally implementing code to solve the problem under discussion in that thread, without taking into consideration any of the design considerations brought up in that thread. It's not that this patch "does not take any opinion" -- this patch does nothing other than take an opinion on that discussion.

This unblocks users that do not want to use std.free or linalg.copy.

The discussion in that thread is specifically about how we want to address this user need. Why would we move forward with this until we decide that this is the way we want to address that user need? For example, I don't see the advantage of this approach vs just creating "free" and "clone" ops and letting downstream users convert those to whatever they choose. This is best covered in that discussion thread.

As a general matter of policy in LLVM, we do not bypass ongoing design discussions to unblock downstream users.

As a general matter of policy in LLVM, we do not bypass ongoing design discussions to unblock downstream users.

Just a note. It's fair to say that they should have pinged the thread one more time, but that thread had a discussion, where dfki and herhut responded to the debate and then received no response for 17 days. That's not really an ongoing discussion at that point.

In D92562#2435368, @tpopp wrote:

As a general matter of policy in LLVM, we do not bypass ongoing design discussions to unblock downstream users.

Just a note. It's fair to say that they should have pinged the thread one more time, but that thread had a discussion, where dfki and herhut responded to the debate and then received no response for 17 days. That's not really an ongoing discussion at that point.

We can call it an "abandoned discussion", but that doesn't make it much better. No response doesn't mean "approved". I would expect an attempt to revive the discussion, rather than bypass it.

Also, some of the later posts were unrelated to my question, which is "why not lower through std.copy, possibly with an attribute for the resources". The last response related to that was from @herhut

If we can agree to have a std.copy, then using allocation resources seems a good way to model this. We could then go ahead and have a reference counted resource.

I interpreted this to mean that we wanted to pursue a direction of std.copy with resource attributes. I support that. I apologize that I did not follow up with a reply to that. I've made a reply in the thread.

@silvas Our intention was not to "bypass anything", sorry for the confusion. As you have mentioned, the discussion went a bit "idle" and we had to more forwards to simplify the integration of the BufferDeallocation pass into other third-party sub projects. Therefore, we proposed this CL that realizes an extremely light-weight version of the Pass Interface being discussed in the forums. However, the proposed solution in the scope of this CL is meant to unblock others on the one hand and being minimally invasive on the other hand. In other words: as soon as we can come up with a better solution based on the Discourse discussion, we can easily "revert" or "update" this pass interface. Please note further that existing code will not break or require further updates after merging this CL.

dfki-mako added a reviewer: mehdi_amini.Dec 7 2020, 2:42 AM

In D92562#2436641, @dfki-mako wrote:

@silvas Our intention was not to "bypass anything", sorry for the confusion. As you have mentioned, the discussion went a bit "idle" and we had to more forwards to simplify the integration of the BufferDeallocation pass into other third-party sub projects. Therefore, we proposed this CL that realizes an extremely light-weight version of the Pass Interface being discussed in the forums. However, the proposed solution in the scope of this CL is meant to unblock others on the one hand and being minimally invasive on the other hand. In other words: as soon as we can come up with a better solution based on the Discourse discussion, we can easily "revert" or "update" this pass interface. Please note further that existing code will not break or require further updates after merging this CL.

Typically, once downstream projects have something that unblocks them, they lose interest in contributing to a long-term solution. That is why in LLVM we tend to avoid "unblocking" downstream projects and insist on more up-front discussion that optimizes long-term community maintenance costs. Also, typically the party implementing the "better solution" ends up being different from the party that was "unblocked", which is unfair from a community maintenance perspective.

A recent example of this is my push to rationalize the bufferization infrastructure (as presented recently at the ODM). I spent about a month of time on that, and a large amount of it was undoing "minimally invasive" solutions that were added without significant discussion. For example, the "minimally invasive" features in https://reviews.llvm.org/D85133 ended up being a significant amount of that time -- I probably spent more time undoing those features than was involved in adding them in the first place. Also, to make things worse, neither of those features were still being used by downstream projects (yet I was still pushed by various people to generalize/rationalize them...), so they should not have been added upstream in the first place, and I think that a bit more up-front discussion would have revealed that (the best code is no code!).

First, the work @silvas did on improving how bufferization works in MLIR core brought a great cleanup with it and the outcome is much nicer than what we had before. Thank you for investing your time and thought on it. A lot of what got cleaned up had to do with how operations get bufferized and how we can do this partially and incrementally, a functionality that we did not have before and that we are already heavily using now. Awesome!

I have heard the argument that LLVM does not accept contributions until there is a fully vetted design before and I have not worked long enough in llvm to judge this. However, applying such a high bar to a code base as immature as MLIR means that we cannot evolve the system incrementally, which would hinder progress. There are many solutions in MLIR that are stop-gaps to unblock things while the full solution does not exist, yet. The modelling of side-effects went through this, the modelling of resources is still in that state, alias and dependence analysis, dominance on regions. There are many places where we have incremental solutions. Where I agree is that we should have a vision of where we want to get so that we do not build solutions that paint us into a corner.

Lets bring this discussion back to the change at hands here. In the discussion you referenced, the question is how we will handle different copy and free operations depending on how a memref was allocated. The goal in that discussion is to find a way to properly model mixed settings that use different kinds of allocation beyond the current support of alloc vs. alloca that is hard coded. This change will not enable that nor does it change the situation wrt. how this is currently modeled.

What this change does is making the used free and copy operations configurable, so that this can be used independently of linalg.copy and std.free. A general solution to the problem will also solve this small issue but solving this small issue will not make the general problem harder to solve, nor will it make a general solution any less desirable.

Therefore, I think that this change is a valid incremental step that brings us slightly closer to what we need without impacting the choices we have nor paint us in a corner. And that makes it good enough for me to land this approach.

Totally agreed that MLIR is in a youthful stage. I think it is a valuable discussion to be had explicitly with the community about how we approach the tradeoffs in adopting more incremental code vs more vetted designs. Getting clarity there would greatly simplify this discussion!

What this change does is making the used free and copy operations configurable, so that this can be used independently of linalg.copy and std.free. A general solution to the problem will also solve this small issue but solving this small issue will not make the general problem harder to solve, nor will it make a general solution any less desirable.

It isn't obvious to me that solving this small issue will not make the general problem harder to solve. My main concern, as I discussed above, is that it mean that somebody eventually implementing the more general functionality has to first figure out how to "undo" the solution to the "small issue" first. I'l give a concrete example:

Having this customizable with a callback will allow users to potentially start doing all sorts of random stuff inside. What if a user starts depending on that? For example, what if the callback references some side data structure and generates different clone/free ops depending on that, possibly in a less structured way than we would like to support upstream. That suddenly becomes much more difficult to unwind. Of course, implementing that callback is likely to be done in a downstream project, where doing something like this as a shortcut might seem more acceptable. But that shortcut then becomes a cost that the MLIR community needs to pay to unwind!

As evidenced in my recent refactorings, even functionality that ends up being functionally *unused* in downstream projects ends up being a significant maintenance cost that needs to be paid by the MLIR community.

+1 to @silvas here, we have to be careful with the balance in general.

In this case I still wonder why do we need to have customization of the alloc/free for a given type? I asked in the thread on Discourse but I don't think I really got an answer (or at least it is still unclear to me)

In D92562#2443647, @silvas wrote:

Having this customizable with a callback will allow users to potentially start doing all sorts of random stuff inside. What if a user starts depending on that? For example, what if the callback references some side data structure and generates different clone/free ops depending on that, possibly in a less structured way than we would like to support upstream. That suddenly becomes much more difficult to unwind. Of course, implementing that callback is likely to be done in a downstream project, where doing something like this as a shortcut might seem more acceptable. But that shortcut then becomes a cost that the MLIR community needs to pay to unwind!

We could, for example, explicitly state that this is not allowed and that the call back should always create the same operations. Then we have a contract of what we want to support and a clear mandate to break downstream that does not follow the contract. Would that approach work for you?

In D92562#2444456, @mehdi_amini wrote:

In this case I still wonder why do we need to have customization of the alloc/free for a given type? I asked in the thread on Discourse but I don't think I really got an answer (or at least it is still unclear to me)

This does not allow to configure the alloc operation, at all. Allocs are created by bufferization patterns and the authors of those currently have a free choice. Allowing to configure the used free operation is useful when this deallocation pass is used with other alloc operations (which it already supports via interfaces). Configuring copy operations is useful when the user of this deallocation pass has knowledge about how memrefs should be copied other than allocating + linalg.copy. We do not even know what the correct alloc operation is to insert there.

The issue with free can be solved by first inserting frees and then rewriting them, which is what we would do today. The issue with copy is more complex, as it currently does not introduce a single operation. To enable the latter, we have discussed to always insert a copy operation and then users can rewrite them at will. I think there is a forthcoming post from @dfki-mako about that. The idea essentially would be to have a bufferize dialect, that contains copy, cast and maybe free operations that can have semantics that are tailored to the needs of bufferization.

In D92562#2444990, @herhut wrote:

In D92562#2443647, @silvas wrote:

Having this customizable with a callback will allow users to potentially start doing all sorts of random stuff inside. What if a user starts depending on that? For example, what if the callback references some side data structure and generates different clone/free ops depending on that, possibly in a less structured way than we would like to support upstream. That suddenly becomes much more difficult to unwind. Of course, implementing that callback is likely to be done in a downstream project, where doing something like this as a shortcut might seem more acceptable. But that shortcut then becomes a cost that the MLIR community needs to pay to unwind!

We could, for example, explicitly state that this is not allowed and that the call back should always create the same operations. Then we have a contract of what we want to support and a clear mandate to break downstream that does not follow the contract. Would that approach work for you?

That seems to be functionally equivalent to my approach of having memref.clone(%0) {tag = "mytag"} and passing in "mytag" to this pass, but with the disadvantage that users might ignore the advice and misuse the callbacks (Hyrum's law). Also, I'm more worried about the things that I can't foresee. That is just one example. Another would be users calling getDefiningOp on the Value and making decisions based on that. Another would be users looking at the Value and seeing what function they are in to determine which clone/free to use. I'm quite sure there's a long list of possible other issues that we cannot foresee.

In D92562#2444456, @mehdi_amini wrote:

In this case I still wonder why do we need to have customization of the alloc/free for a given type? I asked in the thread on Discourse but I don't think I really got an answer (or at least it is still unclear to me)

This does not allow to configure the alloc operation, at all. Allocs are created by bufferization patterns and the authors of those currently have a free choice. Allowing to configure the used free operation is useful when this deallocation pass is used with other alloc operations (which it already supports via interfaces). Configuring copy operations is useful when the user of this deallocation pass has knowledge about how memrefs should be copied other than allocating + linalg.copy. We do not even know what the correct alloc operation is to insert there.

The issue with free can be solved by first inserting frees and then rewriting them, which is what we would do today. The issue with copy is more complex, as it currently does not introduce a single operation. To enable the latter, we have discussed to always insert a copy operation and then users can rewrite them at will. I think there is a forthcoming post from @dfki-mako about that. The idea essentially would be to have a bufferize dialect, that contains copy, cast and maybe free operations that can have semantics that are tailored to the needs of bufferization.

The current approach assumes that all memrefs that this pass will attempt to free can be freed/cloned by the ops produced by these callbacks. That seems incredibly fragile, considering what you said about bufferization patterns being able to freely choose their allocation operation -- I think that's an incredibly strong argument that we need a more general solution, probably something that rationalizes the invariants needed by this pass with what bufferization patterns are allowed to do. It seems that we would need to require all allocating ops to specify what resource/tag they allocate from so we can know how to free it, and then do some dataflow analysis to propagate that information, and fail BufferDeallocation if we cannot figure out which clone/free is legal to insert.

Maybe we can take a step back, talk concretely about the problem that we are actually trying to solve (on discourse), rather than some abstract goal of making this pass more generic. I would especially like to understand how we guarantee the invariant stated above that all allocations can be validly freed by the ops produced by these callbacks.

In D92562#2446427, @silvas wrote:

In D92562#2444990, @herhut wrote:

The issue with free can be solved by first inserting frees and then rewriting them, which is what we would do today. The issue with copy is more complex, as it currently does not introduce a single operation. To enable the latter, we have discussed to always insert a copy operation and then users can rewrite them at will. I think there is a forthcoming post from @dfki-mako about that. The idea essentially would be to have a bufferize dialect, that contains copy, cast and maybe free operations that can have semantics that are tailored to the needs of bufferization.

The current approach assumes that all memrefs that this pass will attempt to free can be freed/cloned by the ops produced by these callbacks. That seems incredibly fragile, considering what you said about bufferization patterns being able to freely choose their allocation operation -- I think that's an incredibly strong argument that we need a more general solution, probably something that rationalizes the invariants needed by this pass with what bufferization patterns are allowed to do. It seems that we would need to require all allocating ops to specify what resource/tag they allocate from so we can know how to free it, and then do some dataflow analysis to propagate that information, and fail BufferDeallocation if we cannot figure out which clone/free is legal to insert.

The current approach always uses free for operations that have an allocation effect on the default resource. It already traces values to their allocation. As there basically is only one allocation resource, conflict resolution is not yet implemented.

Maybe we can take a step back, talk concretely about the problem that we are actually trying to solve (on discourse), rather than some abstract goal of making this pass more generic. I would especially like to understand how we guarantee the invariant stated above that all allocations can be validly freed by the ops produced by these callbacks.

It does not change the status quo, which is that we cannot guarantee that. Currently it inserts the free operation from the standard dialect, as the interfaces are not rich enough to tell. Also see the stale discussion on discourse around that topic for reference.

@silvas @mehdi_amini @herhut based on the feedback and the discussion in the public Discourse forums, we will close this PR for now until there is a better solution to overcome these limitations in the future.

Revision Contents

Path

Size

mlir/

include/

mlir/

Transforms/

BufferUtils.h

21 lines

Passes.h

10 lines

lib/

Transforms/

BufferDeallocation.cpp

107 lines

Diff 309205

mlir/include/mlir/Transforms/BufferUtils.h

	Show All 19 Lines
	#include "mlir/IR/Builders.h"			#include "mlir/IR/Builders.h"
	#include "mlir/IR/Dominance.h"			#include "mlir/IR/Dominance.h"
	#include "mlir/IR/Function.h"			#include "mlir/IR/Function.h"
	#include "mlir/IR/Operation.h"			#include "mlir/IR/Operation.h"
	#include "mlir/Transforms/DialectConversion.h"			#include "mlir/Transforms/DialectConversion.h"

	namespace mlir {			namespace mlir {

				/// A configuration structure for the BufferDeallocation pass. It allows to
				/// define custom operations when allocating/cloning/deallocating buffers.
				struct BufferDeallocationConfig {
				virtual ~BufferDeallocationConfig() = default;

				/// Creates a default BufferDeallocationConfig that uses std::alloc and
				/// linalg::copy to clone buffers and std::delloc to free buffers.
				static std::shared_ptr<BufferDeallocationConfig> createDefault();

				/// Clones the given source value into a (potentially) new buffer. This allows
				/// to define custom create/copy operations for all allocations managed by
				/// the BufferDeallocation pass.
				virtual Value clone(OpBuilder &builder, Location location,
				Value source) const = 0;

				/// Frees the given source buffer. This allows to define custom free
				/// operations for all allocations managed by the BufferDeallocation pass.
				virtual void free(OpBuilder &builder, Location location,
				Value source) const = 0;
				};

	/// A simple analysis that detects allocation operations.			/// A simple analysis that detects allocation operations.
	class BufferPlacementAllocs {			class BufferPlacementAllocs {
	public:			public:
	/// Represents a tuple of allocValue and deallocOperation.			/// Represents a tuple of allocValue and deallocOperation.
	using AllocEntry = std::tuple<Value, Operation *>;			using AllocEntry = std::tuple<Value, Operation *>;

	/// Represents a list containing all alloc entries.			/// Represents a list containing all alloc entries.
	using AllocEntryList = SmallVector<AllocEntry, 8>;			using AllocEntryList = SmallVector<AllocEntry, 8>;
	▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

mlir/include/mlir/Transforms/Passes.h

	Show All 17 Lines
	#include "mlir/Transforms/LocationSnapshot.h"			#include "mlir/Transforms/LocationSnapshot.h"
	#include "mlir/Transforms/ViewOpGraph.h"			#include "mlir/Transforms/ViewOpGraph.h"
	#include "mlir/Transforms/ViewRegionGraph.h"			#include "mlir/Transforms/ViewRegionGraph.h"
	#include <limits>			#include <limits>

	namespace mlir {			namespace mlir {

	class AffineForOp;			class AffineForOp;
				struct BufferDeallocationConfig;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Passes			// Passes
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	/// Creates an instance of the BufferDeallocation pass to free all allocated			/// Creates an instance of the BufferDeallocation pass to free all allocated
	/// buffers.			/// buffers. This overload uses std::alloc, std::dealloc and linalg::copy to
				/// create, clone and free buffers.
	std::unique_ptr<Pass> createBufferDeallocationPass();			std::unique_ptr<Pass> createBufferDeallocationPass();

				/// Creates an instance of the BufferDeallocation pass to free all allocated
				/// buffers. This overload uses the given configruation to allocate/clone and
				/// free buffers.
				std::unique_ptr<Pass>
				createBufferDeallocationPass(std::shared_ptr<BufferDeallocationConfig> config);

	/// Creates a pass that moves allocations upwards to reduce the number of			/// Creates a pass that moves allocations upwards to reduce the number of
	/// required copies that are inserted during the BufferDeallocation pass.			/// required copies that are inserted during the BufferDeallocation pass.
	std::unique_ptr<Pass> createBufferHoistingPass();			std::unique_ptr<Pass> createBufferHoistingPass();

	/// Creates a pass that moves allocations upwards out of loops. This avoids			/// Creates a pass that moves allocations upwards out of loops. This avoids
	/// reallocations inside of loops.			/// reallocations inside of loops.
	std::unique_ptr<Pass> createBufferLoopHoistingPass();			std::unique_ptr<Pass> createBufferLoopHoistingPass();

	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

mlir/lib/Transforms/BufferDeallocation.cpp

Show First 20 Lines • Show All 157 Lines • ▼ Show 20 Lines
// BufferDeallocation		// BufferDeallocation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// The buffer deallocation transformation which ensures that all allocs in the		/// The buffer deallocation transformation which ensures that all allocs in the
/// program have a corresponding de-allocation. As a side-effect, it might also		/// program have a corresponding de-allocation. As a side-effect, it might also
/// introduce copies that in turn leads to additional allocs and de-allocations.		/// introduce copies that in turn leads to additional allocs and de-allocations.
class BufferDeallocation : BufferPlacementTransformationBase {		class BufferDeallocation : BufferPlacementTransformationBase {
public:		public:
BufferDeallocation(Operation *op)		BufferDeallocation(Operation *op, const BufferDeallocationConfig &config)
: BufferPlacementTransformationBase(op), dominators(op),		: BufferPlacementTransformationBase(op), config(config), dominators(op),
postDominators(op) {}		postDominators(op) {}

/// Performs the actual placement/creation of all temporary alloc, copy and		/// Performs the actual placement/creation of all temporary alloc, copy and
/// dealloc nodes.		/// dealloc nodes.
void deallocate() {		void deallocate() {
// Add additional allocations and copies that are required.		// Add additional allocations and copies that are required.
introduceCopies();		introduceCopies();
// Place deallocations for all allocation entries.		// Place deallocations for all allocation entries.
▲ Show 20 Lines • Show All 207 Lines • ▼ Show 20 Lines	Value introduceBufferCopy(Value sourceValue, Operation *terminator) {
// another successor that returns to its parent operation. Note: that		// another successor that returns to its parent operation. Note: that
// copying copied buffers can introduce memory leaks since the invariant of		// copying copied buffers can introduce memory leaks since the invariant of
// BufferPlacement assumes that a buffer will be only copied once into a		// BufferPlacement assumes that a buffer will be only copied once into a
// temporary buffer. Hence, the construction of copy chains introduces		// temporary buffer. Hence, the construction of copy chains introduces
// additional allocations that are not tracked automatically by the		// additional allocations that are not tracked automatically by the
// algorithm.		// algorithm.
if (copiedValues.contains(sourceValue))		if (copiedValues.contains(sourceValue))
return sourceValue;		return sourceValue;
// Create a new alloc at the current location of the terminator.		// Create a new clone at the current location of the terminator.
auto memRefType = sourceValue.getType().cast<MemRefType>();
OpBuilder builder(terminator);		OpBuilder builder(terminator);
		Value clone = config.clone(builder, terminator->getLoc(), sourceValue);
// Extract information about dynamically shaped types by
// extracting their dynamic dimensions.
SmallVector<Value, 4> dynamicOperands;
for (auto shapeElement : llvm::enumerate(memRefType.getShape())) {
if (!ShapedType::isDynamic(shapeElement.value()))
continue;
dynamicOperands.push_back(builder.create<DimOp>(
terminator->getLoc(), sourceValue, shapeElement.index()));
}

// TODO: provide a generic interface to create dialect-specific
// Alloc and CopyOp nodes.
auto alloc = builder.create<AllocOp>(terminator->getLoc(), memRefType,
dynamicOperands);

// Create a new copy operation that copies to contents of the old
// allocation to the new one.
builder.create<linalg::CopyOp>(terminator->getLoc(), sourceValue, alloc);

// Remember the copy of original source value.		// Remember the copy of original source value.
copiedValues.insert(alloc);		copiedValues.insert(clone);
return alloc;		return clone;
}		}

/// Finds correct dealloc positions according to the algorithm described at		/// Finds correct dealloc positions according to the algorithm described at
/// the top of the file for all alloc nodes and block arguments that can be		/// the top of the file for all alloc nodes and block arguments that can be
/// handled by this analysis.		/// handled by this analysis.
void placeDeallocs() const {		void placeDeallocs() const {
// Move or insert deallocs using the previously computed information.		// Move or insert deallocs using the previously computed information.
// These deallocations will be linked to their associated allocation nodes		// These deallocations will be linked to their associated allocation nodes
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	for (const BufferPlacementAllocs::AllocEntry &entry : allocs) {
} else {		} else {
// If the Dealloc position is at the terminator operation of the		// If the Dealloc position is at the terminator operation of the
// block, then the value should escape from a deallocation.		// block, then the value should escape from a deallocation.
Operation *nextOp = endOperation->getNextNode();		Operation *nextOp = endOperation->getNextNode();
if (!nextOp)		if (!nextOp)
continue;		continue;
// If there is no dealloc node, insert one in the right place.		// If there is no dealloc node, insert one in the right place.
OpBuilder builder(nextOp);		OpBuilder builder(nextOp);
builder.create<DeallocOp>(alloc.getLoc(), alloc);		config.free(builder, alloc.getLoc(), alloc);
}		}
}		}
}		}

		/// The current buffer deallocation configuration to clone/free buffers.
		const BufferDeallocationConfig &config;

/// The dominator info to find the appropriate start operation to move the		/// The dominator info to find the appropriate start operation to move the
/// allocs.		/// allocs.
DominanceInfo dominators;		DominanceInfo dominators;

/// The post dominator info to move the dependent allocs in the right		/// The post dominator info to move the dependent allocs in the right
/// position.		/// position.
PostDominanceInfo postDominators;		PostDominanceInfo postDominators;

/// Stores already copied allocations to avoid additional copies of copies.		/// Stores already copied allocations to avoid additional copies of copies.
ValueSetT copiedValues;		ValueSetT copiedValues;
};		};

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// BufferDeallocationPass		// BufferDeallocationPass
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// The actual buffer deallocation pass that inserts and moves dealloc nodes		/// The actual buffer deallocation pass that inserts and moves dealloc nodes
/// into the right positions. Furthermore, it inserts additional allocs and		/// into the right positions. Furthermore, it inserts additional allocs and
/// copies if necessary. It uses the algorithm described at the top of the file.		/// copies if necessary. It uses the algorithm described at the top of the file.
struct BufferDeallocationPass : BufferDeallocationBase<BufferDeallocationPass> {		struct BufferDeallocationPass : BufferDeallocationBase<BufferDeallocationPass> {
		std::shared_ptr<BufferDeallocationConfig> config;

		BufferDeallocationPass(std::shared_ptr<BufferDeallocationConfig> config)
		: config(config) {}

void runOnFunction() override {		void runOnFunction() override {
// Ensure that there are supported loops only.		// Ensure that there are supported loops only.
Backedges backedges(getFunction());		Backedges backedges(getFunction());
if (backedges.size()) {		if (backedges.size()) {
getFunction().emitError(		getFunction().emitError(
"Structured control-flow loops are supported only.");		"Structured control-flow loops are supported only.");
return;		return;
}		}

// Place all required temporary alloc, copy and dealloc nodes.		// Place all required temporary alloc, copy and dealloc nodes.
BufferDeallocation deallocation(getFunction());		BufferDeallocation deallocation(getFunction(), *config.get());
		herhutUnsubmitted Not Done Reply Inline Actions Taking a reference of the contents of a shared pointer is dangerous, as the reference might outlive the lifetime of the shared pointer's content. herhut: Taking a reference of the contents of a shared pointer is dangerous, as the reference might…
deallocation.deallocate();		deallocation.deallocate();
}		}
};		};

		//===----------------------------------------------------------------------===//
		// DefaultBufferDeallocationConfig
		//===----------------------------------------------------------------------===//

		/// A default implementation of the abstract BufferDeallocationConfig structure
		/// that uses std::alloc, linalg::copy and std::dealloc operations.
		struct DefaultBufferDeallocationConfig : public BufferDeallocationConfig {
		/// Clones the given source value into a new buffer using std::alloc and
		/// linalg::copy.
		Value clone(OpBuilder &builder, Location location,
		Value source) const override {
		// Extract information about dynamically shaped types by extracting their
		// dynamic dimensions.
		auto memRefType = source.getType().cast<MemRefType>();
		SmallVector<Value, 4> dynamicOperands;
		for (auto shapeElement : llvm::enumerate(memRefType.getShape())) {
		if (!ShapedType::isDynamic(shapeElement.value()))
		continue;
		dynamicOperands.push_back(
		builder.create<DimOp>(location, source, shapeElement.index()));
		}

		// TODO: Once we have a single "std::clone"-like operation that can express
		herhutUnsubmitted Not Done Reply Inline Actions I just filed a bug for this that you can also reference here. https://bugs.llvm.org/show_bug.cgi?id=48385 herhut: I just filed a bug for this that you can also reference here. https://bugs.llvm.org/show_bug.
		// an allocation and a copy in one single operation, the following piece
		// of code needs to be adapted accordingly. This enables support for
		// reference-counted allocations, for instance.

		// Create a new std::alloc in this default implementation.
		auto alloc = builder.create<AllocOp>(location, memRefType, dynamicOperands);

		// Create a new linalg copy operation that copies to contents of the old
		// allocation to the new one.
		builder.create<linalg::CopyOp>(location, source, alloc);

		// Return the temporary allocation.
		return alloc;
		}

		/// Frees the given source buffer by emitting an std::dealloc.
		void free(OpBuilder &builder, Location location,
		Value source) const override {
		// Emit a default std::dealloc to free the source buffer.
		builder.create<DeallocOp>(location, source);
		};
		};

} // end anonymous namespace		} // end anonymous namespace

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
		// BufferDeallocationConfig
		//===----------------------------------------------------------------------===//

		/// Creates and returns a new instance of the DefaultBufferDeallocationConfig
		/// structure.
		std::shared_ptr<mlir::BufferDeallocationConfig>
		mlir::BufferDeallocationConfig::createDefault() {
		return std::make_shared<DefaultBufferDeallocationConfig>();
		}

		//===----------------------------------------------------------------------===//
// BufferDeallocationPass construction		// BufferDeallocationPass construction
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

std::unique_ptr<Pass> mlir::createBufferDeallocationPass() {		std::unique_ptr<Pass> mlir::createBufferDeallocationPass() {
return std::make_unique<BufferDeallocationPass>();		return createBufferDeallocationPass(
		BufferDeallocationConfig::createDefault());
		}

		std::unique_ptr<Pass> mlir::createBufferDeallocationPass(
		std::shared_ptr<BufferDeallocationConfig> config) {
		return std::make_unique<BufferDeallocationPass>(config);
}		}