This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Linalg/Transforms/
-
mlir/
-
Dialect/
-
Linalg/
-
Transforms/
-
ComprehensiveBufferize.h
-
lib/Dialect/Linalg/Transforms/
-
Dialect/
-
Linalg/
-
Transforms/
36/37
ComprehensiveBufferize.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
2/3
comprehensive-module-bufferize-analysis.mlir

Differential D111287

[mlir][linalg][bufferize] Rewrite RaW conflict detection
ClosedPublic

Authored by springerm on Oct 6 2021, 11:03 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache

Commits

rGd3cb6bf2d462: [mlir][linalg][bufferize] Rewrite conflict detection

Summary

For each memory read, follow SSA use-def chains to find the op that produces the data being read (i.e., the most recent write). A memory write to an alias is a conflict if it takes places after the "most recent write" but before the read.

This CL introduces two main changes:

There is a concise definition of a conflict. Given a piece of IR with InPlaceSpec annotations and a computes alias set, it is easy to compute whether this program has a conflict. No need to consider multiple cases such as "read of operand after in-place write" etc.
No need to check for clobbering.

Depends On D111040

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

springerm created this revision.Oct 6 2021, 11:03 PM

Herald added subscribers: wenzhicui, wrengr, Chia-hungDuan and 20 others. · View Herald TranscriptOct 6 2021, 11:03 PM

springerm requested review of this revision.Oct 6 2021, 11:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2021, 11:03 PM

Herald added subscribers: limo1996, stephenneuendorffer. · View Herald Transcript

springerm retitled this revision from [mlir][linalg][bufferize] Rewrite conflict detection to [mlir][linalg][bufferize] Rewrite RaW conflict detection.Oct 6 2021, 11:14 PM

Harbormaster completed remote builds in B127447: Diff 377749.Oct 6 2021, 11:18 PM

Very nice improvement!

Please capture in the commit message and where appropriate in doc comments something like:

this version replaces alias-based analysis of clobbering with true last write analysis based on SSA use-def chains
the information captured is strictly better

I still need to check the new impl of wouldCreateReadAfterWriteInterference in more detail, the rest looks good modulo the comments I made.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
587	Turn this into a warning a degrade gracefully? Is this part of this CL ?
924	There are TODOs that getAliasingOpOperand should return a list of operands. With the SSA-based analysis here, I think we need to tie this loose end and support the list of operands and traverse them all. For now I would just evolve the API and fail this analysis if I wonder whether there be considerations related to equivalence in the analysis otherwise the last write may be partial and this could be another sack of knots. OTOH equivalence classes are not yet formed as you go up the use-def chain. I think this should be using getInplaceableOpResult on each opOperand considered to check this and the last write needs to be a full write (and in the future it could be a write to a bigger region that fully covers)
994	s/"a new conflict can * * only * * be introduced" etc?
1024	do we want to give up on all the debug messages ?
1045–1046	"Would inplace bufferization of `op` create a conflict?"
mlir/test/Dialect/Linalg/comprehensive-module-bufferize-analysis.mlir
367	very nice!
743	very nice and catching a wrong assumption: this example did not exhibit the need for intersection analysis because the SSA use-def chains already capture the info we want.
869	Mention the intersection / copy a little more of the other comment here ?

rebase

Harbormaster completed remote builds in B127505: Diff 377824.Oct 7 2021, 6:33 AM

rebase

springerm marked an inline comment as done.Oct 7 2021, 7:35 AM

springerm added inline comments.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
587	I think this can stay as is. The CallOpInterface case was missing.
587–588	I missed this before sending out for review. Have to see what's going on here...

improve handling of uWrite

Harbormaster completed remote builds in B127524: Diff 377849.Oct 7 2021, 8:04 AM

nicolasvasilache added a subscriber: jpienaar.Oct 7 2021, 11:56 AM

nicolasvasilache added inline comments.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
587	seems CallOpInterface addition should be split out in its own CL
1062	findLastPrecedingWrite? Reverse use-def chain is an impl detail.
1082	As we had discussed, not being in between the ops is not good enough in the non-straight-line case. If either: readingOp dominates conflictingWritingOp dominates lastWrite then continue, otherwise no.
1087	I do not understand this case. Can you elaborate?
1092	Can you also keep terminology like "no self-conflict" or a "use cannot conflict with itself" ?
1100	I would add the inplace notation to the production of %1 and %2 (and would drop this line). %0 is not yet inplace, we need to determine if inplace creates a conflict. E.g. // %0 = tensor.extract_slice %t[%a, %b][%c, %d][1, 1]. // can this bufferize inplace ? // %1 = linalg.fill %cst, %0 // bufferizes inplace // %2 = tensor.insert_slice %1 into %t[%a, %b][%c, %d][1, 1]. // bufferizes inplace
1103	mention that uConflictingWrite is "the use of %0 in linalg.fill"
1106	Hmm I had not realized you were also using SSA use-def chain for finding the insertSliceOp. I'll need to think about this more...
1107	Please spell this out: lastWrite <- ... uRead <- %t in `tensor.insert_slice` uConflictingWrite <- %0 in `linalg.fill` it is unclear to me who lastWrite is here.
1119	Please add a TODO to replace the magic constant by `insertSliceOp.getDestOpOperand` when available (cc @jpienaar @mehdi_amini @rriddle with whom we discussed this feature)
1121	Please spell this out: lastWrite <- ... uRead <- %1 in `tensor.insert_slice` uConflictingWrite <- t in `tensor.insert_slice` it is unclear to me who lastWrite is here.
1124	Please drop this: (Keep in mind that all three results in the example are considered inplace.) marking the ops that bufferize inplace in the example IR is simpler to follow.

rebase

Harbormaster completed remote builds in B127673: Diff 378079.Oct 7 2021, 8:35 PM

address comments

Harbormaster completed remote builds in B128022: Diff 378548.Oct 10 2021, 7:03 PM

address comments

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
1024	I think we should keep some of them. These particular ones were not very helpful for my debugging. They actually cluttered the output quite a lot. (And they are somewhat repetitive.) I think the important ones are "READ =", "CONFLICTING WRITE =", "WRITE =" below. What do you think?
1082	This is exactly what `isOpBetweenValueAndOp` is doing. Maybe the function name is not good. Or should I just merge `isOpBetweenValueAndOp` back into `hasReadAfterWriteInterference`?
1087	The for loop (`uConflictingWrite`) iterates over all writes. Including the one found by `findLastPrecedingWrite`. `isOpBetweenValueAndOp` checks for proper domination and does not handle the case where both are the same. This case is similar to the case below. In both cases, we want to check if an op `O` is in-between two things. These two checks (requirement 2, requirement 3) handle the case where `O` is the boundary (upper boundary in case of requirement 2 and lower boundary in case of requirement 3 when you think about reading code from top to bottom).
1100	The analysis is independent of which OpOperand we are trying to bufferize at the moment. We simply want to know, given a piece of IR and bufferization decisions, is there a conflict. We may as well be bufferizing %2 in the example and %0 is already bufferized. (The heuristic imposes a certain order of bufferization, but the analysis should work with any order.) Added the inplace annotation to all 3 ops.
1107	You were probably referring to this part? // Requirement 4: No matching ExtractSliceOp/InsertSliceOp pair. If // uRead is an InsertSliceOp... It does not matter what `lastWrite` is. `lastWrite` is not used for "requirement 4".
1121	Added comments for `uRead` and `uConflictingWrite`. The value of `lastWrite` is irrelevant.

Harbormaster completed remote builds in B128040: Diff 378571.Oct 10 2021, 11:27 PM

springerm added inline comments.Oct 11 2021, 11:07 PM

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
1082	This is actually trickier than I thought. I'm not 100% sure what's the right way to handle this with branches. But for the straight-line case this is definitely correct. I would suggest keeping this as is for now and updating this when we add support for scf::IfOp. (I'm working on that right now.) Then we have concrete examples that we can look at. I'm afraid, we may be missing cases otherwise.

nicolasvasilache added inline comments.Oct 12 2021, 8:14 AM

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
825	Can you replace all this with a getBackwardSlice that would use a filter to set true when you encounter your condition? This should be more general and avoids duplicating code. The only downside is that the SetVector is filled but as soon as you need to support more than a straight line this is needed anyway.
859	you should be able to plug getBackwardSlice here and just drop everything else.
1024	yup, removing clutter is def important, if this is the set that you found useful debugging let's go with it

nicolasvasilache added inline comments.Oct 12 2021, 11:59 AM

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
916	You should be able to fold this into req 1. (once it is implemented as "mayOpBeBetweenValueAndOp") if you use `dominates` instead of `properlyDominates` in the right place.
941	You have 2 subcases with an `InsertSliceOp` here. Do you know that you don't need more or is this still TBD ? Is it possible some case is missing in light of my comment on the impl. of `getAliasingOpResult` below ?
1082	It should be "mayOpBeBetweenValueAndOp" and you can only disprove the cases that I listed. Anything else is more tricky. Do you have issues if trying to implement this simple change now? If so, we need to understand the cases that break since you are changing the conflict detection algorithm. This need to be done as part of this revision.

address comments

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
825	I tried this, but unfortunately, `getBackwardSlice` only gives us `Operation` and not `OpOperand` or `Value`. This would work fine if all ops had only one operand and one output. This is not something that can be fixed by specifying a fancy filter function. E.g., we there's no way to filter for OpOperands that bufferize to a memory write. We may want to add another variant of `getBackwardSlice`. One that returns a SetVector<Value> and has a `std::function<bool(Value)>` condition.
916	We have to make sure that it is the exact same OpOperand and not two different ones of the same op. Otherwise, things could break around bufferization of InsertSliceOp etc.
924	Sorry, I can no longer see which line of code this comment is associated to. Wrt. to `lastWrite`, there may be multiple in the case of branches. Whatever analysis we do here, stays the same. It just has to be done (and hold) for every `lastWrite` that we found (`llvm::all_of`).
941	I did not come across any other cases in our current examples, so I think we are good here. Note, these are cases in which there is no conflict. Even if we miss a case, bufferization will work correctly. It may just introduce unnecessary copies.
1082	I merged the `isOpBetween` function back into the caller. Makes it easier to follow the flow of the function. Thinking about double negations etc. just adds unnecessary complexity. In summary, what we are looking for: There is no conflict, if: properlyDominates(readingOp, conflictingWritingOp) or properlyDominates(conflictingWritingOp, writingOp) Let's not think about "in-between" or "maybe in-between". Instead, think of situations when there is no conflict. Lessons learned: We should use `properlyDominates` instead of `dominates`. The case where the two ops are the same need special rules (e.g., has to be the exact same use). Do not use `!dominates` or `!properlyDominates`. This gets tricky when two ops are in two different branches. However, if `properlyDominates(A, B)` says `true`, we can be certain that `A` is before `B`, regardless of the absence or presence of branches. In our code, this is always the safe solution. Worst case, even if we miss a case, we do not `continue` and report something as a conflict that should not be a conflict.

Harbormaster completed remote builds in B128759: Diff 379576.Oct 13 2021, 6:56 PM

springerm added a child revision: D111775: [mlir][linalg][bufferize] Handle scf::ForOp correctly in bufferizesToMemoryRead.Oct 13 2021, 10:09 PM

Great, thanks!

This revision is now accepted and ready to land.Oct 14 2021, 7:11 AM

This revision was landed with ongoing or failed builds.Oct 14 2021, 6:32 PM

Closed by commit rGd3cb6bf2d462: [mlir][linalg][bufferize] Rewrite conflict detection (authored by springerm). · Explain Why

This revision was automatically updated to reflect the committed changes.

springerm added a commit: rGd3cb6bf2d462: [mlir][linalg][bufferize] Rewrite conflict detection.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

Transforms/

ComprehensiveBufferize.h

88 lines

lib/

Dialect/

Linalg/

Transforms/

ComprehensiveBufferize.cpp

434 lines

test/

Dialect/

Linalg/

comprehensive-module-bufferize-analysis.mlir

135 lines

Diff 379897

mlir/include/mlir/Dialect/Linalg/Transforms/ComprehensiveBufferize.h

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	public:

/// Set the inPlace bufferization spec to true.		/// Set the inPlace bufferization spec to true.
/// Merge result's and operand's aliasing sets and iterate to a fixed point.		/// Merge result's and operand's aliasing sets and iterate to a fixed point.
void bufferizeInPlace(OpResult result, OpOperand &operand);		void bufferizeInPlace(OpResult result, OpOperand &operand);

/// Set the inPlace bufferization spec to false.		/// Set the inPlace bufferization spec to false.
void bufferizeOutOfPlace(OpResult result);		void bufferizeOutOfPlace(OpResult result);

/// Return true if it is possible to find an inplace write W among `usesWrite`		/// Return true if `value` has an ExtractSliceOp matching the given
/// and a read R among `usesRead`, such that W and R interfere.		/// InsertSliceOp in its reverse SSA use-def chain.
/// Such a (W, R) pair is an interference to the inplace bufferization of		bool hasMatchingExtractSliceOp(Value value,
/// opResult when:		tensor::InsertSliceOp insertOp) const;
/// 1. R is not known properly dominate W (i.e. the effects of the write may
/// be visible from R).
/// 2. one cannot find an intermediate clobbering write `C` to W, such that
/// C interleaved between W and R (i.e. W -> C -> R where -> denotes
/// dominance).
bool wouldCreateReadAfterWriteInterference(
Operation opToBufferize, DenseSet<OpOperand > &usesRead,
DenseSet<OpOperand *> &usesWrite, const DominanceInfo &domInfo) const;

/// Return true if bufferizing `opOperand` inplace with `opResult` would		/// Return true if bufferizing `opOperand` inplace with `opResult` would
/// create a write to a non-writable buffer.		/// create a write to a non-writable buffer.
bool wouldCreateWriteToNonWritableBuffer(OpOperand &opOperand,		bool wouldCreateWriteToNonWritableBuffer(OpOperand &opOperand,
OpResult opResult) const;		OpResult opResult) const;

/// Assume that result bufferizes in-place with one of the operation's		/// Assume that result bufferizes in-place with one of the operation's
/// operands. Return true if it is possible to find an inplace write W (resp.		/// operands. Return true if it is possible to find an inplace write W that
/// a read R) among the uses of `aliasInfo[result]`, and a read R (resp. an		/// creates a conflict.
/// inplace write W) among the uses of
/// `aliasInfo[getAliasingOpOperand(result)]`, such that W and R interfere.
/// Interference detection is needed to determine which cases may bufferize
/// inplace without interferences. Such cases comprise:
///
/// ```
/// %0 = op_to_bufferize(%1)
/// read(%1)
///
/// %0 = op_to_bufferize(%1)
/// write(%0)
/// read(%1)
///
/// %0 = op_to_bufferize(%1)
/// write(%1)
/// read(%0)
/// ```
bool		bool
wouldCreateReadAfterWriteInterference(OpOperand &operand, OpResult result,		wouldCreateReadAfterWriteInterference(OpOperand &operand, OpResult result,
const DominanceInfo &domInfo) const;		const DominanceInfo &domInfo) const;

/// Return true if `v1` and `v2` bufferize to equivalent buffers.		/// Return true if `v1` and `v2` bufferize to equivalent buffers.
bool areEquivalentBufferizedValues(Value v1, Value v2) const {		bool areEquivalentBufferizedValues(Value v1, Value v2) const {
return equivalentInfo.getLeaderValue(v1) ==		return equivalentInfo.getLeaderValue(v1) ==
equivalentInfo.getLeaderValue(v2);		equivalentInfo.getLeaderValue(v2);
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	private:
/// equivalent operand / result and same offset/sizes/strides specification).		/// equivalent operand / result and same offset/sizes/strides specification).
///		///
/// This is one particular type of relationship between ops on tensors that		/// This is one particular type of relationship between ops on tensors that
/// reduce to an equivalence on buffers. This should be generalized and		/// reduce to an equivalence on buffers. This should be generalized and
/// exposed as interfaces on the proper types.		/// exposed as interfaces on the proper types.
bool areEquivalentExtractSliceOps(tensor::ExtractSliceOp st,		bool areEquivalentExtractSliceOps(tensor::ExtractSliceOp st,
tensor::InsertSliceOp sti) const;		tensor::InsertSliceOp sti) const;

/// Return true if there is a `candidateOp` that would write to memory after		/// Given sets of uses and writes, return true if there is a RaW conflict
/// bufferization and such that:		/// under the assumption that all given reads/writes alias the same buffer and
/// 1. The written buffer is equivalent to either `aliasingRead` or		/// that all given writes bufferize inplace.
/// `aliasingWrite` under the inPlace bufferization decisions taken		bool hasReadAfterWriteInterference(const DenseSet<OpOperand *> &usesRead,
/// so far.		const DenseSet<OpOperand *> &usesWrite,
/// 2. `aliasingWrite` properly dominates `candidateOp`.
/// 3. `candidateOp` properly dominates `aliasingReadOp`.
// TODO: richer clobbering analysis with container-containee relationship
// instead of equivalence.
bool existsInterleavedValueClobber(OpOperand &aliasingRead,
OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const;

/// Return true if there is a write that:
/// 1. Properly dominates aliasingReadOp.
/// 2. Is properly dominated by aliasingWriteOp.
/// 3. Clobbers the write that would be interfering with the read.
///
/// Case discussion:
/// ================
/// Case 1: opOperand is produced by opToBufferize,
/// Case 2: opResult is produced by opToBufferize,
/// Common case:
/// - aliasingReadOp is a read to an alias of opOperand.
/// - aliasingWriteOp is an inplace write to an alias of opResult.
/// - aliasingWriteOp dominates aliasingReadOp.
///
/// ```
/// // Either case 1:
/// %opOperand = opToBufferize(%opResult)
/// aliasingWriteOp(%aliasingWrite = alias(%opResult)) // inplace
/// aliasingReadOp( %aliasingRead = alias(%opOperand))
/// ```
///
/// ```
/// // Or case 2:
/// %opResult = opToBufferize(%opOperand)
/// aliasingWriteOp(%aliasingWrite = alias(%opResult)) // inplace
/// aliasingReadOp( %aliasingRead = alias(%opOperand))
/// ```
///
/// Capture possible cases where `aliasingWriteOp(alias(%opResult))` has no
/// visible effect on `aliasingReadOp(alias(%opOperand))`.
bool isClobberedWriteBeforeRead(Operation *opToBufferize,
OpOperand &aliasingRead,
OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const;		const DominanceInfo &domInfo) const;

/// Set of tensors that are known to bufferize to writable memory.		/// Set of tensors that are known to bufferize to writable memory.
llvm::DenseSet<Value> bufferizeToWritableMemory;		llvm::DenseSet<Value> bufferizeToWritableMemory;

/// Auxiliary structure to store all the values a given value aliases with.		/// Auxiliary structure to store all the values a given value aliases with.
/// These are the conservative cases that can further decompose into		/// These are the conservative cases that can further decompose into
/// "equivalent" buffer relationships.		/// "equivalent" buffer relationships.
llvm::EquivalenceClasses<ValueWrapper> aliasInfo;		llvm::EquivalenceClasses<ValueWrapper> aliasInfo;
Show All 23 Lines

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp

Show First 20 Lines • Show All 578 Lines • ▼ Show 20 Lines	TypeSwitch<Operation *>(result.getDefiningOp())
})		})
.Case([&](vector::TransferWriteOp op) {		.Case([&](vector::TransferWriteOp op) {
r.push_back(&op->getOpOperand(1));		r.push_back(&op->getOpOperand(1));
})		})
.Case<arith::ConstantOp, ConstantOp, CallOpInterface, InitTensorOp>(		.Case<arith::ConstantOp, ConstantOp, CallOpInterface, InitTensorOp>(
[&](auto op) {})		[&](auto op) {})
.Default([&](Operation *op) {		.Default([&](Operation *op) {
op->dump();		op->dump();
llvm_unreachable("unexpected defining op");		llvm_unreachable("unexpected defining op");
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Turn this into a warning a degrade gracefully? Is this part of this CL ? nicolasvasilache: Turn this into a warning a degrade gracefully? Is this part of this CL ?
		springermAuthorUnsubmitted Done Reply Inline Actions I think this can stay as is. The CallOpInterface case was missing. springerm: I think this can stay as is. The CallOpInterface case was missing.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions seems CallOpInterface addition should be split out in its own CL nicolasvasilache: seems CallOpInterface addition should be split out in its own CL
});		});
		springermAuthorUnsubmitted Done Reply Inline Actions I missed this before sending out for review. Have to see what's going on here... springerm: I missed this before sending out for review. Have to see what's going on here...
return r;		return r;
}		}

/// If the an ExtractSliceOp is bufferized in-place, the source operand will		/// If the an ExtractSliceOp is bufferized in-place, the source operand will
/// alias with the result.		/// alias with the result.
static OpResult getAliasingOpResult(ExtractSliceOp op, OpOperand &opOperand) {		static OpResult getAliasingOpResult(ExtractSliceOp op, OpOperand &opOperand) {
if (op.source() == opOperand.get())		if (op.source() == opOperand.get())
return op->getResult(0);		return op->getResult(0);
▲ Show 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	void BufferizationAliasInfo::bufferizeInPlace(OpResult result,
LLVM_DEBUG(dumpEquivalences());		LLVM_DEBUG(dumpEquivalences());
}		}

/// Set the inPlace bufferization spec to false.		/// Set the inPlace bufferization spec to false.
void BufferizationAliasInfo::bufferizeOutOfPlace(OpResult result) {		void BufferizationAliasInfo::bufferizeOutOfPlace(OpResult result) {
setInPlaceOpResult(result, InPlaceSpec::False);		setInPlaceOpResult(result, InPlaceSpec::False);
}		}

/// Return true if it is possible to find an inplace write W among `usesWrite`		/// Starting from `value`, follow the use-def chain in reverse, always selecting
/// and a read R among `usesRead`, such that W and R interfere.		/// the corresponding aliasing OpOperand. Try to find and return a Value for
bool BufferizationAliasInfo::wouldCreateReadAfterWriteInterference(		/// which `condition` evaluates to true for the aliasing OpOperand. Return an
Operation opToBufferize, DenseSet<OpOperand > &usesRead,		/// empty Value if no such Value was found. If `returnLast`, return the last
DenseSet<OpOperand *> &usesWrite, const DominanceInfo &domInfo) const {		/// Value (at the end of the chain), even if it does not satisfy the condition.
		static Value
		findValueInReverseUseDefChain(Value value,
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can you replace all this with a getBackwardSlice that would use a filter to set true when you encounter your condition? This should be more general and avoids duplicating code. The only downside is that the SetVector is filled but as soon as you need to support more than a straight line this is needed anyway. nicolasvasilache: Can you replace all this with a getBackwardSlice that would use a filter to set true when you…
		springermAuthorUnsubmitted Done Reply Inline Actions I tried this, but unfortunately, `getBackwardSlice` only gives us `Operation` and not `OpOperand` or `Value`. This would work fine if all ops had only one operand and one output. This is not something that can be fixed by specifying a fancy filter function. E.g., we there's no way to filter for OpOperands that bufferize to a memory write. We may want to add another variant of `getBackwardSlice`. One that returns a SetVector<Value> and has a `std::function<bool(Value)>` condition. springerm: I tried this, but unfortunately, `getBackwardSlice` only gives us `Operation*` and not…
		std::function<bool(OpOperand &)> condition,
		bool returnLast = false) {
		while (value.isa<OpResult>()) {
		auto opResult = value.cast<OpResult>();
		SmallVector<OpOperand *> opOperands = getAliasingOpOperand(opResult);
		assert(opOperands.size() <= 1 && "more than 1 OpOperand not supported yet");
		if (opOperands.empty())
		// No aliasing OpOperand. This could be an unsupported op or an op without
		// a tensor arg such as InitTensorOp. This is the end of the chain.
		return returnLast ? value : Value();
		OpOperand *opOperand = opOperands.front();
		if (condition(*opOperand))
		return value;
		value = opOperand->get();
		}
		// Value is a BlockArgument. Reached the end of the chain.
		return returnLast ? value : Value();
		}

		/// Find the Value (result) of the last preceding write of a given Value.
		///
		/// Note: Unknown ops are handled conservatively and assumed to be writes.
		/// Furthermore, BlockArguments are also assumed to be writes. There is no
		/// analysis across block boundaries.
		static Value findLastPrecedingWrite(Value value) {
		return findValueInReverseUseDefChain(value, bufferizesToMemoryWrite, true);
		}

		/// Return true if `value` is originating from an ExtractSliceOp that matches
		/// the given InsertSliceOp.
		bool BufferizationAliasInfo::hasMatchingExtractSliceOp(
		Value value, InsertSliceOp insertOp) const {
		return static_cast<bool>(
		findValueInReverseUseDefChain(value, [&](OpOperand &opOperand) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions you should be able to plug getBackwardSlice here and just drop everything else. nicolasvasilache: you should be able to plug getBackwardSlice here and just drop everything else.
		if (auto extractOp = dyn_cast<ExtractSliceOp>(opOperand.getOwner()))
		if (areEquivalentExtractSliceOps(extractOp, insertOp))
		return true;
		return false;
		}));
		}

		/// Given sets of uses and writes, return true if there is a RaW conflict under
		/// the assumption that all given reads/writes alias the same buffer and that
		/// all given writes bufferize inplace.
		///
		/// A conflict is: According to SSA use-def chains, a read R is supposed to read
		/// the result of a write W1. But because of bufferization decisions, R actually
		/// reads another write W2.
		bool BufferizationAliasInfo::hasReadAfterWriteInterference(
		const DenseSet<OpOperand *> &usesRead,
		const DenseSet<OpOperand *> &usesWrite,
		const DominanceInfo &domInfo) const {

for (OpOperand *uRead : usesRead) {		for (OpOperand *uRead : usesRead) {
Operation *aliasingReadOp = uRead->getOwner();		Operation *readingOp = uRead->getOwner();
LDBG("----++++aliasRead -> #"
<< uRead->getOperandNumber()		// Find most recent write of uRead by following the SSA use-def chain. E.g.:
<< " in: " << printOperationInfo(aliasingReadOp) << '\n');		//
for (OpOperand *uWrite : usesWrite) {		// %0 = "writing_op"(%t) : tensor<?x32> -> tensor<?xf32>
// The same operand may both read and write.		// %1 = "aliasing_op"(%0) : tensor<?x32> -> tensor<?xf32>
// Don't consider self-use of the same operand for interference.		// %2 = "reading_op"(%1) : : tensor<?x32> -> not_a_tensor_type
// Multiple different uses within the same op is fair game though.		//
if (uWrite == uRead)		// In the above example, if uRead is the OpOperand of reading_op, lastWrite
continue;		// is %0. Note that operations that create an alias but do not write (such
		// as ExtractSliceOp) are skipped.
Operation *aliasingWriteOp = uWrite->getOwner();		// TODO: With branches this should probably be a list of Values.
LDBG("---- aliasWrite -> #"		Value lastWrite = findLastPrecedingWrite(uRead->get());
<< uWrite->getOperandNumber()
<< " in: " << printOperationInfo(aliasingWriteOp) << '\n');		// Look for conflicting memory writes. Potential conflicts are writes to an
// If the candidate write is the one that produces the read value (in the		// alias that have been decided to bufferize inplace.
// SSA def-use sense), this is not considered an interference.		for (OpOperand *uConflictingWrite : usesWrite) {
if (getInplaceableOpResult(*uWrite) == uRead->get())		// Throughout this loop, check for multiple requirements that have to be
continue;		// met for uConflictingWrite to be an actual conflict.
// If aliasingReadOp properly dominates aliasingWriteOp, the read cannot		Operation *conflictingWritingOp = uConflictingWrite->getOwner();
// be affected by the write: there is no interference.
if (domInfo.properlyDominates(aliasingReadOp, aliasingWriteOp))		// Print some debug info.
continue;		LDBG("Found potential conflict:\n");
// At this point, aliasingWriteOp properly dominates aliasingReadOp or		LDBG("READ = #" << uRead->getOperandNumber() << " of "
// there is no clear dominance and we need to be conservative.		<< printOperationInfo(readingOp) << "\n");
LDBG("---->found RaW interference between:\n");		LDBG("WRITE = #" << printValueInfo(lastWrite) << "\n");
LDBG(" OpToBufferize -> " << printOperationInfo(opToBufferize)		LDBG("CONFLICTING WRITE = #"
<< '\n');		<< uConflictingWrite->getOperandNumber() << " of "
LDBG(" Interfering write -> #"		<< printOperationInfo(conflictingWritingOp) << "\n");
<< uWrite->getOperandNumber() << ":"
<< printOperationInfo(aliasingWriteOp) << '\n');		// No conflict if the readingOp dominates conflictingWritingOp, i.e., the
LDBG(" Target read -> #" << uRead->getOperandNumber() << ":"		// write is not visible when reading.
<< printOperationInfo(aliasingReadOp)		if (domInfo.properlyDominates(readingOp, conflictingWritingOp))
<< '\n');		continue;
LDBG("---->opportunity to clobber RaW interference\n");
if (isClobberedWriteBeforeRead(opToBufferize, uRead, uWrite, domInfo)) {		// No conflict if the conflicting write happens before the last write.
LDBG("---->clobbered! -> skip\n");		if (Operation *writingOp = lastWrite.getDefiningOp()) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions You should be able to fold this into req 1. (once it is implemented as "mayOpBeBetweenValueAndOp") if you use `dominates` instead of `properlyDominates` in the right place. nicolasvasilache: You should be able to fold this into req 1. (once it is implemented as…
		springermAuthorUnsubmitted Done Reply Inline Actions We have to make sure that it is the exact same OpOperand and not two different ones of the same op. Otherwise, things could break around bufferization of InsertSliceOp etc. springerm: We have to make sure that it is the exact same OpOperand and not two different ones of the same…
		if (domInfo.properlyDominates(conflictingWritingOp, writingOp))
		// conflictingWritingOp happens before writingOp. No conflict.
		continue;
		} else {
		auto bbArg = lastWrite.cast<BlockArgument>();
		Block *block = bbArg.getOwner();
		if (!block->findAncestorOpInBlock(*conflictingWritingOp))
		// conflictingWritingOp happens outside of the block. No
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions There are TODOs that getAliasingOpOperand should return a list of operands. With the SSA-based analysis here, I think we need to tie this loose end and support the list of operands and traverse them all. For now I would just evolve the API and fail this analysis if I wonder whether there be considerations related to equivalence in the analysis otherwise the last write may be partial and this could be another sack of knots. OTOH equivalence classes are not yet formed as you go up the use-def chain. I think this should be using getInplaceableOpResult on each opOperand considered to check this and the last write needs to be a full write (and in the future it could be a write to a bigger region that fully covers) nicolasvasilache: There are TODOs that getAliasingOpOperand should return a list of operands. With the SSA-based…
		springermAuthorUnsubmitted Done Reply Inline Actions Sorry, I can no longer see which line of code this comment is associated to. Wrt. to `lastWrite`, there may be multiple in the case of branches. Whatever analysis we do here, stays the same. It just has to be done (and hold) for every `lastWrite` that we found (`llvm::all_of`). springerm: Sorry, I can no longer see which line of code this comment is associated to. Wrt. to…
		// conflict.
continue;		continue;
}		}
LDBG("---->not clobbered -> found an interference\n");
		// No conflict if the conflicting write and the last write are the same
		// use.
		if (getAliasingOpResult(*uConflictingWrite) == lastWrite)
		continue;

		// No conflict is the same use is the read and the conflicting write. A
		// use cannot conflict with itself.
		if (uConflictingWrite == uRead)
		continue;

		// Special rules for matching ExtractSliceOp/InsertSliceOp pairs. If
		// uRead is an InsertSliceOp...
		if (auto insertSliceOp = dyn_cast<InsertSliceOp>(readingOp)) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions You have 2 subcases with an `InsertSliceOp` here. Do you know that you don't need more or is this still TBD ? Is it possible some case is missing in light of my comment on the impl. of `getAliasingOpResult` below ? nicolasvasilache: You have 2 subcases with an `InsertSliceOp` here. Do you know that you don't need more or is…
		springermAuthorUnsubmitted Done Reply Inline Actions I did not come across any other cases in our current examples, so I think we are good here. Note, these are cases in which there is no conflict. Even if we miss a case, bufferization will work correctly. It may just introduce unnecessary copies. springerm: I did not come across any other cases in our current examples, so I think we are good here.
		// As an example, consider the following IR.
		//
		// %0 = tensor.extract_slice %t[%a, %b][%c, %d][1, 1] {inplace= [true] }
		// %1 = linalg.fill %cst, %0 {inplace= [true] }
		// %2 = tensor.insert_slice %1 into %t[%a, %b][%c, %d][1, 1]
		// {inplace= [true] }

		// TODO: Use insertSliceOp.getDestOpOperand etc. when available.
		if (uRead == &insertSliceOp->getOpOperand(1) /dest/ &&
		hasMatchingExtractSliceOp(uConflictingWrite->get(), insertSliceOp))
		// Case 1: The main insight is that InsertSliceOp reads only part of
		// the destination tensor. The overwritten area is not read. If
		// uConflictingWrite writes into exactly the memory location that is
		// being read by uRead, this is not a conflict.
		//
		// In the above example:
		// uRead = OpOperand 1 (%t) of tensor.insert_slice
		// uConflictingWrite = OpOperand 1 (%0) of linalg.fill
		//
		// The read of %t does not conflict with the write of the FillOp
		// (same aliases!) because the area that the FillOp operates on is
		// exactly the one that is not read via %t.
		continue;

		if (uRead == &insertSliceOp->getOpOperand(0) /source/ &&
		uConflictingWrite == &insertSliceOp->getOpOperand(1) /dest/ &&
		hasMatchingExtractSliceOp(uRead->get(), insertSliceOp))
		// Case 2: The read of the source tensor and the write to the dest
		// tensor via an InsertSliceOp is not a conflict if the read is
		// reading exactly that part of an equivalent tensor that the
		// InsertSliceOp is writing.
		//
		// In the above example:
		// uRead = OpOperand 0 (%1) of tensor.insert_slice
		// uConflictingWrite = OpOperand 1 (%t) of tensor.insert_slice
		continue;
		}

		// All requirements are met. Conflict found!
		LDBG("CONFLICT CONFIRMED!\n\n");
return true;		return true;
}		}
}		}
LDBG("----No interference found\n");
		LDBG("NOT A CONFLICT!\n\n");
return false;		return false;
}		}

/// Return true if it is possible to find an inplace write W among the uses of		/// Return true if bufferizing result inplace would create a conflict. A read R
/// aliasInfo[result], and a read R among the uses of aliasInfo[result],		/// and a write W of the same alias set is a conflict if inplace bufferization
/// such that W and R interfere.		/// of W changes the value read by R to a value different from the one that
/// Such a (W, R) pair is an interference to the inplace bufferization of		/// would be expected by tracing back R's origin through SSA use-def chains.
/// opResult when:		/// A conflict can only be introduced by a new alias and/or an inplace
		nicolasvasilacheUnsubmitted Done Reply Inline Actions s/"a new conflict can * * only * * be introduced" etc? nicolasvasilache: s/"a new conflict can * * only * * be introduced" etc?
/// 1. R is not known to properly dominate W (i.e. the effects of the write		/// bufferization decision.
/// may be visible from R).		///
/// 2. one cannot find an intermediate clobbering write `C` to W, such that		/// Example:
/// C interleaved between W and R (i.e. W -> C -> R where -> denotes		/// %0 = tensor.extract_slice %t[...][...][1, 1] {inplace?}
/// dominance).		/// %1 = vector.transfer_write %v1, %t {inplace} : vector<5xf32>, tensor<?xf32>
		/// %e = tensor.extract_slice %1
		/// %2 = vector.transfer_write %v2, %0 {inplace} : vector<6xf32>, tensor<?xf32>
		/// %3 = vector.transfer_read %e, %cst : tensor<?xf32>, vector<7xf32>
		///
		/// In the above example, the two TransferWriteOps have already been decided to
		/// bufferize inplace. Bufferizing the ExtractSliceOp inplace would create a
		/// conflict because:
		/// * According to SSA use-def chains, we expect to read the result of %1.
		/// * However, adding an alias {%0, %t} would mean that the second
		/// TransferWriteOp overwrites the first one. Therefore, the TransferReadOp
		/// would no longer be reading the result of %1.
bool BufferizationAliasInfo::wouldCreateReadAfterWriteInterference(		bool BufferizationAliasInfo::wouldCreateReadAfterWriteInterference(
OpOperand &operand, OpResult result, const DominanceInfo &domInfo) const {		OpOperand &operand, OpResult result, const DominanceInfo &domInfo) const {
#ifndef NDEBUG		#ifndef NDEBUG
SmallVector<OpOperand *> opOperands = getAliasingOpOperand(result);		SmallVector<OpOperand *> opOperands = getAliasingOpOperand(result);
assert(llvm::find(opOperands, &operand) != opOperands.end() &&		assert(llvm::find(opOperands, &operand) != opOperands.end() &&
"operand and result do not match");		"operand and result do not match");
#endif // NDEBUG		#endif // NDEBUG

Operation *opToBufferize = result.getDefiningOp();		// Helper function to iterate on aliases of `root` and capture the reads.
Value opResult = result;
Value opOperand = operand.get();

LDBG("----Start wouldCreateReadAfterWriteInterference\n");
LDBG("--------consider all aliases to root read: "
<< printValueInfo(opOperand) << "\n");
LDBG("--------consider all aliases to root write: "
<< printValueInfo(opResult) << "\n");

/// Helper function to iterate on aliases of `root` and capture the reads.
auto getAliasingReads = [&](DenseSet<OpOperand *> &res, Value root) {		auto getAliasingReads = [&](DenseSet<OpOperand *> &res, Value root) {
for (Value alias : getAliases(root)) {		for (Value alias : getAliases(root))
for (auto &use : alias.getUses()) {		for (auto &use : alias.getUses())
// Read to a value that aliases root.		// Read to a value that aliases root.
if (bufferizesToMemoryRead(use)) {		if (bufferizesToMemoryRead(use))
		nicolasvasilacheUnsubmitted Done Reply Inline Actions do we want to give up on all the debug messages ? nicolasvasilache: do we want to give up on all the debug messages ?
		springermAuthorUnsubmitted Done Reply Inline Actions I think we should keep some of them. These particular ones were not very helpful for my debugging. They actually cluttered the output quite a lot. (And they are somewhat repetitive.) I think the important ones are "READ =", "CONFLICTING WRITE =", "WRITE =" below. What do you think? springerm: I think we should keep some of them. These particular ones were not very helpful for my…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions yup, removing clutter is def important, if this is the set that you found useful debugging let's go with it nicolasvasilache: yup, removing clutter is def important, if this is the set that you found useful debugging…
LDBG("------------bufferizesToMemoryRead: "
<< use.getOwner()->getName().getStringRef() << "\n");
res.insert(&use);		res.insert(&use);
}
}
}
};		};

/// Helper function to iterate on aliases of `root` and capture the writes.		// Helper function to iterate on aliases of `root` and capture the writes.
auto getAliasingInplaceWrites = [&](DenseSet<OpOperand *> &res, Value root) {		auto getAliasingInplaceWrites = [&](DenseSet<OpOperand *> &res, Value root) {
for (Value alias : getAliases(root)) {		for (Value alias : getAliases(root))
for (auto &use : alias.getUses()) {		for (auto &use : alias.getUses())
// Inplace write to a value that aliases root.		// Inplace write to a value that aliases root.
if (isInplaceMemoryWrite(use)) {		if (isInplaceMemoryWrite(use))
LDBG("------------bufferizesToMemoryWrite: "
<< use.getOwner()->getName().getStringRef() << "\n");
res.insert(&use);		res.insert(&use);
}
}
}
};		};

// Check if we can find any interference between reads to aliases[`opOperand`]		// Collect reads and writes of all aliases of OpOperand and OpResult.
// and writes to aliases[`opResult`]. This handles the case:
//
// ```
// %0 = op_to_bufferize_maybe_inplace(%1)
// %2 = some_alias(%0)
// inplace_write(%2)
// %3 = some_alias(%1)
// read(%3)
// ```
DenseSet<OpOperand *> usesRead, usesWrite;		DenseSet<OpOperand *> usesRead, usesWrite;
LDBG("--------\n");		getAliasingReads(usesRead, operand.get());
LDBG("--------Test reads(opOperand) vs writes(opResult)\n");		getAliasingReads(usesRead, result);
getAliasingReads(usesRead, opOperand);		getAliasingInplaceWrites(usesWrite, operand.get());
getAliasingInplaceWrites(usesWrite, opResult);		getAliasingInplaceWrites(usesWrite, result);
// Additionally, `result` is not yet bufferized and we need to check for
// interferences as if it were bufferized inplace: add `operand` if it is a
// write. This handles the case:
//
// ```
// %0 = op_to_bufferize_maybe_inplace(%1)
// %2 = some_alias(%1)
// read(%2)
// ```
if (bufferizesToMemoryWrite(operand))		if (bufferizesToMemoryWrite(operand))
usesWrite.insert(&operand);		usesWrite.insert(&operand);
if (wouldCreateReadAfterWriteInterference(opToBufferize, usesRead, usesWrite,
domInfo))
return true;

// Check if we can find any interference between writes to		return hasReadAfterWriteInterference(usesRead, usesWrite, domInfo);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions "Would inplace bufferization of `op` create a conflict?" nicolasvasilache: "Would inplace bufferization of `op` create a conflict?"
// aliases[`opOperand`] and reads to aliases[`opResult`]. This handles the
// case:
//
// ```
// %0 = op_to_bufferize_maybe_inplace(%1)
// %2 = some_alias(%1)
// inplace_write(%2)
// %3 = some_alias(%0)
// read(%3)
// ```
LDBG("--------\n");
LDBG("--------Test reads(opResult) vs writes(opOperand)\n");
usesRead.clear();
usesWrite.clear();
getAliasingReads(usesRead, opResult);
getAliasingInplaceWrites(usesWrite, opOperand);
return wouldCreateReadAfterWriteInterference(opToBufferize, usesRead,
usesWrite, domInfo);
}		}

/// Return true if bufferizing `opOperand` inplace with `opResult` would create		/// Return true if bufferizing `opOperand` inplace with `opResult` would create
/// a write to a non-writable buffer.		/// a write to a non-writable buffer.
bool BufferizationAliasInfo::wouldCreateWriteToNonWritableBuffer(		bool BufferizationAliasInfo::wouldCreateWriteToNonWritableBuffer(
OpOperand &opOperand, OpResult opResult) const {		OpOperand &opOperand, OpResult opResult) const {
#ifndef NDEBUG		#ifndef NDEBUG
SmallVector<OpOperand *> opOperands = getAliasingOpOperand(opResult);		SmallVector<OpOperand *> opOperands = getAliasingOpOperand(opResult);
assert(llvm::find(opOperands, &opOperand) != opOperands.end() &&		assert(llvm::find(opOperands, &opOperand) != opOperands.end() &&
"operand and result do not match");		"operand and result do not match");
#endif // NDEBUG		#endif // NDEBUG

// Certain buffers are not writeable:		// Certain buffers are not writeable:
// 1. A function bbArg that is not inplaceable or		// 1. A function bbArg that is not inplaceable or
// 2. A constant op.		// 2. A constant op.
assert(!aliasesNonWritableBuffer(opResult) &&		assert(!aliasesNonWritableBuffer(opResult) &&
		nicolasvasilacheUnsubmitted Done Reply Inline Actions findLastPrecedingWrite? Reverse use-def chain is an impl detail. nicolasvasilache: findLastPrecedingWrite? Reverse use-def chain is an impl detail.
"expected that opResult does not alias non-writable buffer");		"expected that opResult does not alias non-writable buffer");
bool nonWritable = aliasesNonWritableBuffer(opOperand.get());		bool nonWritable = aliasesNonWritableBuffer(opOperand.get());
if (!nonWritable)		if (!nonWritable)
return false;		return false;

// This is a problem only if the buffer is written to via some alias.		// This is a problem only if the buffer is written to via some alias.
bool hasWrite = aliasesInPlaceWrite(opResult) \|\|		bool hasWrite = aliasesInPlaceWrite(opResult) \|\|
aliasesInPlaceWrite(opOperand.get()) \|\|		aliasesInPlaceWrite(opOperand.get()) \|\|
bufferizesToMemoryWrite(opOperand);		bufferizesToMemoryWrite(opOperand);
if (!hasWrite)		if (!hasWrite)
return false;		return false;

LDBG("->the corresponding buffer is not writeable\n");		LDBG("->the corresponding buffer is not writeable\n");
return true;		return true;
}		}

/// Return true if the source of a `insertSliceOp` bufferizes to an		/// Return true if the source of a `insertSliceOp` bufferizes to an
/// equivalent ExtractSliceOp that bufferizes inplace.		/// equivalent ExtractSliceOp that bufferizes inplace.
bool BufferizationAliasInfo::isSourceEquivalentToAMatchingInplaceExtractSliceOp(		bool BufferizationAliasInfo::isSourceEquivalentToAMatchingInplaceExtractSliceOp(
InsertSliceOp insertSliceOp) const {		InsertSliceOp insertSliceOp) const {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions As we had discussed, not being in between the ops is not good enough in the non-straight-line case. If either: readingOp dominates conflictingWritingOp dominates lastWrite then continue, otherwise no. nicolasvasilache: As we had discussed, not being in between the ops is not good enough in the non-straight-line…
		springermAuthorUnsubmitted Done Reply Inline Actions This is exactly what `isOpBetweenValueAndOp` is doing. Maybe the function name is not good. Or should I just merge `isOpBetweenValueAndOp` back into `hasReadAfterWriteInterference`? springerm: This is exactly what `isOpBetweenValueAndOp` is doing. Maybe the function name is not good. Or…
		springermAuthorUnsubmitted Done Reply Inline Actions This is actually trickier than I thought. I'm not 100% sure what's the right way to handle this with branches. But for the straight-line case this is definitely correct. I would suggest keeping this as is for now and updating this when we add support for scf::IfOp. (I'm working on that right now.) Then we have concrete examples that we can look at. I'm afraid, we may be missing cases otherwise. springerm: This is actually trickier than I thought. I'm not 100% sure what's the right way to handle this…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions It should be "mayOpBeBetweenValueAndOp" and you can only disprove the cases that I listed. Anything else is more tricky. Do you have issues if trying to implement this simple change now? If so, we need to understand the cases that break since you are changing the conflict detection algorithm. This need to be done as part of this revision. nicolasvasilache: It should be "mayOpBeBetweenValueAndOp" and you can only disprove the cases that I listed.
		springermAuthorUnsubmitted Done Reply Inline Actions I merged the `isOpBetween` function back into the caller. Makes it easier to follow the flow of the function. Thinking about double negations etc. just adds unnecessary complexity. In summary, what we are looking for: There is no conflict, if: properlyDominates(readingOp, conflictingWritingOp) or properlyDominates(conflictingWritingOp, writingOp) Let's not think about "in-between" or "maybe in-between". Instead, think of situations when there is no conflict. Lessons learned: We should use `properlyDominates` instead of `dominates`. The case where the two ops are the same need special rules (e.g., has to be the exact same use). Do not use `!dominates` or `!properlyDominates`. This gets tricky when two ops are in two different branches. However, if `properlyDominates(A, B)` says `true`, we can be certain that `A` is before `B`, regardless of the absence or presence of branches. In our code, this is always the safe solution. Worst case, even if we miss a case, we do not `continue` and report something as a conflict that should not be a conflict. springerm: I merged the `isOpBetween` function back into the caller. Makes it easier to follow the flow of…
LDBG("isSourceEquivalentToAMatchingInplaceExtractSliceOp: " << *insertSliceOp		LDBG("isSourceEquivalentToAMatchingInplaceExtractSliceOp: " << *insertSliceOp
<< '\n');		<< '\n');
auto leaderIt = equivalentInfo.findLeader(insertSliceOp.source());		auto leaderIt = equivalentInfo.findLeader(insertSliceOp.source());
for (auto mit = leaderIt, meit = equivalentInfo.member_end(); mit != meit;		for (auto mit = leaderIt, meit = equivalentInfo.member_end(); mit != meit;
++mit) {		++mit) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions I do not understand this case. Can you elaborate? nicolasvasilache: I do not understand this case. Can you elaborate?
		springermAuthorUnsubmitted Done Reply Inline Actions The for loop (`uConflictingWrite`) iterates over all writes. Including the one found by `findLastPrecedingWrite`. `isOpBetweenValueAndOp` checks for proper domination and does not handle the case where both are the same. This case is similar to the case below. In both cases, we want to check if an op `O` is in-between two things. These two checks (requirement 2, requirement 3) handle the case where `O` is the boundary (upper boundary in case of requirement 2 and lower boundary in case of requirement 3 when you think about reading code from top to bottom). springerm: The for loop (`uConflictingWrite`) iterates over all writes. Including the one found by…
auto extractSliceOp =		auto extractSliceOp =
dyn_cast_or_null<ExtractSliceOp>(mit->v.getDefiningOp());		dyn_cast_or_null<ExtractSliceOp>(mit->v.getDefiningOp());
if (extractSliceOp &&		if (extractSliceOp &&
areEquivalentExtractSliceOps(extractSliceOp, insertSliceOp) &&		areEquivalentExtractSliceOps(extractSliceOp, insertSliceOp) &&
getInPlace(extractSliceOp.result()) == InPlaceSpec::True) {		getInPlace(extractSliceOp.result()) == InPlaceSpec::True) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can you also keep terminology like "no self-conflict" or a "use cannot conflict with itself" ? nicolasvasilache: Can you also keep terminology like "no self-conflict" or a "use cannot conflict with itself" ?
LDBG("\tfound: " << *mit->v.getDefiningOp() << '\n');		LDBG("\tfound: " << *mit->v.getDefiningOp() << '\n');
return true;		return true;
}		}
}		}
LDBG("\tnot equivalent\n");		LDBG("\tnot equivalent\n");
return false;		return false;
}		}

		nicolasvasilacheUnsubmitted Done Reply Inline Actions I would add the inplace notation to the production of %1 and %2 (and would drop this line). %0 is not yet inplace, we need to determine if inplace creates a conflict. E.g. // %0 = tensor.extract_slice %t[%a, %b][%c, %d][1, 1]. // can this bufferize inplace ? // %1 = linalg.fill %cst, %0 // bufferizes inplace // %2 = tensor.insert_slice %1 into %t[%a, %b][%c, %d][1, 1]. // bufferizes inplace nicolasvasilache: I would add the inplace notation to the production of %1 and %2 (and would drop this line). %0…
		springermAuthorUnsubmitted Done Reply Inline Actions The analysis is independent of which OpOperand we are trying to bufferize at the moment. We simply want to know, given a piece of IR and bufferization decisions, is there a conflict. We may as well be bufferizing %2 in the example and %0 is already bufferized. (The heuristic imposes a certain order of bufferization, but the analysis should work with any order.) Added the inplace annotation to all 3 ops. springerm: The analysis is independent of which OpOperand we are trying to bufferize at the moment. We…
/// Apply `fun` to all the members of the equivalence class of `v`.		/// Apply `fun` to all the members of the equivalence class of `v`.
void BufferizationAliasInfo::applyOnEquivalenceClass(		void BufferizationAliasInfo::applyOnEquivalenceClass(
Value v, function_ref<void(Value)> fun) const {		Value v, function_ref<void(Value)> fun) const {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions mention that uConflictingWrite is "the use of %0 in linalg.fill" nicolasvasilache: mention that uConflictingWrite is "the use of %0 in linalg.fill"
auto leaderIt = equivalentInfo.findLeader(v);		auto leaderIt = equivalentInfo.findLeader(v);
for (auto mit = leaderIt, meit = equivalentInfo.member_end(); mit != meit;		for (auto mit = leaderIt, meit = equivalentInfo.member_end(); mit != meit;
++mit) {		++mit) {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Hmm I had not realized you were also using SSA use-def chain for finding the insertSliceOp. I'll need to think about this more... nicolasvasilache: Hmm I had not realized you were also using SSA use-def chain for finding the insertSliceOp.
fun(mit->v);		fun(mit->v);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please spell this out: lastWrite <- ... uRead <- %t in `tensor.insert_slice` uConflictingWrite <- %0 in `linalg.fill` it is unclear to me who lastWrite is here. nicolasvasilache: Please spell this out: ``` lastWrite <- ... uRead <- %t in `tensor.insert_slice`…
		springermAuthorUnsubmitted Done Reply Inline Actions You were probably referring to this part? // Requirement 4: No matching ExtractSliceOp/InsertSliceOp pair. If // uRead is an InsertSliceOp... It does not matter what `lastWrite` is. `lastWrite` is not used for "requirement 4". springerm: You were probably referring to this part? ``` // Requirement 4: No matching…
}		}
}		}

void BufferizationAliasInfo::printAliases(raw_ostream &os) const {		void BufferizationAliasInfo::printAliases(raw_ostream &os) const {
os << "\n/===================== AliasInfo =====================\n";		os << "\n/===================== AliasInfo =====================\n";
for (auto it = aliasInfo.begin(), eit = aliasInfo.end(); it != eit; ++it) {		for (auto it = aliasInfo.begin(), eit = aliasInfo.end(); it != eit; ++it) {
if (!it->isLeader())		if (!it->isLeader())
continue;		continue;
Value leader = it->getData();		Value leader = it->getData();
os << "\|\n\| -- leader: " << printValueInfo(leader, /prefix=/false)		os << "\|\n\| -- leader: " << printValueInfo(leader, /prefix=/false)
<< '\n';		<< '\n';
for (auto mit = aliasInfo.member_begin(it), meit = aliasInfo.member_end();		for (auto mit = aliasInfo.member_begin(it), meit = aliasInfo.member_end();
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please add a TODO to replace the magic constant by `insertSliceOp.getDestOpOperand` when available (cc @jpienaar @mehdi_amini @rriddle with whom we discussed this feature) nicolasvasilache: Please add a TODO to replace the magic constant by `insertSliceOp.getDestOpOperand` when…
mit != meit; ++mit) {		mit != meit; ++mit) {
Value v = static_cast<Value>(*mit);		Value v = static_cast<Value>(*mit);
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please spell this out: lastWrite <- ... uRead <- %1 in `tensor.insert_slice` uConflictingWrite <- t in `tensor.insert_slice` it is unclear to me who lastWrite is here. nicolasvasilache: Please spell this out: lastWrite <- ... uRead <- %1 in `tensor.insert_slice` uConflictingWrite…
		springermAuthorUnsubmitted Done Reply Inline Actions Added comments for `uRead` and `uConflictingWrite`. The value of `lastWrite` is irrelevant. springerm: Added comments for `uRead` and `uConflictingWrite`. The value of `lastWrite` is irrelevant.
os << "\| ---- aliasing member: " << printValueInfo(v, /prefix=/false)		os << "\| ---- aliasing member: " << printValueInfo(v, /prefix=/false)
<< '\n';		<< '\n';
}		}
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please drop this: (Keep in mind that all three results in the example are considered inplace.) marking the ops that bufferize inplace in the example IR is simpler to follow. nicolasvasilache: Please drop this: (Keep in mind that all three results in the example are considered inplace.)…
}		}
os << "\n/===================== End AliasInfo =====================\n\n";		os << "\n/===================== End AliasInfo =====================\n\n";
}		}

void BufferizationAliasInfo::printEquivalences(raw_ostream &os) const {		void BufferizationAliasInfo::printEquivalences(raw_ostream &os) const {
os << "\n/******************* Equivalent Buffers *******************\n";		os << "\n/******************* Equivalent Buffers *******************\n";
for (auto it = equivalentInfo.begin(), eit = equivalentInfo.end(); it != eit;		for (auto it = equivalentInfo.begin(), eit = equivalentInfo.end(); it != eit;
++it) {		++it) {
Show All 37 Lines
bool BufferizationAliasInfo::areEquivalentExtractSliceOps(		bool BufferizationAliasInfo::areEquivalentExtractSliceOps(
ExtractSliceOp st, InsertSliceOp sti) const {		ExtractSliceOp st, InsertSliceOp sti) const {
if (!st \|\| !sti)		if (!st \|\| !sti)
return false;		return false;
if (!equivalentInfo.isEquivalent(st.source(), sti.dest()))		if (!equivalentInfo.isEquivalent(st.source(), sti.dest()))
return false;		return false;
if (!sameOffsetsSizesAndStrides(st, sti, isEqualConstantIntOrValue))		if (!sameOffsetsSizesAndStrides(st, sti, isEqualConstantIntOrValue))
return false;		return false;
		// TODO: Is the following needed?
if (!equivalentInfo.isEquivalent(st.result(), sti.source()))		if (!equivalentInfo.isEquivalent(st.result(), sti.source()))
return false;		return false;
return true;		return true;
}		}

/// Return true if there is a `candidateOp` that would write to memory after
/// bufferization and such that:
/// 1. The written buffer is equivalent to either `aliasingRead` or
/// `aliasingWrite` under the inPlace bufferization decisions taken
/// so far.
/// 2. `aliasingWrite` properly dominates `candidateOp`.
/// 3. `candidateOp` properly dominates `aliasingReadOp`.
// TODO: richer clobbering analysis with container-containee relationship
// instead of equivalence.
bool BufferizationAliasInfo::existsInterleavedValueClobber(
OpOperand &aliasingRead, OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const {
Operation *aliasingReadOp = aliasingRead.getOwner();
Operation *aliasingWriteOp = aliasingWrite.getOwner();
assert(!domInfo.properlyDominates(aliasingReadOp, aliasingWriteOp) &&
"Unexpected aliasingReadOp properly dominates aliasingWriteOp");

for (Value valueToClobber : {aliasingRead.get(), aliasingWrite.get()}) {
auto leaderIt = equivalentInfo.findLeader(valueToClobber);
for (auto mit = leaderIt, meit = equivalentInfo.member_end(); mit != meit;
++mit) {
Operation *candidateOp = mit->v.getDefiningOp();
if (!candidateOp)
continue;
SmallVector<OpOperand *> operands =
getAliasingOpOperand(mit->v.cast<OpResult>());
assert(operands.size() <= 1 && "more than 1 OpOperand not supported yet");
// TODO: Should we check for isInplaceMemoryWrite instead?
if (operands.empty() \|\| !bufferizesToMemoryWrite(*operands.front()))
continue;
LDBG("---->clobbering candidate: " << printOperationInfo(candidateOp)
<< '\n');
if (domInfo.properlyDominates(aliasingWriteOp, candidateOp) &&
domInfo.properlyDominates(candidateOp, aliasingReadOp))
return true;
}
}
return false;
}

/// Return true if there is a write that:
/// 1. Properly dominates aliasingReadOp.
/// 2. Is properly dominated by aliasingWriteOp.
/// 3. Clobbers the write that would be interfering with the read.
///
bool BufferizationAliasInfo::isClobberedWriteBeforeRead(
Operation *opToBufferize, OpOperand &aliasingRead, OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const {
Operation *aliasingReadOp = aliasingRead.getOwner();
Operation *aliasingWriteOp = aliasingWrite.getOwner();
assert(!domInfo.properlyDominates(aliasingReadOp, aliasingWriteOp) &&
"Unexpected aliasingReadOp properly dominates aliasingWriteOp");

// Bail if the write does not dominate the read: it may clobber but only on
// a strict subset of paths, which is not enough for safety.
if (!domInfo.dominates(aliasingWriteOp, aliasingReadOp)) {
LDBG("---->no clobbering: write does not dominate read\n");
return false;
}

// The case `opToBufferize` isa ExtractSliceOp is important enough that we
// look for it specifically. The key information to discover is whether the
// aliasing read or write come from a matching InsertSliceOp.
// Such a pattern is introduced by tiling and is the key inplace condition
// not to miss.
if (auto extractSliceOp = dyn_cast<ExtractSliceOp>(opToBufferize)) {
if (auto insertSliceOp = dyn_cast<InsertSliceOp>(aliasingReadOp)) {
// %1 = extract_slice %0[%offset_sizes_and_strides_1]
//
// ... // 0 or more of inplace compute that reduces to: %X is an
// // aliasingWrite equivalent to %1.
// %W = inplace_write(%1)
//
// // aliasingRead %Y in insert_slice
// ... = insert_slice %W into %R[%offset_sizes_and_strides_1]
if (aliasingRead.get() == insertSliceOp.dest() &&
// TODO: This is currently too restrictive and misses clobberings.
// When available, use container-containee analysis: the condition
// should be that the `aliasingWrite` is contained within
// `insertSliceOp.source()`.
equivalentInfo.isEquivalent(aliasingWrite.get(),
insertSliceOp.source()) &&
areEquivalentExtractSliceOps(extractSliceOp, insertSliceOp)) {
LDBG("---->clobbering matching extract_slice/insert_slice\n");
return true;
}
// %1 = extract_slice %0[%offset_sizes_and_strides_1]
//
// ... // bunch of inplace ops that reduce to %X, equivalent to %1.
// %X = inplace_write(%1)
//
// // aliasingRead %X in insert_slice
// // aliasingWrite %Y in insert_slice
// ... = insert_slice %X into %Y[%offset_sizes_and_strides_1]
if (aliasingReadOp == aliasingWriteOp) {
assert(aliasingRead.get() == insertSliceOp.source() &&
"expected read to source of insert_slice");
assert(aliasingWrite.get() == insertSliceOp.dest() &&
"expected write to dest of insert_slice");
if (areEquivalentExtractSliceOps(extractSliceOp, insertSliceOp)) {
LDBG("---->clobbering matching extract_slice/insert_slice\n");
return true;
}
}
}
}

// General case: look for a properly interleaved clobber of either exactly
// `aliasingRead` or `aliasingWrite`.
// TODO: Relax this to inclusion instead of double inclusion (a.k.a
// equivalence). We will need to compute container-containee relationship.
return existsInterleavedValueClobber(aliasingRead, aliasingWrite, domInfo);
}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Forward declarations.		// Forward declarations.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Return the op with Allocate MemoryEffect if `v` is equivalent to an such		/// Return the op with Allocate MemoryEffect if `v` is equivalent to an such
/// an op. Return null otherwise.		/// an op. Return null otherwise.
static Operation *getEquivalentAlloc(Value value,		static Operation *getEquivalentAlloc(Value value,
const BufferizationAliasInfo &aliasInfo);		const BufferizationAliasInfo &aliasInfo);
▲ Show 20 Lines • Show All 790 Lines • ▼ Show 20 Lines	auto subviewMemRefType =
.cast<MemRefType>();		.cast<MemRefType>();

// A copy of the source buffer is needed if either:		// A copy of the source buffer is needed if either:
// - The producer of `source` is not inplace. This is the case where a		// - The producer of `source` is not inplace. This is the case where a
// slice is computed out of place into the inplace full tensor.		// slice is computed out of place into the inplace full tensor.
// - The result is not inplace. This is the case where the whole tensor is		// - The result is not inplace. This is the case where the whole tensor is
// cloned and the clone needs to be updated.		// cloned and the clone needs to be updated.
auto inPlace = getInPlace(insertSliceOp->getResult(0));		auto inPlace = getInPlace(insertSliceOp->getResult(0));
		// TODO: Is this necessary?
if (!aliasInfo.isSourceEquivalentToAMatchingInplaceExtractSliceOp(		if (!aliasInfo.isSourceEquivalentToAMatchingInplaceExtractSliceOp(
insertSliceOp) \|\|		insertSliceOp) \|\|
inPlace != InPlaceSpec::True) {		inPlace != InPlaceSpec::True) {
LDBG("insert_slice needs extra source copy: " << insertSliceOp.source()		LDBG("insert_slice needs extra source copy: " << insertSliceOp.source()
<< " -> copy\n");		<< " -> copy\n");
// Take a subview of the dst.		// Take a subview of the dst.
Value subView = b.create<memref::SubViewOp>(		Value subView = b.create<memref::SubViewOp>(
loc, subviewMemRefType, dstMemref, insertSliceOp.getMixedOffsets(),		loc, subviewMemRefType, dstMemref, insertSliceOp.getMixedOffsets(),
▲ Show 20 Lines • Show All 788 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/comprehensive-module-bufferize-analysis.mlir

Show First 20 Lines • Show All 358 Lines • ▼ Show 20 Lines	func @nested_extract_slice_and_insert(
%rsA = tensor.insert_slice %FA into %sA[0, 0][4, 4][1, 1] : tensor<4x4xf32> into tensor<?x?xf32>		%rsA = tensor.insert_slice %FA into %sA[0, 0][4, 4][1, 1] : tensor<4x4xf32> into tensor<?x?xf32>
%rA = tensor.insert_slice %rsA into %A[0, 0][%idx, %idx][1, 1] : tensor<?x?xf32> into tensor<?x?xf32>		%rA = tensor.insert_slice %rsA into %A[0, 0][%idx, %idx][1, 1] : tensor<?x?xf32> into tensor<?x?xf32>

// 3-level matching tensor.extract_slice / tensor.insert_slice into		// 3-level matching tensor.extract_slice / tensor.insert_slice into
// inplaceable %B.		// inplaceable %B.
// CHECK-NEXT: tensor.extract_slice		// CHECK-NEXT: tensor.extract_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: tensor.extract_slice		// CHECK-NEXT: tensor.extract_slice
// Atm, this 2nd tensor.extract_slice fails to bufferize inplace because		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
		nicolasvasilacheUnsubmitted Done Reply Inline Actions very nice! nicolasvasilache: very nice!
// clobbering analysis conservatively test for equivalent buffers.
// TODO: This is currently too restrictive and misses clobberings.
// When available, use container-containee analysis.
// CHECK-SAME: {__inplace_results_attr__ = ["false"]}
// CHECK-NEXT: tensor.extract_slice		// CHECK-NEXT: tensor.extract_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: fill		// CHECK-NEXT: fill
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: tensor.insert_slice		// CHECK-NEXT: tensor.insert_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: tensor.insert_slice		// CHECK-NEXT: tensor.insert_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
▲ Show 20 Lines • Show All 359 Lines • ▼ Show 20 Lines	func @insert_slice_chain(
%c0 = arith.constant 0 : index		%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32		%cst = arith.constant 0.000000e+00 : f32

// CHECK: linalg.fill		// CHECK: linalg.fill
// CHECK-SAME: {__inplace_results_attr__ = ["true"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
%0 = linalg.fill(%cst, %arg2) : f32, tensor<62x90xf32> -> tensor<62x90xf32>		%0 = linalg.fill(%cst, %arg2) : f32, tensor<62x90xf32> -> tensor<62x90xf32>

// CHECK: tensor.extract_slice		// CHECK: tensor.extract_slice
// CHECK-SAME: {__inplace_results_attr__ = ["false"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		nicolasvasilacheUnsubmitted Done Reply Inline Actions very nice and catching a wrong assumption: this example did not exhibit the need for intersection analysis because the SSA use-def chains already capture the info we want. nicolasvasilache: very nice and catching a wrong assumption: this example did not exhibit the need for…
// TODO: in order to have this extract_slice bufferize inplace, we need to write a range
// analysis and determine that intersection([0, 32)x[0, 90), [32, 62)x[0, 90)) is empty.
%2 = tensor.extract_slice %0[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>		%2 = tensor.extract_slice %0[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>
// CHECK: vector.transfer_write		// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_results_attr__ = ["true"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
%7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds = [true, true]} : vector<32x90xf32>, tensor<32x90xf32>		%7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds = [true, true]} : vector<32x90xf32>, tensor<32x90xf32>
// CHECK: tensor.insert_slice		// CHECK: tensor.insert_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
%8 = tensor.insert_slice %7 into %0[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>		%8 = tensor.insert_slice %7 into %0[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>

Show All 30 Lines	%r = scf.for %arg0 = %c0 to %c257 step %c256 iter_args(%arg1 = %t) -> (tensor<10x20xf32>) {
%t11 = tensor.extract_slice %t1[0, 0] [5, %y] [1, 1] : tensor<5x?xf32> to tensor<5x?xf32>		%t11 = tensor.extract_slice %t1[0, 0] [5, %y] [1, 1] : tensor<5x?xf32> to tensor<5x?xf32>
%t2 = vector.transfer_write %v, %t11[%c0, %c0] : vector<5x6xf32>, tensor<5x?xf32>		%t2 = vector.transfer_write %v, %t11[%c0, %c0] : vector<5x6xf32>, tensor<5x?xf32>
%t3 = tensor.insert_slice %t2 into %arg1[%x, 0] [5, %y] [1, 1] : tensor<5x?xf32> into tensor<10x20xf32>		%t3 = tensor.insert_slice %t2 into %arg1[%x, 0] [5, %y] [1, 1] : tensor<5x?xf32> into tensor<10x20xf32>
scf.yield %t3 : tensor<10x20xf32>		scf.yield %t3 : tensor<10x20xf32>
}		}
return %r : tensor<10x20xf32>		return %r : tensor<10x20xf32>
}		}

		// -----

		#accesses = [
		affine_map<(i) -> (i)>,
		affine_map<(i) -> (i)>,
		affine_map<(i) -> (i)>
		]
		#trait = {
		indexing_maps = #accesses,
		iterator_types = ["parallel"]
		}

		// CHECK-LABEL: func @linalg_op_same_out_tensors
		func @linalg_op_same_out_tensors(
		%t1: tensor<?xf32> {linalg.inplaceable = true},
		%t2: tensor<?xf32> {linalg.inplaceable = true}) -> (tensor<?xf32>, tensor<?xf32>){

		// CHECK: linalg.generic
		// CHECK-SAME: {__inplace_results_attr__ = ["true", "false"]
		%o:2 = linalg.generic #trait ins(%t1 : tensor<?xf32>)
		outs (%t2, %t2 : tensor<?xf32>, tensor<?xf32>) {
		^bb(%0: f32, %1: f32, %2 : f32) :
		linalg.yield %0, %0 : f32, f32
		} -> (tensor<?xf32>, tensor<?xf32>)
		return %o#0, %o#1 : tensor<?xf32>, tensor<?xf32>
		}

		// -----

		// CHECK-LABEL: func @double_insert_slice_into_alias
		func @double_insert_slice_into_alias(
		%v1: vector<32x90xf32>,
		%v2: vector<30x90xf32>,
		%arg2: tensor<62x90xf32> {linalg.inplaceable = true},
		%s1: index, %s2: index, %s3: index, %s4: index)
		-> (tensor<62x90xf32>, tensor<?x?xf32>)
		{
		%c0 = arith.constant 0 : index

		// Cannot bufferize inplace this extract_slice because both operand and result
		// are modified and returned separately.
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["false"]
		%e = tensor.extract_slice %arg2[%s1, %s2][%s3, %s4][1, 1] : tensor<62x90xf32> to tensor<?x?xf32>

		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%2 = tensor.extract_slice %arg2[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>
		// CHECK: vector.transfer_write
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds = [true, true]} : vector<32x90xf32>, tensor<32x90xf32>
		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%8 = tensor.insert_slice %7 into %arg2[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>

		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%10 = tensor.extract_slice %e[32, 0] [30, 90] [1, 1] : tensor<?x?xf32> to tensor<30x90xf32>
		// CHECK: vector.transfer_write
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%14 = vector.transfer_write %v2, %10[%c0, %c0] {in_bounds = [true, true]} : vector<30x90xf32>, tensor<30x90xf32>
		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%15 = tensor.insert_slice %14 into %e[32, 0] [30, 90] [1, 1] : tensor<30x90xf32> into tensor<?x?xf32>

		return %8, %15 : tensor<62x90xf32>, tensor<?x?xf32>
		}

		// -----

		// CHECK-LABEL: func @interleaved_extract_insert_slice_chain_1
		func @interleaved_extract_insert_slice_chain_1(
		%arg2: tensor<62x90xf32> {linalg.inplaceable = true})
		-> (tensor<62x90xf32>)
		{
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%2 = tensor.extract_slice %arg2[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>

		// TODO: This should bufferize inplace once we have a proper range analysis.
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions Mention the intersection / copy a little more of the other comment here ? nicolasvasilache: Mention the intersection / copy a little more of the other comment here ?
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["false"]
		%10 = tensor.extract_slice %arg2[32, 0] [30, 90] [1, 1] : tensor<62x90xf32> to tensor<30x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%8 = tensor.insert_slice %2 into %arg2[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%15 = tensor.insert_slice %10 into %8[32, 0] [30, 90] [1, 1] : tensor<30x90xf32> into tensor<62x90xf32>

		return %15 : tensor<62x90xf32>
		}

		// -----

		// CHECK-LABEL: func @interleaved_extract_insert_slice_chain_2
		func @interleaved_extract_insert_slice_chain_2(
		%arg2: tensor<62x90xf32> {linalg.inplaceable = true})
		-> (tensor<62x90xf32>)
		{
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%2 = tensor.extract_slice %arg2[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>

		// The slices are overlapping, so this can never bufferize inplace.
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["false"]
		%10 = tensor.extract_slice %arg2[31, 0] [30, 90] [1, 1] : tensor<62x90xf32> to tensor<30x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%8 = tensor.insert_slice %2 into %arg2[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%15 = tensor.insert_slice %10 into %8[31, 0] [30, 90] [1, 1] : tensor<30x90xf32> into tensor<62x90xf32>

		return %15 : tensor<62x90xf32>
		}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][linalg][bufferize] Rewrite RaW conflict detectionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 379897

mlir/include/mlir/Dialect/Linalg/Transforms/ComprehensiveBufferize.h

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp

mlir/test/Dialect/Linalg/comprehensive-module-bufferize-analysis.mlir

[mlir][linalg][bufferize] Rewrite RaW conflict detection
ClosedPublic