This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/Dialect/Linalg/Transforms/
-
mlir/
-
Dialect/
-
Linalg/
-
Transforms/
-
ComprehensiveBufferize.h
-
lib/Dialect/Linalg/Transforms/
-
Dialect/
-
Linalg/
-
Transforms/
36/37
ComprehensiveBufferize.cpp
-
test/Dialect/Linalg/
-
Dialect/
-
Linalg/
2/3
comprehensive-module-bufferize-analysis.mlir

Differential D111287

[mlir][linalg][bufferize] Rewrite RaW conflict detection
ClosedPublic

Authored by springerm on Oct 6 2021, 11:03 PM.

Download Raw Diff

Details

Reviewers

nicolasvasilache

Commits

rGd3cb6bf2d462: [mlir][linalg][bufferize] Rewrite conflict detection

Summary

For each memory read, follow SSA use-def chains to find the op that produces the data being read (i.e., the most recent write). A memory write to an alias is a conflict if it takes places after the "most recent write" but before the read.

This CL introduces two main changes:

There is a concise definition of a conflict. Given a piece of IR with InPlaceSpec annotations and a computes alias set, it is easy to compute whether this program has a conflict. No need to consider multiple cases such as "read of operand after in-place write" etc.
No need to check for clobbering.

Depends On D111040

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

springerm created this revision.Oct 6 2021, 11:03 PM

Herald added subscribers: wenzhicui, wrengr, Chia-hungDuan and 20 others. · View Herald TranscriptOct 6 2021, 11:03 PM

springerm requested review of this revision.Oct 6 2021, 11:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2021, 11:03 PM

Herald added subscribers: limo1996, stephenneuendorffer. · View Herald Transcript

springerm retitled this revision from [mlir][linalg][bufferize] Rewrite conflict detection to [mlir][linalg][bufferize] Rewrite RaW conflict detection.Oct 6 2021, 11:14 PM

Harbormaster completed remote builds in B127447: Diff 377749.Oct 6 2021, 11:18 PM

Very nice improvement!

Please capture in the commit message and where appropriate in doc comments something like:

this version replaces alias-based analysis of clobbering with true last write analysis based on SSA use-def chains
the information captured is strictly better

I still need to check the new impl of wouldCreateReadAfterWriteInterference in more detail, the rest looks good modulo the comments I made.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
582–583	Turn this into a warning a degrade gracefully? Is this part of this CL ?
825	There are TODOs that getAliasingOpOperand should return a list of operands. With the SSA-based analysis here, I think we need to tie this loose end and support the list of operands and traverse them all. For now I would just evolve the API and fail this analysis if I wonder whether there be considerations related to equivalence in the analysis otherwise the last write may be partial and this could be another sack of knots. OTOH equivalence classes are not yet formed as you go up the use-def chain. I think this should be using getInplaceableOpResult on each opOperand considered to check this and the last write needs to be a full write (and in the future it could be a write to a bigger region that fully covers)
883	s/"a new conflict can * * only * * be introduced" etc?
907	do we want to give up on all the debug messages ?
929–932	"Would inplace bufferization of `op` create a conflict?"
mlir/test/Dialect/Linalg/comprehensive-module-bufferize-analysis.mlir
367	very nice!
743	very nice and catching a wrong assumption: this example did not exhibit the need for intersection analysis because the SSA use-def chains already capture the info we want.
869	Mention the intersection / copy a little more of the other comment here ?

rebase

Harbormaster completed remote builds in B127505: Diff 377824.Oct 7 2021, 6:33 AM

rebase

springerm marked an inline comment as done.Oct 7 2021, 7:35 AM

springerm added inline comments.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
582–583	I think this can stay as is. The CallOpInterface case was missing.
582–584	I missed this before sending out for review. Have to see what's going on here...

improve handling of uWrite

Harbormaster completed remote builds in B127524: Diff 377849.Oct 7 2021, 8:04 AM

nicolasvasilache added a subscriber: jpienaar.Oct 7 2021, 11:56 AM

nicolasvasilache added inline comments.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
582–583	seems CallOpInterface addition should be split out in its own CL
950	findLastPrecedingWrite? Reverse use-def chain is an impl detail.
970	As we had discussed, not being in between the ops is not good enough in the non-straight-line case. If either: readingOp dominates conflictingWritingOp dominates lastWrite then continue, otherwise no.
975	I do not understand this case. Can you elaborate?
980	Can you also keep terminology like "no self-conflict" or a "use cannot conflict with itself" ?
988	I would add the inplace notation to the production of %1 and %2 (and would drop this line). %0 is not yet inplace, we need to determine if inplace creates a conflict. E.g. // %0 = tensor.extract_slice %t[%a, %b][%c, %d][1, 1]. // can this bufferize inplace ? // %1 = linalg.fill %cst, %0 // bufferizes inplace // %2 = tensor.insert_slice %1 into %t[%a, %b][%c, %d][1, 1]. // bufferizes inplace
991	mention that uConflictingWrite is "the use of %0 in linalg.fill"
994	Hmm I had not realized you were also using SSA use-def chain for finding the insertSliceOp. I'll need to think about this more...
995	Please spell this out: lastWrite <- ... uRead <- %t in `tensor.insert_slice` uConflictingWrite <- %0 in `linalg.fill` it is unclear to me who lastWrite is here.
1007	Please add a TODO to replace the magic constant by `insertSliceOp.getDestOpOperand` when available (cc @jpienaar @mehdi_amini @rriddle with whom we discussed this feature)
1009	Please spell this out: lastWrite <- ... uRead <- %1 in `tensor.insert_slice` uConflictingWrite <- t in `tensor.insert_slice` it is unclear to me who lastWrite is here.
1012	Please drop this: (Keep in mind that all three results in the example are considered inplace.) marking the ops that bufferize inplace in the example IR is simpler to follow.

rebase

Harbormaster completed remote builds in B127673: Diff 378079.Oct 7 2021, 8:35 PM

address comments

Harbormaster completed remote builds in B128022: Diff 378548.Oct 10 2021, 7:03 PM

address comments

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
907	I think we should keep some of them. These particular ones were not very helpful for my debugging. They actually cluttered the output quite a lot. (And they are somewhat repetitive.) I think the important ones are "READ =", "CONFLICTING WRITE =", "WRITE =" below. What do you think?
970	This is exactly what `isOpBetweenValueAndOp` is doing. Maybe the function name is not good. Or should I just merge `isOpBetweenValueAndOp` back into `hasReadAfterWriteInterference`?
975	The for loop (`uConflictingWrite`) iterates over all writes. Including the one found by `findLastPrecedingWrite`. `isOpBetweenValueAndOp` checks for proper domination and does not handle the case where both are the same. This case is similar to the case below. In both cases, we want to check if an op `O` is in-between two things. These two checks (requirement 2, requirement 3) handle the case where `O` is the boundary (upper boundary in case of requirement 2 and lower boundary in case of requirement 3 when you think about reading code from top to bottom).
988	The analysis is independent of which OpOperand we are trying to bufferize at the moment. We simply want to know, given a piece of IR and bufferization decisions, is there a conflict. We may as well be bufferizing %2 in the example and %0 is already bufferized. (The heuristic imposes a certain order of bufferization, but the analysis should work with any order.) Added the inplace annotation to all 3 ops.
995	You were probably referring to this part? // Requirement 4: No matching ExtractSliceOp/InsertSliceOp pair. If // uRead is an InsertSliceOp... It does not matter what `lastWrite` is. `lastWrite` is not used for "requirement 4".
1009	Added comments for `uRead` and `uConflictingWrite`. The value of `lastWrite` is irrelevant.

Harbormaster completed remote builds in B128040: Diff 378571.Oct 10 2021, 11:27 PM

springerm added inline comments.Oct 11 2021, 11:07 PM

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
970	This is actually trickier than I thought. I'm not 100% sure what's the right way to handle this with branches. But for the straight-line case this is definitely correct. I would suggest keeping this as is for now and updating this when we add support for scf::IfOp. (I'm working on that right now.) Then we have concrete examples that we can look at. I'm afraid, we may be missing cases otherwise.

nicolasvasilache added inline comments.Oct 12 2021, 8:14 AM

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
819	Can you replace all this with a getBackwardSlice that would use a filter to set true when you encounter your condition? This should be more general and avoids duplicating code. The only downside is that the SetVector is filled but as soon as you need to support more than a straight line this is needed anyway.
853	you should be able to plug getBackwardSlice here and just drop everything else.
907	yup, removing clutter is def important, if this is the set that you found useful debugging let's go with it

nicolasvasilache added inline comments.Oct 12 2021, 11:59 AM

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
838	You should be able to fold this into req 1. (once it is implemented as "mayOpBeBetweenValueAndOp") if you use `dominates` instead of `properlyDominates` in the right place.
841	You have 2 subcases with an `InsertSliceOp` here. Do you know that you don't need more or is this still TBD ? Is it possible some case is missing in light of my comment on the impl. of `getAliasingOpResult` below ?
970	It should be "mayOpBeBetweenValueAndOp" and you can only disprove the cases that I listed. Anything else is more tricky. Do you have issues if trying to implement this simple change now? If so, we need to understand the cases that break since you are changing the conflict detection algorithm. This need to be done as part of this revision.

address comments

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
819	I tried this, but unfortunately, `getBackwardSlice` only gives us `Operation` and not `OpOperand` or `Value`. This would work fine if all ops had only one operand and one output. This is not something that can be fixed by specifying a fancy filter function. E.g., we there's no way to filter for OpOperands that bufferize to a memory write. We may want to add another variant of `getBackwardSlice`. One that returns a SetVector<Value> and has a `std::function<bool(Value)>` condition.
825	Sorry, I can no longer see which line of code this comment is associated to. Wrt. to `lastWrite`, there may be multiple in the case of branches. Whatever analysis we do here, stays the same. It just has to be done (and hold) for every `lastWrite` that we found (`llvm::all_of`).
838	We have to make sure that it is the exact same OpOperand and not two different ones of the same op. Otherwise, things could break around bufferization of InsertSliceOp etc.
841	I did not come across any other cases in our current examples, so I think we are good here. Note, these are cases in which there is no conflict. Even if we miss a case, bufferization will work correctly. It may just introduce unnecessary copies.
970	I merged the `isOpBetween` function back into the caller. Makes it easier to follow the flow of the function. Thinking about double negations etc. just adds unnecessary complexity. In summary, what we are looking for: There is no conflict, if: properlyDominates(readingOp, conflictingWritingOp) or properlyDominates(conflictingWritingOp, writingOp) Let's not think about "in-between" or "maybe in-between". Instead, think of situations when there is no conflict. Lessons learned: We should use `properlyDominates` instead of `dominates`. The case where the two ops are the same need special rules (e.g., has to be the exact same use). Do not use `!dominates` or `!properlyDominates`. This gets tricky when two ops are in two different branches. However, if `properlyDominates(A, B)` says `true`, we can be certain that `A` is before `B`, regardless of the absence or presence of branches. In our code, this is always the safe solution. Worst case, even if we miss a case, we do not `continue` and report something as a conflict that should not be a conflict.

Harbormaster completed remote builds in B128759: Diff 379576.Oct 13 2021, 6:56 PM

springerm added a child revision: D111775: [mlir][linalg][bufferize] Handle scf::ForOp correctly in bufferizesToMemoryRead.Oct 13 2021, 10:09 PM

Great, thanks!

This revision is now accepted and ready to land.Oct 14 2021, 7:11 AM

This revision was landed with ongoing or failed builds.Oct 14 2021, 6:32 PM

Closed by commit rGd3cb6bf2d462: [mlir][linalg][bufferize] Rewrite conflict detection (authored by springerm). · Explain Why

This revision was automatically updated to reflect the committed changes.

springerm added a commit: rGd3cb6bf2d462: [mlir][linalg][bufferize] Rewrite conflict detection.

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

Linalg/

Transforms/

ComprehensiveBufferize.h

85 lines

lib/

Dialect/

Linalg/

Transforms/

ComprehensiveBufferize.cpp

451 lines

test/

Dialect/

Linalg/

comprehensive-module-bufferize-analysis.mlir

135 lines

Diff 377824

mlir/include/mlir/Dialect/Linalg/Transforms/ComprehensiveBufferize.h

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	public:

/// Set the inPlace bufferization spec to true.		/// Set the inPlace bufferization spec to true.
/// Merge result's and operand's aliasing sets and iterate to a fixed point.		/// Merge result's and operand's aliasing sets and iterate to a fixed point.
void bufferizeInPlace(OpResult result, OpOperand &operand);		void bufferizeInPlace(OpResult result, OpOperand &operand);

/// Set the inPlace bufferization spec to false.		/// Set the inPlace bufferization spec to false.
void bufferizeOutOfPlace(OpResult result);		void bufferizeOutOfPlace(OpResult result);

/// Return true if it is possible to find an inplace write W among `usesWrite`		/// Return true if `value` has an ExtractSliceOp matching the given
/// and a read R among `usesRead`, such that W and R interfere.		/// InsertSliceOp in its reverse SSA use-def chain.
/// Such a (W, R) pair is an interference to the inplace bufferization of		bool hasMatchingExtractSliceOp(Value value,
/// opResult when:		tensor::InsertSliceOp insertOp) const;
/// 1. R is not known properly dominate W (i.e. the effects of the write may
/// be visible from R).
/// 2. one cannot find an intermediate clobbering write `C` to W, such that
/// C interleaved between W and R (i.e. W -> C -> R where -> denotes
/// dominance).
bool wouldCreateReadAfterWriteInterference(
Operation opToBufferize, DenseSet<OpOperand > &usesRead,
DenseSet<OpOperand *> &usesWrite, const DominanceInfo &domInfo) const;

/// Assume that result bufferizes in-place with one of the operation's		/// Assume that result bufferizes in-place with one of the operation's
/// operands. Return true if it is possible to find an inplace write W (resp.		/// operands. Return true if it is possible to find an inplace write W that
/// a read R) among the uses of `aliasInfo[result]`, and a read R (resp. an		/// creates a conflict.
/// inplace write W) among the uses of
/// `aliasInfo[getAliasingOpOperand(result)]`, such that W and R interfere.
/// Interference detection is needed to determine which cases may bufferize
/// inplace without interferences. Such cases comprise:
///
/// ```
/// %0 = op_to_bufferize(%1)
/// read(%1)
///
/// %0 = op_to_bufferize(%1)
/// write(%0)
/// read(%1)
///
/// %0 = op_to_bufferize(%1)
/// write(%1)
/// read(%0)
/// ```
bool		bool
wouldCreateReadAfterWriteInterference(OpOperand &operand, OpResult result,		wouldCreateReadAfterWriteInterference(OpOperand &operand, OpResult result,
const DominanceInfo &domInfo) const;		const DominanceInfo &domInfo) const;

/// Return true if bufferizing `opOperand` inplace with `opResult` would		/// Return true if bufferizing `opOperand` inplace with `opResult` would
/// create a write to a non-writable buffer.		/// create a write to a non-writable buffer.
bool wouldCreateWriteToNonWritableBuffer(OpOperand &opOperand,		bool wouldCreateWriteToNonWritableBuffer(OpOperand &opOperand,
OpResult opResult) const;		OpResult opResult) const;
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	private:
/// equivalent operand / result and same offset/sizes/strides specification).		/// equivalent operand / result and same offset/sizes/strides specification).
///		///
/// This is one particular type of relationship between ops on tensors that		/// This is one particular type of relationship between ops on tensors that
/// reduce to an equivalence on buffers. This should be generalized and		/// reduce to an equivalence on buffers. This should be generalized and
/// exposed as interfaces on the proper types.		/// exposed as interfaces on the proper types.
bool areEquivalentExtractSliceOps(tensor::ExtractSliceOp st,		bool areEquivalentExtractSliceOps(tensor::ExtractSliceOp st,
tensor::InsertSliceOp sti) const;		tensor::InsertSliceOp sti) const;

/// Return true if there is a `candidateOp` that would write to memory after
/// bufferization and such that:
/// 1. The written buffer is equivalent to either `aliasingRead` or
/// `aliasingWrite` under the inPlace bufferization decisions taken
/// so far.
/// 2. `aliasingWrite` properly dominates `candidateOp`.
/// 3. `candidateOp` properly dominates `aliasingReadOp`.
// TODO: richer clobbering analysis with container-containee relationship
// instead of equivalence.
bool existsInterleavedValueClobber(OpOperand &aliasingRead,
OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const;

/// Return true if there is a write that:
/// 1. Properly dominates aliasingReadOp.
/// 2. Is properly dominated by aliasingWriteOp.
/// 3. Clobbers the write that would be interfering with the read.
///
/// Case discussion:
/// ================
/// Case 1: opOperand is produced by opToBufferize,
/// Case 2: opResult is produced by opToBufferize,
/// Common case:
/// - aliasingReadOp is a read to an alias of opOperand.
/// - aliasingWriteOp is an inplace write to an alias of opResult.
/// - aliasingWriteOp dominates aliasingReadOp.
///
/// ```
/// // Either case 1:
/// %opOperand = opToBufferize(%opResult)
/// aliasingWriteOp(%aliasingWrite = alias(%opResult)) // inplace
/// aliasingReadOp( %aliasingRead = alias(%opOperand))
/// ```
///
/// ```
/// // Or case 2:
/// %opResult = opToBufferize(%opOperand)
/// aliasingWriteOp(%aliasingWrite = alias(%opResult)) // inplace
/// aliasingReadOp( %aliasingRead = alias(%opOperand))
/// ```
///
/// Capture possible cases where `aliasingWriteOp(alias(%opResult))` has no
/// visible effect on `aliasingReadOp(alias(%opOperand))`.
bool isClobberedWriteBeforeRead(Operation *opToBufferize,
OpOperand &aliasingRead,
OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const;

/// Set of tensors that are known to bufferize to writable memory.		/// Set of tensors that are known to bufferize to writable memory.
llvm::DenseSet<Value> bufferizeToWritableMemory;		llvm::DenseSet<Value> bufferizeToWritableMemory;

/// Auxiliary structure to store all the values a given value aliases with.		/// Auxiliary structure to store all the values a given value aliases with.
/// These are the conservative cases that can further decompose into		/// These are the conservative cases that can further decompose into
/// "equivalent" buffer relationships.		/// "equivalent" buffer relationships.
llvm::EquivalenceClasses<ValueWrapper> aliasInfo;		llvm::EquivalenceClasses<ValueWrapper> aliasInfo;

Show All 22 Lines

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp

Show First 20 Lines • Show All 573 Lines • ▼ Show 20 Lines	return TypeSwitch<Operation , OpOperand >(result.getDefiningOp())
})		})
.Case([&](TiledLoopOp op) {		.Case([&](TiledLoopOp op) {
// TODO: TiledLoopOp helper method to avoid leaking impl details.		// TODO: TiledLoopOp helper method to avoid leaking impl details.
return &op->getOpOperand(op.getNumControlOperands() +		return &op->getOpOperand(op.getNumControlOperands() +
op.getNumInputs() + result.getResultNumber());		op.getNumInputs() + result.getResultNumber());
})		})
.Case([&](vector::TransferWriteOp op) { return &op->getOpOperand(1); })		.Case([&](vector::TransferWriteOp op) { return &op->getOpOperand(1); })
.Default([&](Operation *op) {		.Default([&](Operation *op) {
op->dump();		// op->dump();
llvm_unreachable("unexpected defining op");		// llvm_unreachable("unexpected defining op");
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Turn this into a warning a degrade gracefully? Is this part of this CL ? nicolasvasilache: Turn this into a warning a degrade gracefully? Is this part of this CL ?
		springermAuthorUnsubmitted Done Reply Inline Actions I think this can stay as is. The CallOpInterface case was missing. springerm: I think this can stay as is. The CallOpInterface case was missing.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions seems CallOpInterface addition should be split out in its own CL nicolasvasilache: seems CallOpInterface addition should be split out in its own CL
return nullptr;		return nullptr;
		springermAuthorUnsubmitted Done Reply Inline Actions I missed this before sending out for review. Have to see what's going on here... springerm: I missed this before sending out for review. Have to see what's going on here...
});		});
}		}

/// If the an ExtractSliceOp is bufferized in-place, the source operand will		/// If the an ExtractSliceOp is bufferized in-place, the source operand will
/// alias with the result.		/// alias with the result.
static OpResult getAliasingOpResult(ExtractSliceOp op, OpOperand &opOperand) {		static OpResult getAliasingOpResult(ExtractSliceOp op, OpOperand &opOperand) {
if (op.source() == opOperand.get())		if (op.source() == opOperand.get())
return op->getResult(0);		return op->getResult(0);
▲ Show 20 Lines • Show All 213 Lines • ▼ Show 20 Lines	void BufferizationAliasInfo::bufferizeInPlace(OpResult result,
LLVM_DEBUG(dumpEquivalences());		LLVM_DEBUG(dumpEquivalences());
}		}

/// Set the inPlace bufferization spec to false.		/// Set the inPlace bufferization spec to false.
void BufferizationAliasInfo::bufferizeOutOfPlace(OpResult result) {		void BufferizationAliasInfo::bufferizeOutOfPlace(OpResult result) {
setInPlaceOpResult(result, InPlaceSpec::False);		setInPlaceOpResult(result, InPlaceSpec::False);
}		}

/// Return true if it is possible to find an inplace write W among `usesWrite`		/// Starting from `value`, follow the use-def chain in reverse, always selecting
/// and a read R among `usesRead`, such that W and R interfere.		/// the corresponding aliasing OpOperand. Try to find and return a Value for
bool BufferizationAliasInfo::wouldCreateReadAfterWriteInterference(		/// which `condition` evaluates to true for the aliasing OpOperand. Return an
Operation opToBufferize, DenseSet<OpOperand > &usesRead,		/// empty Value if no such Value was found. If `returnLast`, return the last
DenseSet<OpOperand *> &usesWrite, const DominanceInfo &domInfo) const {		/// Value (at the end of the chain), even if it does not satisfy the condition.
for (OpOperand *uRead : usesRead) {		static Value
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can you replace all this with a getBackwardSlice that would use a filter to set true when you encounter your condition? This should be more general and avoids duplicating code. The only downside is that the SetVector is filled but as soon as you need to support more than a straight line this is needed anyway. nicolasvasilache: Can you replace all this with a getBackwardSlice that would use a filter to set true when you…
		springermAuthorUnsubmitted Done Reply Inline Actions I tried this, but unfortunately, `getBackwardSlice` only gives us `Operation` and not `OpOperand` or `Value`. This would work fine if all ops had only one operand and one output. This is not something that can be fixed by specifying a fancy filter function. E.g., we there's no way to filter for OpOperands that bufferize to a memory write. We may want to add another variant of `getBackwardSlice`. One that returns a SetVector<Value> and has a `std::function<bool(Value)>` condition. springerm: I tried this, but unfortunately, `getBackwardSlice` only gives us `Operation*` and not…
Operation *aliasingReadOp = uRead->getOwner();		findValueInReverseUseDefChain(Value value,
LDBG("----++++aliasRead -> #"		std::function<bool(OpOperand &)> condition,
<< uRead->getOperandNumber()		bool returnLast = false) {
<< " in: " << printOperationInfo(aliasingReadOp) << '\n');		while (value.isa<OpResult>()) {
for (OpOperand *uWrite : usesWrite) {		auto opResult = value.cast<OpResult>();
// The same operand may both read and write.		OpOperand *opOperand = getAliasingOpOperand(opResult);
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions There are TODOs that getAliasingOpOperand should return a list of operands. With the SSA-based analysis here, I think we need to tie this loose end and support the list of operands and traverse them all. For now I would just evolve the API and fail this analysis if I wonder whether there be considerations related to equivalence in the analysis otherwise the last write may be partial and this could be another sack of knots. OTOH equivalence classes are not yet formed as you go up the use-def chain. I think this should be using getInplaceableOpResult on each opOperand considered to check this and the last write needs to be a full write (and in the future it could be a write to a bigger region that fully covers) nicolasvasilache: There are TODOs that getAliasingOpOperand should return a list of operands. With the SSA-based…
		springermAuthorUnsubmitted Done Reply Inline Actions Sorry, I can no longer see which line of code this comment is associated to. Wrt. to `lastWrite`, there may be multiple in the case of branches. Whatever analysis we do here, stays the same. It just has to be done (and hold) for every `lastWrite` that we found (`llvm::all_of`). springerm: Sorry, I can no longer see which line of code this comment is associated to. Wrt. to…
// Don't consider self-use of the same operand for interference.		if (!opOperand)
// Multiple different uses within the same op is fair game though.		// No aliasing OpOperand. This could be an unsupported op or an op without
if (uWrite == uRead)		// a tensor arg such as InitTensorOp. This is the end of the chain.
continue;		return returnLast ? value : Value();
		if (condition(*opOperand))
Operation *aliasingWriteOp = uWrite->getOwner();		return value;
LDBG("---- aliasWrite -> #"		value = opOperand->get();
<< uWrite->getOperandNumber()		}
<< " in: " << printOperationInfo(aliasingWriteOp) << '\n');		// Value is a BlockArgument. Reached the end of the chain.
// If the candidate write is the one that produces the read value (in the		return returnLast ? value : Value();
// SSA def-use sense), this is not considered an interference.		}
if (getInplaceableOpResult(*uWrite) == uRead->get())
continue;		/// Return true if `value` is originating from an ExtractSliceOp that matches
		nicolasvasilacheUnsubmitted Done Reply Inline Actions You should be able to fold this into req 1. (once it is implemented as "mayOpBeBetweenValueAndOp") if you use `dominates` instead of `properlyDominates` in the right place. nicolasvasilache: You should be able to fold this into req 1. (once it is implemented as…
		springermAuthorUnsubmitted Done Reply Inline Actions We have to make sure that it is the exact same OpOperand and not two different ones of the same op. Otherwise, things could break around bufferization of InsertSliceOp etc. springerm: We have to make sure that it is the exact same OpOperand and not two different ones of the same…
// If aliasingReadOp properly dominates aliasingWriteOp, the read cannot		/// the given InsertSliceOp.
// be affected by the write: there is no interference.		bool BufferizationAliasInfo::hasMatchingExtractSliceOp(
if (domInfo.properlyDominates(aliasingReadOp, aliasingWriteOp))		Value value, InsertSliceOp insertOp) const {
		nicolasvasilacheUnsubmitted Done Reply Inline Actions You have 2 subcases with an `InsertSliceOp` here. Do you know that you don't need more or is this still TBD ? Is it possible some case is missing in light of my comment on the impl. of `getAliasingOpResult` below ? nicolasvasilache: You have 2 subcases with an `InsertSliceOp` here. Do you know that you don't need more or is…
		springermAuthorUnsubmitted Done Reply Inline Actions I did not come across any other cases in our current examples, so I think we are good here. Note, these are cases in which there is no conflict. Even if we miss a case, bufferization will work correctly. It may just introduce unnecessary copies. springerm: I did not come across any other cases in our current examples, so I think we are good here.
continue;		return static_cast<bool>(
// At this point, aliasingWriteOp properly dominates aliasingReadOp or		findValueInReverseUseDefChain(value, [&](OpOperand &opOperand) {
// there is no clear dominance and we need to be conservative.		if (auto extractOp = dyn_cast<ExtractSliceOp>(opOperand.getOwner()))
LDBG("---->found RaW interference between:\n");		if (areEquivalentExtractSliceOps(extractOp, insertOp))
LDBG(" OpToBufferize -> " << printOperationInfo(opToBufferize)
<< '\n');
LDBG(" Interfering write -> #"
<< uWrite->getOperandNumber() << ":"
<< printOperationInfo(aliasingWriteOp) << '\n');
LDBG(" Target read -> #" << uRead->getOperandNumber() << ":"
<< printOperationInfo(aliasingReadOp)
<< '\n');
LDBG("---->opportunity to clobber RaW interference\n");
if (isClobberedWriteBeforeRead(opToBufferize, uRead, uWrite, domInfo)) {
LDBG("---->clobbered! -> skip\n");
continue;
}
LDBG("---->not clobbered -> found an interference\n");
return true;		return true;
		return false;
		}));
}		}

		/// Return true if `op` is between `begin` and `end` as per op dominance.
		///
		/// If `begin` is a BlockArgument, return true if `op` is in the same block
		nicolasvasilacheUnsubmitted Done Reply Inline Actions you should be able to plug getBackwardSlice here and just drop everything else. nicolasvasilache: you should be able to plug getBackwardSlice here and just drop everything else.
		/// before `end`. Return false if `op` is the same op as `begin` or `end`.
		static bool isOpBetweenValueAndOp(Operation op, Value begin, Operation end,
		const DominanceInfo &domInfo) {
		if (Operation *beginOp = begin.getDefiningOp()) {
		// `begin` is an operation.
		if (domInfo.properlyDominates(op, beginOp))
		// `op` happens before `begin`. (Or they are the same op.)
		return false;
		} else {
		// `begin` is a BlockArgument.
		auto bbArg = begin.cast<BlockArgument>();

		Block *block = bbArg.getOwner();
		if (!block->findAncestorOpInBlock(*op))
		// `op` happens before or after the block.
		return false;
}		}
LDBG("----No interference found\n");
		if (domInfo.properlyDominates(end, op))
		// `op` happens after the `end`. (Or they are the same op.)
return false;		return false;

		return true;
}		}

/// Return true if it is possible to find an inplace write W among the uses of		/// Return true if bufferizing result inplace would create a conflict. A read R
/// aliasInfo[result], and a read R among the uses of aliasInfo[result],		/// and a write W of the same alias set is a conflict if inplace bufferization
/// such that W and R interfere.		/// of W changes the value read by R to a value different from the one that
/// Such a (W, R) pair is an interference to the inplace bufferization of		/// would be expected by tracing back R's origin through SSA use-def chains.
/// opResult when:		/// A conflict can be introduced by a new alias and/or an inplace bufferization
		nicolasvasilacheUnsubmitted Done Reply Inline Actions s/"a new conflict can * * only * * be introduced" etc? nicolasvasilache: s/"a new conflict can * * only * * be introduced" etc?
/// 1. R is not known to properly dominate W (i.e. the effects of the write		/// decision.
/// may be visible from R).		///
/// 2. one cannot find an intermediate clobbering write `C` to W, such that		/// Example:
/// C interleaved between W and R (i.e. W -> C -> R where -> denotes		/// %0 = tensor.extract_slice %t[...][...][1, 1] {inplace?}
/// dominance).		/// %1 = vector.transfer_write %v1, %t {inplace} : vector<5xf32>, tensor<?xf32>
		/// %e = tensor.extract_slice %1
		/// %2 = vector.transfer_write %v2, %0 {inplace} : vector<6xf32>, tensor<?xf32>
		/// %3 = vector.transfer_read %e, %cst : tensor<?xf32>, vector<7xf32>
		///
		/// In the above example, the two TransferWriteOps have already been decided to
		/// bufferize inplace. Bufferizing the ExtractSliceOp inplace would create a
		/// conflict because:
		/// * According to SSA use-def chains, we expect to read the result of %1.
		/// * However, adding an alias {%0, %t} would mean that the second
		/// TransferWriteOp overwrites the first one. Therefore, the TransferReadOp
		/// would no longer be reading the result of %1.
bool BufferizationAliasInfo::wouldCreateReadAfterWriteInterference(		bool BufferizationAliasInfo::wouldCreateReadAfterWriteInterference(
OpOperand &operand, OpResult result, const DominanceInfo &domInfo) const {		OpOperand &operand, OpResult result, const DominanceInfo &domInfo) const {
Operation *opToBufferize = result.getDefiningOp();		// Helper function to iterate on aliases of `root` and capture the reads.
Value opResult = result;
Value opOperand = operand.get();

LDBG("----Start wouldCreateReadAfterWriteInterference\n");
LDBG("--------consider all aliases to root read: "
<< printValueInfo(opOperand) << "\n");
LDBG("--------consider all aliases to root write: "
<< printValueInfo(opResult) << "\n");

/// Helper function to iterate on aliases of `root` and capture the reads.
auto getAliasingReads = [&](DenseSet<OpOperand *> &res, Value root) {		auto getAliasingReads = [&](DenseSet<OpOperand *> &res, Value root) {
for (Value alias : getAliases(root)) {		for (Value alias : getAliases(root))
for (auto &use : alias.getUses()) {		for (auto &use : alias.getUses())
// Read to a value that aliases root.		// Read to a value that aliases root.
if (bufferizesToMemoryRead(use)) {		if (bufferizesToMemoryRead(use))
		nicolasvasilacheUnsubmitted Done Reply Inline Actions do we want to give up on all the debug messages ? nicolasvasilache: do we want to give up on all the debug messages ?
		springermAuthorUnsubmitted Done Reply Inline Actions I think we should keep some of them. These particular ones were not very helpful for my debugging. They actually cluttered the output quite a lot. (And they are somewhat repetitive.) I think the important ones are "READ =", "CONFLICTING WRITE =", "WRITE =" below. What do you think? springerm: I think we should keep some of them. These particular ones were not very helpful for my…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions yup, removing clutter is def important, if this is the set that you found useful debugging let's go with it nicolasvasilache: yup, removing clutter is def important, if this is the set that you found useful debugging…
LDBG("------------bufferizesToMemoryRead: "
<< use.getOwner()->getName().getStringRef() << "\n");
res.insert(&use);		res.insert(&use);
}
}
}
};		};

/// Helper function to iterate on aliases of `root` and capture the writes.		// Helper function to iterate on aliases of `root` and capture the writes.
auto getAliasingInplaceWrites = [&](DenseSet<OpOperand *> &res, Value root) {		auto getAliasingInplaceWrites = [&](DenseSet<OpOperand *> &res, Value root) {
for (Value alias : getAliases(root)) {		for (Value alias : getAliases(root))
for (auto &use : alias.getUses()) {		for (auto &use : alias.getUses())
// Inplace write to a value that aliases root.		// Inplace write to a value that aliases root.
if (isInplaceMemoryWrite(use)) {		if (isInplaceMemoryWrite(use))
LDBG("------------bufferizesToMemoryWrite: "
<< use.getOwner()->getName().getStringRef() << "\n");
res.insert(&use);		res.insert(&use);
}
}
}
};		};

// Check if we can find any interference between reads to aliases[`opOperand`]		// Collect reads and writes of all aliases of OpOperand and OpResult.
// and writes to aliases[`opResult`]. This handles the case:
//
// ```
// %0 = op_to_bufferize_maybe_inplace(%1)
// %2 = some_alias(%0)
// inplace_write(%2)
// %3 = some_alias(%1)
// read(%3)
// ```
DenseSet<OpOperand *> usesRead, usesWrite;		DenseSet<OpOperand *> usesRead, usesWrite;
LDBG("--------\n");		getAliasingReads(usesRead, operand.get());
LDBG("--------Test reads(opOperand) vs writes(opResult)\n");		getAliasingReads(usesRead, result);
getAliasingReads(usesRead, opOperand);		getAliasingInplaceWrites(usesWrite, operand.get());
getAliasingInplaceWrites(usesWrite, opResult);		getAliasingInplaceWrites(usesWrite, result);
// Additionally, `result` is not yet bufferized and we need to check for
// interferences as if it were bufferized inplace: add `operand` if it is a
// write. This handles the case:
//
// ```
// %0 = op_to_bufferize_maybe_inplace(%1)
// %2 = some_alias(%1)
// read(%2)
// ```
if (bufferizesToMemoryWrite(operand))		if (bufferizesToMemoryWrite(operand))
usesWrite.insert(&operand);		usesWrite.insert(&operand);
if (wouldCreateReadAfterWriteInterference(opToBufferize, usesRead, usesWrite,
		// Assuming that the op is bufferized in-place, is there a conflict? For each
		// read R (of any alias) find the last write W1 that is being read (according
		// to SSA use-def chains). For every write W2 (of any alias), check if W2
		// causes a conflict based on current bufferization decisions.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions "Would inplace bufferization of `op` create a conflict?" nicolasvasilache: "Would inplace bufferization of `op` create a conflict?"
		//
		// A conflict is: According to SSA use-def chains, R is supposed to read W1.
		// But because of bufferization decisions, R actually reads W2. (Assuming that
		// W1 != W2.)
		for (OpOperand *uRead : usesRead) {
		Operation *readingOp = uRead->getOwner();

		// Find most recent write of uRead by following the SSA use-def chain. E.g.:
		//
		// %0 = "writing_op"(%t) : tensor<?x32> -> tensor<?xf32>
		// %1 = "aliasing_op"(%0) : tensor<?x32> -> tensor<?xf32>
		// %2 = "reading_op"(%1) : : tensor<?x32> -> not_a_tensor_type
		//
		// In the above example, if uRead is the OpOperand of reading_op, lastWrite
		// is %0. Note that operations that create an alias but do not write (such
		// as ExtractSliceOp) are skipped.
		// TODO: With branches this should probably be a list of Values.
		Value lastWrite = findValueInReverseUseDefChain(
		nicolasvasilacheUnsubmitted Done Reply Inline Actions findLastPrecedingWrite? Reverse use-def chain is an impl detail. nicolasvasilache: findLastPrecedingWrite? Reverse use-def chain is an impl detail.
		uRead->get(), bufferizesToMemoryWrite, /returnLast=/true);
		// Note: `writingOp` is nullptr in case of a BlockArgument.
		Operation *writingOp = lastWrite.getDefiningOp();
		OpOperand *uWrite =
		writingOp ? getAliasingOpOperand(lastWrite.cast<OpResult>()) : nullptr;

		// Look for conflicting memory writes. Potential conflicts are writes to an
		// alias that have been decided to bufferize inplace.
		for (OpOperand *uConflictingWrite : usesWrite) {
		// Throughout this loop, check for multiple requirements that have to be
		// met for uConflictingWrite to be an actual conflict.
		Operation *conflictingWritingOp = uConflictingWrite->getOwner();

		// Print some debug info.
		LDBG("Found potential conflict:\n");
		LDBG("READ = #" << uRead->getOperandNumber() << " of "
		<< printOperationInfo(readingOp) << "\n");
		if (writingOp) {
		LDBG("WRITE = #" << uWrite->getOperandNumber() << " of "
		<< printOperationInfo(writingOp) << "\n");
		nicolasvasilacheUnsubmitted Done Reply Inline Actions As we had discussed, not being in between the ops is not good enough in the non-straight-line case. If either: readingOp dominates conflictingWritingOp dominates lastWrite then continue, otherwise no. nicolasvasilache: As we had discussed, not being in between the ops is not good enough in the non-straight-line…
		springermAuthorUnsubmitted Done Reply Inline Actions This is exactly what `isOpBetweenValueAndOp` is doing. Maybe the function name is not good. Or should I just merge `isOpBetweenValueAndOp` back into `hasReadAfterWriteInterference`? springerm: This is exactly what `isOpBetweenValueAndOp` is doing. Maybe the function name is not good. Or…
		springermAuthorUnsubmitted Done Reply Inline Actions This is actually trickier than I thought. I'm not 100% sure what's the right way to handle this with branches. But for the straight-line case this is definitely correct. I would suggest keeping this as is for now and updating this when we add support for scf::IfOp. (I'm working on that right now.) Then we have concrete examples that we can look at. I'm afraid, we may be missing cases otherwise. springerm: This is actually trickier than I thought. I'm not 100% sure what's the right way to handle this…
		nicolasvasilacheUnsubmitted Done Reply Inline Actions It should be "mayOpBeBetweenValueAndOp" and you can only disprove the cases that I listed. Anything else is more tricky. Do you have issues if trying to implement this simple change now? If so, we need to understand the cases that break since you are changing the conflict detection algorithm. This need to be done as part of this revision. nicolasvasilache: It should be "mayOpBeBetweenValueAndOp" and you can only disprove the cases that I listed.
		springermAuthorUnsubmitted Done Reply Inline Actions I merged the `isOpBetween` function back into the caller. Makes it easier to follow the flow of the function. Thinking about double negations etc. just adds unnecessary complexity. In summary, what we are looking for: There is no conflict, if: properlyDominates(readingOp, conflictingWritingOp) or properlyDominates(conflictingWritingOp, writingOp) Let's not think about "in-between" or "maybe in-between". Instead, think of situations when there is no conflict. Lessons learned: We should use `properlyDominates` instead of `dominates`. The case where the two ops are the same need special rules (e.g., has to be the exact same use). Do not use `!dominates` or `!properlyDominates`. This gets tricky when two ops are in two different branches. However, if `properlyDominates(A, B)` says `true`, we can be certain that `A` is before `B`, regardless of the absence or presence of branches. In our code, this is always the safe solution. Worst case, even if we miss a case, we do not `continue` and report something as a conflict that should not be a conflict. springerm: I merged the `isOpBetween` function back into the caller. Makes it easier to follow the flow of…
		} else {
		LDBG("READ = BlockArgument #"
		<< lastWrite.cast<BlockArgument>().getArgNumber() << "\n");
		}
		LDBG("CONFLICTING WRITE = #"
		nicolasvasilacheUnsubmitted Done Reply Inline Actions I do not understand this case. Can you elaborate? nicolasvasilache: I do not understand this case. Can you elaborate?
		springermAuthorUnsubmitted Done Reply Inline Actions The for loop (`uConflictingWrite`) iterates over all writes. Including the one found by `findLastPrecedingWrite`. `isOpBetweenValueAndOp` checks for proper domination and does not handle the case where both are the same. This case is similar to the case below. In both cases, we want to check if an op `O` is in-between two things. These two checks (requirement 2, requirement 3) handle the case where `O` is the boundary (upper boundary in case of requirement 2 and lower boundary in case of requirement 3 when you think about reading code from top to bottom). springerm: The for loop (`uConflictingWrite`) iterates over all writes. Including the one found by…
		<< uConflictingWrite->getOperandNumber() << " of "
		<< printOperationInfo(conflictingWritingOp) << "\n");

		// Requirement 1: uConflictingWrite is between lastWrite and readingOp.
		if (!isOpBetweenValueAndOp(conflictingWritingOp, lastWrite, readingOp,
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Can you also keep terminology like "no self-conflict" or a "use cannot conflict with itself" ? nicolasvasilache: Can you also keep terminology like "no self-conflict" or a "use cannot conflict with itself" ?
domInfo))		domInfo))
		continue;

		// Requirement 2: uConflictingWrite and uWrite are not the same OpOperand.
		// Only applicable if lastWrite is not a BlockArgument.
		if (uConflictingWrite == uWrite)
		continue;

		nicolasvasilacheUnsubmitted Done Reply Inline Actions I would add the inplace notation to the production of %1 and %2 (and would drop this line). %0 is not yet inplace, we need to determine if inplace creates a conflict. E.g. // %0 = tensor.extract_slice %t[%a, %b][%c, %d][1, 1]. // can this bufferize inplace ? // %1 = linalg.fill %cst, %0 // bufferizes inplace // %2 = tensor.insert_slice %1 into %t[%a, %b][%c, %d][1, 1]. // bufferizes inplace nicolasvasilache: I would add the inplace notation to the production of %1 and %2 (and would drop this line). %0…
		springermAuthorUnsubmitted Done Reply Inline Actions The analysis is independent of which OpOperand we are trying to bufferize at the moment. We simply want to know, given a piece of IR and bufferization decisions, is there a conflict. We may as well be bufferizing %2 in the example and %0 is already bufferized. (The heuristic imposes a certain order of bufferization, but the analysis should work with any order.) Added the inplace annotation to all 3 ops. springerm: The analysis is independent of which OpOperand we are trying to bufferize at the moment. We…
		// Requirement 3: uConflictingWrite and uRead are not the same. If the
		// same OpOperand reads and writes, this is not a conflict.
		if (uConflictingWrite == uRead)
		nicolasvasilacheUnsubmitted Done Reply Inline Actions mention that uConflictingWrite is "the use of %0 in linalg.fill" nicolasvasilache: mention that uConflictingWrite is "the use of %0 in linalg.fill"
		// The same OpOperand may read and write.
		continue;

		nicolasvasilacheUnsubmitted Done Reply Inline Actions Hmm I had not realized you were also using SSA use-def chain for finding the insertSliceOp. I'll need to think about this more... nicolasvasilache: Hmm I had not realized you were also using SSA use-def chain for finding the insertSliceOp.
		// Requirement 4: No matching ExtractSliceOp/InsertSliceOp pair. If
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please spell this out: lastWrite <- ... uRead <- %t in `tensor.insert_slice` uConflictingWrite <- %0 in `linalg.fill` it is unclear to me who lastWrite is here. nicolasvasilache: Please spell this out: ``` lastWrite <- ... uRead <- %t in `tensor.insert_slice`…
		springermAuthorUnsubmitted Done Reply Inline Actions You were probably referring to this part? // Requirement 4: No matching ExtractSliceOp/InsertSliceOp pair. If // uRead is an InsertSliceOp... It does not matter what `lastWrite` is. `lastWrite` is not used for "requirement 4". springerm: You were probably referring to this part? ``` // Requirement 4: No matching…
		// uRead is an InsertSliceOp...
		if (auto insertSliceOp = dyn_cast<InsertSliceOp>(readingOp)) {
		// As an example, consider the following IR. All results are assumed to
		// be inplace.
		//
		// %0 = tensor.extract_slice %t[%a, %b][%c, %d][1, 1]
		// %1 = linalg.fill %cst, %0
		// %2 = tensor.insert_slice %1 into %t[%a, %b][%c, %d][1, 1]
		if (uRead->get() == insertSliceOp.dest() &&
		hasMatchingExtractSliceOp(uConflictingWrite->get(), insertSliceOp))
		// Case 4.1: The main insight is that InsertSliceOp reads only part of
		// the destination tensor. The overwritten area is not read. If
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please add a TODO to replace the magic constant by `insertSliceOp.getDestOpOperand` when available (cc @jpienaar @mehdi_amini @rriddle with whom we discussed this feature) nicolasvasilache: Please add a TODO to replace the magic constant by `insertSliceOp.getDestOpOperand` when…
		// uConflictingWrite writes into exactly the memory location that is
		// being read by uRead, this is not a conflict.
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please spell this out: lastWrite <- ... uRead <- %1 in `tensor.insert_slice` uConflictingWrite <- t in `tensor.insert_slice` it is unclear to me who lastWrite is here. nicolasvasilache: Please spell this out: lastWrite <- ... uRead <- %1 in `tensor.insert_slice` uConflictingWrite…
		springermAuthorUnsubmitted Done Reply Inline Actions Added comments for `uRead` and `uConflictingWrite`. The value of `lastWrite` is irrelevant. springerm: Added comments for `uRead` and `uConflictingWrite`. The value of `lastWrite` is irrelevant.
		//
		// In the InsertSliceOp in the above example, the read of %t does not
		// conflict with the write of the FillOp (same aliases!) because the
		nicolasvasilacheUnsubmitted Done Reply Inline Actions Please drop this: (Keep in mind that all three results in the example are considered inplace.) marking the ops that bufferize inplace in the example IR is simpler to follow. nicolasvasilache: Please drop this: (Keep in mind that all three results in the example are considered inplace.)…
		// area that the FillOp operates on is exactly the one that is not
		// via %t.
		continue;

		if (uRead->get() == insertSliceOp.source() &&
		uConflictingWrite == &insertSliceOp->getOpOperand(1) &&
		hasMatchingExtractSliceOp(uRead->get(), insertSliceOp))
		// Case 4.2: In the InsertSliceOp in the above example, two aliasing
		// tensors are read (%1) and written (%t) via two OpOperands. This is
		// usually a conflict. (Keep in mind that all three results in the
		// example are considered inplace.) This is not a conflict if the area
		// written to by the FillOp is the same as the one that the %t of the
		// InsertSliceOp writes to.
		continue;
		}

		// All requirements are met. Conflict found!
		LDBG("CONFLICT CONFIRMED!\n\n");
return true;		return true;
		}
		}

// Check if we can find any interference between writes to		LDBG("NOT A CONFLICT!\n\n");
// aliases[`opOperand`] and reads to aliases[`opResult`]. This handles the		return false;
// case:
//
// ```
// %0 = op_to_bufferize_maybe_inplace(%1)
// %2 = some_alias(%1)
// inplace_write(%2)
// %3 = some_alias(%0)
// read(%3)
// ```
LDBG("--------\n");
LDBG("--------Test reads(opResult) vs writes(opOperand)\n");
usesRead.clear();
usesWrite.clear();
getAliasingReads(usesRead, opResult);
getAliasingInplaceWrites(usesWrite, opOperand);
return wouldCreateReadAfterWriteInterference(opToBufferize, usesRead,
usesWrite, domInfo);
}		}

/// Return true if bufferizing `opOperand` inplace with `opResult` would create		/// Return true if bufferizing `opOperand` inplace with `opResult` would create
/// a write to a non-writable buffer.		/// a write to a non-writable buffer.
bool BufferizationAliasInfo::wouldCreateWriteToNonWritableBuffer(		bool BufferizationAliasInfo::wouldCreateWriteToNonWritableBuffer(
OpOperand &opOperand, OpResult opResult) const {		OpOperand &opOperand, OpResult opResult) const {
// Certain buffers are not writeable:		// Certain buffers are not writeable:
// 1. A function bbArg that is not inplaceable or		// 1. A function bbArg that is not inplaceable or
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines
bool BufferizationAliasInfo::areEquivalentExtractSliceOps(		bool BufferizationAliasInfo::areEquivalentExtractSliceOps(
ExtractSliceOp st, InsertSliceOp sti) const {		ExtractSliceOp st, InsertSliceOp sti) const {
if (!st \|\| !sti)		if (!st \|\| !sti)
return false;		return false;
if (!equivalentInfo.isEquivalent(st.source(), sti.dest()))		if (!equivalentInfo.isEquivalent(st.source(), sti.dest()))
return false;		return false;
if (!sameOffsetsSizesAndStrides(st, sti, isEqualConstantIntOrValue))		if (!sameOffsetsSizesAndStrides(st, sti, isEqualConstantIntOrValue))
return false;		return false;
		// TODO: Is the following needed?
if (!equivalentInfo.isEquivalent(st.result(), sti.source()))		if (!equivalentInfo.isEquivalent(st.result(), sti.source()))
return false;		return false;
return true;		return true;
}		}

/// Return true if there is a `candidateOp` that would write to memory after
/// bufferization and such that:
/// 1. The written buffer is equivalent to either `aliasingRead` or
/// `aliasingWrite` under the inPlace bufferization decisions taken
/// so far.
/// 2. `aliasingWrite` properly dominates `candidateOp`.
/// 3. `candidateOp` properly dominates `aliasingReadOp`.
// TODO: richer clobbering analysis with container-containee relationship
// instead of equivalence.
bool BufferizationAliasInfo::existsInterleavedValueClobber(
OpOperand &aliasingRead, OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const {
Operation *aliasingReadOp = aliasingRead.getOwner();
Operation *aliasingWriteOp = aliasingWrite.getOwner();
assert(!domInfo.properlyDominates(aliasingReadOp, aliasingWriteOp) &&
"Unexpected aliasingReadOp properly dominates aliasingWriteOp");

for (Value valueToClobber : {aliasingRead.get(), aliasingWrite.get()}) {
auto leaderIt = equivalentInfo.findLeader(valueToClobber);
for (auto mit = leaderIt, meit = equivalentInfo.member_end(); mit != meit;
++mit) {
Operation *candidateOp = mit->v.getDefiningOp();
if (!candidateOp)
continue;
OpOperand *operand = getAliasingOpOperand(mit->v.cast<OpResult>());
// TODO: Should we check for isInplaceMemoryWrite instead?
if (!operand \|\| !bufferizesToMemoryWrite(*operand))
continue;
LDBG("---->clobbering candidate: " << printOperationInfo(candidateOp)
<< '\n');
if (domInfo.properlyDominates(aliasingWriteOp, candidateOp) &&
domInfo.properlyDominates(candidateOp, aliasingReadOp))
return true;
}
}
return false;
}

/// Return true if there is a write that:
/// 1. Properly dominates aliasingReadOp.
/// 2. Is properly dominated by aliasingWriteOp.
/// 3. Clobbers the write that would be interfering with the read.
///
bool BufferizationAliasInfo::isClobberedWriteBeforeRead(
Operation *opToBufferize, OpOperand &aliasingRead, OpOperand &aliasingWrite,
const DominanceInfo &domInfo) const {
Operation *aliasingReadOp = aliasingRead.getOwner();
Operation *aliasingWriteOp = aliasingWrite.getOwner();
assert(!domInfo.properlyDominates(aliasingReadOp, aliasingWriteOp) &&
"Unexpected aliasingReadOp properly dominates aliasingWriteOp");

// Bail if the write does not dominate the read: it may clobber but only on
// a strict subset of paths, which is not enough for safety.
if (!domInfo.dominates(aliasingWriteOp, aliasingReadOp)) {
LDBG("---->no clobbering: write does not dominate read\n");
return false;
}

// The case `opToBufferize` isa ExtractSliceOp is important enough that we
// look for it specifically. The key information to discover is whether the
// aliasing read or write come from a matching InsertSliceOp.
// Such a pattern is introduced by tiling and is the key inplace condition
// not to miss.
if (auto extractSliceOp = dyn_cast<ExtractSliceOp>(opToBufferize)) {
if (auto insertSliceOp = dyn_cast<InsertSliceOp>(aliasingReadOp)) {
// %1 = extract_slice %0[%offset_sizes_and_strides_1]
//
// ... // 0 or more of inplace compute that reduces to: %X is an
// // aliasingWrite equivalent to %1.
// %W = inplace_write(%1)
//
// // aliasingRead %Y in insert_slice
// ... = insert_slice %W into %R[%offset_sizes_and_strides_1]
if (aliasingRead.get() == insertSliceOp.dest() &&
// TODO: This is currently too restrictive and misses clobberings.
// When available, use container-containee analysis: the condition
// should be that the `aliasingWrite` is contained within
// `insertSliceOp.source()`.
equivalentInfo.isEquivalent(aliasingWrite.get(),
insertSliceOp.source()) &&
areEquivalentExtractSliceOps(extractSliceOp, insertSliceOp)) {
LDBG("---->clobbering matching extract_slice/insert_slice\n");
return true;
}
// %1 = extract_slice %0[%offset_sizes_and_strides_1]
//
// ... // bunch of inplace ops that reduce to %X, equivalent to %1.
// %X = inplace_write(%1)
//
// // aliasingRead %X in insert_slice
// // aliasingWrite %Y in insert_slice
// ... = insert_slice %X into %Y[%offset_sizes_and_strides_1]
if (aliasingReadOp == aliasingWriteOp) {
assert(aliasingRead.get() == insertSliceOp.source() &&
"expected read to source of insert_slice");
assert(aliasingWrite.get() == insertSliceOp.dest() &&
"expected write to dest of insert_slice");
if (areEquivalentExtractSliceOps(extractSliceOp, insertSliceOp)) {
LDBG("---->clobbering matching extract_slice/insert_slice\n");
return true;
}
}
}
}

// General case: look for a properly interleaved clobber of either exactly
// `aliasingRead` or `aliasingWrite`.
// TODO: Relax this to inclusion instead of double inclusion (a.k.a
// equivalence). We will need to compute container-containee relationship.
return existsInterleavedValueClobber(aliasingRead, aliasingWrite, domInfo);
}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Forward declarations.		// Forward declarations.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Return the op with Allocate MemoryEffect if `v` is equivalent to an such		/// Return the op with Allocate MemoryEffect if `v` is equivalent to an such
/// an op. Return null otherwise.		/// an op. Return null otherwise.
static Operation *getEquivalentAlloc(Value value,		static Operation *getEquivalentAlloc(Value value,
const BufferizationAliasInfo &aliasInfo);		const BufferizationAliasInfo &aliasInfo);
▲ Show 20 Lines • Show All 788 Lines • ▼ Show 20 Lines	auto subviewMemRefType =
.cast<MemRefType>();		.cast<MemRefType>();

// A copy of the source buffer is needed if either:		// A copy of the source buffer is needed if either:
// - The producer of `source` is not inplace. This is the case where a		// - The producer of `source` is not inplace. This is the case where a
// slice is computed out of place into the inplace full tensor.		// slice is computed out of place into the inplace full tensor.
// - The result is not inplace. This is the case where the whole tensor is		// - The result is not inplace. This is the case where the whole tensor is
// cloned and the clone needs to be updated.		// cloned and the clone needs to be updated.
auto inPlace = getInPlace(insertSliceOp->getResult(0));		auto inPlace = getInPlace(insertSliceOp->getResult(0));
		// TODO: Is this necessary?
if (!aliasInfo.isSourceEquivalentToAMatchingInplaceExtractSliceOp(		if (!aliasInfo.isSourceEquivalentToAMatchingInplaceExtractSliceOp(
insertSliceOp) \|\|		insertSliceOp) \|\|
inPlace != InPlaceSpec::True) {		inPlace != InPlaceSpec::True) {
LDBG("insert_slice needs extra source copy: " << insertSliceOp.source()		LDBG("insert_slice needs extra source copy: " << insertSliceOp.source()
<< " -> copy\n");		<< " -> copy\n");
// Take a subview of the dst.		// Take a subview of the dst.
Value subView = b.create<memref::SubViewOp>(		Value subView = b.create<memref::SubViewOp>(
loc, subviewMemRefType, dstMemref, insertSliceOp.getMixedOffsets(),		loc, subviewMemRefType, dstMemref, insertSliceOp.getMixedOffsets(),
▲ Show 20 Lines • Show All 764 Lines • Show Last 20 Lines

mlir/test/Dialect/Linalg/comprehensive-module-bufferize-analysis.mlir

Show First 20 Lines • Show All 358 Lines • ▼ Show 20 Lines	func @nested_extract_slice_and_insert(
%rsA = tensor.insert_slice %FA into %sA[0, 0][4, 4][1, 1] : tensor<4x4xf32> into tensor<?x?xf32>		%rsA = tensor.insert_slice %FA into %sA[0, 0][4, 4][1, 1] : tensor<4x4xf32> into tensor<?x?xf32>
%rA = tensor.insert_slice %rsA into %A[0, 0][%idx, %idx][1, 1] : tensor<?x?xf32> into tensor<?x?xf32>		%rA = tensor.insert_slice %rsA into %A[0, 0][%idx, %idx][1, 1] : tensor<?x?xf32> into tensor<?x?xf32>

// 3-level matching tensor.extract_slice / tensor.insert_slice into		// 3-level matching tensor.extract_slice / tensor.insert_slice into
// inplaceable %B.		// inplaceable %B.
// CHECK-NEXT: tensor.extract_slice		// CHECK-NEXT: tensor.extract_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: tensor.extract_slice		// CHECK-NEXT: tensor.extract_slice
// Atm, this 2nd tensor.extract_slice fails to bufferize inplace because		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
		nicolasvasilacheUnsubmitted Done Reply Inline Actions very nice! nicolasvasilache: very nice!
// clobbering analysis conservatively test for equivalent buffers.
// TODO: This is currently too restrictive and misses clobberings.
// When available, use container-containee analysis.
// CHECK-SAME: {__inplace_results_attr__ = ["false"]}
// CHECK-NEXT: tensor.extract_slice		// CHECK-NEXT: tensor.extract_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: fill		// CHECK-NEXT: fill
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: tensor.insert_slice		// CHECK-NEXT: tensor.insert_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
// CHECK-NEXT: tensor.insert_slice		// CHECK-NEXT: tensor.insert_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]}		// CHECK-SAME: {__inplace_results_attr__ = ["true"]}
▲ Show 20 Lines • Show All 359 Lines • ▼ Show 20 Lines	func @insert_slice_chain(
%c0 = constant 0 : index		%c0 = constant 0 : index
%cst = constant 0.000000e+00 : f32		%cst = constant 0.000000e+00 : f32

// CHECK: linalg.fill		// CHECK: linalg.fill
// CHECK-SAME: {__inplace_results_attr__ = ["true"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
%0 = linalg.fill(%cst, %arg2) : f32, tensor<62x90xf32> -> tensor<62x90xf32>		%0 = linalg.fill(%cst, %arg2) : f32, tensor<62x90xf32> -> tensor<62x90xf32>

// CHECK: tensor.extract_slice		// CHECK: tensor.extract_slice
// CHECK-SAME: {__inplace_results_attr__ = ["false"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		nicolasvasilacheUnsubmitted Done Reply Inline Actions very nice and catching a wrong assumption: this example did not exhibit the need for intersection analysis because the SSA use-def chains already capture the info we want. nicolasvasilache: very nice and catching a wrong assumption: this example did not exhibit the need for…
// TODO: in order to have this extract_slice bufferize inplace, we need to write a range
// analysis and determine that intersection([0, 32)x[0, 90), [32, 62)x[0, 90)) is empty.
%2 = tensor.extract_slice %0[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>		%2 = tensor.extract_slice %0[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>
// CHECK: vector.transfer_write		// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_results_attr__ = ["true"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
%7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds = [true, true]} : vector<32x90xf32>, tensor<32x90xf32>		%7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds = [true, true]} : vector<32x90xf32>, tensor<32x90xf32>
// CHECK: tensor.insert_slice		// CHECK: tensor.insert_slice
// CHECK-SAME: {__inplace_results_attr__ = ["true"]		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
%8 = tensor.insert_slice %7 into %0[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>		%8 = tensor.insert_slice %7 into %0[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>

Show All 30 Lines	%r = scf.for %arg0 = %c0 to %c257 step %c256 iter_args(%arg1 = %t) -> (tensor<10x20xf32>) {
%t11 = tensor.extract_slice %t1[0, 0] [5, %y] [1, 1] : tensor<5x?xf32> to tensor<5x?xf32>		%t11 = tensor.extract_slice %t1[0, 0] [5, %y] [1, 1] : tensor<5x?xf32> to tensor<5x?xf32>
%t2 = vector.transfer_write %v, %t11[%c0, %c0] : vector<5x6xf32>, tensor<5x?xf32>		%t2 = vector.transfer_write %v, %t11[%c0, %c0] : vector<5x6xf32>, tensor<5x?xf32>
%t3 = tensor.insert_slice %t2 into %arg1[%x, 0] [5, %y] [1, 1] : tensor<5x?xf32> into tensor<10x20xf32>		%t3 = tensor.insert_slice %t2 into %arg1[%x, 0] [5, %y] [1, 1] : tensor<5x?xf32> into tensor<10x20xf32>
scf.yield %t3 : tensor<10x20xf32>		scf.yield %t3 : tensor<10x20xf32>
}		}
return %r : tensor<10x20xf32>		return %r : tensor<10x20xf32>
}		}

		// -----

		#accesses = [
		affine_map<(i) -> (i)>,
		affine_map<(i) -> (i)>,
		affine_map<(i) -> (i)>
		]
		#trait = {
		indexing_maps = #accesses,
		iterator_types = ["parallel"]
		}

		// CHECK-LABEL: func @linalg_op_same_out_tensors
		func @linalg_op_same_out_tensors(
		%t1: tensor<?xf32> {linalg.inplaceable = true},
		%t2: tensor<?xf32> {linalg.inplaceable = true}) -> (tensor<?xf32>, tensor<?xf32>){

		// CHECK: linalg.generic
		// CHECK-SAME: {__inplace_results_attr__ = ["true", "false"]
		%o:2 = linalg.generic #trait ins(%t1 : tensor<?xf32>)
		outs (%t2, %t2 : tensor<?xf32>, tensor<?xf32>) {
		^bb(%0: f32, %1: f32, %2 : f32) :
		linalg.yield %0, %0 : f32, f32
		} -> (tensor<?xf32>, tensor<?xf32>)
		return %o#0, %o#1 : tensor<?xf32>, tensor<?xf32>
		}

		// -----

		// CHECK-LABEL: func @double_insert_slice_into_alias
		func @double_insert_slice_into_alias(
		%v1: vector<32x90xf32>,
		%v2: vector<30x90xf32>,
		%arg2: tensor<62x90xf32> {linalg.inplaceable = true},
		%s1: index, %s2: index, %s3: index, %s4: index)
		-> (tensor<62x90xf32>, tensor<?x?xf32>)
		{
		%c0 = constant 0 : index

		// Cannot bufferize inplace this extract_slice because both operand and result
		// are modified and returned separately.
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["false"]
		%e = tensor.extract_slice %arg2[%s1, %s2][%s3, %s4][1, 1] : tensor<62x90xf32> to tensor<?x?xf32>

		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%2 = tensor.extract_slice %arg2[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>
		// CHECK: vector.transfer_write
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds = [true, true]} : vector<32x90xf32>, tensor<32x90xf32>
		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%8 = tensor.insert_slice %7 into %arg2[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>

		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%10 = tensor.extract_slice %e[32, 0] [30, 90] [1, 1] : tensor<?x?xf32> to tensor<30x90xf32>
		// CHECK: vector.transfer_write
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%14 = vector.transfer_write %v2, %10[%c0, %c0] {in_bounds = [true, true]} : vector<30x90xf32>, tensor<30x90xf32>
		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%15 = tensor.insert_slice %14 into %e[32, 0] [30, 90] [1, 1] : tensor<30x90xf32> into tensor<?x?xf32>

		return %8, %15 : tensor<62x90xf32>, tensor<?x?xf32>
		}

		// -----

		// CHECK-LABEL: func @interleaved_extract_insert_slice_chain_1
		func @interleaved_extract_insert_slice_chain_1(
		%arg2: tensor<62x90xf32> {linalg.inplaceable = true})
		-> (tensor<62x90xf32>)
		{
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%2 = tensor.extract_slice %arg2[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>

		// TODO: This should bufferize inplace once we have a proper range analysis.
		nicolasvasilacheUnsubmitted Not Done Reply Inline Actions Mention the intersection / copy a little more of the other comment here ? nicolasvasilache: Mention the intersection / copy a little more of the other comment here ?
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["false"]
		%10 = tensor.extract_slice %arg2[32, 0] [30, 90] [1, 1] : tensor<62x90xf32> to tensor<30x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%8 = tensor.insert_slice %2 into %arg2[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%15 = tensor.insert_slice %10 into %8[32, 0] [30, 90] [1, 1] : tensor<30x90xf32> into tensor<62x90xf32>

		return %15 : tensor<62x90xf32>
		}

		// -----

		// CHECK-LABEL: func @interleaved_extract_insert_slice_chain_2
		func @interleaved_extract_insert_slice_chain_2(
		%arg2: tensor<62x90xf32> {linalg.inplaceable = true})
		-> (tensor<62x90xf32>)
		{
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%2 = tensor.extract_slice %arg2[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>

		// The slices are overlapping, so this can never bufferize inplace.
		// CHECK: tensor.extract_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["false"]
		%10 = tensor.extract_slice %arg2[31, 0] [30, 90] [1, 1] : tensor<62x90xf32> to tensor<30x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%8 = tensor.insert_slice %2 into %arg2[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>


		// CHECK: tensor.insert_slice
		// CHECK-SAME: {__inplace_results_attr__ = ["true"]
		%15 = tensor.insert_slice %10 into %8[31, 0] [30, 90] [1, 1] : tensor<30x90xf32> into tensor<62x90xf32>

		return %15 : tensor<62x90xf32>
		}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][linalg][bufferize] Rewrite RaW conflict detectionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 377824

mlir/include/mlir/Dialect/Linalg/Transforms/ComprehensiveBufferize.h

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp

mlir/test/Dialect/Linalg/comprehensive-module-bufferize-analysis.mlir

[mlir][linalg][bufferize] Rewrite RaW conflict detection
ClosedPublic