This is an archive of the discontinued LLVM Phabricator instance.

[mlir][linalg][bufferize] Rewrite RaW conflict detection
ClosedPublic

Authored by springerm on Oct 6 2021, 11:03 PM.

Details

Summary

For each memory read, follow SSA use-def chains to find the op that produces the data being read (i.e., the most recent write). A memory write to an alias is a conflict if it takes places after the "most recent write" but before the read.

This CL introduces two main changes:

  • There is a concise definition of a conflict. Given a piece of IR with InPlaceSpec annotations and a computes alias set, it is easy to compute whether this program has a conflict. No need to consider multiple cases such as "read of operand after in-place write" etc.
  • No need to check for clobbering.

Depends On D111040

Diff Detail

Event Timeline

springerm created this revision.Oct 6 2021, 11:03 PM
springerm requested review of this revision.Oct 6 2021, 11:03 PM
springerm retitled this revision from [mlir][linalg][bufferize] Rewrite conflict detection to [mlir][linalg][bufferize] Rewrite RaW conflict detection.Oct 6 2021, 11:14 PM

Very nice improvement!

Please capture in the commit message and where appropriate in doc comments something like:

  • this version replaces alias-based analysis of clobbering with true last write analysis based on SSA use-def chains
  • the information captured is strictly better

I still need to check the new impl of wouldCreateReadAfterWriteInterference in more detail, the rest looks good modulo the comments I made.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
586

Turn this into a warning a degrade gracefully?
Is this part of this CL ?

950

There are TODOs that getAliasingOpOperand should return a list of operands.
With the SSA-based analysis here, I think we need to tie this loose end and support the list of operands and traverse them all.
For now I would just evolve the API and fail this analysis if

I wonder whether there be considerations related to equivalence in the analysis otherwise the last write may be partial and this could be another sack of knots.
OTOH equivalence classes are not yet formed as you go up the use-def chain.
I think this should be using getInplaceableOpResult on each opOperand considered to check this and the last write needs to be a full write (and in the future it could be a write to a bigger region that fully covers)

998

s/"a new conflict can * * only * * be introduced" etc?

1028

do we want to give up on all the debug messages ?

1049–1050

"Would inplace bufferization of op create a conflict?"

mlir/test/Dialect/Linalg/comprehensive-module-bufferize-analysis.mlir
367

very nice!

743

very nice and catching a wrong assumption: this example did not exhibit the need for intersection analysis because the SSA use-def chains already capture the info we want.

869

Mention the intersection / copy a little more of the other comment here ?

springerm marked an inline comment as done.Oct 7 2021, 7:35 AM
springerm added inline comments.
mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
586

I think this can stay as is. The CallOpInterface case was missing.

586–587

I missed this before sending out for review. Have to see what's going on here...

springerm updated this revision to Diff 377849.Oct 7 2021, 7:46 AM
springerm marked an inline comment as done.

improve handling of uWrite

nicolasvasilache added inline comments.
mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
586

seems CallOpInterface addition should be split out in its own CL

1066

findLastPrecedingWrite?

Reverse use-def chain is an impl detail.

1086

As we had discussed, not being in between the ops is not good enough in the non-straight-line case.
If either:

  1. readingOp dominates
  2. conflictingWritingOp dominates lastWrite

then continue, otherwise no.

1091

I do not understand this case. Can you elaborate?

1096

Can you also keep terminology like "no self-conflict" or a "use cannot conflict with itself" ?

1104

I would add the inplace notation to the production of %1 and %2 (and would drop this line).
%0 is not yet inplace, we need to determine if inplace creates a conflict.

E.g.

// %0 = tensor.extract_slice %t[%a, %b][%c, %d][1, 1].             // can this bufferize inplace ?
// %1 = linalg.fill %cst, %0                                                            // bufferizes inplace
// %2 = tensor.insert_slice %1 into %t[%a, %b][%c, %d][1, 1]. // bufferizes inplace
1107

mention that uConflictingWrite is "the use of %0 in linalg.fill"

1110

Hmm I had not realized you were also using SSA use-def chain for finding the insertSliceOp.
I'll need to think about this more...

1111

Please spell this out:

lastWrite <- ...
uRead <- %t in `tensor.insert_slice`
uConflictingWrite <- %0 in `linalg.fill`

it is unclear to me who lastWrite is here.

1123

Please add a TODO to replace the magic constant by insertSliceOp.getDestOpOperand when available (cc @jpienaar @mehdi_amini @rriddle with whom we discussed this feature)

1125

Please spell this out:

lastWrite <- ...
uRead <- %1 in tensor.insert_slice
uConflictingWrite <- t in tensor.insert_slice

it is unclear to me who lastWrite is here.

1128

Please drop this: (Keep in mind that all three results in the example are considered inplace.)

marking the ops that bufferize inplace in the example IR is simpler to follow.

springerm updated this revision to Diff 378548.Oct 10 2021, 6:52 PM
springerm marked 6 inline comments as done.

address comments

springerm marked 8 inline comments as done.

address comments

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
1028

I think we should keep some of them. These particular ones were not very helpful for my debugging. They actually cluttered the output quite a lot. (And they are somewhat repetitive.)

I think the important ones are "READ =", "CONFLICTING WRITE =", "WRITE =" below. What do you think?

1086

This is exactly what isOpBetweenValueAndOp is doing. Maybe the function name is not good. Or should I just merge isOpBetweenValueAndOp back into hasReadAfterWriteInterference?

1091

The for loop (uConflictingWrite) iterates over all writes. Including the one found by findLastPrecedingWrite. isOpBetweenValueAndOp checks for proper domination and does not handle the case where both are the same.

This case is similar to the case below. In both cases, we want to check if an op O is in-between two things. These two checks (requirement 2, requirement 3) handle the case where O is the boundary (upper boundary in case of requirement 2 and lower boundary in case of requirement 3 when you think about reading code from top to bottom).

1104

The analysis is independent of which OpOperand we are trying to bufferize at the moment. We simply want to know, given a piece of IR and bufferization decisions, is there a conflict.

We may as well be bufferizing %2 in the example and %0 is already bufferized. (The heuristic imposes a certain order of bufferization, but the analysis should work with any order.)

Added the inplace annotation to all 3 ops.

1111

You were probably referring to this part?

// Requirement 4: No matching ExtractSliceOp/InsertSliceOp pair. If
// uRead is an InsertSliceOp...

It does not matter what lastWrite is. lastWrite is not used for "requirement 4".

1125

Added comments for uRead and uConflictingWrite. The value of lastWrite is irrelevant.

springerm added inline comments.Oct 11 2021, 11:07 PM
mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
1086

This is actually trickier than I thought. I'm not 100% sure what's the right way to handle this with branches. But for the straight-line case this is definitely correct.

I would suggest keeping this as is for now and updating this when we add support for scf::IfOp. (I'm working on that right now.) Then we have concrete examples that we can look at. I'm afraid, we may be missing cases otherwise.

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
823

Can you replace all this with a getBackwardSlice that would use a filter to set true when you encounter your condition?
This should be more general and avoids duplicating code.
The only downside is that the SetVector is filled but as soon as you need to support more than a straight line this is needed anyway.

857

you should be able to plug getBackwardSlice here and just drop everything else.

1028

yup, removing clutter is def important, if this is the set that you found useful debugging let's go with it

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
942

You should be able to fold this into req 1. (once it is implemented as "mayOpBeBetweenValueAndOp") if you use dominates instead of properlyDominates in the right place.

952

You have 2 subcases with an InsertSliceOp here.
Do you know that you don't need more or is this still TBD ?

Is it possible some case is missing in light of my comment on the impl. of getAliasingOpResult below ?

1086

It should be "mayOpBeBetweenValueAndOp" and you can only disprove the cases that I listed.
Anything else is more tricky.

Do you have issues if trying to implement this simple change now?
If so, we need to understand the cases that break since you are changing the conflict detection algorithm.
This need to be done as part of this revision.

springerm updated this revision to Diff 379576.Oct 13 2021, 6:43 PM
springerm marked 9 inline comments as done.

address comments

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
823

I tried this, but unfortunately, getBackwardSlice only gives us Operation* and not OpOperand* or Value. This would work fine if all ops had only one operand and one output.

This is not something that can be fixed by specifying a fancy filter function. E.g., we there's no way to filter for OpOperands that bufferize to a memory write.

We may want to add another variant of getBackwardSlice. One that returns a SetVector<Value> and has a std::function<bool(Value)> condition.

942

We have to make sure that it is the exact same OpOperand and not two different ones of the same op. Otherwise, things could break around bufferization of InsertSliceOp etc.

950

Sorry, I can no longer see which line of code this comment is associated to.

Wrt. to lastWrite, there may be multiple in the case of branches. Whatever analysis we do here, stays the same. It just has to be done (and hold) for every lastWrite that we found (llvm::all_of).

952

I did not come across any other cases in our current examples, so I think we are good here.

Note, these are cases in which there is no conflict. Even if we miss a case, bufferization will work correctly. It may just introduce unnecessary copies.

1086

I merged the isOpBetween function back into the caller. Makes it easier to follow the flow of the function. Thinking about double negations etc. just adds unnecessary complexity.

In summary, what we are looking for: There is *no* conflict, if:

  • properlyDominates(readingOp, conflictingWritingOp)
  • or properlyDominates(conflictingWritingOp, writingOp)

Let's not think about "in-between" or "maybe in-between". Instead, think of situations when there is no conflict.

Lessons learned:

  1. We should use properlyDominates instead of dominates. The case where the two ops are the same need special rules (e.g., has to be the exact same use).
  2. Do not use !dominates or !properlyDominates. This gets tricky when two ops are in two different branches. However, if properlyDominates(A, B) says true, we can be certain that A is before B, regardless of the absence or presence of branches. In our code, this is always the safe solution. Worst case, even if we miss a case, we do not continue and report something as a conflict that should not be a conflict.
nicolasvasilache accepted this revision.Oct 14 2021, 7:11 AM

Great, thanks!

This revision is now accepted and ready to land.Oct 14 2021, 7:11 AM
This revision was landed with ongoing or failed builds.Oct 14 2021, 6:32 PM
This revision was automatically updated to reflect the committed changes.