As discussed on D123492, there are some use cases downstream
where doing the equivalence on pointers is not desired. This patch
makes the equivalence checks configurable with a new flag so the previous
behavior is restored and the CSE pass can still make use of the pointers
equivalence on commutative operations.
Details
- Reviewers
mehdi_amini rriddle
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).
mlir/include/mlir/IR/OperationSupport.h | ||
---|---|---|
1201 | This should mention how dangerous it is I think. |
Is your idea to have a canonicalization that make commutative op operands ordering and then the CSE can just take over an eliminate the equivalent ops without special trick? Yeah I think that's a valid solution as well.
mlir/include/mlir/IR/OperationSupport.h | ||
---|---|---|
1201 | I will update. |
Ok. Sounds reasonable to me. I'm not sure when I can work on this but I'll give it a try.
Hi :)
Would it still be possible to merge this change until the reordering is done in canonicalization? This is still a bug that is affecting us since we are using OperationEquivalence::isRegionEquivalentTo and we hit cases where this method returns false for two identical regions because of this sorting. To make these methods backwards compatible we can instead have an IgnoreCommutativeTrait for new users that want to avoid this and later on deprecate it in favor of the canonicalization solution.
Thanks!
Seeing how this had been dragging for so long, I'm concerned that we if land this we won't fix it ever, so I'd rather revert D123492, take the hit on CSE for now, and implement the canonicalization: it does not seem like a difficult canonicalization to implement I think?
I agree that given that the logic is faulty, it's better to revert than have it as is.
It depends how long it would take to get the canonicalization change in, as this bug is making OperationEquivalence::isRegionEquivalentTo unusable for us
There is one mitigation that would be simple, we first try to compare operands in the given order, then if this fails we check if the op is commutative and do this sorting then.
This will unblock us because we don't care about the commutative property when using this method, it's a nice to have.
Wouldn't it make more sense to revert it when the canonicalization is in place?
Aren't we talking about a performance hit vs a correctness issue?
It works in our use case and has been upstream for more than a year. If it's blocking you feel free to revert. Looks like nobody will work to make it work correctly for all.
Actually this mitigation will only work for OperationEquivalence::isEquivalentTo but not for OperationEquivalence::computeHash, which seems inconsistent as two identical ops might get a different hash although isEquivalentTo will return true for them. So I'm leaning toward reverting until we have a proper solution, what do you think?
One solution I had in mind for picking a deterministic order for two operands that are op results is this (though I won't have time to implement it) is this: find the lowest common ancestor block of the two defining ops, and check which ancestor op within that block is before which. This solution can also be used in canonicalization.
I think one issue with moving this to canonicalization is that you have to add a pattern for each commutative ops. I don't think there is a mechanism to run canonicalization pattern on traits.
By the way there is already a pattern that does this canonicalization: https://github.com/llvm/llvm-project/blob/a57bdc8fe68753c341cac0fcecabcd4d35e45466/mlir/lib/Transforms/Utils/CommutativityUtils.cpp#L230
It's not part of the default canonicalize pass because we were concerned about the cost. But I wonder if we could run this pattern at the beginning of the CSE pass? (maybe with an option to disable it)
Something I haven't thought enough about is whether during the application of CSE we may break this canonicalization (in which case it'll be suboptimal compared to what we have now)
Yeah I think it makes sense to move it to the beginning of the CSE pass which is already costly.
Why would we break this canonicalization? If we sort the operands at the beginning, and then during CSE only compare operands in the order they are in.
If we look at a simple example:
func (a, b) { c = a + b; d = b + a; return c + d; }
Here the pre-sorting phase would turn it into:
func (a, b) { c = a + b; d = a + b; return c + d; }
At which point CSE will de-dup c and d, all is fine, but note that c and d were sorted according to a and b whose order didn't change.
Now if we have: something a bit more interesting:
func (a, b, c) { add1 = a + b add2 = a + c; res1 = add1 + add2 add3 = a + b; add4 = a + c; res2 = add3 + add4 return res1 + res2; }
Here everything is already sorted, CSE will deduce add1 with add4, and add2 with add3.
However after this transformation, the order of add3 and add4 is reversed, so the operands of res2 are no longer sorted:
func (a, b, c) { add1 = a + b add2 = a + c; res1 = add1 + add2 res2 = add2 + add1 return res1 + res2; }
That's what I mean by CSE will break the canonicalization.
That makes sense, thank you! This makes me think that we should either:
- continue to sort the operands during CSE so that 'res2 = add2 + add1' will get sorted in your example and them deduped.
- do this as part of IsEquivalentTo like we do now, but with a deterministic algorithm that doesn't depend on pointers.
Regardless, this shouldn't block D154699 right?
This should mention how dangerous it is I think.