This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Make commutative equivalence as an option
Needs ReviewPublic

Authored by clementval on Jul 11 2022, 4:47 AM.

Details

Summary

As discussed on D123492, there are some use cases downstream
where doing the equivalence on pointers is not desired. This patch
makes the equivalence checks configurable with a new flag so the previous
behavior is restored and the CSE pass can still make use of the pointers
equivalence on commutative operations.

Diff Detail

Event Timeline

clementval created this revision.Jul 11 2022, 4:47 AM
Herald added a project: Restricted Project. · View Herald Transcript
clementval requested review of this revision.Jul 11 2022, 4:47 AM

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

mlir/include/mlir/IR/OperationSupport.h
1201

This should mention how dangerous it is I think.

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

Is your idea to have a canonicalization that make commutative op operands ordering and then the CSE can just take over an eliminate the equivalent ops without special trick? Yeah I think that's a valid solution as well.

mlir/include/mlir/IR/OperationSupport.h
1201

I will update.

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

Is your idea to have a canonicalization that make commutative op operands ordering and then the CSE can just take over an eliminate the equivalent ops without special trick? Yeah I think that's a valid solution as well.

Yes that is what I have in mind.

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

Is your idea to have a canonicalization that make commutative op operands ordering and then the CSE can just take over an eliminate the equivalent ops without special trick? Yeah I think that's a valid solution as well.

Yes that is what I have in mind.

Ok. Sounds reasonable to me. I'm not sure when I can work on this but I'll give it a try.

Hi :)

Would it still be possible to merge this change until the reordering is done in canonicalization? This is still a bug that is affecting us since we are using OperationEquivalence::isRegionEquivalentTo and we hit cases where this method returns false for two identical regions because of this sorting. To make these methods backwards compatible we can instead have an IgnoreCommutativeTrait for new users that want to avoid this and later on deprecate it in favor of the canonicalization solution.

Thanks!

Seeing how this had been dragging for so long, I'm concerned that we if land this we won't fix it ever, so I'd rather revert D123492, take the hit on CSE for now, and implement the canonicalization: it does not seem like a difficult canonicalization to implement I think?

Seeing how this had been dragging for so long, I'm concerned that we if land this we won't fix it ever, so I'd rather revert D123492, take the hit on CSE for now, and implement the canonicalization: it does not seem like a difficult canonicalization to implement I think?

I agree that given that the logic is faulty, it's better to revert than have it as is.

Wouldn't it make more sense to revert it when the canonicalization is in place?

It depends how long it would take to get the canonicalization change in, as this bug is making OperationEquivalence::isRegionEquivalentTo unusable for us

There is one mitigation that would be simple, we first try to compare operands in the given order, then if this fails we check if the op is commutative and do this sorting then.

This will unblock us because we don't care about the commutative property when using this method, it's a nice to have.

It depends how long it would take to get the canonicalization change in, as this bug is making OperationEquivalence::isRegionEquivalentTo unusable for us

I personally don't have time to work on this now.

Wouldn't it make more sense to revert it when the canonicalization is in place?

Aren't we talking about a performance hit vs a correctness issue?

Wouldn't it make more sense to revert it when the canonicalization is in place?

Aren't we talking about a performance hit vs a correctness issue?

It works in our use case and has been upstream for more than a year. If it's blocking you feel free to revert. Looks like nobody will work to make it work correctly for all.

There is one mitigation that would be simple, we first try to compare operands in the given order, then if this fails we check if the op is commutative and do this sorting then.

This will unblock us because we don't care about the commutative property when using this method, it's a nice to have.

Isn't this better than reverting?

There is one mitigation that would be simple, we first try to compare operands in the given order, then if this fails we check if the op is commutative and do this sorting then.

This will unblock us because we don't care about the commutative property when using this method, it's a nice to have.

Isn't this better than reverting?

yeah.

Actually this mitigation will only work for OperationEquivalence::isEquivalentTo but not for OperationEquivalence::computeHash, which seems inconsistent as two identical ops might get a different hash although isEquivalentTo will return true for them. So I'm leaning toward reverting until we have a proper solution, what do you think?

One solution I had in mind for picking a deterministic order for two operands that are op results is this (though I won't have time to implement it) is this: find the lowest common ancestor block of the two defining ops, and check which ancestor op within that block is before which. This solution can also be used in canonicalization.

Actually this mitigation will only work for OperationEquivalence::isEquivalentTo but not for OperationEquivalence::computeHash, which seems inconsistent as two identical ops might get a different hash although isEquivalentTo will return true for them. So I'm leaning toward reverting until we have a proper solution, what do you think?

One solution I had in mind for picking a deterministic order for two operands that are op results is this (though I won't have time to implement it) is this: find the lowest common ancestor block of the two defining ops, and check which ancestor op within that block is before which. This solution can also be used in canonicalization.

I think one issue with moving this to canonicalization is that you have to add a pattern for each commutative ops. I don't think there is a mechanism to run canonicalization pattern on traits.

By the way there is already a pattern that does this canonicalization: https://github.com/llvm/llvm-project/blob/a57bdc8fe68753c341cac0fcecabcd4d35e45466/mlir/lib/Transforms/Utils/CommutativityUtils.cpp#L230

It's not part of the default canonicalize pass because we were concerned about the cost. But I wonder if we could run this pattern at the beginning of the CSE pass? (maybe with an option to disable it)

Something I haven't thought enough about is whether during the application of CSE we may break this canonicalization (in which case it'll be suboptimal compared to what we have now)

By the way there is already a pattern that does this canonicalization: https://github.com/llvm/llvm-project/blob/a57bdc8fe68753c341cac0fcecabcd4d35e45466/mlir/lib/Transforms/Utils/CommutativityUtils.cpp#L230

It's not part of the default canonicalize pass because we were concerned about the cost. But I wonder if we could run this pattern at the beginning of the CSE pass? (maybe with an option to disable it)

Something I haven't thought enough about is whether during the application of CSE we may break this canonicalization (in which case it'll be suboptimal compared to what we have now)

Yeah I think it makes sense to move it to the beginning of the CSE pass which is already costly.

Why would we break this canonicalization? If we sort the operands at the beginning, and then during CSE only compare operands in the order they are in.

If we look at a simple example:

func (a, b) {
  c = a + b;
  d = b + a;
  return c + d;
}

Here the pre-sorting phase would turn it into:

func (a, b) {
  c = a + b;
  d = a + b;
  return c + d;
}

At which point CSE will de-dup c and d, all is fine, but note that c and d were sorted according to a and b whose order didn't change.
Now if we have: something a bit more interesting:

func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2

  add3 = a + b;
  add4 = a + c;
  res2 = add3 + add4

  return res1 + res2;
}

Here everything is already sorted, CSE will deduce add1 with add4, and add2 with add3.
However after this transformation, the order of add3 and add4 is reversed, so the operands of res2 are no longer sorted:

func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2
  res2 = add2 + add1

  return res1 + res2;
}

That's what I mean by CSE will break the canonicalization.

If we look at a simple example:

func (a, b) {
  c = a + b;
  d = b + a;
  return c + d;
}

Here the pre-sorting phase would turn it into:

func (a, b) {
  c = a + b;
  d = a + b;
  return c + d;
}

At which point CSE will de-dup c and d, all is fine, but note that c and d were sorted according to a and b whose order didn't change.
Now if we have: something a bit more interesting:

func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2

  add3 = a + b;
  add4 = a + c;
  res2 = add3 + add4

  return res1 + res2;
}

Here everything is already sorted, CSE will deduce add1 with add4, and add2 with add3.
However after this transformation, the order of add3 and add4 is reversed, so the operands of res2 are no longer sorted:

func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2
  res2 = add2 + add1

  return res1 + res2;
}

That's what I mean by CSE will break the canonicalization.

That makes sense, thank you! This makes me think that we should either:

  • continue to sort the operands during CSE so that 'res2 = add2 + add1' will get sorted in your example and them deduped.
  • do this as part of IsEquivalentTo like we do now, but with a deterministic algorithm that doesn't depend on pointers.

Regardless, this shouldn't block D154699 right?