This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Make commutative equivalence as an option
Needs ReviewPublic

Authored by clementval on Jul 11 2022, 4:47 AM.

Download Raw Diff

Details

Reviewers

mehdi_amini
rriddle

Summary

As discussed on D123492, there are some use cases downstream
where doing the equivalence on pointers is not desired. This patch
makes the equivalence checks configurable with a new flag so the previous
behavior is restored and the CSE pass can still make use of the pointers
equivalence on commutative operations.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

clementval created this revision.Jul 11 2022, 4:47 AM

Herald added a reviewer: rriddle. · View Herald TranscriptJul 11 2022, 4:47 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: bzcheeseman, sdasgup3, wenzhicui and 17 others. · View Herald Transcript

clementval requested review of this revision.Jul 11 2022, 4:47 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 4:47 AM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

clementval mentioned this in D123492: [mlir][CSE] Add ability to remove commutative operations.Jul 11 2022, 4:48 AM

Harbormaster completed remote builds in B174641: Diff 443602.Jul 11 2022, 5:06 AM

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

mlir/include/mlir/IR/OperationSupport.h
1201	This should mention how dangerous it is I think.

In D129480#3659999, @mehdi_amini wrote:

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

Is your idea to have a canonicalization that make commutative op operands ordering and then the CSE can just take over an eliminate the equivalent ops without special trick? Yeah I think that's a valid solution as well.

mlir/include/mlir/IR/OperationSupport.h
1201	I will update.

In D129480#3665380, @clementval wrote:

In D129480#3659999, @mehdi_amini wrote:

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

Is your idea to have a canonicalization that make commutative op operands ordering and then the CSE can just take over an eliminate the equivalent ops without special trick? Yeah I think that's a valid solution as well.

Yes that is what I have in mind.

In D129480#3665400, @mehdi_amini wrote:

In D129480#3665380, @clementval wrote:

In D129480#3659999, @mehdi_amini wrote:

I don't think you answered a question I had in your previous revision: can we try instead to move this to a canonicalization: we could sort operands (we'd need some stable sort based on some numbering in the region maybe).

Is your idea to have a canonicalization that make commutative op operands ordering and then the CSE can just take over an eliminate the equivalent ops without special trick? Yeah I think that's a valid solution as well.

Yes that is what I have in mind.

Ok. Sounds reasonable to me. I'm not sure when I can work on this but I'll give it a try.

Hi :)

Would it still be possible to merge this change until the reordering is done in canonicalization? This is still a bug that is affecting us since we are using OperationEquivalence::isRegionEquivalentTo and we hit cases where this method returns false for two identical regions because of this sorting. To make these methods backwards compatible we can instead have an IgnoreCommutativeTrait for new users that want to avoid this and later on deprecate it in favor of the canonicalization solution.

Thanks!

Herald added subscribers: bviyer, Moerafaat, zero9178. · View Herald TranscriptJul 6 2023, 7:20 AM

Seeing how this had been dragging for so long, I'm concerned that we if land this we won't fix it ever, so I'd rather revert D123492, take the hit on CSE for now, and implement the canonicalization: it does not seem like a difficult canonicalization to implement I think?

In D129480#4477500, @mehdi_amini wrote:

Seeing how this had been dragging for so long, I'm concerned that we if land this we won't fix it ever, so I'd rather revert D123492, take the hit on CSE for now, and implement the canonicalization: it does not seem like a difficult canonicalization to implement I think?

I agree that given that the logic is faulty, it's better to revert than have it as is.

SGTM

Wouldn't it make more sense to revert it when the canonicalization is in place?

It depends how long it would take to get the canonicalization change in, as this bug is making OperationEquivalence::isRegionEquivalentTo unusable for us

There is one mitigation that would be simple, we first try to compare operands in the given order, then if this fails we check if the op is commutative and do this sorting then.

This will unblock us because we don't care about the commutative property when using this method, it's a nice to have.

In D129480#4477662, @tomnatan wrote:

It depends how long it would take to get the canonicalization change in, as this bug is making OperationEquivalence::isRegionEquivalentTo unusable for us

I personally don't have time to work on this now.

Wouldn't it make more sense to revert it when the canonicalization is in place?

Aren't we talking about a performance hit vs a correctness issue?

In D129480#4477709, @mehdi_amini wrote:

Wouldn't it make more sense to revert it when the canonicalization is in place?

Aren't we talking about a performance hit vs a correctness issue?

It works in our use case and has been upstream for more than a year. If it's blocking you feel free to revert. Looks like nobody will work to make it work correctly for all.

In D129480#4477664, @tomnatan wrote:

There is one mitigation that would be simple, we first try to compare operands in the given order, then if this fails we check if the op is commutative and do this sorting then.

This will unblock us because we don't care about the commutative property when using this method, it's a nice to have.

Isn't this better than reverting?

In D129480#4477759, @tomnatan wrote:

In D129480#4477664, @tomnatan wrote:

There is one mitigation that would be simple, we first try to compare operands in the given order, then if this fails we check if the op is commutative and do this sorting then.

This will unblock us because we don't care about the commutative property when using this method, it's a nice to have.

Isn't this better than reverting?

yeah.

Actually this mitigation will only work for OperationEquivalence::isEquivalentTo but not for OperationEquivalence::computeHash, which seems inconsistent as two identical ops might get a different hash although isEquivalentTo will return true for them. So I'm leaning toward reverting until we have a proper solution, what do you think?

One solution I had in mind for picking a deterministic order for two operands that are op results is this (though I won't have time to implement it) is this: find the lowest common ancestor block of the two defining ops, and check which ancestor op within that block is before which. This solution can also be used in canonicalization.

In D129480#4479977, @tomnatan wrote:

Actually this mitigation will only work for OperationEquivalence::isEquivalentTo but not for OperationEquivalence::computeHash, which seems inconsistent as two identical ops might get a different hash although isEquivalentTo will return true for them. So I'm leaning toward reverting until we have a proper solution, what do you think?

One solution I had in mind for picking a deterministic order for two operands that are op results is this (though I won't have time to implement it) is this: find the lowest common ancestor block of the two defining ops, and check which ancestor op within that block is before which. This solution can also be used in canonicalization.

I think one issue with moving this to canonicalization is that you have to add a pattern for each commutative ops. I don't think there is a mechanism to run canonicalization pattern on traits.

By the way there is already a pattern that does this canonicalization: https://github.com/llvm/llvm-project/blob/a57bdc8fe68753c341cac0fcecabcd4d35e45466/mlir/lib/Transforms/Utils/CommutativityUtils.cpp#L230

It's not part of the default canonicalize pass because we were concerned about the cost. But I wonder if we could run this pattern at the beginning of the CSE pass? (maybe with an option to disable it)

Something I haven't thought enough about is whether during the application of CSE we may break this canonicalization (in which case it'll be suboptimal compared to what we have now)

In D129480#4482423, @mehdi_amini wrote:

By the way there is already a pattern that does this canonicalization: https://github.com/llvm/llvm-project/blob/a57bdc8fe68753c341cac0fcecabcd4d35e45466/mlir/lib/Transforms/Utils/CommutativityUtils.cpp#L230

It's not part of the default canonicalize pass because we were concerned about the cost. But I wonder if we could run this pattern at the beginning of the CSE pass? (maybe with an option to disable it)

Something I haven't thought enough about is whether during the application of CSE we may break this canonicalization (in which case it'll be suboptimal compared to what we have now)

Yeah I think it makes sense to move it to the beginning of the CSE pass which is already costly.

Why would we break this canonicalization? If we sort the operands at the beginning, and then during CSE only compare operands in the order they are in.

If we look at a simple example:

func (a, b) {
  c = a + b;
  d = b + a;
  return c + d;
}

Here the pre-sorting phase would turn it into:

func (a, b) {
  c = a + b;
  d = a + b;
  return c + d;
}

At which point CSE will de-dup c and d, all is fine, but note that c and d were sorted according to a and b whose order didn't change.
Now if we have: something a bit more interesting:

func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2

  add3 = a + b;
  add4 = a + c;
  res2 = add3 + add4

  return res1 + res2;
}

Here everything is already sorted, CSE will deduce add1 with add4, and add2 with add3.
However after this transformation, the order of add3 and add4 is reversed, so the operands of res2 are no longer sorted:

func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2
  res2 = add2 + add1

  return res1 + res2;
}

That's what I mean by CSE will break the canonicalization.

In D129480#4487217, @mehdi_amini wrote:
If we look at a simple example:
func (a, b) {
  c = a + b;
  d = b + a;
  return c + d;
}
Here the pre-sorting phase would turn it into:
func (a, b) {
  c = a + b;
  d = a + b;
  return c + d;
}
At which point CSE will de-dup c and d, all is fine, but note that c and d were sorted according to a and b whose order didn't change.
Now if we have: something a bit more interesting:
func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2

  add3 = a + b;
  add4 = a + c;
  res2 = add3 + add4

  return res1 + res2;
}
Here everything is already sorted, CSE will deduce add1 with add4, and add2 with add3.
However after this transformation, the order of add3 and add4 is reversed, so the operands of res2 are no longer sorted:
func (a, b, c) {
  add1 = a + b
  add2 = a + c;
  res1 = add1 + add2
  res2 = add2 + add1

  return res1 + res2;
}
That's what I mean by CSE will break the canonicalization.

That makes sense, thank you! This makes me think that we should either:

continue to sort the operands during CSE so that 'res2 = add2 + add1' will get sorted in your example and them deduped.
do this as part of IsEquivalentTo like we do now, but with a deterministic algorithm that doesn't depend on pointers.

Regardless, this shouldn't block D154699 right?

Revision Contents

Path

Size

mlir/

include/

mlir/

IR/

OperationSupport.h

6 lines

lib/

IR/

OperationSupport.cpp

6 lines

Transforms/

CSE.cpp

6 lines

Diff 443602

mlir/include/mlir/IR/OperationSupport.h

	Show First 20 Lines • Show All 1,191 Lines • ▼ Show 20 Lines
	/// equivalent.			/// equivalent.
	struct OperationEquivalence {			struct OperationEquivalence {
	enum Flags {			enum Flags {
	None = 0,			None = 0,

	// When provided, the location attached to the operation are ignored.			// When provided, the location attached to the operation are ignored.
	IgnoreLocations = 1,			IgnoreLocations = 1,

	LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue = */ IgnoreLocations)			// When provided, operands on commutative operations are checked with their
				// pointers.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions This should mention how dangerous it is I think. mehdi_amini: This should mention how dangerous it is I think.
				clementvalAuthorUnsubmitted Done Reply Inline Actions I will update. clementval: I will update.
				HandleCommutativeOps = 2,

				LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue = */ HandleCommutativeOps)
	};			};

	/// Compute a hash for the given operation.			/// Compute a hash for the given operation.
	/// The `hashOperands` and `hashResults` callbacks are expected to return a			/// The `hashOperands` and `hashResults` callbacks are expected to return a
	/// unique hash_code for a given Value.			/// unique hash_code for a given Value.
	static llvm::hash_code computeHash(			static llvm::hash_code computeHash(
	Operation *op,			Operation *op,
	function_ref<llvm::hash_code(Value)> hashOperands =			function_ref<llvm::hash_code(Value)> hashOperands =
	▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

mlir/lib/IR/OperationSupport.cpp

Show First 20 Lines • Show All 629 Lines • ▼ Show 20 Lines	llvm::hash_code OperationEquivalence::computeHash(
// - Attributes		// - Attributes
// - Result Types		// - Result Types
llvm::hash_code hash = llvm::hash_combine(		llvm::hash_code hash = llvm::hash_combine(
op->getName(), op->getAttrDictionary(), op->getResultTypes());		op->getName(), op->getAttrDictionary(), op->getResultTypes());

// - Operands		// - Operands
ValueRange operands = op->getOperands();		ValueRange operands = op->getOperands();
SmallVector<Value> operandStorage;		SmallVector<Value> operandStorage;
if (op->hasTrait<mlir::OpTrait::IsCommutative>()) {		if (op->hasTrait<mlir::OpTrait::IsCommutative>() &&
		(flags & OperationEquivalence::HandleCommutativeOps)) {
operandStorage.append(operands.begin(), operands.end());		operandStorage.append(operands.begin(), operands.end());
llvm::sort(operandStorage, [](Value a, Value b) -> bool {		llvm::sort(operandStorage, [](Value a, Value b) -> bool {
return a.getAsOpaquePointer() < b.getAsOpaquePointer();		return a.getAsOpaquePointer() < b.getAsOpaquePointer();
});		});
operands = operandStorage;		operands = operandStorage;
}		}
for (Value operand : operands)		for (Value operand : operands)
hash = llvm::hash_combine(hash, hashOperands(operand));		hash = llvm::hash_combine(hash, hashOperands(operand));
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	if (lhs->getName() != rhs->getName() \|\|
lhs->getNumOperands() != rhs->getNumOperands() \|\|		lhs->getNumOperands() != rhs->getNumOperands() \|\|
lhs->getNumResults() != rhs->getNumResults())		lhs->getNumResults() != rhs->getNumResults())
return false;		return false;
if (!(flags & IgnoreLocations) && lhs->getLoc() != rhs->getLoc())		if (!(flags & IgnoreLocations) && lhs->getLoc() != rhs->getLoc())
return false;		return false;

ValueRange lhsOperands = lhs->getOperands(), rhsOperands = rhs->getOperands();		ValueRange lhsOperands = lhs->getOperands(), rhsOperands = rhs->getOperands();
SmallVector<Value> lhsOperandStorage, rhsOperandStorage;		SmallVector<Value> lhsOperandStorage, rhsOperandStorage;
if (lhs->hasTrait<mlir::OpTrait::IsCommutative>()) {		if (lhs->hasTrait<mlir::OpTrait::IsCommutative>() &&
		(flags & OperationEquivalence::HandleCommutativeOps)) {
lhsOperandStorage.append(lhsOperands.begin(), lhsOperands.end());		lhsOperandStorage.append(lhsOperands.begin(), lhsOperands.end());
llvm::sort(lhsOperandStorage, [](Value a, Value b) -> bool {		llvm::sort(lhsOperandStorage, [](Value a, Value b) -> bool {
return a.getAsOpaquePointer() < b.getAsOpaquePointer();		return a.getAsOpaquePointer() < b.getAsOpaquePointer();
});		});
lhsOperands = lhsOperandStorage;		lhsOperands = lhsOperandStorage;

rhsOperandStorage.append(rhsOperands.begin(), rhsOperands.end());		rhsOperandStorage.append(rhsOperands.begin(), rhsOperands.end());
llvm::sort(rhsOperandStorage, [](Value a, Value b) -> bool {		llvm::sort(rhsOperandStorage, [](Value a, Value b) -> bool {
Show All 29 Lines

mlir/lib/Transforms/CSE.cpp

	Show All 26 Lines

	namespace {			namespace {
	struct SimpleOperationInfo : public llvm::DenseMapInfo<Operation *> {			struct SimpleOperationInfo : public llvm::DenseMapInfo<Operation *> {
	static unsigned getHashValue(const Operation *opC) {			static unsigned getHashValue(const Operation *opC) {
	return OperationEquivalence::computeHash(			return OperationEquivalence::computeHash(
	const_cast<Operation *>(opC),			const_cast<Operation *>(opC),
	/hashOperands=/OperationEquivalence::directHashValue,			/hashOperands=/OperationEquivalence::directHashValue,
	/hashResults=/OperationEquivalence::ignoreHashValue,			/hashResults=/OperationEquivalence::ignoreHashValue,
	OperationEquivalence::IgnoreLocations);			OperationEquivalence::IgnoreLocations \|
				OperationEquivalence::HandleCommutativeOps);
	}			}
	static bool isEqual(const Operation lhsC, const Operation rhsC) {			static bool isEqual(const Operation lhsC, const Operation rhsC) {
	auto lhs = const_cast<Operation >(lhsC);			auto lhs = const_cast<Operation >(lhsC);
	auto rhs = const_cast<Operation >(rhsC);			auto rhs = const_cast<Operation >(rhsC);
	if (lhs == rhs)			if (lhs == rhs)
	return true;			return true;
	if (lhs == getTombstoneKey() \|\| lhs == getEmptyKey() \|\|			if (lhs == getTombstoneKey() \|\| lhs == getEmptyKey() \|\|
	rhs == getTombstoneKey() \|\| rhs == getEmptyKey())			rhs == getTombstoneKey() \|\| rhs == getEmptyKey())
	return false;			return false;
	return OperationEquivalence::isEquivalentTo(			return OperationEquivalence::isEquivalentTo(
	const_cast<Operation >(lhsC), const_cast<Operation >(rhsC),			const_cast<Operation >(lhsC), const_cast<Operation >(rhsC),
	/mapOperands=/OperationEquivalence::exactValueMatch,			/mapOperands=/OperationEquivalence::exactValueMatch,
	/mapResults=/OperationEquivalence::ignoreValueEquivalence,			/mapResults=/OperationEquivalence::ignoreValueEquivalence,
	OperationEquivalence::IgnoreLocations);			OperationEquivalence::IgnoreLocations \|
				OperationEquivalence::HandleCommutativeOps);
	}			}
	};			};
	} // namespace			} // namespace

	namespace {			namespace {
	/// Simple common sub-expression elimination.			/// Simple common sub-expression elimination.
	struct CSE : public CSEBase<CSE> {			struct CSE : public CSEBase<CSE> {
	/// Shared implementation of operation elimination and scoped map definitions.			/// Shared implementation of operation elimination and scoped map definitions.
	▲ Show 20 Lines • Show All 294 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Make commutative equivalence as an optionNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 443602

mlir/include/mlir/IR/OperationSupport.h

mlir/lib/IR/OperationSupport.cpp

mlir/lib/Transforms/CSE.cpp

[MLIR] Make commutative equivalence as an option
Needs ReviewPublic