Page MenuHomePhabricator

[InstCombine] use commutative matchers for patterns with commutative operators
ClosedPublic

Authored by spatel on Sep 9 2016, 12:13 PM.

Details

Summary

This is a bandage on a wound that needs stitches.

Background/motivation: I was circling back around to https://llvm.org/bugs/show_bug.cgi?id=28296 . I made a simple patch for that and noticed some regressions. I added test cases for those with rL281055, and this is hopefully the minimal fix for just those cases.

But as you can see from the surrounding untouched folds, we are missing commuted patterns all over the place, and of course there are no regression tests to cover any of those cases.

It seems like we could sprinkle "m_c_" dust all over this file and catch most of the missing folds, but then we still wouldn't have test coverage, and we'd still miss some fraction of commuted patterns because they require adjustments to the match order.

Another thought: are there any cases where we do *not* want to match the commuted variants of a pattern? If there are none of those, then could we just change the definitions for all commutative pattern matchers, so "m_c_" becomes automatic?

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 70876.Sep 9 2016, 12:13 PM
spatel retitled this revision from to [InstCombine] use commutative matchers for patterns with commutative operators .
spatel updated this object.
spatel added reviewers: majnemer, efriedma, hfinkel.
spatel added a subscriber: llvm-commits.
sanjoy added a subscriber: sanjoy.Sep 22 2016, 2:36 PM

Have you considered treating this as a canonicalization problem? That is, always canonicalize (X & A) | (A ^ Y) to (A & X) | (A ^ Y) (for all commutative ops)?

Have you considered treating this as a canonicalization problem? That is, always canonicalize (X & A) | (A ^ Y) to (A & X) | (A ^ Y) (for all commutative ops)?

Hmmm...no, I didn't think of that. Can you describe the rule that we would use with this example?

This case is the 2nd to last code diff in this patch:
(A & ~B) ^ (~A & B) -> A ^ B
(~B & A) ^ (~A & B) -> A ^ B

I'm not sure how to canonicalize this without possibly conflicting with other canonicalization rules based on operand complexity. Ie, we order binop operands depending on whether they are themselves binop/unop/params/constants.

sanjoy added a comment.Oct 2 2016, 3:04 PM

I'm not sure how to canonicalize this without possibly conflicting
with other canonicalization rules based on operand complexity. Ie, we
order binop operands depending on whether they are themselves
binop/unop/params/constants.

Yeah, I can't come up with a scheme that will obviously not clash with
InstCombine's operand complexity order.

An alternative may be to canonicalize "internally" -- that is first
split the expression into (Op0 ITy0 Op1) ITy1 (Op2 ITy2 Op3) (that
is, "explode" the contents of the operation into local variables) and
then std::swap the Op0 / Op1 or the Op2 / Op3 pairs in
whatever order that makes matching these easier. That may involve too
much refactoring though.

I'm not sure how to canonicalize this without possibly conflicting
with other canonicalization rules based on operand complexity. Ie, we
order binop operands depending on whether they are themselves
binop/unop/params/constants.

Yeah, I can't come up with a scheme that will obviously not clash with
InstCombine's operand complexity order.

An alternative may be to canonicalize "internally" -- that is first
split the expression into (Op0 ITy0 Op1) ITy1 (Op2 ITy2 Op3) (that
is, "explode" the contents of the operation into local variables) and
then std::swap the Op0 / Op1 or the Op2 / Op3 pairs in
whatever order that makes matching these easier. That may involve too
much refactoring though.

Ah, yes - "local" canonicalization. Some of that does already exist in these functions, but of course it's not done consistently. IMO, using "m_c_" is cleaner. Eg, in some places we're maintaining the swap state (see "SwappedForXor" - not sure if that's actually necessary), so it makes it harder to read the code.

I've looked over this file again, and I'm estimating ~50 potential places where we should account for commuted patterns. So it's not *that* bad. :)

If there are no objections to using "m_c_" in this way, I'd like to move forward on this patch. I can add "FIXME" comments ahead of visitAnd/visitOr/visitXor, and then we can add tests and clean up the rest of the matchers in follow-ups.

efriedma added inline comments.Oct 13 2016, 2:01 PM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2350 ↗(On Diff #70876)

This isn't actually equivalent to four separate checks in the case where both operands of the xor are "not" operations. I guess that doesn't really matter much in practice for this particular case, but we need a better approach in general.

spatel added inline comments.Oct 19 2016, 11:04 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2350 ↗(On Diff #70876)

Is it possible to have both operands 'notted'? We should always be able to fold:

%negx = xor i32 %x, -1
%negy = xor i32 %y, -1
%xor = xor i32 %negx, %negy

to:

xor i32 %x, %y

before we reach here?

efriedma added inline comments.Oct 19 2016, 11:29 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2350 ↗(On Diff #70876)

Yes, in this particular case, it probably doesn't matter. I don't want to have to reason that out for every possible use of commuted patterns, though.

spatel added a comment.Dec 8 2016, 7:03 AM

Ping.

D27531 reminded me of this patch. Another possibility (this is in InstSimplify) looks like this:

if (auto *ICILHS = dyn_cast<ICmpInst>(Op0)) {
  if (auto *ICIRHS = dyn_cast<ICmpInst>(Op1)) {
    if (Value *V = SimplifyAndOfICmps(ICILHS, ICIRHS))
      return V;
    if (Value *V = SimplifyAndOfICmps(ICIRHS, ICILHS))
      return V;
  }
}

So put a pile of folds in a helper function and then call it with all permutations of the commutative parameters. Better than using m_c_*?

majnemer accepted this revision.Dec 16 2016, 12:00 PM
majnemer edited edge metadata.

LGTM

This revision is now accepted and ready to land.Dec 16 2016, 12:00 PM

Thanks, David. I'll add some FIXME notes as I mentioned on Oct 12 and check this in. Then we can move forward on PR28296.

A better overall solution for commuted matching may emerge after some refactoring.

This revision was automatically updated to reflect the committed changes.