Download Raw Diff

Details

Reviewers

spatel
efriedma
craig.topper
davide

Summary

Addresses PR#32791 (https://bugs.llvm.org//show_bug.cgi?id=32791).

When attempting to simplify an 'or' instruction, check whether its operands are
the results of 'select' instructions, and whether those instructions are the
inverse of one another (that is, their predicate are the inverse of one another,
but their true and false conditions are identical). If they are, the 'or' can
be simplified to a single 'select' instruction.

Diff Detail

Build Status

Buildable 6614
Build 6614: arc lint + arc unit

Event Timeline

modocache created this revision.May 14 2017, 10:50 AM

milseman added a subscriber: milseman.May 14 2017, 10:58 AM

milseman added inline comments.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2287	You should probably follow the style of the surrounding code by using the pattern matchers. See PatternMatch.h, and an example at e.g. lines 2261 above.

Use PatternMatch.h. Also, actually perform checks for zero operands (whoops!).

modocache marked an inline comment as done.May 14 2017, 3:43 PM

spatel added inline comments.May 14 2017, 4:48 PM

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2268–2269	This comment didn't read clearly to me. The 'or' (not a comparison) is based on the selects. Have the formula in the comment line up with the variable names in the code: or (select (cmp Pred1, A, B), C, 0), (select (cmp Pred2, A, B), D, 0) --> select (cmp Pred1, A, B), C, D when Pred1 is the inverse of Pred2.
2284–2289	Use another pattern matcher here (warning: untested code): if (match(X, m_Cmp(Pred1, m_Value(A), m_Value(B))) && match(Y, m_Cmp(Pred2, m_Specific(A), m_Specific(B))) && CmpInst::getInversePredicate(Pred1) == Pred2)
test/Transforms/InstCombine/logical-select.ll
67	We need more tests: Check the case where the 0 operands are on the left instead of the right. The code should work with fcmps, so we need a test for that. The code should work with vectors, so we need a test for that.

Use a match() in the nested condition, update comment for clarity. To come: additional tests.

modocache marked 2 inline comments as done.May 14 2017, 6:24 PM

Add tests, and fix a bug that the 'reversed' tests uncovered.

Thanks again, @spatel! I think I've addressed all of your feedback.

I starred a bit at this patch, and I think it's correct.
OTOH, I have feeling of perplexity that arises from the way we're looking at this.
In particular, it seems like we're widening the peephole a little too much -> or (select (cmp.
I'm fairly sure we do the same in other places in InstCombine, so there's a precedent.
I don't have a strong opinion on this, but I'd like if @majnemer or @efriedma could take a look.

After giving some more careful thought, I'm not necessarily sure InstCombine is the right pass to solve this problem (or at least, there's an alternative solution).
I've been convincing myself this is a value-range propagation/copy-propagation problem, and in fact, this is how GCC optimizes this code.
If I understand correctly, you start with some C code that looks like this: https://godbolt.org/g/0HBds4

Now you have some pseudo-IR that looks like this (actually, this is GIMPLE, so we can map to something real :))

<bb 2>:
if (a_3(D) < b_4(D))
  goto <bb 4>;
else
  goto <bb 3>;

<bb 3>:

<bb 4>:
# iftmp.0_1 = PHI <c_5(D)(2), 0(3)>
if (a_3(D) >= b_4(D))
  goto <bb 6>;
else
  goto <bb 5>;

<bb 5>:

<bb 6>:
# iftmp.1_2 = PHI <d_6(D)(4), 0(5)>
_7 = iftmp.0_1 | iftmp.1_2;
return _7;

your VRP algorithm can prove after SSA replacement (where LHS replaces RHS)

a_9 -> { a_3(D) }
a_10 -> { a_3(D) }
b_11 -> { b_4(D) }
b_12 -> { b_4(D) }

that your SSA values have the following ranges:

a_9: [-INF, b_4(D) + -1]
a_10: [b_4(D), +INF]
b_11: [a_9 + 1, +INF]
b_12: [-INF, a_10]

therefore removing the second compare and transforming your IR into:

<bb 2>:
if (a_3(D) < b_4(D))
  goto <bb 5>;
else
  goto <bb 4>;

<bb 3>:
# iftmp.1_2 = PHI <0(5), d_6(D)(4)>
# iftmp.0_15 = PHI <iftmp.0_14(5), iftmp.0_13(4)>
_7 = iftmp.0_15 | iftmp.1_2;
return _7;

<bb 4>:
# iftmp.0_13 = PHI <0(2)>
goto <bb 3>;

<bb 5>:
# iftmp.0_14 = PHI <c_5(D)(2)>
goto <bb 3>;

(at that point you have a bunch of trivial phi nodes that probably copy propagation or something else could clean up, you get the idea).

This revision now requires changes to proceed.May 14 2017, 10:51 PM

I agree that we're missing some kind of more general transform. Eg:

  int goo(int x, int y, int r) {
    if (x > y) r += 42;
    if (x <= y) r += 12;
    return r;
  }

define i32 @goo(i32, i32, i32) {
  %4 = icmp sgt i32 %0, %1
  %5 = add nsw i32 %2, 42
  %6 = select i1 %4, i32 %5, i32 %2
  %7 = add nsw i32 %6, 12
  %8 = select i1 %4, i32 %5, i32 %7
  ret i32 %8
}

https://godbolt.org/g/tszo72

It's worth noting that one motivation for PR32791 (although I forgot to write it there) is that I think we should canonicalize all:
not (cmp P, X, Y) --> cmp P', X, Y
Ie, invert the predicate to eliminate an explicit 'not' (xor) because then we have eliminated a dependency in the IR. This was discussed in D32725.

So we should look at how the existing InstCombine test ("@poo" is just above the new test here) fails if we implement that canonicalization:

define i32 @poo5(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %and1 = select i1 %cmp, i32 %c, i32 0
  %cmpnot = xor i1 %cmp, -1   <--- this can be replaced by "icmp sge i32 %a, %b", and we have the first testcase in this patch (the test from PR32791)
  %and2 = select i1 %cmpnot, i32 %d, i32 0
  %or = or i32 %and1, %and2
  ret i32 %or
}

define i32 @poo6(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %and1 = select i1 %cmp, i32 %c, i32 0
  %and2 = select i1 %cmp, i32 0, i32 %d  <--- we fold a 'not' compare by swapping the select operands
  %or = or i32 %and1, %and2
  ret i32 %or
}

For this pattern, there's a fold in InstCombiner::SimplifyUsingDistributiveLaws():

// (op (select (a, c, b)), (select (a, d, b))) -> (select (a, (op c, d), 0))
// (op (select (a, b, c)), (select (a, b, d))) -> (select (a, 0, (op c, d)))

Note: the implementation for that fold seems bogus. We create a new binop and a new select for *both* of the current selects in this example and then don't use one pair.

So if we want a more general InstCombine solution than what is currently proposed here, I think we should enhance that to account for an inverted predicate as well as matching predicates:
binop (select P, A, B), (select P', C, D) --> binop (select P, A, B), (select P, D, C)
and let the existing fold take it from there.

Swapping select operands to eliminate an inverted compare seems like a generally good fold for some other pass too. Would that be GVN?

In D33172#755069, @spatel wrote:
I agree that we're missing some kind of more general transform. Eg:
  int goo(int x, int y, int r) {
    if (x > y) r += 42;
    if (x <= y) r += 12;
    return r;
  }

define i32 @goo(i32, i32, i32) {
  %4 = icmp sgt i32 %0, %1
  %5 = add nsw i32 %2, 42
  %6 = select i1 %4, i32 %5, i32 %2
  %7 = add nsw i32 %6, 12
  %8 = select i1 %4, i32 %5, i32 %7
  ret i32 %8
}
https://godbolt.org/g/tszo72

It's worth noting that one motivation for PR32791 (although I forgot to write it there) is that I think we should canonicalize all:
not (cmp P, X, Y) --> cmp P', X, Y
Ie, invert the predicate to eliminate an explicit 'not' (xor) because then we have eliminated a dependency in the IR. This was discussed in D32725.

So we should look at how the existing InstCombine test ("@poo" is just above the new test here) fails if we implement that canonicalization:
define i32 @poo5(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %and1 = select i1 %cmp, i32 %c, i32 0
  %cmpnot = xor i1 %cmp, -1   <--- this can be replaced by "icmp sge i32 %a, %b", and we have the first testcase in this patch (the test from PR32791)
  %and2 = select i1 %cmpnot, i32 %d, i32 0
  %or = or i32 %and1, %and2
  ret i32 %or
}

define i32 @poo6(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %and1 = select i1 %cmp, i32 %c, i32 0
  %and2 = select i1 %cmp, i32 0, i32 %d  <--- we fold a 'not' compare by swapping the select operands
  %or = or i32 %and1, %and2
  ret i32 %or
}
For this pattern, there's a fold in InstCombiner::SimplifyUsingDistributiveLaws():
// (op (select (a, c, b)), (select (a, d, b))) -> (select (a, (op c, d), 0))
// (op (select (a, b, c)), (select (a, b, d))) -> (select (a, 0, (op c, d)))
Note: the implementation for that fold seems bogus. We create a new binop and a new select for *both* of the current selects in this example and then don't use one pair.

So if we want a more general InstCombine solution than what is currently proposed here, I think we should enhance that to account for an inverted predicate as well as matching predicates:
binop (select P, A, B), (select P', C, D) --> binop (select P, A, B), (select P, D, C)
and let the existing fold take it from there.

Swapping select operands to eliminate an inverted compare seems like a generally good fold for some other pass too. Would that be GVN?

Not entirely sure I follow this last one.
NewGVN finds whether two expressions e1 and e2 are equivalent (actually, not general equivalence as checking equivalence of program expressions is undecidable, so we approximate with Herbrand equivalence). If you have an example (IR), that will help (and I can take a look).

In D33172#755138, @davide wrote:

Swapping select operands to eliminate an inverted compare seems like a generally good fold for some other pass too. Would that be GVN?

Not entirely sure I follow this last one.
NewGVN finds whether two expressions e1 and e2 are equivalent (actually, not general equivalence as checking equivalence of program expressions is undecidable, so we approximate with Herbrand equivalence). If you have an example (IR), that will help (and I can take a look).

declare i32 @bar(i32, i32)

define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %sel1 = select i1 %cmp, i32 %c, i32 0
  %cmpnot = icmp sge i32 %a, %b
  %sel2 = select i1 %cmpnot, i32 %d, i32 0
  %call = tail call i32 @bar(i32 %sel1, i32 %sel2)
  ret i32 %call
}

I think this should be canonicalized to:

define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %sel1 = select i1 %cmp, i32 %c, i32 0
  %sel2 = select i1 %cmp, i32 0,  i32 %d
  %call = tail call i32 @bar(i32 %sel1, i32 %sel2)
  ret i32 %call
}

The general problem is that we should recognize when a compare value used by a select already exists in inverted form. If it does, swap the select operands and eliminate the inverted usage.

The problem may be more general than that. It's really about recognizing inverted compares, so let's take selects out of the equation (using div ops to thwart hoisting/sinking here, but that's just to produce a minimal IR example):

define i32 @inverted_compares(i32 %a, i32 %b, i32 %c) {
entry:
  %cmp = icmp slt i32 %a, %b
  br i1 %cmp, label %t1, label %f1

t1:
  %div1 = sdiv i32 42, %c
  br label %endif1

f1:
  %div2 = srem i32 43, %c
  br label %endif1

endif1:
  %phi1 = phi i32 [ %div1, %t1 ], [ %div2, %f1 ]
  %cmpnot = icmp sge i32 %a, %b
  br i1 %cmpnot, label %t2, label %f2

t2:
  %div3 = sdiv i32 44, %phi1
  br label %endif2

f2:
  %div4 = srem i32 45, %phi1
  br label %endif2

endif2:
  %phi2 = phi i32 [ %div3, %t2 ], [ %div4, %f2 ]
  %call = call i32 @bar(i32 %phi1, i32 %phi2)

  ret i32 %call
}

Because of instcombine predicate canonicalization rules, we optimize this to one compare:

define i32 @inverted_compares_optimized(i32 %a, i32 %b, i32 %c) {
entry:
  %cmp = icmp slt i32 %a, %b
  br i1 %cmp, label %f2, label %t2

t2:               
  %div2 = srem i32 43, %c
  %div3 = udiv i32 44, %div2
  br label %endif2

f2:                               
  %div1 = sdiv i32 42, %c
  %div4 = srem i32 45, %div1
  br label %endif2

endif2:                               
  %phi12 = phi i32 [ %div2, %t2 ], [ %div1, %f2 ]
  %phi2 = phi i32 [ %div3, %t2 ], [ %div4, %f2 ]
  %call = tail call i32 @bar(i32 %phi12, i32 %phi2)
  ret i32 %call
}

But if the compare has an extra use, the instcombine doesn't fire, and this won't optimize at -O2:

define i32 @inverted_compares_and_one(i32 %a, i32 %b, i32 %c) {
entry:
  %cmp = icmp slt i32 %a, %b
  br i1 %cmp, label %t1, label %f1

t1:
  %div1 = sdiv i32 42, %c
  br label %endif1

f1:
  %div2 = srem i32 43, %c
  br label %endif1

endif1:
  %phi1 = phi i32 [ %div1, %t1 ], [ %div2, %f1 ]
  %cmpnot = icmp sge i32 %a, %b
  br i1 %cmpnot, label %t2, label %f2

t2:
  %div3 = sdiv i32 44, %phi1
  br label %endif2

f2:
  %div4 = srem i32 45, %phi1
  br label %endif2

endif2:
  %phi2 = phi i32 [ %div3, %t2 ], [ %div4, %f2 ]
  %call = call i32 @bar_extra_cmp_use(i32 %phi1, i32 %phi2, i1 %cmpnot)
  ret i32 %call
}

We could have gotten this to:

define i32 @inverted_compares_optimized(i32 %a, i32 %b, i32 %c) {
entry:
  %cmp = icmp slt i32 %a, %b
  br i1 %cmp, label %f2, label %t2

t2:               
  %div2 = srem i32 43, %c
  %div3 = udiv i32 44, %div2
  br label %endif2

f2:                               
  %div1 = sdiv i32 42, %c
  %div4 = srem i32 45, %div1
  br label %endif2

endif2:                               
  %phi12 = phi i32 [ %div2, %t2 ], [ %div1, %f2 ]
  %phi2 = phi i32 [ %div3, %t2 ], [ %div4, %f2 ]
  %cmpnot = icmp sge i32 %a, %b
  %call = call i32 @bar_extra_cmp_use(i32 %phi1, i32 %phi2, i1 %cmpnot)
  ret i32 %call
}

In D33172#755146, @spatel wrote:
In D33172#755138, @davide wrote:

Swapping select operands to eliminate an inverted compare seems like a generally good fold for some other pass too. Would that be GVN?

Not entirely sure I follow this last one.
NewGVN finds whether two expressions e1 and e2 are equivalent (actually, not general equivalence as checking equivalence of program expressions is undecidable, so we approximate with Herbrand equivalence). If you have an example (IR), that will help (and I can take a look).
declare i32 @bar(i32, i32)

define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %sel1 = select i1 %cmp, i32 %c, i32 0
  %cmpnot = icmp sge i32 %a, %b
  %sel2 = select i1 %cmpnot, i32 %d, i32 0
  %call = tail call i32 @bar(i32 %sel1, i32 %sel2)
  ret i32 %call
}
I think this should be canonicalized to:
define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %sel1 = select i1 %cmp, i32 %c, i32 0
  %sel2 = select i1 %cmp, i32 0,  i32 %d
  %call = tail call i32 @bar(i32 %sel1, i32 %sel2)
  ret i32 %call
}

If you really want canonicalize (and that needs to be evaluated yet), I think you might consider instead canonicalizing to:

declare i32 @bar(i32, i32)

define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) {
  %cmp = icmp slt i32 %a, %b
  %sel1 = select i1 %cmp, i32 %c, i32 0
  %cmpnot = icmp slt i32 %a, %b
  %sel2 = select i1 %cmpnot, i32 0, i32 %d
  %call = tail call i32 @bar(i32 %sel1, i32 %sel2)
  ret i32 %call
}

and then let NewGVN discover that %cmpnot and %cmp are actually in the same congruence class (GVN will also get to the same conclusion, but it doesn't have a real analysis/notion of congruence class).

a.elovikov added a subscriber: a.elovikov.May 15 2017, 11:20 AM

spatel mentioned this in D33242: [InstCombine] add motivational comment for tests; NFC.May 16 2017, 8:19 AM

spatel mentioned this in rL303185: [InstCombine] add motivational comment for tests; NFC.May 16 2017, 9:44 AM

A couple of updates:

Added the tests included here to show the current IR with: rL303133 / rL303185

Refactored some code that may be relevant for this transform:

rL303261

I think the predicate canonicalization that we do for icmp+branch is worthwhile because it increases the likelihood of creating duplicate values (fcmp needs fixes IMO). It also can make pattern matching for other transforms easier if we know that half of the possible predicates are eliminated by canonicalization (at least for the single-use case). For the same reasons, I favor making that canonicalization apply to cmp+select as well. Ie, I would invert the predicates / swap the select operands shown in the tests here. This probably requires preliminary patches to avoid regressions though.

Rebased onto rL303133. Sorry, I'm a new contributor, so I'm having trouble understanding how to proceed here. Is there anything specifically I should change in this diff? Should I be using the isCanonicalPredicate helper added in rL303261? Or should I be using a different approach altogether?

In D33172#760113, @modocache wrote:

Rebased onto rL303133. Sorry, I'm a new contributor, so I'm having trouble understanding how to proceed here. Is there anything specifically I should change in this diff? Should I be using the isCanonicalPredicate helper added in rL303261? Or should I be using a different approach altogether?

It turns out this simple patch raises deep philosophical questions for LLVM. :)
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113184.html

I agreed with Davide that the patch as written is too specific. We should handle this problem more generally, and likely that's a shortcoming of some other pass. IMO and regardless of that, InstCombine is still missing a canonicalization of the cmp predicate that is feeding the select. I did not see any other comments to know if anyone else agrees or disagrees with that position. If you disagree with that, then I assume you would also want to see the related cmp+branch canonicalization removed? I suspect that would require some thorough perf analysis to justify.

But since we have that canonicalization today, I think the immediate problem will be solved either here in InstCombine or somewhere else using existing logic.

There's also a question about whether the existing fold that I pointed out:
// (op (select (a, c, b)), (select (a, d, b))) -> (select (a, (op c, d), 0))
should exist in InstCombine. Again, whether you think that belongs in InstCombine or not, IMO is irrelevant to the immediate goal of solving PR32791 because the fold already exists, and there's no evidence that some other pass handles that case. If you advocate removing that, then you have to come up with a better solution...which could be a major undertaking at this point.

So to answer the question: yes, I think we should use the helper added in rL303261 to canonicalize cmps used by selects. However, doing that likely requires not introducing any known regressions to instcombine, and I already know there will be one. :)

Let me see if I can solve that one non-controversially and report back here on that.

There are at least a couple of current patches that delve into the same philosophical territory, so I think it's worth following the discussions here too:
https://reviews.llvm.org/D33342
https://reviews.llvm.org/D33338

Ah OK, thanks! I'll keep this open and try to follow along with the discussions as best I can. Thanks for the reviews!

Nothing's easy with this one...the transform that I thought could incur a regression may not even be legal...but it's not clear if anyone knows for certain. :)
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html

spatel mentioned this in D34242: [InstCombine] canonicalize icmp predicate feeding select.Jun 15 2017, 9:32 AM

This is a bit beyond my depth at this point, sorry! Going to abandon this one :)

spatel mentioned this in rL306435: [InstCombine] canonicalize icmp predicate feeding select.Jun 27 2017, 10:53 AM

spatel mentioned this in D35182: [InstCombine] remove one-use restriction for not (cmp P, A, B) --> cmp P', A, B.Jul 9 2017, 9:11 AM

Diff 99665

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show First 20 Lines • Show All 2,259 Lines • ▼ Show 20 Lines	if (Op0->hasOneUse() && Op1->hasOneUse() &&
match(Op0, m_Select(m_Value(X), m_Value(A), m_Value(B))) &&		match(Op0, m_Select(m_Value(X), m_Value(A), m_Value(B))) &&
match(Op1, m_Select(m_Value(Y), m_Value(C), m_Value(D))) && X == Y) {		match(Op1, m_Select(m_Value(Y), m_Value(C), m_Value(D))) && X == Y) {
Value *orTrue = Builder->CreateOr(A, C);		Value *orTrue = Builder->CreateOr(A, C);
Value *orFalse = Builder->CreateOr(B, D);		Value *orFalse = Builder->CreateOr(B, D);
return SelectInst::Create(X, orTrue, orFalse);		return SelectInst::Create(X, orTrue, orFalse);
}		}
}		}

		// Change (or (select (cmp Pred1, A, B), C, 0), (select (cmp Pred2, A, B), D, 0)) --> (select (cmp Pred1, A, B), C, D)
		// when Pred1 is the inverse of Pred2. That is, if the or is based on the
		spatelUnsubmitted Done Reply Inline Actions This comment didn't read clearly to me. The 'or' (not a comparison) is based on the selects. Have the formula in the comment line up with the variable names in the code: or (select (cmp Pred1, A, B), C, 0), (select (cmp Pred2, A, B), D, 0) --> select (cmp Pred1, A, B), C, D when Pred1 is the inverse of Pred2. spatel: This comment didn't read clearly to me. The 'or' (not a comparison) is based on the selects.
		// results of two select instructions, check whether the conditions of those
		// select instructions are inverse icmp instructions with zero operands. If
		// so, simplify to a single select on one of the conditions.
		{
		Value X = nullptr, Y = nullptr;
		// Match both:
		// (or (X ? 0 : A), (!X ? 0 : C))
		// (or (X ? A : 0), (!X ? C : 0))
		if ((match(Op0, m_Select(m_Value(X), m_Value(C), m_Zero())) &&
		match(Op1, m_Select(m_Value(Y), m_Value(D), m_Zero()))) \|\|
		(match(Op0, m_Select(m_Value(Y), m_Zero(), m_Value(C))) &&
		match(Op1, m_Select(m_Value(X), m_Zero(), m_Value(D))))) {
		// Only transform into a select if X and Y have inverted predicates
		// with identical operands.
		CmpInst::Predicate Pred1, Pred2;
		if (match(X, m_Cmp(Pred1, m_Value(A), m_Value(B))) &&
		match(Y, m_Cmp(Pred2, m_Specific(A), m_Specific(B))) &&
		CmpInst::getInversePredicate(Pred1) == Pred2) {
		milsemanUnsubmitted Done Reply Inline Actions You should probably follow the style of the surrounding code by using the pattern matchers. See PatternMatch.h, and an example at e.g. lines 2261 above. milseman: You should probably follow the style of the surrounding code by using the pattern matchers. See…
		return SelectInst::Create(X, C, D);
		}
		spatelUnsubmitted Done Reply Inline Actions Use another pattern matcher here (warning: untested code): if (match(X, m_Cmp(Pred1, m_Value(A), m_Value(B))) && match(Y, m_Cmp(Pred2, m_Specific(A), m_Specific(B))) && CmpInst::getInversePredicate(Pred1) == Pred2) spatel: Use another pattern matcher here (warning: untested code): if (match(X, m_Cmp(Pred1, m_Value…
		}
		}

return Changed ? &I : nullptr;		return Changed ? &I : nullptr;
}		}

/// A ^ B can be specified using other logic ops in a variety of patterns. We		/// A ^ B can be specified using other logic ops in a variety of patterns. We
/// can fold these early and efficiently by morphing an existing instruction.		/// can fold these early and efficiently by morphing an existing instruction.
static Instruction *foldXorToXor(BinaryOperator &I) {		static Instruction *foldXorToXor(BinaryOperator &I) {
assert(I.getOpcode() == Instruction::Xor);		assert(I.getOpcode() == Instruction::Xor);
Value *Op0 = I.getOperand(0);		Value *Op0 = I.getOperand(0);
▲ Show 20 Lines • Show All 334 Lines • Show Last 20 Lines

test/Transforms/InstCombine/logical-select.ll

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	;
%iftmp = select i1 %t0, i32 0, i32 -1		%iftmp = select i1 %t0, i32 0, i32 -1
%t2 = and i32 %iftmp, %d		%t2 = and i32 %iftmp, %d
%t3 = or i32 %t1, %t2		%t3 = or i32 %t1, %t2
ret i32 %t3		ret i32 %t3
}		}

; TODO: For the next 4 tests, are there potential canonicalizations and/or folds for these		; TODO: For the next 4 tests, are there potential canonicalizations and/or folds for these
; in InstCombine? Independent of that, tests like this that may not show any transforms		; in InstCombine? Independent of that, tests like this that may not show any transforms
; still have value because they can help identify conflicting canonicalization rules that		; still have value because they can help identify conflicting canonicalization rules that
		spatelUnsubmitted Done Reply Inline Actions We need more tests: Check the case where the 0 operands are on the left instead of the right. The code should work with fcmps, so we need a test for that. The code should work with vectors, so we need a test for that. spatel: We need more tests: 1. Check the case where the 0 operands are on the left instead of the right.
; lead to infinite looping.		; lead to infinite looping.

; PR32791 - https://bugs.llvm.org//show_bug.cgi?id=32791		; PR32791 - https://bugs.llvm.org//show_bug.cgi?id=32791
; Fold two selects with inverted predicates and zero operands.		; Fold two selects with inverted predicates and zero operands.
define i32 @fold_inverted_icmp_preds(i32 %a, i32 %b, i32 %c, i32 %d) {		define i32 @fold_inverted_icmp_preds(i32 %a, i32 %b, i32 %c, i32 %d) {
; CHECK-LABEL: @fold_inverted_icmp_preds(		; CHECK-LABEL: @fold_inverted_icmp_preds(
; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i32 %a, %b		; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 %a, %b
; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CMP1]], i32 %c, i32 0		; CHECK-NEXT: [[SEL:%.*]] = select i1 [[CMP]], i32 %c, i32 %d
; CHECK-NEXT: [[CMP2:%.*]] = icmp sge i32 %a, %b		; CHECK-NEXT: ret i32 [[SEL]]
; CHECK-NEXT: [[SEL2:%.*]] = select i1 [[CMP2]], i32 %d, i32 0
; CHECK-NEXT: [[OR:%.*]] = or i32 [[SEL1]], [[SEL2]]
; CHECK-NEXT: ret i32 [[OR]]
;		;
%cmp1 = icmp slt i32 %a, %b		%cmp1 = icmp slt i32 %a, %b
%sel1 = select i1 %cmp1, i32 %c, i32 0		%sel1 = select i1 %cmp1, i32 %c, i32 0
%cmp2 = icmp sge i32 %a, %b		%cmp2 = icmp sge i32 %a, %b
%sel2 = select i1 %cmp2, i32 %d, i32 0		%sel2 = select i1 %cmp2, i32 %d, i32 0
%or = or i32 %sel1, %sel2		%or = or i32 %sel1, %sel2
ret i32 %or		ret i32 %or
}		}

define i32 @fold_inverted_icmp_preds_reverse(i32 %a, i32 %b, i32 %c, i32 %d) {		define i32 @fold_inverted_icmp_preds_reverse(i32 %a, i32 %b, i32 %c, i32 %d) {
; CHECK-LABEL: @fold_inverted_icmp_preds_reverse(		; CHECK-LABEL: @fold_inverted_icmp_preds_reverse(
; CHECK-NEXT: [[CMP1:%.*]] = icmp slt i32 %a, %b		; CHECK-NEXT: [[CMP:%.*]] = icmp sge i32 %a, %b
; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CMP1]], i32 0, i32 %c		; CHECK-NEXT: [[SEL:%.*]] = select i1 [[CMP]], i32 %c, i32 %d
; CHECK-NEXT: [[CMP2:%.*]] = icmp sge i32 %a, %b		; CHECK-NEXT: ret i32 [[SEL]]
; CHECK-NEXT: [[SEL2:%.*]] = select i1 [[CMP2]], i32 0, i32 %d
; CHECK-NEXT: [[OR:%.*]] = or i32 [[SEL1]], [[SEL2]]
; CHECK-NEXT: ret i32 [[OR]]
;		;
%cmp1 = icmp slt i32 %a, %b		%cmp1 = icmp slt i32 %a, %b
%sel1 = select i1 %cmp1, i32 0, i32 %c		%sel1 = select i1 %cmp1, i32 0, i32 %c
%cmp2 = icmp sge i32 %a, %b		%cmp2 = icmp sge i32 %a, %b
%sel2 = select i1 %cmp2, i32 0, i32 %d		%sel2 = select i1 %cmp2, i32 0, i32 %d
%or = or i32 %sel1, %sel2		%or = or i32 %sel1, %sel2
ret i32 %or		ret i32 %or
}		}

define i32 @fold_inverted_fcmp_preds(float %a, float %b, i32 %c, i32 %d) {		define i32 @fold_inverted_fcmp_preds(float %a, float %b, i32 %c, i32 %d) {
; CHECK-LABEL: @fold_inverted_fcmp_preds(		; CHECK-LABEL: @fold_inverted_fcmp_preds(
; CHECK-NEXT: [[CMP1:%.*]] = fcmp olt float %a, %b		; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float %a, %b
; CHECK-NEXT: [[SEL1:%.*]] = select i1 [[CMP1]], i32 %c, i32 0		; CHECK-NEXT: [[SEL:%.*]] = select i1 [[CMP]], i32 %c, i32 %d
; CHECK-NEXT: [[CMP2:%.*]] = fcmp uge float %a, %b		; CHECK-NEXT: ret i32 [[SEL]]
; CHECK-NEXT: [[SEL2:%.*]] = select i1 [[CMP2]], i32 %d, i32 0
; CHECK-NEXT: [[OR:%.*]] = or i32 [[SEL1]], [[SEL2]]
; CHECK-NEXT: ret i32 [[OR]]
;		;
%cmp1 = fcmp olt float %a, %b		%cmp1 = fcmp olt float %a, %b
%sel1 = select i1 %cmp1, i32 %c, i32 0		%sel1 = select i1 %cmp1, i32 %c, i32 0
%cmp2 = fcmp uge float %a, %b		%cmp2 = fcmp uge float %a, %b
%sel2 = select i1 %cmp2, i32 %d, i32 0		%sel2 = select i1 %cmp2, i32 %d, i32 0
%or = or i32 %sel1, %sel2		%or = or i32 %sel1, %sel2
ret i32 %or		ret i32 %or
}		}

define <2 x i32> @fold_inverted_icmp_vector_preds(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c, <2 x i32> %d) {		define <2 x i32> @fold_inverted_icmp_vector_preds(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c, <2 x i32> %d) {
; CHECK-LABEL: @fold_inverted_icmp_vector_preds(		; CHECK-LABEL: @fold_inverted_icmp_vector_preds(
; CHECK-NEXT: [[CMP1:%.*]] = icmp ne <2 x i32> %a, %b		; CHECK-NEXT: [[CMP:%.*]] = icmp ne <2 x i32> %a, %b
; CHECK-NEXT: [[SEL1:%.*]] = select <2 x i1> [[CMP1]], <2 x i32> %c, <2 x i32> zeroinitializer		; CHECK-NEXT: [[SEL:%.*]] = select <2 x i1> [[CMP]], <2 x i32> %c, <2 x i32> %d
; CHECK-NEXT: [[CMP2:%.*]] = icmp eq <2 x i32> %a, %b		; CHECK-NEXT: ret <2 x i32> [[SEL]]
; CHECK-NEXT: [[SEL2:%.*]] = select <2 x i1> [[CMP2]], <2 x i32> %d, <2 x i32> zeroinitializer
; CHECK-NEXT: [[OR:%.*]] = or <2 x i32> [[SEL1]], [[SEL2]]
; CHECK-NEXT: ret <2 x i32> [[OR]]
;		;
%cmp1 = icmp ne <2 x i32> %a, %b		%cmp1 = icmp ne <2 x i32> %a, %b
%sel1 = select <2 x i1> %cmp1, <2 x i32> %c, <2 x i32> <i32 0, i32 0>		%sel1 = select <2 x i1> %cmp1, <2 x i32> %c, <2 x i32> <i32 0, i32 0>
%cmp2 = icmp eq <2 x i32> %a, %b		%cmp2 = icmp eq <2 x i32> %a, %b
%sel2 = select <2 x i1> %cmp2, <2 x i32> %d, <2 x i32> <i32 0, i32 0>		%sel2 = select <2 x i1> %cmp2, <2 x i32> %d, <2 x i32> <i32 0, i32 0>
%or = or <2 x i32> %sel1, %sel2		%or = or <2 x i32> %sel1, %sel2
ret <2 x i32> %or		ret <2 x i32> %or
}		}
▲ Show 20 Lines • Show All 383 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Simpify inverted predicates in 'or'
AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 99665

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

test/Transforms/InstCombine/logical-select.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Simpify inverted predicates in 'or'AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 99665

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

test/Transforms/InstCombine/logical-select.ll

[InstCombine] Simpify inverted predicates in 'or'
AbandonedPublic