This is an archive of the discontinued LLVM Phabricator instance.

Differential D51964

[InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible.
ClosedPublic

Authored by craig.topper on Sep 11 2018, 11:30 PM.

Download Raw Diff

Details

Reviewers

spatel
anna

Commits

rG8fc05ce34016: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y…
rL342163: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y…

Summary

This allows the xor to be removed completely.

This might help with recomitting r341674, but seems good regardless.

Diff Detail

Repository: rL LLVM

Event Timeline

craig.topper created this revision.Sep 11 2018, 11:30 PM

Harbormaster completed remote builds in B22518: Diff 165011.Sep 11 2018, 11:30 PM

spatel added inline comments.Sep 12 2018, 7:11 AM

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2903 ↗	(On Diff #165011)	typo
2908–2910 ↗	(On Diff #165011)	Move createMinMax() from InstCombineSelect.cpp, so we can use it here and above?
test/Transforms/InstCombine/xor.ll
813–816 ↗	(On Diff #165011)	The test should start in canonical min/max form, so we're not depending on other folds to exercise the new one: %a = sub i32 -2, %x %b = icmp sgt i32 %a, 0 %c = select i1 %1, i32 %a, i32 0 %d = xor i32 %c, -1 ...and of course, I'd prefer to have the test in place ahead of the patch.
827–830 ↗	(On Diff #165011)	This doesn't provide much extra coverage - just changing the constant? Better to have a test with a vector type and/or change the sub to an add?

More test cases and fix an infinite loop.

Can you split off the inf-loop fix? That solves PR38915, right?
https://bugs.llvm.org/show_bug.cgi?id=38915

I don’t know if it fixes that PR. This transform causes an infinite loop on the tests where both max/min operands are invertible but not constants. Test50 and test51

This whole patch does fix the infinite loop from PR38915, but it requires the whole patch and not just the change in InstCombineSelect.cpp

Fix the typo in the comment. I missed it earlier in my haste to board a plane.

Harbormaster completed remote builds in B22579: Diff 165203.Sep 12 2018, 11:21 PM

In D51964#1232736, @craig.topper wrote:

This whole patch does fix the infinite loop from PR38915, but it requires the whole patch and not just the change in InstCombineSelect.cpp

The whole patch fixes our internal regression as well (PR38915 was the bug point reduced version of the regression). Thanks for the fix!

I looked a bit closer at what's going on in PR38915. This patch fixes that particular case, but I'm worried that we still have the same bug lurking if we commit this without solving the underlying problem.

I think these combines are fighting because we don't have a way to accurately assess 'one-use' in terms of another min/max within IsFreeToInvert(). This could be the best argument yet for integer min/max intrinsics because we're going to have to keep special-casing / adding those !X->hasNUsesOrMore(3) hacks to account for min/max transforms...or maybe this + whatever else is needed to recommit rL341674 is enough?

The problem might also be solved by adjusting the min/max bailout that we have within visitICmp. We don't optimize cmps that are part of min/max because we're scared to break a min/max.

There's another underlying problem that I'd like to look that I've been putting off: we have a bunch of min/max of min/max transforms in instcombine, but they really belong in instsimplify because we're just deleting ops. It's not a complete solution, but if we fix that, we're less likely to have this problem in instcombine because some patterns would be killed before we could get into trouble.

spatel mentioned this in rL342147: [InstCombine] reorder folds to reduce chance of infinite loops.Sep 13 2018, 9:05 AM

spatel mentioned this in rL342149: [InstCombine] remove checks for IsFreeToInvert().Sep 13 2018, 9:21 AM

I don't know how to solve the infinite loop bug completely yet (if anyone else does, please suggest), and given that this solves the existing bug while adding an optimization, I'll say LGTM.

But please do:

Commit the tests with baseline checks.
Add a TODO comment about using createMinMax() (we're dropping metadata in all of the related transforms, so it would be good to handle that in 1 place if possible)
Add this test for the existing bug:

define i32 @PR38915(i32 %x, i32 %y, i32 %z) {
  %xn = sub i32 0, %x
  %yn = sub i32 0, %y
  %c1 = icmp sgt i32 %xn, %yn
  %m1 = select i1 %c1, i32 %xn, i32 %yn
  %m1n = xor i32 %m1, -1
  %c2 = icmp sgt i32 %m1n, %z
  %m2 = select i1 %c2, i32 %m1n, i32 %z
  %m2n = xor i32 %m2, -1
  ret i32 %m2n
}

This revision is now accepted and ready to land.Sep 13 2018, 10:40 AM

Closed by commit rL342163: [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y… (authored by ctopper). · Explain WhySep 13 2018, 11:54 AM

This revision was automatically updated to reflect the committed changes.

Diffusion mentioned this in rL342162: [InstCombine] Add test cases for D51964. NFC.

spatel mentioned this in D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.Sep 25 2018, 7:02 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

InstCombine/

InstCombineAndOrXor.cpp

11 lines

InstCombineSelect.cpp

2 lines

test/

Transforms/

InstCombine/

pr38915.ll

24 lines

xor.ll

64 lines

Diff 165342

llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show First 20 Lines • Show All 2,892 Lines • ▼ Show 20 Lines	if (SelectPatternResult::isMinOrMax(SPF)) {

// It's possible we get here before the not has been simplified, so make		// It's possible we get here before the not has been simplified, so make
// sure the input to the not isn't freely invertible.		// sure the input to the not isn't freely invertible.
if (match(RHS, m_Not(m_Value(Y))) && !IsFreeToInvert(Y, Y->hasOneUse())) {		if (match(RHS, m_Not(m_Value(Y))) && !IsFreeToInvert(Y, Y->hasOneUse())) {
Value *NotX = Builder.CreateNot(LHS);		Value *NotX = Builder.CreateNot(LHS);
return SelectInst::Create(		return SelectInst::Create(
Builder.CreateICmp(getInverseMinMaxPred(SPF), NotX, Y), NotX, Y);		Builder.CreateICmp(getInverseMinMaxPred(SPF), NotX, Y), NotX, Y);
}		}

		// If both sides are freely invertible, then we can get rid of the xor
		// completely.
		if (IsFreeToInvert(LHS, !LHS->hasNUsesOrMore(3)) &&
		IsFreeToInvert(RHS, !RHS->hasNUsesOrMore(3))) {
		Value *NotLHS = Builder.CreateNot(LHS);
		Value *NotRHS = Builder.CreateNot(RHS);
		return SelectInst::Create(
		Builder.CreateICmp(getInverseMinMaxPred(SPF), NotLHS, NotRHS),
		NotLHS, NotRHS);
		}
}		}
}		}

if (Instruction *NewXor = sinkNotIntoXor(I, Builder))		if (Instruction *NewXor = sinkNotIntoXor(I, Builder))
return NewXor;		return NewXor;

return nullptr;		return nullptr;
}		}

llvm/trunk/lib/Transforms/InstCombine/InstCombineSelect.cpp

Show First 20 Lines • Show All 1,841 Lines • ▼ Show 20 Lines	if (SelectPatternResult::isMinOrMax(SPF)) {
Value *NewCast = Builder.CreateCast(CastOp, NewSI, SelType);		Value *NewCast = Builder.CreateCast(CastOp, NewSI, SelType);
return replaceInstUsesWith(SI, NewCast);		return replaceInstUsesWith(SI, NewCast);
}		}

// MAX(~a, ~b) -> ~MIN(a, b)		// MAX(~a, ~b) -> ~MIN(a, b)
// MIN(~a, ~b) -> ~MAX(a, b)		// MIN(~a, ~b) -> ~MAX(a, b)
Value A, B;		Value A, B;
if (match(LHS, m_Not(m_Value(A))) && match(RHS, m_Not(m_Value(B))) &&		if (match(LHS, m_Not(m_Value(A))) && match(RHS, m_Not(m_Value(B))) &&
		!IsFreeToInvert(A, A->hasOneUse()) &&
		!IsFreeToInvert(B, B->hasOneUse()) &&
(!LHS->hasNUsesOrMore(3) \|\| !RHS->hasNUsesOrMore(3))) {		(!LHS->hasNUsesOrMore(3) \|\| !RHS->hasNUsesOrMore(3))) {
CmpInst::Predicate InvertedPred = getInverseMinMaxPred(SPF);		CmpInst::Predicate InvertedPred = getInverseMinMaxPred(SPF);
Value *InvertedCmp = Builder.CreateICmp(InvertedPred, A, B);		Value *InvertedCmp = Builder.CreateICmp(InvertedPred, A, B);
Value *NewSel = Builder.CreateSelect(InvertedCmp, A, B);		Value *NewSel = Builder.CreateSelect(InvertedCmp, A, B);
return BinaryOperator::CreateNot(NewSel);		return BinaryOperator::CreateNot(NewSel);
}		}

if (Instruction *I = factorizeMinMaxTree(SPF, LHS, RHS, Builder))		if (Instruction *I = factorizeMinMaxTree(SPF, LHS, RHS, Builder))
▲ Show 20 Lines • Show All 168 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/pr38915.ll

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt %s -instcombine -S \| FileCheck %s

				define i32 @PR38915(i32 %x, i32 %y, i32 %z) {
				; CHECK-LABEL: @PR38915(
				; CHECK-NEXT: [[TMP1:%.]] = add i32 [[X:%.]], -1
				; CHECK-NEXT: [[TMP2:%.]] = add i32 [[Y:%.]], -1
				; CHECK-NEXT: [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], [[TMP1]]
				; CHECK-NEXT: [[M1N:%.*]] = select i1 [[TMP3]], i32 [[TMP1]], i32 [[TMP2]]
				; CHECK-NEXT: [[C2:%.]] = icmp sgt i32 [[M1N]], [[Z:%.]]
				; CHECK-NEXT: [[M2:%.*]] = select i1 [[C2]], i32 [[M1N]], i32 [[Z]]
				; CHECK-NEXT: [[M2N:%.*]] = xor i32 [[M2]], -1
				; CHECK-NEXT: ret i32 [[M2N]]
				;
				%xn = sub i32 0, %x
				%yn = sub i32 0, %y
				%c1 = icmp sgt i32 %xn, %yn
				%m1 = select i1 %c1, i32 %xn, i32 %yn
				%m1n = xor i32 %m1, -1
				%c2 = icmp sgt i32 %m1n, %z
				%m2 = select i1 %c2, i32 %m1n, i32 %z
				%m2n = xor i32 %m2, -1
				ret i32 %m2n
				}

llvm/trunk/test/Transforms/InstCombine/xor.ll

Show First 20 Lines • Show All 799 Lines • ▼ Show 20 Lines	;
%umin = xor i32 %umax, -1		%umin = xor i32 %umax, -1
%add = add i32 %umax, %z		%add = add i32 %umax, %z
%res = mul i32 %umin, %add		%res = mul i32 %umin, %add
ret i32 %res		ret i32 %res
}		}

define i32 @test48(i32 %x) {		define i32 @test48(i32 %x) {
; CHECK-LABEL: @test48(		; CHECK-LABEL: @test48(
; CHECK-NEXT: [[A:%.]] = sub i32 -2, [[X:%.]]		; CHECK-NEXT: [[TMP1:%.]] = add i32 [[X:%.]], 1
; CHECK-NEXT: [[B:%.*]] = icmp sgt i32 [[A]], 0		; CHECK-NEXT: [[TMP2:%.*]] = icmp slt i32 [[TMP1]], -1
; CHECK-NEXT: [[C:%.*]] = select i1 [[B]], i32 [[A]], i32 0		; CHECK-NEXT: [[D:%.*]] = select i1 [[TMP2]], i32 [[TMP1]], i32 -1
; CHECK-NEXT: [[D:%.*]] = xor i32 [[C]], -1
; CHECK-NEXT: ret i32 [[D]]		; CHECK-NEXT: ret i32 [[D]]
;		;
%a = sub i32 -2, %x		%a = sub i32 -2, %x
%b = icmp sgt i32 %a, 0		%b = icmp sgt i32 %a, 0
%c = select i1 %b, i32 %a, i32 0		%c = select i1 %b, i32 %a, i32 0
%d = xor i32 %c, -1		%d = xor i32 %c, -1
ret i32 %d		ret i32 %d
}		}

define <2 x i32> @test48vec(<2 x i32> %x) {		define <2 x i32> @test48vec(<2 x i32> %x) {
; CHECK-LABEL: @test48vec(		; CHECK-LABEL: @test48vec(
; CHECK-NEXT: [[A:%.]] = sub <2 x i32> <i32 -2, i32 -2>, [[X:%.]]		; CHECK-NEXT: [[TMP1:%.]] = add <2 x i32> [[X:%.]], <i32 1, i32 1>
; CHECK-NEXT: [[B:%.*]] = icmp sgt <2 x i32> [[A]], zeroinitializer		; CHECK-NEXT: [[TMP2:%.*]] = icmp slt <2 x i32> [[TMP1]], <i32 -1, i32 -1>
; CHECK-NEXT: [[C:%.*]] = select <2 x i1> [[B]], <2 x i32> [[A]], <2 x i32> zeroinitializer		; CHECK-NEXT: [[D:%.*]] = select <2 x i1> [[TMP2]], <2 x i32> [[TMP1]], <2 x i32> <i32 -1, i32 -1>
; CHECK-NEXT: [[D:%.*]] = xor <2 x i32> [[C]], <i32 -1, i32 -1>
; CHECK-NEXT: ret <2 x i32> [[D]]		; CHECK-NEXT: ret <2 x i32> [[D]]
;		;
%a = sub <2 x i32> <i32 -2, i32 -2>, %x		%a = sub <2 x i32> <i32 -2, i32 -2>, %x
%b = icmp sgt <2 x i32> %a, zeroinitializer		%b = icmp sgt <2 x i32> %a, zeroinitializer
%c = select <2 x i1> %b, <2 x i32> %a, <2 x i32> zeroinitializer		%c = select <2 x i1> %b, <2 x i32> %a, <2 x i32> zeroinitializer
%d = xor <2 x i32> %c, <i32 -1, i32 -1>		%d = xor <2 x i32> %c, <i32 -1, i32 -1>
ret <2 x i32> %d		ret <2 x i32> %d
}		}

define i32 @test49(i32 %x) {		define i32 @test49(i32 %x) {
; CHECK-LABEL: @test49(		; CHECK-LABEL: @test49(
; CHECK-NEXT: [[A:%.]] = add i32 [[X:%.]], -2		; CHECK-NEXT: [[TMP1:%.]] = sub i32 1, [[X:%.]]
; CHECK-NEXT: [[B:%.*]] = icmp slt i32 [[A]], -1		; CHECK-NEXT: [[TMP2:%.*]] = icmp sgt i32 [[TMP1]], 0
; CHECK-NEXT: [[C:%.*]] = select i1 [[B]], i32 [[A]], i32 -1		; CHECK-NEXT: [[D:%.*]] = select i1 [[TMP2]], i32 [[TMP1]], i32 0
; CHECK-NEXT: [[D:%.*]] = xor i32 [[C]], -1
; CHECK-NEXT: ret i32 [[D]]		; CHECK-NEXT: ret i32 [[D]]
;		;
%a = add i32 %x, -2		%a = add i32 %x, -2
%b = icmp slt i32 %a, -1		%b = icmp slt i32 %a, -1
%c = select i1 %b, i32 %a, i32 -1		%c = select i1 %b, i32 %a, i32 -1
%d = xor i32 %c, -1		%d = xor i32 %c, -1
ret i32 %d		ret i32 %d
}		}

define <2 x i32> @test49vec(<2 x i32> %x) {		define <2 x i32> @test49vec(<2 x i32> %x) {
; CHECK-LABEL: @test49vec(		; CHECK-LABEL: @test49vec(
; CHECK-NEXT: [[A:%.]] = add <2 x i32> [[X:%.]], <i32 -2, i32 -2>		; CHECK-NEXT: [[TMP1:%.]] = sub <2 x i32> <i32 1, i32 1>, [[X:%.]]
; CHECK-NEXT: [[B:%.*]] = icmp slt <2 x i32> [[A]], <i32 -1, i32 -1>		; CHECK-NEXT: [[TMP2:%.*]] = icmp sgt <2 x i32> [[TMP1]], zeroinitializer
; CHECK-NEXT: [[C:%.*]] = select <2 x i1> [[B]], <2 x i32> [[A]], <2 x i32> <i32 -1, i32 -1>		; CHECK-NEXT: [[D:%.*]] = select <2 x i1> [[TMP2]], <2 x i32> [[TMP1]], <2 x i32> zeroinitializer
; CHECK-NEXT: [[D:%.*]] = xor <2 x i32> [[C]], <i32 -1, i32 -1>
; CHECK-NEXT: ret <2 x i32> [[D]]		; CHECK-NEXT: ret <2 x i32> [[D]]
;		;
%a = add <2 x i32> %x, <i32 -2, i32 -2>		%a = add <2 x i32> %x, <i32 -2, i32 -2>
%b = icmp slt <2 x i32> %a, <i32 -1, i32 -1>		%b = icmp slt <2 x i32> %a, <i32 -1, i32 -1>
%c = select <2 x i1> %b, <2 x i32> %a, <2 x i32> <i32 -1, i32 -1>		%c = select <2 x i1> %b, <2 x i32> %a, <2 x i32> <i32 -1, i32 -1>
%d = xor <2 x i32> %c, <i32 -1, i32 -1>		%d = xor <2 x i32> %c, <i32 -1, i32 -1>
ret <2 x i32> %d		ret <2 x i32> %d
}		}

define i32 @test50(i32 %x, i32 %y) {		define i32 @test50(i32 %x, i32 %y) {
; CHECK-LABEL: @test50(		; CHECK-LABEL: @test50(
; CHECK-NEXT: [[A:%.]] = add i32 [[X:%.]], -2		; CHECK-NEXT: [[TMP1:%.]] = sub i32 1, [[X:%.]]
; CHECK-NEXT: [[B:%.]] = sub i32 -2, [[Y:%.]]		; CHECK-NEXT: [[TMP2:%.]] = add i32 [[Y:%.]], 1
; CHECK-NEXT: [[C:%.*]] = icmp slt i32 [[A]], [[B]]		; CHECK-NEXT: [[TMP3:%.*]] = icmp slt i32 [[TMP2]], [[TMP1]]
; CHECK-NEXT: [[D:%.*]] = select i1 [[C]], i32 [[A]], i32 [[B]]		; CHECK-NEXT: [[E:%.*]] = select i1 [[TMP3]], i32 [[TMP1]], i32 [[TMP2]]
; CHECK-NEXT: [[E:%.*]] = xor i32 [[D]], -1
; CHECK-NEXT: ret i32 [[E]]		; CHECK-NEXT: ret i32 [[E]]
;		;
%a = add i32 %x, -2		%a = add i32 %x, -2
%b = sub i32 -2, %y		%b = sub i32 -2, %y
%c = icmp slt i32 %a, %b		%c = icmp slt i32 %a, %b
%d = select i1 %c, i32 %a, i32 %b		%d = select i1 %c, i32 %a, i32 %b
%e = xor i32 %d, -1		%e = xor i32 %d, -1
ret i32 %e		ret i32 %e
}		}

define <2 x i32> @test50vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test50vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test50vec(		; CHECK-LABEL: @test50vec(
; CHECK-NEXT: [[A:%.]] = add <2 x i32> [[X:%.]], <i32 -2, i32 -2>		; CHECK-NEXT: [[TMP1:%.]] = sub <2 x i32> <i32 1, i32 1>, [[X:%.]]
; CHECK-NEXT: [[B:%.]] = sub <2 x i32> <i32 -2, i32 -2>, [[Y:%.]]		; CHECK-NEXT: [[TMP2:%.]] = add <2 x i32> [[Y:%.]], <i32 1, i32 1>
; CHECK-NEXT: [[C:%.*]] = icmp slt <2 x i32> [[A]], [[B]]		; CHECK-NEXT: [[TMP3:%.*]] = icmp slt <2 x i32> [[TMP2]], [[TMP1]]
; CHECK-NEXT: [[D:%.*]] = select <2 x i1> [[C]], <2 x i32> [[A]], <2 x i32> [[B]]		; CHECK-NEXT: [[E:%.*]] = select <2 x i1> [[TMP3]], <2 x i32> [[TMP1]], <2 x i32> [[TMP2]]
; CHECK-NEXT: [[E:%.*]] = xor <2 x i32> [[D]], <i32 -1, i32 -1>
; CHECK-NEXT: ret <2 x i32> [[E]]		; CHECK-NEXT: ret <2 x i32> [[E]]
;		;
%a = add <2 x i32> %x, <i32 -2, i32 -2>		%a = add <2 x i32> %x, <i32 -2, i32 -2>
%b = sub <2 x i32> <i32 -2, i32 -2>, %y		%b = sub <2 x i32> <i32 -2, i32 -2>, %y
%c = icmp slt <2 x i32> %a, %b		%c = icmp slt <2 x i32> %a, %b
%d = select <2 x i1> %c, <2 x i32> %a, <2 x i32> %b		%d = select <2 x i1> %c, <2 x i32> %a, <2 x i32> %b
%e = xor <2 x i32> %d, <i32 -1, i32 -1>		%e = xor <2 x i32> %d, <i32 -1, i32 -1>
ret <2 x i32> %e		ret <2 x i32> %e
}		}

define i32 @test51(i32 %x, i32 %y) {		define i32 @test51(i32 %x, i32 %y) {
; CHECK-LABEL: @test51(		; CHECK-LABEL: @test51(
; CHECK-NEXT: [[A:%.]] = add i32 [[X:%.]], 2		; CHECK-NEXT: [[TMP1:%.]] = sub i32 -3, [[X:%.]]
; CHECK-NEXT: [[B:%.]] = sub i32 2, [[Y:%.]]		; CHECK-NEXT: [[TMP2:%.]] = add i32 [[Y:%.]], -3
; CHECK-NEXT: [[C:%.*]] = icmp sgt i32 [[A]], [[B]]		; CHECK-NEXT: [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], [[TMP1]]
; CHECK-NEXT: [[D:%.*]] = select i1 [[C]], i32 [[A]], i32 [[B]]		; CHECK-NEXT: [[E:%.*]] = select i1 [[TMP3]], i32 [[TMP1]], i32 [[TMP2]]
; CHECK-NEXT: [[E:%.*]] = xor i32 [[D]], -1
; CHECK-NEXT: ret i32 [[E]]		; CHECK-NEXT: ret i32 [[E]]
;		;
%a = add i32 %x, 2		%a = add i32 %x, 2
%b = sub i32 2, %y		%b = sub i32 2, %y
%c = icmp sgt i32 %a, %b		%c = icmp sgt i32 %a, %b
%d = select i1 %c, i32 %a, i32 %b		%d = select i1 %c, i32 %a, i32 %b
%e = xor i32 %d, -1		%e = xor i32 %d, -1
ret i32 %e		ret i32 %e
}		}

define <2 x i32> @test51vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test51vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test51vec(		; CHECK-LABEL: @test51vec(
; CHECK-NEXT: [[A:%.]] = add <2 x i32> [[X:%.]], <i32 2, i32 2>		; CHECK-NEXT: [[TMP1:%.]] = sub <2 x i32> <i32 -3, i32 -3>, [[X:%.]]
; CHECK-NEXT: [[B:%.]] = sub <2 x i32> <i32 2, i32 2>, [[Y:%.]]		; CHECK-NEXT: [[TMP2:%.]] = add <2 x i32> [[Y:%.]], <i32 -3, i32 -3>
; CHECK-NEXT: [[C:%.*]] = icmp sgt <2 x i32> [[A]], [[B]]		; CHECK-NEXT: [[TMP3:%.*]] = icmp sgt <2 x i32> [[TMP2]], [[TMP1]]
; CHECK-NEXT: [[D:%.*]] = select <2 x i1> [[C]], <2 x i32> [[A]], <2 x i32> [[B]]		; CHECK-NEXT: [[E:%.*]] = select <2 x i1> [[TMP3]], <2 x i32> [[TMP1]], <2 x i32> [[TMP2]]
; CHECK-NEXT: [[E:%.*]] = xor <2 x i32> [[D]], <i32 -1, i32 -1>
; CHECK-NEXT: ret <2 x i32> [[E]]		; CHECK-NEXT: ret <2 x i32> [[E]]
;		;
%a = add <2 x i32> %x, <i32 2, i32 2>		%a = add <2 x i32> %x, <i32 2, i32 2>
%b = sub <2 x i32> <i32 2, i32 2>, %y		%b = sub <2 x i32> <i32 2, i32 2>, %y
%c = icmp sgt <2 x i32> %a, %b		%c = icmp sgt <2 x i32> %a, %b
%d = select <2 x i1> %c, <2 x i32> %a, <2 x i32> %b		%d = select <2 x i1> %c, <2 x i32> %a, <2 x i32> %b
%e = xor <2 x i32> %d, <i32 -1, i32 -1>		%e = xor <2 x i32> %d, <i32 -1, i32 -1>
ret <2 x i32> %e		ret <2 x i32> %e
}		}