This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt trunc X, 0) and (icmp sgt trunc X, -1) as equivalent to an and with the sign bit of the truncated type
ClosedPublic

Authored by craig.topper on Aug 8 2017, 6:02 PM.

Download Raw Diff

Details

Reviewers

spatel
efriedma
davide

Commits

rG74177e1ed1a9: [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp…
rL311362: [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp…

Summary

This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely.

As I'm writing this I wonder if we're also missing (icmp slt X, 0) and (icmp sgt X, -1) without any truncates.

The vector tests added should already work without this change since we don't turn compares with ands into truncate for vectors.

Diff Detail

Event Timeline

craig.topper created this revision.Aug 8 2017, 6:02 PM

I've been operating under the assumption that we want to transform in the opposite direction in IR. Ie, preserve and probably create more select-of-constants (see https://reviews.llvm.org/D24480 which I am planning to abandon). We should be able to reduce selects to logic/math in the DAG where it makes sense (eg, https://reviews.llvm.org/rL310717 ). Some reasons to prefer select were noted in:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html

Not sure if it was listed there, but another reason we might actually want to keep a cmp/select in IR is in the case where we have profile info / branch stats. If we know in IR that a cmp always goes one direction, then the target might prefer to turn the select into control flow instead of turning it into logic ops. There's some infrastructure for this in CGP already. See also: https://reviews.llvm.org/D36081

This patch is really just making InstCombine self consistent. We currently optimize this case differently depending on whether i8 is legal in datalayout.

define i32 @test71(i32 %x) {
; CHECK-LABEL: @test71(
; CHECK-NEXT: [[TMP1:%.*]] = lshr i32 [[X:%.*]], 6
; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 2
; CHECK-NEXT: [[TMP3:%.*]] = or i32 [[TMP2]], 40
; CHECK-NEXT: ret i32 [[TMP3]]
;

%1 = and i32 %x, 128
%2 = icmp eq i32 %1, 0
%3 = select i1 %2, i32 40, i32 42
ret i32 %3

}

If we want to remove foldSelectICmpAnd that's a different question.

In D36498#841022, @craig.topper wrote:
This patch is really just making InstCombine self consistent. We currently optimize this case differently depending on whether i8 is legal in datalayout.

define i32 @test71(i32 %x) {
; CHECK-LABEL: @test71(
; CHECK-NEXT: [[TMP1:%.*]] = lshr i32 [[X:%.*]], 6
; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 2
; CHECK-NEXT: [[TMP3:%.*]] = or i32 [[TMP2]], 40
; CHECK-NEXT: ret i32 [[TMP3]]
;
%1 = and i32 %x, 128
%2 = icmp eq i32 %1, 0
%3 = select i1 %2, i32 40, i32 42
ret i32 %3
}

If we want to remove foldSelectICmpAnd that's a different question.

Ah, I didn't recognize what was going on. This is a sibling to D22537. Can you include a test that has a trunc in it from the start, so we are not dependent on the other combine? A code comment to show the complete transform would also make it a bit clearer for me.

FWIW, test71 is converted to math in the x86 backend for all 3 possibilities, but this doesn't happen for AArch64 or PPC where it's also likely a win. And for x86, it's different asm in all 3 cases:

With mask+cmp+sel:

%1 = and i32 %x, 128
%2 = icmp eq i32 %1, 0
%3 = select i1 %2, i32 40, i32 42
ret i32 %3
-->
andl	$128, %edi
shrl	$6, %edi
leal	40(%rdi), %eax

With shift+mask+or:

%1 = lshr i32 %x, 6
%2 = and i32 %1, 2
%3 = or i32 %2, 40
-->
shrl	$6, %edi
andl	$2, %edi
leal	40(%rdi), %eax

With trunc+cmp+sel

%1 = trunc i32 %x to i8
%2 = icmp sgt i8 %1, -1
%3 = select i1 %2, i32 40, i32 42
-->
xorl	%eax, %eax
testb	%dil, %dil
sets	%al
leal	40(%rax,%rax), %eax

Rewritten to use decomposeBitTestICmp.

The use of decomposeBitTestICmp removes the truncate check I had in the old patch. It's not really needed. We just care about the mask. This exposed that I almost broke another transform in here that used ashr to generate constants for selects. Reordered the combines a little and added a TODO to try to merge these combines. We may want to look at turning the lshr, and, or sequence into the ashr pattern independently too.

Looks like this also exposed that canEvaluateZExtd can't handle vector shifts correctly. So that prevented the zext and trunc from being folded in the vector tests. I'll submit a follow up patch for that.

craig.topper added a child revision: D36763: [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount.Aug 15 2017, 11:50 AM

spatel added inline comments.Aug 15 2017, 2:22 PM

lib/Transforms/InstCombine/InstCombineSelect.cpp
638	'IC->getPredicate()' can be shortened to 'Pred'?
641	IsEqualZero is not initialized on this path.
test/Transforms/InstCombine/select-with-bitwise-ops.ll
496–498	This is miscompiling: http://rise4fun.com/Alive/ACO Should be the same ops as the scalar test?

craig.topper added inline comments.Aug 15 2017, 2:31 PM

test/Transforms/InstCombine/select-with-bitwise-ops.ll
496–498	Yeah it should, but it isn't. See the child revision of this D36763.

craig.topper added inline comments.Aug 15 2017, 2:49 PM

test/Transforms/InstCombine/select-with-bitwise-ops.ll
496–498	h oops. I thought you were asking about the other vector tests, not this one. I believe the scalar matching code uses CosntantInt. I can submit another patch to fix it too.

Use Pred more consistently through. Remove EqualAnd and just use Pred directly later.

The comment about multi-uses of the compare in D36711 made me wonder what we're doing here. Should we check one-use before trying to transform?

define i32 @test71_multi_use_cmp(i32 %x) {
  %t1 = and i32 %x, 128
  %t2 = icmp ne i32 %t1, 0
  %t3 = select i1 %t2, i32 40, i32 42
  %t4 = select i1 %t2, i32 60, i32 82
  %add = add i32 %t3, %t4
  ret i32 %add
}

$ ./opt -instcombine test71.ll -S

define i32 @test71_multi_use_cmp(i32 %x) {
  %t1 = and i32 %x, 128
  %t2 = icmp eq i32 %t1, 0
  %1 = lshr exact i32 %t1, 6
  %2 = xor i32 %1, 42
  %t4 = select i1 %t2, i32 82, i32 60
  %add = add nuw nsw i32 %2, %t4
  ret i32 %add
}

Yeah we probably need to check fpr one use, but I'd like to do that separately from this.

I want to reimplement foldSelectICmpAndOr by calling foldSelectICmpAnd from foldSelectIntoOp if the immediate in foldSelectIntoOp is a power of 2. This will allow the transform in foldSelectICmpAndOr to support Xor as well because there was no reason to restrict it to just Or. And of course it will remove some very similar duplicated code. To do that I'll need to revisit all the one use enhancements that were already added to foldSelectICmpAndOr anyway.

spatel mentioned this in D36763: [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount.Aug 15 2017, 3:29 PM

craig.topper removed a child revision: D36763: [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount.Aug 15 2017, 3:36 PM

craig.topper added a parent revision: D36763: [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount.

Rebase after fixing the canEvaluateTruncated/ZExtd issues with vectors.

Fix mad comment placement

spatel added inline comments.Aug 16 2017, 1:26 PM

test/Transforms/InstCombine/select-with-bitwise-ops.ll
425–431	Sorry if I missed it, but do you know why canEvaluateZExtd() failed on this? If you have a fix, I'd rather see that go in first because this doesn't look better in IR or x86 codegen.

Rebasing this on top of D36781 which fixes the ashr transform for vectors.

craig.topper added a parent revision: D36781: [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors.Aug 16 2017, 1:27 PM

@spatel is this ok now? I'd like to work on replacing the foldSelectICmpAndOr implementation using this.

spatel added inline comments.Aug 18 2017, 3:32 PM

test/Transforms/InstCombine/select-with-bitwise-ops.ll
425–431	This question came in close proximity to the last update, so you might not have seen it.

I believe the issue is that this needs to handle And slightly differently. I think if MaskedValueIsZero returns true for the And, we should reset the BitsToClear to 0 before returning true.

// If the operation is an AND/OR/XOR and the bits to clear are zero in the
// other side, BitsToClear is ok.
if (Tmp == 0 && I->isBitwiseLogicOp()) {
  // We use MaskedValueIsZero here for generality, but the case we care
  // about the most is constant RHS.
  unsigned VSize = V->getType()->getScalarSizeInBits();
  if (IC.MaskedValueIsZero(I->getOperand(1),
                           APInt::getHighBitsSet(VSize, BitsToClear),
                           0, CxtI))
    return true;
}

In D36498#846103, @craig.topper wrote:
I believe the issue is that this needs to handle And slightly differently. I think if MaskedValueIsZero returns true for the And, we should reset the BitsToClear to 0 before returning true.
// If the operation is an AND/OR/XOR and the bits to clear are zero in the
// other side, BitsToClear is ok.
if (Tmp == 0 && I->isBitwiseLogicOp()) {
  // We use MaskedValueIsZero here for generality, but the case we care
  // about the most is constant RHS.
  unsigned VSize = V->getType()->getScalarSizeInBits();
  if (IC.MaskedValueIsZero(I->getOperand(1),
                           APInt::getHighBitsSet(VSize, BitsToClear),
                           0, CxtI))
    return true;
}

Looks like that problem will be fixed in D36944. I think that will make this one good, but please update after that.

I'm still trying to get the backend prepared for IR to go in the other direction with patches like D36840.

Update after r311343

LGTM.

This revision is now accepted and ready to land.Aug 21 2017, 10:47 AM

Closed by commit rL311362: [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp… (authored by ctopper). · Explain WhyAug 21 2017, 12:03 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineSelect.cpp

69 lines

test/

Transforms/

InstCombine/

select-with-bitwise-ops.ll

110 lines

Diff 111277

lib/Transforms/InstCombine/InstCombineSelect.cpp

//===- InstCombineSelect.cpp ----------------------------------------------===//		//===- InstCombineSelect.cpp ----------------------------------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the visitSelect function.		// This file implements the visitSelect function.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "InstCombineInternal.h"		#include "InstCombineInternal.h"
		#include "llvm/Analysis/CmpInstAnalysis.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/MDBuilder.h"		#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
#include "llvm/Support/KnownBits.h"		#include "llvm/Support/KnownBits.h"
using namespace llvm;		using namespace llvm;
using namespace PatternMatch;		using namespace PatternMatch;
▲ Show 20 Lines • Show All 565 Lines • ▼ Show 20 Lines	canonicalizeMinMaxWithConstant(SelectInst &Sel, ICmpInst &Cmp,
Sel.swapProfMetadata();		Sel.swapProfMetadata();
return &Sel;		return &Sel;
}		}

/// If one of the constants is zero (we know they can't both be) and we have an		/// If one of the constants is zero (we know they can't both be) and we have an
/// icmp instruction with zero, and we have an 'and' with the non-constant value		/// icmp instruction with zero, and we have an 'and' with the non-constant value
/// and a power of two we can turn the select into a shift on the result of the		/// and a power of two we can turn the select into a shift on the result of the
/// 'and'.		/// 'and'.
		/// This folds:
		/// select (icmp eq (and X, C1)), C2, C3
		/// iff C1 is a power 2 and the difference between C2 and C3 is a power of 2.
		/// To something like:
		/// (shr (and (X, C1)), (log2(C1) - log2(C2-C3))) + C3
		/// Or:
		/// (shl (and (X, C1)), (log2(C2-C3) - log2(C1))) + C3
		/// With some variations depending if C3 is larger than C2, or the shift
		/// isn't needed, or the bit widths don't match.
static Value foldSelectICmpAnd(Type SelType, const ICmpInst *IC,		static Value foldSelectICmpAnd(Type SelType, const ICmpInst *IC,
APInt TrueVal, APInt FalseVal,		APInt TrueVal, APInt FalseVal,
InstCombiner::BuilderTy &Builder) {		InstCombiner::BuilderTy &Builder) {
assert(SelType->isIntOrIntVectorTy() && "Not an integer select?");		assert(SelType->isIntOrIntVectorTy() && "Not an integer select?");

// If this is a vector select, we need a vector compare.		// If this is a vector select, we need a vector compare.
if (SelType->isVectorTy() != IC->getType()->isVectorTy())		if (SelType->isVectorTy() != IC->getType()->isVectorTy())
return nullptr;		return nullptr;

if (!IC->isEquality())		Value *V;
return nullptr;		APInt AndMask;
		bool CreateAnd = false;
		ICmpInst::Predicate Pred = IC->getPredicate();
		if (ICmpInst::isEquality(Pred)) {
if (!match(IC->getOperand(1), m_Zero()))		if (!match(IC->getOperand(1), m_Zero()))
return nullptr;		return nullptr;

		V = IC->getOperand(0);

const APInt *AndRHS;		const APInt *AndRHS;
Value *LHS = IC->getOperand(0);		if (!match(V, m_And(m_Value(), m_Power2(AndRHS))))
if (!match(LHS, m_And(m_Value(), m_Power2(AndRHS))))		return nullptr;

		AndMask = *AndRHS;
		} else if (decomposeBitTestICmp(IC->getOperand(0), IC->getOperand(1),
		Pred, V, AndMask)) {
		assert(ICmpInst::isEquality(Pred) && "Not equality test?");

		if (!AndMask.isPowerOf2())
		return nullptr;

		CreateAnd = true;
		} else {
		spatelUnsubmitted Not Done Reply Inline Actions 'IC->getPredicate()' can be shortened to 'Pred'? spatel: 'IC->getPredicate()' can be shortened to 'Pred'?
return nullptr;		return nullptr;
		}

		spatelUnsubmitted Not Done Reply Inline Actions IsEqualZero is not initialized on this path. spatel: IsEqualZero is not initialized on this path.
// If both select arms are non-zero see if we have a select of the form		// If both select arms are non-zero see if we have a select of the form
// 'x ? 2^n + C : C'. Then we can offset both arms by C, use the logic		// 'x ? 2^n + C : C'. Then we can offset both arms by C, use the logic
// for 'x ? 2^n : 0' and fix the thing up at the end.		// for 'x ? 2^n : 0' and fix the thing up at the end.
APInt Offset(TrueVal.getBitWidth(), 0);		APInt Offset(TrueVal.getBitWidth(), 0);
if (!TrueVal.isNullValue() && !FalseVal.isNullValue()) {		if (!TrueVal.isNullValue() && !FalseVal.isNullValue()) {
if ((TrueVal - FalseVal).isPowerOf2())		if ((TrueVal - FalseVal).isPowerOf2())
Offset = FalseVal;		Offset = FalseVal;
else if ((FalseVal - TrueVal).isPowerOf2())		else if ((FalseVal - TrueVal).isPowerOf2())
Show All 9 Lines	static Value foldSelectICmpAnd(Type SelType, const ICmpInst *IC,
// Make sure one of the select arms is a power of 2.		// Make sure one of the select arms is a power of 2.
if (!TrueVal.isPowerOf2() && !FalseVal.isPowerOf2())		if (!TrueVal.isPowerOf2() && !FalseVal.isPowerOf2())
return nullptr;		return nullptr;

// Determine which shift is needed to transform result of the 'and' into the		// Determine which shift is needed to transform result of the 'and' into the
// desired result.		// desired result.
const APInt &ValC = !TrueVal.isNullValue() ? TrueVal : FalseVal;		const APInt &ValC = !TrueVal.isNullValue() ? TrueVal : FalseVal;
unsigned ValZeros = ValC.logBase2();		unsigned ValZeros = ValC.logBase2();
unsigned AndZeros = AndRHS->logBase2();		unsigned AndZeros = AndMask.logBase2();

		if (CreateAnd) {
		// Insert the AND instruction on the input to the truncate.
		V = Builder.CreateAnd(V, ConstantInt::get(V->getType(), AndMask));
		}

// If types don't match we can still convert the select by introducing a zext		// If types don't match we can still convert the select by introducing a zext
// or a trunc of the 'and'.		// or a trunc of the 'and'.
Value *V = LHS;
if (ValZeros > AndZeros) {		if (ValZeros > AndZeros) {
V = Builder.CreateZExtOrTrunc(V, SelType);		V = Builder.CreateZExtOrTrunc(V, SelType);
V = Builder.CreateShl(V, ValZeros - AndZeros);		V = Builder.CreateShl(V, ValZeros - AndZeros);
} else if (ValZeros < AndZeros) {		} else if (ValZeros < AndZeros) {
V = Builder.CreateLShr(V, AndZeros - ValZeros);		V = Builder.CreateLShr(V, AndZeros - ValZeros);
V = Builder.CreateZExtOrTrunc(V, SelType);		V = Builder.CreateZExtOrTrunc(V, SelType);
} else		} else
V = Builder.CreateZExtOrTrunc(V, SelType);		V = Builder.CreateZExtOrTrunc(V, SelType);

// Okay, now we know that everything is set up, we just don't know whether we		// Okay, now we know that everything is set up, we just don't know whether we
// have a icmp_ne or icmp_eq and whether the true or false val is the zero.		// have a icmp_ne or icmp_eq and whether the true or false val is the zero.
bool ShouldNotVal = !TrueVal.isNullValue();		bool ShouldNotVal = !TrueVal.isNullValue();
ShouldNotVal ^= IC->getPredicate() == ICmpInst::ICMP_NE;		ShouldNotVal ^= Pred == ICmpInst::ICMP_NE;
if (ShouldNotVal)		if (ShouldNotVal)
V = Builder.CreateXor(V, ValC);		V = Builder.CreateXor(V, ValC);

// Apply an offset if needed.		// Apply an offset if needed.
if (!Offset.isNullValue())		if (!Offset.isNullValue())
V = Builder.CreateAdd(V, ConstantInt::get(V->getType(), Offset));		V = Builder.CreateAdd(V, ConstantInt::get(V->getType(), Offset));
return V;		return V;
}		}

/// Visit a SelectInst that has an ICmpInst as its first operand.		/// Visit a SelectInst that has an ICmpInst as its first operand.
Instruction *InstCombiner::foldSelectInstWithICmp(SelectInst &SI,		Instruction *InstCombiner::foldSelectInstWithICmp(SelectInst &SI,
ICmpInst *ICI) {		ICmpInst *ICI) {
Value *TrueVal = SI.getTrueValue();		Value *TrueVal = SI.getTrueValue();
Value *FalseVal = SI.getFalseValue();		Value *FalseVal = SI.getFalseValue();

{
const APInt TrueValC, FalseValC;
if (match(TrueVal, m_APInt(TrueValC)) &&
match(FalseVal, m_APInt(FalseValC)))
if (Value V = foldSelectICmpAnd(SI.getType(), ICI, TrueValC,
*FalseValC, Builder))
return replaceInstUsesWith(SI, V);
}

if (Instruction NewSel = canonicalizeMinMaxWithConstant(SI, ICI, Builder))		if (Instruction NewSel = canonicalizeMinMaxWithConstant(SI, ICI, Builder))
return NewSel;		return NewSel;

bool Changed = adjustMinMax(SI, *ICI);		bool Changed = adjustMinMax(SI, *ICI);

ICmpInst::Predicate Pred = ICI->getPredicate();		ICmpInst::Predicate Pred = ICI->getPredicate();
Value *CmpLHS = ICI->getOperand(0);		Value *CmpLHS = ICI->getOperand(0);
Value *CmpRHS = ICI->getOperand(1);		Value *CmpRHS = ICI->getOperand(1);

// Transform (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1		// Transform (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1
// and (X <s 0) ? C2 : C1 --> ((X >>s 31) & (C2 - C1)) + C1		// and (X <s 0) ? C2 : C1 --> ((X >>s 31) & (C2 - C1)) + C1
// FIXME: Type and constness constraints could be lifted, but we have to		// FIXME: Type and constness constraints could be lifted, but we have to
// watch code size carefully. We should consider xor instead of		// watch code size carefully. We should consider xor instead of
// sub/add when we decide to do that.		// sub/add when we decide to do that.
		// TODO: Merge this with foldSelectICmpAnd somehow.
if (IntegerType *Ty = dyn_cast<IntegerType>(CmpLHS->getType())) {		if (IntegerType *Ty = dyn_cast<IntegerType>(CmpLHS->getType())) {
if (TrueVal->getType() == Ty) {		if (TrueVal->getType() == Ty) {
if (ConstantInt *Cmp = dyn_cast<ConstantInt>(CmpRHS)) {		if (ConstantInt *Cmp = dyn_cast<ConstantInt>(CmpRHS)) {
ConstantInt C1 = nullptr, C2 = nullptr;		ConstantInt C1 = nullptr, C2 = nullptr;
if (Pred == ICmpInst::ICMP_SGT && Cmp->isMinusOne()) {		if (Pred == ICmpInst::ICMP_SGT && Cmp->isMinusOne()) {
C1 = dyn_cast<ConstantInt>(TrueVal);		C1 = dyn_cast<ConstantInt>(TrueVal);
C2 = dyn_cast<ConstantInt>(FalseVal);		C2 = dyn_cast<ConstantInt>(FalseVal);
} else if (Pred == ICmpInst::ICMP_SLT && Cmp->isZero()) {		} else if (Pred == ICmpInst::ICMP_SLT && Cmp->isZero()) {
Show All 10 Lines	if (TrueVal->getType() == Ty) {

Value *And = Builder.CreateAnd(AShr, C2->getValue() - C1->getValue());		Value *And = Builder.CreateAnd(AShr, C2->getValue() - C1->getValue());
return replaceInstUsesWith(SI, Builder.CreateAdd(And, C1));		return replaceInstUsesWith(SI, Builder.CreateAdd(And, C1));
}		}
}		}
}		}
}		}

		{
		const APInt TrueValC, FalseValC;
		if (match(TrueVal, m_APInt(TrueValC)) &&
		match(FalseVal, m_APInt(FalseValC)))
		if (Value V = foldSelectICmpAnd(SI.getType(), ICI, TrueValC,
		*FalseValC, Builder))
		return replaceInstUsesWith(SI, V);
		}

// NOTE: if we wanted to, this is where to detect integer MIN/MAX		// NOTE: if we wanted to, this is where to detect integer MIN/MAX

if (CmpRHS != CmpLHS && isa<Constant>(CmpRHS)) {		if (CmpRHS != CmpLHS && isa<Constant>(CmpRHS)) {
if (CmpLHS == TrueVal && Pred == ICmpInst::ICMP_EQ) {		if (CmpLHS == TrueVal && Pred == ICmpInst::ICMP_EQ) {
// Transform (X == C) ? X : Y -> (X == C) ? C : Y		// Transform (X == C) ? X : Y -> (X == C) ? C : Y
SI.setOperand(1, CmpRHS);		SI.setOperand(1, CmpRHS);
Changed = true;		Changed = true;
} else if (CmpLHS == FalseVal && Pred == ICmpInst::ICMP_NE) {		} else if (CmpLHS == FalseVal && Pred == ICmpInst::ICMP_NE) {
▲ Show 20 Lines • Show All 845 Lines • Show Last 20 Lines

test/Transforms/InstCombine/select-with-bitwise-ops.ll

	Show First 20 Lines • Show All 389 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i8 [[SELECT]]			; CHECK-NEXT: ret i8 [[SELECT]]
	;			;
	%cmp = icmp slt i8 %x, 0			%cmp = icmp slt i8 %x, 0
	%or = or i8 %y, 2			%or = or i8 %y, 2
	%select = select i1 %cmp, i8 %or, i8 %y			%select = select i1 %cmp, i8 %or, i8 %y
	ret i8 %select			ret i8 %select
	}			}

				define i32 @test71(i32 %x) {
				; CHECK-LABEL: @test71(
				; CHECK-NEXT: [[TMP1:%.]] = lshr i32 [[X:%.]], 6
				; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 2
				; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP2]], 42
				; CHECK-NEXT: ret i32 [[TMP3]]
				;
				%1 = and i32 %x, 128
				%2 = icmp ne i32 %1, 0
				%3 = select i1 %2, i32 40, i32 42
				ret i32 %3
				}

				define <2 x i32> @test71vec(<2 x i32> %x) {
				; CHECK-LABEL: @test71vec(
				; CHECK-NEXT: [[TMP1:%.]] = lshr <2 x i32> [[X:%.]], <i32 6, i32 6>
				; CHECK-NEXT: [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 2, i32 2>
				; CHECK-NEXT: [[TMP3:%.*]] = xor <2 x i32> [[TMP2]], <i32 42, i32 42>
				; CHECK-NEXT: ret <2 x i32> [[TMP3]]
				;
				%1 = and <2 x i32> %x, <i32 128, i32 128>
				%2 = icmp ne <2 x i32> %1, <i32 0, i32 0>
				%3 = select <2 x i1> %2, <2 x i32> <i32 40, i32 40>, <2 x i32> <i32 42, i32 42>
				ret <2 x i32> %3
				}

				define i32 @test72(i32 %x) {
				; CHECK-LABEL: @test72(
				; CHECK-NEXT: [[TMP1:%.]] = trunc i32 [[X:%.]] to i8
				; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP1]], 6
				; CHECK-NEXT: [[TMP3:%.*]] = and i8 [[TMP2]], 2
				; CHECK-NEXT: [[TMP4:%.*]] = or i8 [[TMP3]], 40
				; CHECK-NEXT: [[TMP5:%.*]] = zext i8 [[TMP4]] to i32
				; CHECK-NEXT: ret i32 [[TMP5]]
				spatelUnsubmitted Not Done Reply Inline Actions Sorry if I missed it, but do you know why canEvaluateZExtd() failed on this? If you have a fix, I'd rather see that go in first because this doesn't look better in IR or x86 codegen. spatel: Sorry if I missed it, but do you know why canEvaluateZExtd() failed on this? If you have a fix…
				spatelUnsubmitted Not Done Reply Inline Actions This question came in close proximity to the last update, so you might not have seen it. spatel: This question came in close proximity to the last update, so you might not have seen it.
				;
				%1 = and i32 %x, 128
				%2 = icmp eq i32 %1, 0
				%3 = select i1 %2, i32 40, i32 42
				ret i32 %3
				}

				define <2 x i32> @test72vec(<2 x i32> %x) {
				; CHECK-LABEL: @test72vec(
				; CHECK-NEXT: [[TMP1:%.]] = trunc <2 x i32> [[X:%.]] to <2 x i8>
				; CHECK-NEXT: [[TMP2:%.*]] = lshr <2 x i8> [[TMP1]], <i8 6, i8 6>
				; CHECK-NEXT: [[TMP3:%.*]] = and <2 x i8> [[TMP2]], <i8 2, i8 2>
				; CHECK-NEXT: [[TMP4:%.*]] = or <2 x i8> [[TMP3]], <i8 40, i8 40>
				; CHECK-NEXT: [[TMP5:%.*]] = zext <2 x i8> [[TMP4]] to <2 x i32>
				; CHECK-NEXT: ret <2 x i32> [[TMP5]]
				;
				%1 = and <2 x i32> %x, <i32 128, i32 128>
				%2 = icmp eq <2 x i32> %1, <i32 0, i32 0>
				%3 = select <2 x i1> %2, <2 x i32> <i32 40, i32 40>, <2 x i32> <i32 42, i32 42>
				ret <2 x i32> %3
				}

				define i32 @test73(i32 %x) {
				; CHECK-LABEL: @test73(
				; CHECK-NEXT: [[TMP1:%.]] = trunc i32 [[X:%.]] to i8
				; CHECK-NEXT: [[TMP2:%.*]] = lshr i8 [[TMP1]], 6
				; CHECK-NEXT: [[TMP3:%.*]] = and i8 [[TMP2]], 2
				; CHECK-NEXT: [[TMP4:%.*]] = or i8 [[TMP3]], 40
				; CHECK-NEXT: [[TMP5:%.*]] = zext i8 [[TMP4]] to i32
				; CHECK-NEXT: ret i32 [[TMP5]]
				;
				%1 = trunc i32 %x to i8
				%2 = icmp sgt i8 %1, -1
				%3 = select i1 %2, i32 40, i32 42
				ret i32 %3
				}

				define <2 x i32> @test73vec(<2 x i32> %x) {
				; CHECK-LABEL: @test73vec(
				; CHECK-NEXT: [[TMP1:%.]] = trunc <2 x i32> [[X:%.]] to <2 x i8>
				; CHECK-NEXT: [[TMP2:%.*]] = lshr <2 x i8> [[TMP1]], <i8 6, i8 6>
				; CHECK-NEXT: [[TMP3:%.*]] = and <2 x i8> [[TMP2]], <i8 2, i8 2>
				; CHECK-NEXT: [[TMP4:%.*]] = or <2 x i8> [[TMP3]], <i8 40, i8 40>
				; CHECK-NEXT: [[TMP5:%.*]] = zext <2 x i8> [[TMP4]] to <2 x i32>
				; CHECK-NEXT: ret <2 x i32> [[TMP5]]
				;
				%1 = trunc <2 x i32> %x to <2 x i8>
				%2 = icmp sgt <2 x i8> %1, <i8 -1, i8 -1>
				%3 = select <2 x i1> %2, <2 x i32> <i32 40, i32 40>, <2 x i32> <i32 42, i32 42>
				ret <2 x i32> %3
				}

				define i32 @test74(i32 %x) {
				; CHECK-LABEL: @test74(
				; CHECK-NEXT: [[TMP1:%.]] = ashr i32 [[X:%.]], 31
				; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 2
				; CHECK-NEXT: [[TMP3:%.*]] = or i32 [[TMP2]], 40
				; CHECK-NEXT: ret i32 [[TMP3]]
				;
				%1 = icmp sgt i32 %x, -1
				%2 = select i1 %1, i32 40, i32 42
				ret i32 %2
				}

				define <2 x i32> @test74vec(<2 x i32> %x) {
				; CHECK-LABEL: @test74vec(
				; CHECK-NEXT: [[TMP1:%.]] = lshr <2 x i32> [[X:%.]], <i32 30, i32 30>
				spatelUnsubmitted Not Done Reply Inline Actions This is miscompiling: http://rise4fun.com/Alive/ACO Should be the same ops as the scalar test? spatel: This is miscompiling: http://rise4fun.com/Alive/ACO Should be the same ops as the scalar test?
				craig.topperAuthorUnsubmitted Not Done Reply Inline Actions Yeah it should, but it isn't. See the child revision of this D36763. craig.topper: Yeah it should, but it isn't. See the child revision of this D36763.
				craig.topperAuthorUnsubmitted Not Done Reply Inline Actions h oops. I thought you were asking about the other vector tests, not this one. I believe the scalar matching code uses CosntantInt. I can submit another patch to fix it too. craig.topper: h oops. I thought you were asking about the other vector tests, not this one. I believe the…
				; CHECK-NEXT: [[TMP2:%.*]] = and <2 x i32> [[TMP1]], <i32 2, i32 2>
				; CHECK-NEXT: [[TMP3:%.*]] = or <2 x i32> [[TMP2]], <i32 40, i32 40>
				; CHECK-NEXT: ret <2 x i32> [[TMP3]]
				;
				%1 = icmp sgt <2 x i32> %x, <i32 -1, i32 -1>
				%2 = select <2 x i1> %1, <2 x i32> <i32 40, i32 40>, <2 x i32> <i32 42, i32 42>
				ret <2 x i32> %2
				}

	define i32 @shift_no_xor_multiuse_or(i32 %x, i32 %y) {			define i32 @shift_no_xor_multiuse_or(i32 %x, i32 %y) {
	; CHECK-LABEL: @shift_no_xor_multiuse_or(			; CHECK-LABEL: @shift_no_xor_multiuse_or(
	; CHECK-NEXT: [[OR:%.]] = or i32 [[Y:%.]], 2			; CHECK-NEXT: [[OR:%.]] = or i32 [[Y:%.]], 2
	; CHECK-NEXT: [[AND:%.]] = shl i32 [[X:%.]], 1			; CHECK-NEXT: [[AND:%.]] = shl i32 [[X:%.]], 1
	; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[AND]], 2			; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[AND]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = or i32 [[TMP1]], [[Y]]			; CHECK-NEXT: [[TMP2:%.*]] = or i32 [[TMP1]], [[Y]]
	; CHECK-NEXT: [[RES:%.*]] = mul i32 [[TMP2]], [[OR]]			; CHECK-NEXT: [[RES:%.*]] = mul i32 [[TMP2]], [[OR]]
	; CHECK-NEXT: ret i32 [[RES]]			; CHECK-NEXT: ret i32 [[RES]]
	▲ Show 20 Lines • Show All 217 Lines • Show Last 20 Lines