Download Raw Diff

Details

Reviewers

lebedev.ri
efriedma
spatel
craig.topper

Commits

rG4626613ffe06: [InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x…
rL364255: [InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x…

Summary

To generate simplified IR, make sure fold

(X & ~C) ==/!= 0 --> X u</u>= C+1

is scheduled before fold

((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.

https://rise4fun.com/Alive/7ZN

Diff Detail

Repository: rL LLVM

Event Timeline

huihuiz created this revision.Jun 18 2019, 11:19 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2019, 11:19 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

This is the second fold we talked about , split from D63026

lebedev.ri edited the summary of this revision. (Show Details)Jun 18 2019, 11:34 AM

Looks good for IR, but looks like this needs an undo fold for backend?
https://godbolt.org/z/giY9Cp
At least for aarch64 this seems to result in worse ASM?

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1643–1645 ↗	(On Diff #205403)	Why are we restricting this fold to single-use `and`?
1655 ↗	(On Diff #205403)	`((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.`
1656–1657 ↗	(On Diff #205403)	Why do we care about that? Seems rather arbitrary.

huihuiz updated this revision to Diff 205409.Jun 18 2019, 11:54 AM

huihuiz marked 2 inline comments as done.

huihuiz marked an inline comment as done.

huihuiz added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1643–1645 ↗	(On Diff #205403)	Same reason in D63026 (resolving PR10267)
1656–1657 ↗	(On Diff #205403)	take this test as example define i1 @test_shift_and_cmp_changed2(i8 %p) { %shlp = shl i8 %p, 5 %andp = and i8 %shlp, -64 %cmp = icmp ult i8 %andp, 32 ret i1 %cmp } we do miss fold for (X >> C3) & C2 != C1 --> (X & (C2 << C3)) != (C1 << C3)

lebedev.ri added inline comments.Jun 18 2019, 12:03 PM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1647 ↗	(On Diff #205409)	Despite how pointless it will look, please add the comment to each of the folds. Nothing prevents from other folds being added, and this comment getting out of date.
1656–1657 ↗	(On Diff #205403)	Is that IR without this restriction? We currently seem to handle it just fine https://godbolt.org/z/8DB1xD

huihuiz marked an inline comment as done.Jun 18 2019, 12:03 PM

huihuiz added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656–1657 ↗	(On Diff #205403)	for this test, should be better transforming into this define i1 @test_shift_and_cmp_changed2(i8 %p) { %andp = and i8 %p, 6 %cmp = icmp eq i8 %andp, 0 ret i1 %cmp } However, by scheduling ((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two earlier, we end up with define i1 @test_shift_and_cmp_changed2(i8 %p) { %shlp = shl i8 %p, 5 %cmp = icmp ult i8 %shlp, 64 ret i1 %cmp }

huihuiz updated this revision to Diff 205413.Jun 18 2019, 12:12 PM

huihuiz marked 2 inline comments as done.

huihuiz added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656–1657 ↗	(On Diff #205403)	Yes , that is the IR without add this restriction. We are currently handle it fine. But when we try to schedule (X & ~C) ==/!= 0 --> X u</u>= C+1 earlier we end up with define i1 @test_shift_and_cmp_changed2(i8 %p) { %shlp = shl i8 %p, 5 %cmp = icmp ult i8 %shlp, 64 ret i1 %cmp }

lebedev.ri added inline comments.Jun 18 2019, 12:55 PM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656–1657 ↗	(On Diff #205403)	Well, nice find. Like i guessed this rescheduling is exposing all kinds of missing folds :)

lebedev.ri added inline comments.Jun 18 2019, 2:11 PM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656–1657 ↗	(On Diff #205403)	Can you give me a spoiler, if you drop this restriction, are there any other regressions in `check-llvm`?

huihuiz marked an inline comment as done.Jun 18 2019, 2:31 PM

huihuiz added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

1656–1657 ↗

(On Diff #205403)

Here are the failing tests when dropping this restriction:

test file: test/Transforms/InstCombine/pr17827.ll

test_shift_and_cmp_changed2 and its vector version

test file: test/Transforms/InstCombine/lshr-and-negC-icmpeq-zero.ll

scalar_i32_lshr_and_negC_eq_X_is_constant1
and
scalar_i32_lshr_and_negC_eq_X_is_constant1

 define i1 @scalar_i32_lshr_and_negC_eq_X_is_constant1(i32 %y) {
 ; CHECK-LABEL: @scalar_i32_lshr_and_negC_eq_X_is_constant1(
 ; CHECK-NEXT:    [[LSHR:%.*]] = lshr i32 12345, [[Y:%.*]]
-; CHECK-NEXT:    [[AND:%.*]] = and i32 [[LSHR]], -8
-; CHECK-NEXT:    [[R:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:    [[R:%.*]] = icmp ult i32 [[LSHR]], 8
 ; CHECK-NEXT:    ret i1 [[R]]
 ;
   %lshr = lshr i32 12345, %y
   %and = and i32 %lshr, 4294967288  ; ~7
   %r = icmp eq i32 %and, 0
   ret i1 %r
 }

 define i1 @scalar_i32_lshr_and_negC_eq_X_is_constant2(i32 %y) {
 ; CHECK-LABEL: @scalar_i32_lshr_and_negC_eq_X_is_constant2(
 ; CHECK-NEXT:    [[LSHR:%.*]] = lshr i32 268435456, [[Y:%.*]]
-; CHECK-NEXT:    [[AND:%.*]] = and i32 [[LSHR]], -8
-; CHECK-NEXT:    [[R:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:    [[R:%.*]] = icmp ult i32 [[LSHR]], 8
 ; CHECK-NEXT:    ret i1 [[R]]
 ;
   %lshr = lshr i32 268435456, %y
   %and = and i32 %lshr, 4294967288  ; ~7
   %r = icmp eq i32 %and, 0
   ret i1 %r
 }

similarly for test/Transforms/InstCombine/shl-and-negC-icmpeq-zero.ll

lebedev.ri added inline comments.Jun 18 2019, 2:40 PM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656–1657 ↗	(On Diff #205403)	I'm confused, 2. and 3. are improvements, right? So only `test_shift_and_cmp_changed2` regresses? A missing fold should simply be added then, i'd say. I'm not sure what it is yet though.

huihuiz marked an inline comment as done.Jun 18 2019, 2:51 PM

huihuiz added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656–1657 ↗	(On Diff #205403)	test file 2 and 3 are improvements. But the improved test cases are created mostly to check for correctness, I am not sure the improved test cases are actually popular in real applications. test file 1 is regression, the failing test is simplified from bit filed test bug in pr17827. Probably more prevalent in application code. Adding restriction tend to benefit more cases.

lebedev.ri added inline comments.Jun 18 2019, 3:08 PM

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656–1657 ↗	(On Diff #205403)	(TLDR: It's all such a mess, isn't it?) I am not sure the improved test cases are actually popular in real applications. Given just how many these transforms are, it's not particularly relevant what can be and what can't be encountered in the "actually popular in real applications", what mattes is what IR patterns we can encounter in "actually popular in real application" after the middle-end opt pipeline. test file 1 is regression, the failing test is simplified from bit filed test bug in pr17827. Probably more prevalent in application code. So naturally, if that new pattern is now being encountered in a testcase reduced from an actual code, then by definition it can be encountered in "actually popular in real application", no? Adding restriction tend to benefit more cases. But do we know that? Sadly all this patternmatching is done blindly, with no attempt at cost modelling, or trying to ultimately produce least amount of instructions/etc, it just tries to combine instructions when they can be combined.

Yes , this reorder expose yet another missing fold. As regression in test/Transforms/InstCombine/pr17827.ll

Simplify 'shl' inequality test into 'and' equality test should fix this issue. I am posting this into another differential, link shortly.

icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0

huihuiz mentioned this in D63675: [InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0..Jun 21 2019, 2:57 PM

Looks good. I'll accept at the same time with D63675.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
1656 ↗	(On Diff #206042)	s/to/only for/

lebedev.ri added inline comments.Jun 22 2019, 2:15 AM

llvm/test/Transforms/InstCombine/pr17827.ll
7 ↗	(On Diff #206042)	Please can you regenerate this test (and just directly commit, no review) to get rid of this noise.

huihuiz mentioned this in rL364224: [InstCombine] Regenerate test pr17827. NFCI..Jun 24 2019, 12:52 PM

huihuiz mentioned this in rG94b43160963d: [InstCombine] Regenerate test pr17827. NFCI..

addressed review comments

Looks good, thank you.

lebedev.ri accepted this revision.Jun 24 2019, 3:14 PM

This revision is now accepted and ready to land.Jun 24 2019, 3:14 PM

Closed by commit rL364255: [InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x… (authored by huihuiz). · Explain WhyJun 24 2019, 5:10 PM

This revision was automatically updated to reflect the committed changes.

Diff 206336

llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp

Show First 20 Lines • Show All 1,646 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::foldICmpAndConstConst(ICmpInst &Cmp,
if (Cmp.isEquality() && C1.isNullValue()) {		if (Cmp.isEquality() && C1.isNullValue()) {
// Restrict this fold to single-use 'and' (PR10267).		// Restrict this fold to single-use 'and' (PR10267).
// Replace (and X, (1 << size(X)-1) != 0) with X s< 0		// Replace (and X, (1 << size(X)-1) != 0) with X s< 0
if (C2->isSignMask()) {		if (C2->isSignMask()) {
Constant *Zero = Constant::getNullValue(X->getType());		Constant *Zero = Constant::getNullValue(X->getType());
auto NewPred = isICMP_NE ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_SGE;		auto NewPred = isICMP_NE ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_SGE;
return new ICmpInst(NewPred, X, Zero);		return new ICmpInst(NewPred, X, Zero);
}		}

		// Restrict this fold only for single-use 'and' (PR10267).
		// ((%x & C) == 0) --> %x u< (-C) iff (-C) is power of two.
		if ((~(*C2) + 1).isPowerOf2()) {
		Constant *NegBOC =
		ConstantExpr::getNeg(cast<Constant>(And->getOperand(1)));
		auto NewPred = isICMP_NE ? ICmpInst::ICMP_UGE : ICmpInst::ICMP_ULT;
		return new ICmpInst(NewPred, X, NegBOC);
		}
}		}

// If the LHS is an 'and' of a truncate and we can widen the and/compare to		// If the LHS is an 'and' of a truncate and we can widen the and/compare to
// the input width without changing the value produced, eliminate the cast:		// the input width without changing the value produced, eliminate the cast:
//		//
// icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1'		// icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1'
//		//
// We can do this transformation if the constants do not have their sign bits		// We can do this transformation if the constants do not have their sign bits
▲ Show 20 Lines • Show All 1,129 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::foldICmpBinOpEqualityWithConstant(ICmpInst &Cmp,
}		}
case Instruction::And: {		case Instruction::And: {
const APInt *BOC;		const APInt *BOC;
if (match(BOp1, m_APInt(BOC))) {		if (match(BOp1, m_APInt(BOC))) {
// If we have ((X & C) == C), turn it into ((X & C) != 0).		// If we have ((X & C) == C), turn it into ((X & C) != 0).
if (C == *BOC && C.isPowerOf2())		if (C == *BOC && C.isPowerOf2())
return new ICmpInst(isICMP_NE ? ICmpInst::ICMP_EQ : ICmpInst::ICMP_NE,		return new ICmpInst(isICMP_NE ? ICmpInst::ICMP_EQ : ICmpInst::ICMP_NE,
BO, Constant::getNullValue(RHS->getType()));		BO, Constant::getNullValue(RHS->getType()));

// Don't perform the following transforms if the AND has multiple uses
if (!BO->hasOneUse())
break;

// ((X & ~7) == 0) --> X < 8
if (C.isNullValue() && (~(*BOC) + 1).isPowerOf2()) {
Constant *NegBOC = ConstantExpr::getNeg(cast<Constant>(BOp1));
auto NewPred = isICMP_NE ? ICmpInst::ICMP_UGE : ICmpInst::ICMP_ULT;
return new ICmpInst(NewPred, BOp0, NegBOC);
}
}		}
break;		break;
}		}
case Instruction::Mul:		case Instruction::Mul:
if (C.isNullValue() && BO->hasNoSignedWrap()) {		if (C.isNullValue() && BO->hasNoSignedWrap()) {
const APInt *BOC;		const APInt *BOC;
if (match(BOp1, m_APInt(BOC)) && !BOC->isNullValue()) {		if (match(BOp1, m_APInt(BOC)) && !BOC->isNullValue()) {
// The trivial case (mul X, 0) is handled by InstSimplify.		// The trivial case (mul X, 0) is handled by InstSimplify.
▲ Show 20 Lines • Show All 2,825 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/lshr-and-negC-icmpeq-zero.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -instcombine -S \| FileCheck %s			; RUN: opt %s -instcombine -S \| FileCheck %s

	; For pattern ((X l>> Y) & ~C) ==/!= 0; when C+1 is power of 2			; For pattern ((X l>> Y) & ~C) ==/!= 0; when C+1 is power of 2
	; it may be optimal to fold into (X l>> Y) </>= C+1			; it may be optimal to fold into (X l>> Y) </>= C+1
	; rather than X & (~C << Y) ==/!= 0			; rather than X & (~C << Y) ==/!= 0

	; Scalar tests			; Scalar tests

	define i1 @scalar_i8_lshr_and_negC_eq(i8 %x, i8 %y) {			define i1 @scalar_i8_lshr_and_negC_eq(i8 %x, i8 %y) {
	; CHECK-LABEL: @scalar_i8_lshr_and_negC_eq(			; CHECK-LABEL: @scalar_i8_lshr_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = shl i8 -4, [[Y:%.]]			; CHECK-NEXT: [[LSHR:%.]] = lshr i8 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i8 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i8 [[LSHR]], 4
	; CHECK-NEXT: [[R:%.*]] = icmp eq i8 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%lshr = lshr i8 %x, %y			%lshr = lshr i8 %x, %y
	%and = and i8 %lshr, 252 ; ~3			%and = and i8 %lshr, 252 ; ~3
	%r = icmp eq i8 %and, 0			%r = icmp eq i8 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i16_lshr_and_negC_eq(i16 %x, i16 %y) {			define i1 @scalar_i16_lshr_and_negC_eq(i16 %x, i16 %y) {
	; CHECK-LABEL: @scalar_i16_lshr_and_negC_eq(			; CHECK-LABEL: @scalar_i16_lshr_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = shl i16 -128, [[Y:%.]]			; CHECK-NEXT: [[LSHR:%.]] = lshr i16 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i16 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i16 [[LSHR]], 128
	; CHECK-NEXT: [[R:%.*]] = icmp eq i16 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%lshr = lshr i16 %x, %y			%lshr = lshr i16 %x, %y
	%and = and i16 %lshr, 65408 ; ~127			%and = and i16 %lshr, 65408 ; ~127
	%r = icmp eq i16 %and, 0			%r = icmp eq i16 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i32_lshr_and_negC_eq(i32 %x, i32 %y) {			define i1 @scalar_i32_lshr_and_negC_eq(i32 %x, i32 %y) {
	; CHECK-LABEL: @scalar_i32_lshr_and_negC_eq(			; CHECK-LABEL: @scalar_i32_lshr_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = shl i32 -262144, [[Y:%.]]			; CHECK-NEXT: [[LSHR:%.]] = lshr i32 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i32 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i32 [[LSHR]], 262144
	; CHECK-NEXT: [[R:%.*]] = icmp eq i32 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%lshr = lshr i32 %x, %y			%lshr = lshr i32 %x, %y
	%and = and i32 %lshr, 4294705152 ; ~262143			%and = and i32 %lshr, 4294705152 ; ~262143
	%r = icmp eq i32 %and, 0			%r = icmp eq i32 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i64_lshr_and_negC_eq(i64 %x, i64 %y) {			define i1 @scalar_i64_lshr_and_negC_eq(i64 %x, i64 %y) {
	; CHECK-LABEL: @scalar_i64_lshr_and_negC_eq(			; CHECK-LABEL: @scalar_i64_lshr_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = shl i64 -8589934592, [[Y:%.]]			; CHECK-NEXT: [[LSHR:%.]] = lshr i64 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i64 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i64 [[LSHR]], 8589934592
	; CHECK-NEXT: [[R:%.*]] = icmp eq i64 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%lshr = lshr i64 %x, %y			%lshr = lshr i64 %x, %y
	%and = and i64 %lshr, 18446744065119617024 ; ~8589934591			%and = and i64 %lshr, 18446744065119617024 ; ~8589934591
	%r = icmp eq i64 %and, 0			%r = icmp eq i64 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i32_lshr_and_negC_ne(i32 %x, i32 %y) {			define i1 @scalar_i32_lshr_and_negC_ne(i32 %x, i32 %y) {
	; CHECK-LABEL: @scalar_i32_lshr_and_negC_ne(			; CHECK-LABEL: @scalar_i32_lshr_and_negC_ne(
	; CHECK-NEXT: [[TMP1:%.]] = shl i32 -262144, [[Y:%.]]			; CHECK-NEXT: [[LSHR:%.]] = lshr i32 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i32 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ugt i32 [[LSHR]], 262143
	; CHECK-NEXT: [[R:%.*]] = icmp ne i32 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%lshr = lshr i32 %x, %y			%lshr = lshr i32 %x, %y
	%and = and i32 %lshr, 4294705152 ; ~262143			%and = and i32 %lshr, 4294705152 ; ~262143
	%r = icmp ne i32 %and, 0 ; check 'ne' predicate			%r = icmp ne i32 %and, 0 ; check 'ne' predicate
	ret i1 %r			ret i1 %r
	}			}

	; Vector tests			; Vector tests

	define <4 x i1> @vec_4xi32_lshr_and_negC_eq(<4 x i32> %x, <4 x i32> %y) {			define <4 x i1> @vec_4xi32_lshr_and_negC_eq(<4 x i32> %x, <4 x i32> %y) {
	; CHECK-LABEL: @vec_4xi32_lshr_and_negC_eq(			; CHECK-LABEL: @vec_4xi32_lshr_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = shl <4 x i32> <i32 -8, i32 -8, i32 -8, i32 -8>, [[Y:%.]]			; CHECK-NEXT: [[LSHR:%.]] = lshr <4 x i32> [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and <4 x i32> [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult <4 x i32> [[LSHR]], <i32 8, i32 8, i32 8, i32 8>
	; CHECK-NEXT: [[R:%.*]] = icmp eq <4 x i32> [[TMP2]], zeroinitializer
	; CHECK-NEXT: ret <4 x i1> [[R]]			; CHECK-NEXT: ret <4 x i1> [[R]]
	;			;
	%lshr = lshr <4 x i32> %x, %y			%lshr = lshr <4 x i32> %x, %y
	%and = and <4 x i32> %lshr, <i32 4294967288, i32 4294967288, i32 4294967288, i32 4294967288> ; ~7			%and = and <4 x i32> %lshr, <i32 4294967288, i32 4294967288, i32 4294967288, i32 4294967288> ; ~7
	%r = icmp eq <4 x i32> %and, <i32 0, i32 0, i32 0, i32 0>			%r = icmp eq <4 x i32> %and, <i32 0, i32 0, i32 0, i32 0>
	ret <4 x i1> %r			ret <4 x i1> %r
	}			}

	▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/pr17827.ll

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	;
%ashr = ashr <2 x i8> %shl, <i8 5, i8 5>		%ashr = ashr <2 x i8> %shl, <i8 5, i8 5>
%cmp = icmp slt <2 x i8> %ashr, <i8 1, i8 1>		%cmp = icmp slt <2 x i8> %ashr, <i8 1, i8 1>
ret <2 x i1> %cmp		ret <2 x i1> %cmp
}		}

; Unsigned compare allows a transformation to compare against 0.		; Unsigned compare allows a transformation to compare against 0.
define i1 @test_shift_and_cmp_changed2(i8 %p) {		define i1 @test_shift_and_cmp_changed2(i8 %p) {
; CHECK-LABEL: @test_shift_and_cmp_changed2(		; CHECK-LABEL: @test_shift_and_cmp_changed2(
; CHECK-NEXT: [[ANDP:%.]] = and i8 [[P:%.]], 6		; CHECK-NEXT: [[SHLP:%.]] = shl i8 [[P:%.]], 5
; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[ANDP]], 0		; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[SHLP]], 64
; CHECK-NEXT: ret i1 [[CMP]]		; CHECK-NEXT: ret i1 [[CMP]]
;		;
%shlp = shl i8 %p, 5		%shlp = shl i8 %p, 5
%andp = and i8 %shlp, -64		%andp = and i8 %shlp, -64
%cmp = icmp ult i8 %andp, 32		%cmp = icmp ult i8 %andp, 32
ret i1 %cmp		ret i1 %cmp
}		}

define <2 x i1> @test_shift_and_cmp_changed2_vec(<2 x i8> %p) {		define <2 x i1> @test_shift_and_cmp_changed2_vec(<2 x i8> %p) {
; CHECK-LABEL: @test_shift_and_cmp_changed2_vec(		; CHECK-LABEL: @test_shift_and_cmp_changed2_vec(
; CHECK-NEXT: [[ANDP:%.]] = and <2 x i8> [[P:%.]], <i8 6, i8 6>		; CHECK-NEXT: [[SHLP:%.]] = shl <2 x i8> [[P:%.]], <i8 5, i8 5>
; CHECK-NEXT: [[CMP:%.*]] = icmp eq <2 x i8> [[ANDP]], zeroinitializer		; CHECK-NEXT: [[CMP:%.*]] = icmp ult <2 x i8> [[SHLP]], <i8 64, i8 64>
; CHECK-NEXT: ret <2 x i1> [[CMP]]		; CHECK-NEXT: ret <2 x i1> [[CMP]]
;		;
%shlp = shl <2 x i8> %p, <i8 5, i8 5>		%shlp = shl <2 x i8> %p, <i8 5, i8 5>
%andp = and <2 x i8> %shlp, <i8 -64, i8 -64>		%andp = and <2 x i8> %shlp, <i8 -64, i8 -64>
%cmp = icmp ult <2 x i8> %andp, <i8 32, i8 32>		%cmp = icmp ult <2 x i8> %andp, <i8 32, i8 32>
ret <2 x i1> %cmp		ret <2 x i1> %cmp
}		}

Show All 25 Lines

llvm/trunk/test/Transforms/InstCombine/shl-and-negC-icmpeq-zero.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -instcombine -S \| FileCheck %s			; RUN: opt %s -instcombine -S \| FileCheck %s

	; For pattern ((X << Y) & ~C) ==/!= 0; when C+1 is power of 2			; For pattern ((X << Y) & ~C) ==/!= 0; when C+1 is power of 2
	; it may be optimal to fold into (X << Y) </>= C+1			; it may be optimal to fold into (X << Y) </>= C+1
	; rather than X & (~C l>> Y) ==/!= 0			; rather than X & (~C l>> Y) ==/!= 0

	; Scalar tests			; Scalar tests

	define i1 @scalar_i8_shl_and_negC_eq(i8 %x, i8 %y) {			define i1 @scalar_i8_shl_and_negC_eq(i8 %x, i8 %y) {
	; CHECK-LABEL: @scalar_i8_shl_and_negC_eq(			; CHECK-LABEL: @scalar_i8_shl_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = lshr i8 -4, [[Y:%.]]			; CHECK-NEXT: [[SHL:%.]] = shl i8 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i8 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i8 [[SHL]], 4
	; CHECK-NEXT: [[R:%.*]] = icmp eq i8 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%shl = shl i8 %x, %y			%shl = shl i8 %x, %y
	%and = and i8 %shl, 252 ; ~3			%and = and i8 %shl, 252 ; ~3
	%r = icmp eq i8 %and, 0			%r = icmp eq i8 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i16_shl_and_negC_eq(i16 %x, i16 %y) {			define i1 @scalar_i16_shl_and_negC_eq(i16 %x, i16 %y) {
	; CHECK-LABEL: @scalar_i16_shl_and_negC_eq(			; CHECK-LABEL: @scalar_i16_shl_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = lshr i16 -128, [[Y:%.]]			; CHECK-NEXT: [[SHL:%.]] = shl i16 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i16 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i16 [[SHL]], 128
	; CHECK-NEXT: [[R:%.*]] = icmp eq i16 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%shl = shl i16 %x, %y			%shl = shl i16 %x, %y
	%and = and i16 %shl, 65408 ; ~127			%and = and i16 %shl, 65408 ; ~127
	%r = icmp eq i16 %and, 0			%r = icmp eq i16 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i32_shl_and_negC_eq(i32 %x, i32 %y) {			define i1 @scalar_i32_shl_and_negC_eq(i32 %x, i32 %y) {
	; CHECK-LABEL: @scalar_i32_shl_and_negC_eq(			; CHECK-LABEL: @scalar_i32_shl_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = lshr i32 -262144, [[Y:%.]]			; CHECK-NEXT: [[SHL:%.]] = shl i32 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i32 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i32 [[SHL]], 262144
	; CHECK-NEXT: [[R:%.*]] = icmp eq i32 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%shl = shl i32 %x, %y			%shl = shl i32 %x, %y
	%and = and i32 %shl, 4294705152 ; ~262143			%and = and i32 %shl, 4294705152 ; ~262143
	%r = icmp eq i32 %and, 0			%r = icmp eq i32 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i64_shl_and_negC_eq(i64 %x, i64 %y) {			define i1 @scalar_i64_shl_and_negC_eq(i64 %x, i64 %y) {
	; CHECK-LABEL: @scalar_i64_shl_and_negC_eq(			; CHECK-LABEL: @scalar_i64_shl_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = lshr i64 -8589934592, [[Y:%.]]			; CHECK-NEXT: [[SHL:%.]] = shl i64 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i64 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult i64 [[SHL]], 8589934592
	; CHECK-NEXT: [[R:%.*]] = icmp eq i64 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%shl = shl i64 %x, %y			%shl = shl i64 %x, %y
	%and = and i64 %shl, 18446744065119617024 ; ~8589934591			%and = and i64 %shl, 18446744065119617024 ; ~8589934591
	%r = icmp eq i64 %and, 0			%r = icmp eq i64 %and, 0
	ret i1 %r			ret i1 %r
	}			}

	define i1 @scalar_i32_shl_and_negC_ne(i32 %x, i32 %y) {			define i1 @scalar_i32_shl_and_negC_ne(i32 %x, i32 %y) {
	; CHECK-LABEL: @scalar_i32_shl_and_negC_ne(			; CHECK-LABEL: @scalar_i32_shl_and_negC_ne(
	; CHECK-NEXT: [[TMP1:%.]] = lshr i32 -262144, [[Y:%.]]			; CHECK-NEXT: [[SHL:%.]] = shl i32 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and i32 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ugt i32 [[SHL]], 262143
	; CHECK-NEXT: [[R:%.*]] = icmp ne i32 [[TMP2]], 0
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[R]]
	;			;
	%shl = shl i32 %x, %y			%shl = shl i32 %x, %y
	%and = and i32 %shl, 4294705152 ; ~262143			%and = and i32 %shl, 4294705152 ; ~262143
	%r = icmp ne i32 %and, 0 ; check 'ne' predicate			%r = icmp ne i32 %and, 0 ; check 'ne' predicate
	ret i1 %r			ret i1 %r
	}			}

	; Vector tests			; Vector tests

	define <4 x i1> @vec_4xi32_shl_and_negC_eq(<4 x i32> %x, <4 x i32> %y) {			define <4 x i1> @vec_4xi32_shl_and_negC_eq(<4 x i32> %x, <4 x i32> %y) {
	; CHECK-LABEL: @vec_4xi32_shl_and_negC_eq(			; CHECK-LABEL: @vec_4xi32_shl_and_negC_eq(
	; CHECK-NEXT: [[TMP1:%.]] = lshr <4 x i32> <i32 -8, i32 -8, i32 -8, i32 -8>, [[Y:%.]]			; CHECK-NEXT: [[SHL:%.]] = shl <4 x i32> [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = and <4 x i32> [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[R:%.*]] = icmp ult <4 x i32> [[SHL]], <i32 8, i32 8, i32 8, i32 8>
	; CHECK-NEXT: [[R:%.*]] = icmp eq <4 x i32> [[TMP2]], zeroinitializer
	; CHECK-NEXT: ret <4 x i1> [[R]]			; CHECK-NEXT: ret <4 x i1> [[R]]
	;			;
	%shl = shl <4 x i32> %x, %y			%shl = shl <4 x i32> %x, %y
	%and = and <4 x i32> %shl, <i32 4294967288, i32 4294967288, i32 4294967288, i32 4294967288> ; ~7			%and = and <4 x i32> %shl, <i32 4294967288, i32 4294967288, i32 4294967288, i32 4294967288> ; ~7
	%r = icmp eq <4 x i32> %and, <i32 0, i32 0, i32 0, i32 0>			%r = icmp eq <4 x i32> %and, <i32 0, i32 0, i32 0, i32 0>
	ret <4 x i1> %r			ret <4 x i1> %r
	}			}

	▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 206336

llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp

llvm/trunk/test/Transforms/InstCombine/lshr-and-negC-icmpeq-zero.ll

llvm/trunk/test/Transforms/InstCombine/pr17827.ll

llvm/trunk/test/Transforms/InstCombine/shl-and-negC-icmpeq-zero.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 206336

llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp

llvm/trunk/test/Transforms/InstCombine/lshr-and-negC-icmpeq-zero.ll

llvm/trunk/test/Transforms/InstCombine/pr17827.ll

llvm/trunk/test/Transforms/InstCombine/shl-and-negC-icmpeq-zero.ll

[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier.
ClosedPublic