This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/IR/
-
llvm/
-
IR/
-
PatternMatch.h
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineCompares.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll
-
canonicalize-low-bit-mask-and-icmp-eq-to-icmp-ule.ll
3
icmp-logical.ll
-
icmp-mul-zext.ll

Differential D49179

[InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y)
ClosedPublic

Authored by lebedev.ri on Jul 11 2018, 6:03 AM.

Download Raw Diff

Details

Reviewers

spatel
craig.topper

Commits

rG68d54cf5b39c: [InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y)
rL336834: [InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y)

Summary

https://bugs.llvm.org/show_bug.cgi?id=38123

This pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958
https://bugs.llvm.org/show_bug.cgi?id=21530
in unsigned case, therefore it is probably a good idea to improve it.

https://rise4fun.com/Alive/Rny
^ there are more opportunities for folds, i will follow up with them afterwards.

Caveat: this somehow exposes a missing opportunities
in test/Transforms/InstCombine/icmp-logical.ll
It seems, the problem is in foldLogOpOfMaskedICmps() in InstCombineAndOrXor.cpp.
But i'm not quite sure what is wrong, because it calls getMaskedTypeForICmpPair(),
which calls decomposeBitTestICmp() which should already work for these cases...
I would love to have some pointers on how to address it.

Diff Detail

Repository: rL LLVM

Event Timeline

lebedev.ri created this revision.Jul 11 2018, 6:03 AM

lebedev.ri added inline comments.Jul 11 2018, 6:08 AM

test/Transforms/InstCombine/canonicalize-low-bit-mask-and-icmp-eq-to-icmp-ule.ll
17–19 ↗	(On Diff #154978)	To spare time, yes, this is correct, https://rise4fun.com/Alive/J71

lebedev.ri added inline comments.Jul 11 2018, 6:30 AM

test/Transforms/InstCombine/icmp-mul-zext.ll
14–15 ↗	(On Diff #154978)	This too, https://rise4fun.com/Alive/twl2, note that `br` inverted, too.

I'm not too worried about the missing fold in foldLogOpOfMaskedICmps(). That's a complex mess, so not surprising that it has logic holes. I've wondered if we could rewrite that using ranges to simplify that code.

lib/Transforms/InstCombine/InstCombineCompares.cpp
4447–4449 ↗	(On Diff #154978)	We're also missing signed folds like: Pre: isPowerOf2(C1+1) %masked = and i32 %arg, C1 %truncheck = icmp sge i32 %masked, %arg => %truncheck = icmp sle i32 %arg, C1 https://rise4fun.com/Alive/K8H ...so I'd give this function a more general name: foldICmpWithMaskedVal?
4452–4458 ↗	(On Diff #154978)	Untested, but this could be less indent-crazy if we give some of it a local name like: auto m_Mask = m_CombineOr(m_LShr(m_AllOnes(), m_Value()), m_LowBitMask()); if (!match(&I, m_c_ICmp(SrcPred, m_c_And(m_CombineAnd(m_Mask, m_Value(M)), m_Value(X)), m_Deferred(X))))
4462 ↗	(On Diff #154978)	I think you're planning to extend this soon, but it's best to leave a TODO comment with an explanation just in case that gets delayed.
4659–4660 ↗	(On Diff #154978)	There's no logic to fold ordering within visitICmpInst()...the new function could be called from within foldICmpBinOp() rather than the top-level?

Address @spatel's review notes.

In D49179#1158760, @spatel wrote:

I'm not too worried about the missing fold in foldLogOpOfMaskedICmps(). That's a complex mess, so not surprising that it has logic holes. I've wondered if we could rewrite that using ranges to simplify that code.

That's, relieving to know!
I have looked a bit more, and i'm not sure what goes wrong,
so one regression can slip, that'd be great.

lib/Transforms/InstCombine/InstCombineCompares.cpp
4447–4449 ↗	(On Diff #154978)	Yep, i didn't like that long name anyway.
4452–4458 ↗	(On Diff #154978)	Ha, nice one.
4659–4660 ↗	(On Diff #154978)	Hm, good idea.

LGTM.

lib/Transforms/InstCombine/InstCombineCompares.cpp
2873 ↗	(On Diff #155015)	Nit: make the formula up here more general and then put the specific formulas within the 'switch'.

This revision is now accepted and ready to land.Jul 11 2018, 9:29 AM

In D49179#1158930, @spatel wrote:

LGTM.

Thank you for such speedy review!

I'm wondering, what about the signed case?
https://godbolt.org/g/WvWX13
https://godbolt.org/g/DM7XA4
https://rise4fun.com/Alive/Qslx (check that 25 high bits are either all-ones, or all-zeros)

Can we do this in instcombine?
Or not, given that we were disabling some bit-fiddling transforms lately?

In D49179#1158957, @lebedev.ri wrote:

In D49179#1158930, @spatel wrote:

LGTM.

Thank you for such speedy review!

I'm wondering, what about the signed case?
https://godbolt.org/g/WvWX13
https://godbolt.org/g/DM7XA4
https://rise4fun.com/Alive/Qslx (check that 25 high bits are either all-ones, or all-zeros)

Can we do this in instcombine?
Or not, given that we were disabling some bit-fiddling transforms lately?

Actually, not sure that is any better, will leave that for now.
Though the reverse transform might be a good thing for dagcombine.

In D49179#1159024, @lebedev.ri wrote:

In D49179#1158957, @lebedev.ri wrote:

In D49179#1158930, @spatel wrote:

LGTM.

Thank you for such speedy review!

I'm wondering, what about the signed case?
https://godbolt.org/g/WvWX13
https://godbolt.org/g/DM7XA4
https://rise4fun.com/Alive/Qslx (check that 25 high bits are either all-ones, or all-zeros)

Can we do this in instcombine?
Or not, given that we were disabling some bit-fiddling transforms lately?

Actually, not sure that is any better, will leave that for now.
Though the reverse transform might be a good thing for dagcombine.

It's a good question, but probably better asked on the dev list than here. I think we prefer to canonicalize to the form with less instructions even if that means we lose information from the eliminated ops.

Closed by commit rL336834: [InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y) (authored by lebedevri). · Explain WhyJul 11 2018, 12:10 PM

This revision was automatically updated to reflect the committed changes.

lebedev.ri added a child revision: D48958: [clang][ubsan] Implicit Cast Sanitizer - integer truncation - clang part.Jul 11 2018, 12:14 PM

lebedev.ri mentioned this in D49205: [InstCombine] Fold x & (-1 >> y) != x to x u> (-1 >> y).Jul 11 2018, 2:38 PM

lebedev.ri added a child revision: D49205: [InstCombine] Fold x & (-1 >> y) != x to x u> (-1 >> y).

hjyamauchi added inline comments.Jul 11 2018, 3:35 PM

llvm/trunk/test/Transforms/InstCombine/icmp-logical.ll
91	It seems like the simplification of this change (D49179) triggers before this original simplification triggers and the original simplification no longer triggers? My guess is that this test just means to test a plain or-case and it may make sense to use some other values like 14 and 78 (shifted left by 1 bit, instead of 7 and 39) and preserve the original intention of the test. Note the next test @masked_or_A_slightly_optimized has the same code as the after-simplification code of this function. Not sure if it is a just a coincidence.

lebedev.ri added inline comments.Jul 11 2018, 3:40 PM

llvm/trunk/test/Transforms/InstCombine/icmp-logical.ll
91	and the original simplification no longer triggers? In any case, the original simplification clearly does not handle this pattern. My guess is that this test just means to test a plain or-case and it may make sense to use some other values like 14 and 78 (shifted left by 1 bit, instead of 7 and 39) and preserve the original intention of the test. Hmm, thanks, might be a good idea.. Note the next test @masked_or_A_slightly_optimized has the same code as the after-simplification code of this function. Not sure if it is a just a coincidence. I have added `@masked_or_A_slightly_optimized` in rL336784 after noticing this regression, to have a test regardless.

In D49179#1159036, @spatel wrote:

In D49179#1159024, @lebedev.ri wrote:

In D49179#1158957, @lebedev.ri wrote:

In D49179#1158930, @spatel wrote:

LGTM.

Thank you for such speedy review!

I'm wondering, what about the signed case?
https://godbolt.org/g/WvWX13
https://godbolt.org/g/DM7XA4
https://rise4fun.com/Alive/Qslx (check that 25 high bits are either all-ones, or all-zeros)

Can we do this in instcombine?
Or not, given that we were disabling some bit-fiddling transforms lately?

Actually, not sure that is any better, will leave that for now.
Though the reverse transform might be a good thing for dagcombine.

It's a good question, but probably better asked on the dev list than here. I think we prefer to canonicalize to the form with less instructions even if that means we lose information from the eliminated ops.

For future reference, here is a more straight-forward fold https://rise4fun.com/Alive/XuW, that will afterwards fold back into and+icmp https://godbolt.org/g/bm3yZu
But any such fold will clearly need dagcombine work, since the 'naive' version with shifts seems to produce optimal assembly already.
So after all i'm not sure i'm motivated to look into the 'signed truncation pattern' right now..

x86-signed-truncation-check.ll9 KBDownload

aarch64-signed-truncation-check.ll4 KBDownload

In D49179#1160020, @lebedev.ri wrote:

In D49179#1159036, @spatel wrote:

In D49179#1159024, @lebedev.ri wrote:

In D49179#1158957, @lebedev.ri wrote:

In D49179#1158930, @spatel wrote:

LGTM.

Thank you for such speedy review!

I'm wondering, what about the signed case?
https://godbolt.org/g/WvWX13
https://godbolt.org/g/DM7XA4
https://rise4fun.com/Alive/Qslx (check that 25 high bits are either all-ones, or all-zeros)

Can we do this in instcombine?
Or not, given that we were disabling some bit-fiddling transforms lately?

Actually, not sure that is any better, will leave that for now.
Though the reverse transform might be a good thing for dagcombine.

It's a good question, but probably better asked on the dev list than here. I think we prefer to canonicalize to the form with less instructions even if that means we lose information from the eliminated ops.

For future reference, here is a more straight-forward fold https://rise4fun.com/Alive/XuW, that will afterwards fold back into and+icmp https://godbolt.org/g/bm3yZu
But any such fold will clearly need dagcombine work, since the 'naive' version with shifts seems to produce optimal assembly already.
So after all i'm not sure i'm motivated to look into the 'signed truncation pattern' right now..

x86-signed-truncation-check.ll9 KBDownload

aarch64-signed-truncation-check.ll4 KBDownload

Name: signed truncation check
  %old0 = shl i16 %x, 8
  %old1 = ashr exact i16 %old0, 8
  %ret = icmp eq i16 %old1, %x
=>
  %new0 = icmp slt i16 %x, 128
  %new1 = icmp sgt i16 %x, -129
  %ret = and i1 %new1, %new0

Name: and-of-icmps
  %new0 = icmp slt i16 %x, 128
  %new1 = icmp sgt i16 %x, -129
  %ret = and i1 %new1, %new0
=>
  %x.off = add i16 %x, 128
  %ret = icmp ult i16 %x.off, 256

Let me make sure I'm seeing it - as a question of IR canonicalization, we're deciding which of these 3 forms is best? I don't see any reason to favor the 1st two options over the 3rd (shorter) one. We already convert the 2nd to the 3rd (although as noted here, that transform is not happening as expected in all cases).

If the backend can't produce optimal code from the 3rd form, that's a concern, but it doesn't necessarily have to hold up the IR improvement. We might live with that (hopefully minor and temporary) problem because the IR improvement can lead to optimizations in other passes that we don't see in any of the minimal examples.

In D49179#1160049, @spatel wrote:
In D49179#1160020, @lebedev.ri wrote:

In D49179#1159036, @spatel wrote:

In D49179#1159024, @lebedev.ri wrote:

In D49179#1158957, @lebedev.ri wrote:

In D49179#1158930, @spatel wrote:

LGTM.

Thank you for such speedy review!

I'm wondering, what about the signed case?
https://godbolt.org/g/WvWX13
https://godbolt.org/g/DM7XA4
https://rise4fun.com/Alive/Qslx (check that 25 high bits are either all-ones, or all-zeros)

Can we do this in instcombine?
Or not, given that we were disabling some bit-fiddling transforms lately?

Actually, not sure that is any better, will leave that for now.
Though the reverse transform might be a good thing for dagcombine.

It's a good question, but probably better asked on the dev list than here. I think we prefer to canonicalize to the form with less instructions even if that means we lose information from the eliminated ops.

For future reference, here is a more straight-forward fold https://rise4fun.com/Alive/XuW, that will afterwards fold back into and+icmp https://godbolt.org/g/bm3yZu
But any such fold will clearly need dagcombine work, since the 'naive' version with shifts seems to produce optimal assembly already.
So after all i'm not sure i'm motivated to look into the 'signed truncation pattern' right now..

x86-signed-truncation-check.ll9 KBDownload

aarch64-signed-truncation-check.ll4 KBDownload
Name: signed truncation check
  %old0 = shl i16 %x, 8
  %old1 = ashr exact i16 %old0, 8
  %ret = icmp eq i16 %old1, %x
=>
  %new0 = icmp slt i16 %x, 128
  %new1 = icmp sgt i16 %x, -129
  %ret = and i1 %new1, %new0

Name: and-of-icmps
  %new0 = icmp slt i16 %x, 128
  %new1 = icmp sgt i16 %x, -129
  %ret = and i1 %new1, %new0
=>
  %x.off = add i16 %x, 128
  %ret = icmp ult i16 %x.off, 256
Let me make sure I'm seeing it - as a question of IR canonicalization, we're deciding which of these 3 forms is best?

Not really. I agree that the last one is the best from IR clarity standpoint.
I was just looking for some other variants of this pattern.
(Sadly souper was of no help here, strangely.)

I don't see any reason to favor the 1st two options over the 3rd (shorter) one. We already convert the 2nd to the 3rd (although as noted here, that transform is not happening as expected in all cases).

I agree.

If the backend can't produce optimal code from the 3rd form, that's a concern, but it doesn't necessarily have to hold up the IR improvement.
We might live with that (hopefully minor and temporary) problem because the IR improvement can lead to optimizations in other passes that we don't see in any of the minimal examples.

Filed https://bugs.llvm.org/show_bug.cgi?id=38149 to track the signed part of the pattern.

Diffusion mentioned this in rL336911: [InstCombine] Fold x & (-1 >> y) != x to x u> (-1 >> y).Jul 12 2018, 8:01 AM

lebedev.ri added inline comments.Jul 12 2018, 8:01 AM

llvm/trunk/test/Transforms/InstCombine/icmp-logical.ll
91	@yamauchi committed in rL336912, thanks!

lebedev.ri mentioned this in D48958: [clang][ubsan] Implicit Cast Sanitizer - integer truncation - clang part.Jul 12 2018, 11:07 AM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

IR/

PatternMatch.h

9 lines

lib/

Transforms/

InstCombine/

InstCombineCompares.cpp

34 lines

test/

Transforms/

InstCombine/

canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll

29 lines

canonicalize-low-bit-mask-and-icmp-eq-to-icmp-ule.ll

43 lines

icmp-logical.ll

6 lines

icmp-mul-zext.ll

9 lines

Diff 155042

llvm/trunk/include/llvm/IR/PatternMatch.h

Show First 20 Lines • Show All 401 Lines • ▼ Show 20 Lines	struct is_sign_mask {
bool isValue(const APInt &C) { return C.isSignMask(); }		bool isValue(const APInt &C) { return C.isSignMask(); }
};		};
/// Match an integer or vector with only the sign bit(s) set.		/// Match an integer or vector with only the sign bit(s) set.
/// For vectors, this includes constants with undefined elements.		/// For vectors, this includes constants with undefined elements.
inline cst_pred_ty<is_sign_mask> m_SignMask() {		inline cst_pred_ty<is_sign_mask> m_SignMask() {
return cst_pred_ty<is_sign_mask>();		return cst_pred_ty<is_sign_mask>();
}		}

		struct is_lowbit_mask {
		bool isValue(const APInt &C) { return C.isMask(); }
		};
		/// Match an integer or vector with only the low bit(s) set.
		/// For vectors, this includes constants with undefined elements.
		inline cst_pred_ty<is_lowbit_mask> m_LowBitMask() {
		return cst_pred_ty<is_lowbit_mask>();
		}

struct is_nan {		struct is_nan {
bool isValue(const APFloat &C) { return C.isNaN(); }		bool isValue(const APFloat &C) { return C.isNaN(); }
};		};
/// Match an arbitrary NaN constant. This includes quiet and signalling nans.		/// Match an arbitrary NaN constant. This includes quiet and signalling nans.
/// For vectors, this includes constants with undefined elements.		/// For vectors, this includes constants with undefined elements.
inline cstfp_pred_ty<is_nan> m_NaN() {		inline cstfp_pred_ty<is_nan> m_NaN() {
return cstfp_pred_ty<is_nan>();		return cstfp_pred_ty<is_nan>();
}		}
▲ Show 20 Lines • Show All 1,341 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp

Show First 20 Lines • Show All 2,860 Lines • ▼ Show 20 Lines	if (GetElementPtrInst *GEP =
return Res;		return Res;
}		}
break;		break;
}		}

return nullptr;		return nullptr;
}		}

		/// Some comparisons can be simplified.
		/// In this case, we are looking for comparisons that look like
		/// a check for a lossy truncation.
		/// Folds:
		/// x & (-1 >> y) SrcPred x to x DstPred (-1 >> y)
		/// The Mask can be a constant, too.
		static Value *foldICmpWithLowBitMaskedVal(ICmpInst &I,
		InstCombiner::BuilderTy &Builder) {
		ICmpInst::Predicate SrcPred;
		Value X, M;
		auto m_Mask = m_CombineOr(m_LShr(m_AllOnes(), m_Value()), m_LowBitMask());
		if (!match(&I, m_c_ICmp(SrcPred,
		m_c_And(m_CombineAnd(m_Mask, m_Value(M)), m_Value(X)),
		m_Deferred(X))))
		return nullptr;

		ICmpInst::Predicate DstPred;
		switch (SrcPred) {
		case ICmpInst::Predicate::ICMP_EQ:
		// x & (-1 >> y) == x -> x u<= (-1 >> y)
		DstPred = ICmpInst::Predicate::ICMP_ULE;
		break;
		// TODO: more folds are possible, https://bugs.llvm.org/show_bug.cgi?id=38123
		default:
		return nullptr;
		}

		return Builder.CreateICmp(DstPred, X, M);
		}

/// Try to fold icmp (binop), X or icmp X, (binop).		/// Try to fold icmp (binop), X or icmp X, (binop).
/// TODO: A large part of this logic is duplicated in InstSimplify's		/// TODO: A large part of this logic is duplicated in InstSimplify's
/// simplifyICmpWithBinOp(). We should be able to share that and avoid the code		/// simplifyICmpWithBinOp(). We should be able to share that and avoid the code
/// duplication.		/// duplication.
Instruction *InstCombiner::foldICmpBinOp(ICmpInst &I) {		Instruction *InstCombiner::foldICmpBinOp(ICmpInst &I) {
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

// Special logic for binary operators.		// Special logic for binary operators.
▲ Show 20 Lines • Show All 321 Lines • ▼ Show 20 Lines	if (BO0) {
auto BitwiseAnd = m_c_And(m_Value(), LSubOne);		auto BitwiseAnd = m_c_And(m_Value(), LSubOne);

if (match(BO0, BitwiseAnd) && Pred == ICmpInst::ICMP_ULT) {		if (match(BO0, BitwiseAnd) && Pred == ICmpInst::ICMP_ULT) {
auto *Zero = Constant::getNullValue(BO0->getType());		auto *Zero = Constant::getNullValue(BO0->getType());
return new ICmpInst(ICmpInst::ICMP_NE, Op1, Zero);		return new ICmpInst(ICmpInst::ICMP_NE, Op1, Zero);
}		}
}		}

		if (Value *V = foldICmpWithLowBitMaskedVal(I, Builder))
		return replaceInstUsesWith(I, V);

return nullptr;		return nullptr;
}		}

/// Fold icmp Pred min\|max(X, Y), X.		/// Fold icmp Pred min\|max(X, Y), X.
static Instruction *foldICmpWithMinMax(ICmpInst &Cmp) {		static Instruction *foldICmpWithMinMax(ICmpInst &Cmp) {
ICmpInst::Predicate Pred = Cmp.getPredicate();		ICmpInst::Predicate Pred = Cmp.getPredicate();
Value *Op0 = Cmp.getOperand(0);		Value *Op0 = Cmp.getOperand(0);
Value *X = Cmp.getOperand(1);		Value *X = Cmp.getOperand(1);
▲ Show 20 Lines • Show All 1,487 Lines • ▼ Show 20 Lines	if (I.getPredicate() == ICmpInst::ICMP_EQ)
// icmp X+Cst, X		// icmp X+Cst, X
if (match(Op0, m_Add(m_Value(X), m_ConstantInt(Cst))) && Op1 == X)		if (match(Op0, m_Add(m_Value(X), m_ConstantInt(Cst))) && Op1 == X)
return foldICmpAddOpConst(X, Cst, I.getPredicate());		return foldICmpAddOpConst(X, Cst, I.getPredicate());

// icmp X, X+Cst		// icmp X, X+Cst
if (match(Op1, m_Add(m_Value(X), m_ConstantInt(Cst))) && Op0 == X)		if (match(Op1, m_Add(m_Value(X), m_ConstantInt(Cst))) && Op0 == X)
return foldICmpAddOpConst(X, Cst, I.getSwappedPredicate());		return foldICmpAddOpConst(X, Cst, I.getSwappedPredicate());
}		}

return Changed ? &I : nullptr;		return Changed ? &I : nullptr;
}		}

/// Fold fcmp ([us]itofp x, cst) if possible.		/// Fold fcmp ([us]itofp x, cst) if possible.
Instruction InstCombiner::foldFCmpIntToFPConst(FCmpInst &I, Instruction LHSI,		Instruction InstCombiner::foldFCmpIntToFPConst(FCmpInst &I, Instruction LHSI,
Constant *RHSC) {		Constant *RHSC) {
if (!isa<ConstantFP>(RHSC)) return nullptr;		if (!isa<ConstantFP>(RHSC)) return nullptr;
const APFloat &RHS = cast<ConstantFP>(RHSC)->getValueAPF();		const APFloat &RHS = cast<ConstantFP>(RHSC)->getValueAPF();
▲ Show 20 Lines • Show All 428 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/canonicalize-constant-low-bit-mask-and-icmp-eq-to-icmp-ule.ll

	Show All 9 Lines
	; Iff: isPowerOf2(C + 1)			; Iff: isPowerOf2(C + 1)

	; ============================================================================ ;			; ============================================================================ ;
	; Basic positive tests			; Basic positive tests
	; ============================================================================ ;			; ============================================================================ ;

	define i1 @p0(i8 %x) {			define i1 @p0(i8 %x) {
	; CHECK-LABEL: @p0(			; CHECK-LABEL: @p0(
	; CHECK-NEXT: [[TMP0:%.]] = and i8 [[X:%.]], 3			; CHECK-NEXT: [[TMP1:%.]] = icmp ult i8 [[X:%.]], 4
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[TMP0]], [[X]]			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[RET]]
	;			;
	%tmp0 = and i8 %x, 3			%tmp0 = and i8 %x, 3
	%ret = icmp eq i8 %tmp0, %x			%ret = icmp eq i8 %tmp0, %x
	ret i1 %ret			ret i1 %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; Vector tests			; Vector tests
	; ============================================================================ ;			; ============================================================================ ;

	define <2 x i1> @p1_vec_splat(<2 x i8> %x) {			define <2 x i1> @p1_vec_splat(<2 x i8> %x) {
	; CHECK-LABEL: @p1_vec_splat(			; CHECK-LABEL: @p1_vec_splat(
	; CHECK-NEXT: [[TMP0:%.]] = and <2 x i8> [[X:%.]], <i8 3, i8 3>			; CHECK-NEXT: [[TMP1:%.]] = icmp ult <2 x i8> [[X:%.]], <i8 4, i8 4>
	; CHECK-NEXT: [[RET:%.*]] = icmp eq <2 x i8> [[TMP0]], [[X]]			; CHECK-NEXT: ret <2 x i1> [[TMP1]]
	; CHECK-NEXT: ret <2 x i1> [[RET]]
	;			;
	%tmp0 = and <2 x i8> %x, <i8 3, i8 3>			%tmp0 = and <2 x i8> %x, <i8 3, i8 3>
	%ret = icmp eq <2 x i8> %tmp0, %x			%ret = icmp eq <2 x i8> %tmp0, %x
	ret <2 x i1> %ret			ret <2 x i1> %ret
	}			}

	define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {			define <2 x i1> @p2_vec_nonsplat(<2 x i8> %x) {
	; CHECK-LABEL: @p2_vec_nonsplat(			; CHECK-LABEL: @p2_vec_nonsplat(
	; CHECK-NEXT: [[TMP0:%.]] = and <2 x i8> [[X:%.]], <i8 3, i8 15>			; CHECK-NEXT: [[TMP1:%.]] = icmp ult <2 x i8> [[X:%.]], <i8 4, i8 16>
	; CHECK-NEXT: [[RET:%.*]] = icmp eq <2 x i8> [[TMP0]], [[X]]			; CHECK-NEXT: ret <2 x i1> [[TMP1]]
	; CHECK-NEXT: ret <2 x i1> [[RET]]
	;			;
	%tmp0 = and <2 x i8> %x, <i8 3, i8 15> ; doesn't have to be splat.			%tmp0 = and <2 x i8> %x, <i8 3, i8 15> ; doesn't have to be splat.
	%ret = icmp eq <2 x i8> %tmp0, %x			%ret = icmp eq <2 x i8> %tmp0, %x
	ret <2 x i1> %ret			ret <2 x i1> %ret
	}			}

	define <3 x i1> @p3_vec_splat_undef(<3 x i8> %x) {			define <3 x i1> @p3_vec_splat_undef(<3 x i8> %x) {
	; CHECK-LABEL: @p3_vec_splat_undef(			; CHECK-LABEL: @p3_vec_splat_undef(
	; CHECK-NEXT: [[TMP0:%.]] = and <3 x i8> [[X:%.]], <i8 3, i8 undef, i8 3>			; CHECK-NEXT: [[TMP1:%.]] = icmp ult <3 x i8> [[X:%.]], <i8 4, i8 undef, i8 4>
	; CHECK-NEXT: [[RET:%.*]] = icmp eq <3 x i8> [[TMP0]], [[X]]			; CHECK-NEXT: ret <3 x i1> [[TMP1]]
	; CHECK-NEXT: ret <3 x i1> [[RET]]
	;			;
	%tmp0 = and <3 x i8> %x, <i8 3, i8 undef, i8 3>			%tmp0 = and <3 x i8> %x, <i8 3, i8 undef, i8 3>
	%ret = icmp eq <3 x i8> %tmp0, %x			%ret = icmp eq <3 x i8> %tmp0, %x
	ret <3 x i1> %ret			ret <3 x i1> %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; Commutativity tests.			; Commutativity tests.
	; ============================================================================ ;			; ============================================================================ ;

	declare i8 @gen8()			declare i8 @gen8()

	define i1 @c0() {			define i1 @c0() {
	; CHECK-LABEL: @c0(			; CHECK-LABEL: @c0(
	; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()			; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()
	; CHECK-NEXT: [[TMP0:%.*]] = and i8 [[X]], 3			; CHECK-NEXT: [[TMP1:%.*]] = icmp ult i8 [[X]], 4
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[X]], [[TMP0]]			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[RET]]
	;			;
	%x = call i8 @gen8()			%x = call i8 @gen8()
	%tmp0 = and i8 %x, 3			%tmp0 = and i8 %x, 3
	%ret = icmp eq i8 %x, %tmp0 ; swapped order			%ret = icmp eq i8 %x, %tmp0 ; swapped order
	ret i1 %ret			ret i1 %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; One-use tests. We don't care about multi-uses here.			; One-use tests. We don't care about multi-uses here.
	; ============================================================================ ;			; ============================================================================ ;

	declare void @use8(i8)			declare void @use8(i8)

	define i1 @oneuse0(i8 %x) {			define i1 @oneuse0(i8 %x) {
	; CHECK-LABEL: @oneuse0(			; CHECK-LABEL: @oneuse0(
	; CHECK-NEXT: [[TMP0:%.]] = and i8 [[X:%.]], 3			; CHECK-NEXT: [[TMP0:%.]] = and i8 [[X:%.]], 3
	; CHECK-NEXT: call void @use8(i8 [[TMP0]])			; CHECK-NEXT: call void @use8(i8 [[TMP0]])
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[TMP0]], [[X]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ult i8 [[X]], 4
	; CHECK-NEXT: ret i1 [[RET]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%tmp0 = and i8 %x, 3			%tmp0 = and i8 %x, 3
	call void @use8(i8 %tmp0)			call void @use8(i8 %tmp0)
	%ret = icmp eq i8 %tmp0, %x			%ret = icmp eq i8 %tmp0, %x
	ret i1 %ret			ret i1 %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	Show All 35 Lines

llvm/trunk/test/Transforms/InstCombine/canonicalize-low-bit-mask-and-icmp-eq-to-icmp-ule.ll

	Show All 9 Lines

	; ============================================================================ ;			; ============================================================================ ;
	; Basic positive tests			; Basic positive tests
	; ============================================================================ ;			; ============================================================================ ;

	define i1 @p0(i8 %x, i8 %y) {			define i1 @p0(i8 %x, i8 %y) {
	; CHECK-LABEL: @p0(			; CHECK-LABEL: @p0(
	; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = and i8 [[TMP0]], [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = icmp uge i8 [[TMP0]], [[X:%.]]
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[TMP1]], [[X]]			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[RET]]
	;			;
	%tmp0 = lshr i8 -1, %y			%tmp0 = lshr i8 -1, %y
	%tmp1 = and i8 %tmp0, %x			%tmp1 = and i8 %tmp0, %x
	%ret = icmp eq i8 %tmp1, %x			%ret = icmp eq i8 %tmp1, %x
	ret i1 %ret			ret i1 %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; Vector tests			; Vector tests
	; ============================================================================ ;			; ============================================================================ ;

	define <2 x i1> @p1_vec(<2 x i8> %x, <2 x i8> %y) {			define <2 x i1> @p1_vec(<2 x i8> %x, <2 x i8> %y) {
	; CHECK-LABEL: @p1_vec(			; CHECK-LABEL: @p1_vec(
	; CHECK-NEXT: [[TMP0:%.]] = lshr <2 x i8> <i8 -1, i8 -1>, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr <2 x i8> <i8 -1, i8 -1>, [[Y:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = and <2 x i8> [[TMP0]], [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = icmp uge <2 x i8> [[TMP0]], [[X:%.]]
	; CHECK-NEXT: [[RET:%.*]] = icmp eq <2 x i8> [[TMP1]], [[X]]			; CHECK-NEXT: ret <2 x i1> [[TMP1]]
	; CHECK-NEXT: ret <2 x i1> [[RET]]
	;			;
	%tmp0 = lshr <2 x i8> <i8 -1, i8 -1>, %y			%tmp0 = lshr <2 x i8> <i8 -1, i8 -1>, %y
	%tmp1 = and <2 x i8> %tmp0, %x			%tmp1 = and <2 x i8> %tmp0, %x
	%ret = icmp eq <2 x i8> %tmp1, %x			%ret = icmp eq <2 x i8> %tmp1, %x
	ret <2 x i1> %ret			ret <2 x i1> %ret
	}			}

	define <3 x i1> @p2_vec_undef(<3 x i8> %x, <3 x i8> %y) {			define <3 x i1> @p2_vec_undef(<3 x i8> %x, <3 x i8> %y) {
	; CHECK-LABEL: @p2_vec_undef(			; CHECK-LABEL: @p2_vec_undef(
	; CHECK-NEXT: [[TMP0:%.]] = lshr <3 x i8> <i8 -1, i8 undef, i8 -1>, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr <3 x i8> <i8 -1, i8 undef, i8 -1>, [[Y:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = and <3 x i8> [[TMP0]], [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = icmp uge <3 x i8> [[TMP0]], [[X:%.]]
	; CHECK-NEXT: [[RET:%.*]] = icmp eq <3 x i8> [[TMP1]], [[X]]			; CHECK-NEXT: ret <3 x i1> [[TMP1]]
	; CHECK-NEXT: ret <3 x i1> [[RET]]
	;			;
	%tmp0 = lshr <3 x i8> <i8 -1, i8 undef, i8 -1>, %y			%tmp0 = lshr <3 x i8> <i8 -1, i8 undef, i8 -1>, %y
	%tmp1 = and <3 x i8> %tmp0, %x			%tmp1 = and <3 x i8> %tmp0, %x
	%ret = icmp eq <3 x i8> %tmp1, %x			%ret = icmp eq <3 x i8> %tmp1, %x
	ret <3 x i1> %ret			ret <3 x i1> %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; Commutativity tests.			; Commutativity tests.
	; ============================================================================ ;			; ============================================================================ ;

	declare i8 @gen8()			declare i8 @gen8()

	define i1 @c0(i8 %y) {			define i1 @c0(i8 %y) {
	; CHECK-LABEL: @c0(			; CHECK-LABEL: @c0(
	; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]
	; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()			; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()
	; CHECK-NEXT: [[TMP1:%.*]] = and i8 [[X]], [[TMP0]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ule i8 [[X]], [[TMP0]]
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[TMP1]], [[X]]			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[RET]]
	;			;
	%tmp0 = lshr i8 -1, %y			%tmp0 = lshr i8 -1, %y
	%x = call i8 @gen8()			%x = call i8 @gen8()
	%tmp1 = and i8 %x, %tmp0 ; swapped order			%tmp1 = and i8 %x, %tmp0 ; swapped order
	%ret = icmp eq i8 %tmp1, %x			%ret = icmp eq i8 %tmp1, %x
	ret i1 %ret			ret i1 %ret
	}			}

	define i1 @c1(i8 %y) {			define i1 @c1(i8 %y) {
	; CHECK-LABEL: @c1(			; CHECK-LABEL: @c1(
	; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]
	; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()			; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()
	; CHECK-NEXT: [[TMP1:%.*]] = and i8 [[TMP0]], [[X]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ule i8 [[X]], [[TMP0]]
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[X]], [[TMP1]]			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[RET]]
	;			;
	%tmp0 = lshr i8 -1, %y			%tmp0 = lshr i8 -1, %y
	%x = call i8 @gen8()			%x = call i8 @gen8()
	%tmp1 = and i8 %tmp0, %x			%tmp1 = and i8 %tmp0, %x
	%ret = icmp eq i8 %x, %tmp1 ; swapped order			%ret = icmp eq i8 %x, %tmp1 ; swapped order
	ret i1 %ret			ret i1 %ret
	}			}

	define i1 @c2(i8 %y) {			define i1 @c2(i8 %y) {
	; CHECK-LABEL: @c2(			; CHECK-LABEL: @c2(
	; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]
	; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()			; CHECK-NEXT: [[X:%.*]] = call i8 @gen8()
	; CHECK-NEXT: [[TMP1:%.*]] = and i8 [[X]], [[TMP0]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ule i8 [[X]], [[TMP0]]
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[X]], [[TMP1]]			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[RET]]
	;			;
	%tmp0 = lshr i8 -1, %y			%tmp0 = lshr i8 -1, %y
	%x = call i8 @gen8()			%x = call i8 @gen8()
	%tmp1 = and i8 %x, %tmp0 ; swapped order			%tmp1 = and i8 %x, %tmp0 ; swapped order
	%ret = icmp eq i8 %x, %tmp1 ; swapped order			%ret = icmp eq i8 %x, %tmp1 ; swapped order
	ret i1 %ret			ret i1 %ret
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; One-use tests. We don't care about multi-uses here.			; One-use tests. We don't care about multi-uses here.
	; ============================================================================ ;			; ============================================================================ ;

	declare void @use8(i8)			declare void @use8(i8)

	define i1 @oneuse0(i8 %x, i8 %y) {			define i1 @oneuse0(i8 %x, i8 %y) {
	; CHECK-LABEL: @oneuse0(			; CHECK-LABEL: @oneuse0(
	; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]
	; CHECK-NEXT: call void @use8(i8 [[TMP0]])			; CHECK-NEXT: call void @use8(i8 [[TMP0]])
	; CHECK-NEXT: [[TMP1:%.]] = and i8 [[TMP0]], [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = icmp uge i8 [[TMP0]], [[X:%.]]
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[TMP1]], [[X]]			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[RET]]
	;			;
	%tmp0 = lshr i8 -1, %y			%tmp0 = lshr i8 -1, %y
	call void @use8(i8 %tmp0)			call void @use8(i8 %tmp0)
	%tmp1 = and i8 %tmp0, %x			%tmp1 = and i8 %tmp0, %x
	%ret = icmp eq i8 %tmp1, %x			%ret = icmp eq i8 %tmp1, %x
	ret i1 %ret			ret i1 %ret
	}			}

	define i1 @oneuse1(i8 %x, i8 %y) {			define i1 @oneuse1(i8 %x, i8 %y) {
	; CHECK-LABEL: @oneuse1(			; CHECK-LABEL: @oneuse1(
	; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = and i8 [[TMP0]], [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = and i8 [[TMP0]], [[X:%.]]
	; CHECK-NEXT: call void @use8(i8 [[TMP1]])			; CHECK-NEXT: call void @use8(i8 [[TMP1]])
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[TMP1]], [[X]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp uge i8 [[TMP0]], [[X]]
	; CHECK-NEXT: ret i1 [[RET]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%tmp0 = lshr i8 -1, %y			%tmp0 = lshr i8 -1, %y
	%tmp1 = and i8 %tmp0, %x			%tmp1 = and i8 %tmp0, %x
	call void @use8(i8 %tmp1)			call void @use8(i8 %tmp1)
	%ret = icmp eq i8 %tmp1, %x			%ret = icmp eq i8 %tmp1, %x
	ret i1 %ret			ret i1 %ret
	}			}

	define i1 @oneuse2(i8 %x, i8 %y) {			define i1 @oneuse2(i8 %x, i8 %y) {
	; CHECK-LABEL: @oneuse2(			; CHECK-LABEL: @oneuse2(
	; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]			; CHECK-NEXT: [[TMP0:%.]] = lshr i8 -1, [[Y:%.]]
	; CHECK-NEXT: call void @use8(i8 [[TMP0]])			; CHECK-NEXT: call void @use8(i8 [[TMP0]])
	; CHECK-NEXT: [[TMP1:%.]] = and i8 [[TMP0]], [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = and i8 [[TMP0]], [[X:%.]]
	; CHECK-NEXT: call void @use8(i8 [[TMP1]])			; CHECK-NEXT: call void @use8(i8 [[TMP1]])
	; CHECK-NEXT: [[RET:%.*]] = icmp eq i8 [[TMP1]], [[X]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp uge i8 [[TMP0]], [[X]]
	; CHECK-NEXT: ret i1 [[RET]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%tmp0 = lshr i8 -1, %y			%tmp0 = lshr i8 -1, %y
	call void @use8(i8 %tmp0)			call void @use8(i8 %tmp0)
	%tmp1 = and i8 %tmp0, %x			%tmp1 = and i8 %tmp0, %x
	call void @use8(i8 %tmp1)			call void @use8(i8 %tmp1)
	%ret = icmp eq i8 %tmp1, %x			%ret = icmp eq i8 %tmp1, %x
	ret i1 %ret			ret i1 %ret
	}			}
	Show All 17 Lines

llvm/trunk/test/Transforms/InstCombine/icmp-logical.ll

Show First 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	;
%mask2 = and i32 %A, 39		%mask2 = and i32 %A, 39
%tst2 = icmp ne i32 %mask2, %A		%tst2 = icmp ne i32 %mask2, %A
%res = and i1 %tmp0, %tst2		%res = and i1 %tmp0, %tst2
ret i1 %res		ret i1 %res
}		}

define i1 @masked_or_A(i32 %A) {		define i1 @masked_or_A(i32 %A) {
; CHECK-LABEL: @masked_or_A(		; CHECK-LABEL: @masked_or_A(
; CHECK-NEXT: [[MASK2:%.]] = and i32 [[A:%.]], 39		; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[A:%.]], 8
		hjyamauchiUnsubmitted Not Done Reply Inline Actions It seems like the simplification of this change (D49179) triggers before this original simplification triggers and the original simplification no longer triggers? My guess is that this test just means to test a plain or-case and it may make sense to use some other values like 14 and 78 (shifted left by 1 bit, instead of 7 and 39) and preserve the original intention of the test. Note the next test @masked_or_A_slightly_optimized has the same code as the after-simplification code of this function. Not sure if it is a just a coincidence. hjyamauchi: It seems like the simplification of this change (D49179) triggers before this original…
		lebedev.riAuthorUnsubmitted Not Done Reply Inline Actions and the original simplification no longer triggers? In any case, the original simplification clearly does not handle this pattern. My guess is that this test just means to test a plain or-case and it may make sense to use some other values like 14 and 78 (shifted left by 1 bit, instead of 7 and 39) and preserve the original intention of the test. Hmm, thanks, might be a good idea.. Note the next test @masked_or_A_slightly_optimized has the same code as the after-simplification code of this function. Not sure if it is a just a coincidence. I have added `@masked_or_A_slightly_optimized` in rL336784 after noticing this regression, to have a test regardless. lebedev.ri: > and the original simplification no longer triggers? In any case, the original simplification…
		lebedev.riAuthorUnsubmitted Not Done Reply Inline Actions @yamauchi committed in rL336912, thanks! lebedev.ri: @yamauchi committed in rL336912, thanks!
		; CHECK-NEXT: [[MASK2:%.*]] = and i32 [[A]], 39
; CHECK-NEXT: [[TST2:%.*]] = icmp eq i32 [[MASK2]], [[A]]		; CHECK-NEXT: [[TST2:%.*]] = icmp eq i32 [[MASK2]], [[A]]
; CHECK-NEXT: ret i1 [[TST2]]		; CHECK-NEXT: [[RES:%.*]] = or i1 [[TMP1]], [[TST2]]
		; CHECK-NEXT: ret i1 [[RES]]
;		;
%mask1 = and i32 %A, 7		%mask1 = and i32 %A, 7
%tst1 = icmp eq i32 %mask1, %A		%tst1 = icmp eq i32 %mask1, %A
%mask2 = and i32 %A, 39		%mask2 = and i32 %A, 39
%tst2 = icmp eq i32 %mask2, %A		%tst2 = icmp eq i32 %mask2, %A
%res = or i1 %tst1, %tst2		%res = or i1 %tst1, %tst2
ret i1 %res		ret i1 %res
}		}
▲ Show 20 Lines • Show All 809 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/icmp-mul-zext.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -instcombine -S \| FileCheck %s			; RUN: opt < %s -instcombine -S \| FileCheck %s

	define i32 @sterix(i32, i8, i64) {			define i32 @sterix(i32, i8, i64) {
	; CHECK-LABEL: @sterix(			; CHECK-LABEL: @sterix(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CONV:%.]] = zext i32 [[TMP0:%.]] to i64			; CHECK-NEXT: [[CONV:%.]] = zext i32 [[TMP0:%.]] to i64
	; CHECK-NEXT: [[CONV1:%.]] = sext i8 [[TMP1:%.]] to i32			; CHECK-NEXT: [[CONV1:%.]] = sext i8 [[TMP1:%.]] to i32
	; CHECK-NEXT: [[MUL:%.*]] = mul i32 [[CONV1]], 1945964878			; CHECK-NEXT: [[MUL:%.*]] = mul i32 [[CONV1]], 1945964878
	; CHECK-NEXT: [[SH_PROM:%.]] = trunc i64 [[TMP2:%.]] to i32			; CHECK-NEXT: [[SH_PROM:%.]] = trunc i64 [[TMP2:%.]] to i32
	; CHECK-NEXT: [[SHR:%.*]] = lshr i32 [[MUL]], [[SH_PROM]]			; CHECK-NEXT: [[SHR:%.*]] = lshr i32 [[MUL]], [[SH_PROM]]
	; CHECK-NEXT: [[CONV2:%.*]] = zext i32 [[SHR]] to i64			; CHECK-NEXT: [[CONV2:%.*]] = zext i32 [[SHR]] to i64
	; CHECK-NEXT: [[MUL3:%.*]] = mul nuw nsw i64 [[CONV]], [[CONV2]]			; CHECK-NEXT: [[MUL3:%.*]] = mul nuw nsw i64 [[CONV]], [[CONV2]]
	; CHECK-NEXT: [[CONV6:%.*]] = and i64 [[MUL3]], 4294967295			; CHECK-NEXT: [[TMP3:%.*]] = icmp ugt i64 [[MUL3]], 4294967295
	; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq i64 [[CONV6]], [[MUL3]]			; CHECK-NEXT: br i1 [[TMP3]], label [[LOR_END:%.]], label [[LOR_RHS:%.]]
	; CHECK-NEXT: br i1 [[TOBOOL]], label [[LOR_RHS:%.]], label [[LOR_END:%.]]
	; CHECK: lor.rhs:			; CHECK: lor.rhs:
	; CHECK-NEXT: [[AND:%.*]] = and i64 [[MUL3]], [[TMP2]]			; CHECK-NEXT: [[AND:%.*]] = and i64 [[MUL3]], [[TMP2]]
	; CHECK-NEXT: [[CONV4:%.*]] = trunc i64 [[AND]] to i32			; CHECK-NEXT: [[CONV4:%.*]] = trunc i64 [[AND]] to i32
	; CHECK-NEXT: [[TOBOOL7:%.*]] = icmp eq i32 [[CONV4]], 0			; CHECK-NEXT: [[TOBOOL7:%.*]] = icmp eq i32 [[CONV4]], 0
	; CHECK-NEXT: [[PHITMP:%.*]] = zext i1 [[TOBOOL7]] to i32			; CHECK-NEXT: [[PHITMP:%.*]] = zext i1 [[TOBOOL7]] to i32
	; CHECK-NEXT: br label [[LOR_END]]			; CHECK-NEXT: br label [[LOR_END]]
	; CHECK: lor.end:			; CHECK: lor.end:
	; CHECK-NEXT: [[TMP3:%.]] = phi i32 [ 1, [[ENTRY:%.]] ], [ [[PHITMP]], [[LOR_RHS]] ]			; CHECK-NEXT: [[TMP4:%.]] = phi i32 [ 1, [[ENTRY:%.]] ], [ [[PHITMP]], [[LOR_RHS]] ]
	; CHECK-NEXT: ret i32 [[TMP3]]			; CHECK-NEXT: ret i32 [[TMP4]]
	;			;
	entry:			entry:
	%conv = zext i32 %0 to i64			%conv = zext i32 %0 to i64
	%conv1 = sext i8 %1 to i32			%conv1 = sext i8 %1 to i32
	%mul = mul i32 %conv1, 1945964878			%mul = mul i32 %conv1, 1945964878
	%sh_prom = trunc i64 %2 to i32			%sh_prom = trunc i64 %2 to i32
	%shr = lshr i32 %mul, %sh_prom			%shr = lshr i32 %mul, %sh_prom
	%conv2 = zext i32 %shr to i64			%conv2 = zext i32 %shr to i64
	▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines