This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
4/6
CorrelatedValuePropagation.cpp
-
test/Transforms/
-
Transforms/
-
CorrelatedValuePropagation/
-
lshr.ll
-
PhaseOrdering/
-
udiv-urem-instcombine-vs-cvp.ll

Differential D47113

[CVP] Teach CorrelatedValuePropagation to reduce the width of lshr instruction.
AbandonedPublic

Authored by lebedev.ri on May 19 2018, 4:45 PM.

Download Raw Diff

Details

Reviewers

spatel
bixia
jlebar
sanjoy
anna
davide
reames

Summary

Counter-proposal to D46760.
I suppose, continuation of D44102.

If the second operand of udiv/urem is power-of-two,
instcombine will transform that into lshr/and,
and CVP does not handle them.
https://godbolt.org/g/hhT9bc

Do note that teaching CVP to only handle the lshr width
reduction is already sufficient to replace D46760,
since it reduces use count of zext,
thus instcombine is able to propagate it.

I have looked into teaching CVP about and handling,
and it will be more complicated.

https://rise4fun.com/Alive/zfP

Diff Detail

Repository: rL LLVM

Event Timeline

lebedev.ri created this revision.May 19 2018, 4:45 PM

Rebased ontop of parent differentials.

lebedev.ri added parent revisions: D47112: [CVP] Add tests for lshr width reduction, D47117: [CVP][NFCI] Refactor truncated binary operator creation out of processUDivOrURem().May 20 2018, 2:12 AM

lebedev.ri edited the summary of this revision. (Show Details)May 20 2018, 2:54 AM

Add one more run-line to the phase ordering test to demonstrate
how instcombine is able to cleanup after CVP.

Do note that teaching CVP to only handle the lshr width
reduction is already sufficient to replace D46760,
since it reduces use count of zext,
thus instcombine is able to propagate it.

I have looked into teaching CVP about and handling,
and it will be more complicated.

Seems reasonable to me. I dunno if this solves @bixia's problem or not, but even if it doesn't, seems reasonable...

Would like a CVP person to approve, though.

lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
604	getConstantRange doesn't do the right thing when Value is a constant?
613	I don't think this is sufficient to ensure that RHS.getZExtValue() below doesn't assert? (For example, RHS could be an int128, as I read the langref.)

Fix handling of large shifts.

@jlebar thank you for looking at this!

lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
604	Hm, it seems it does. I suppose this was a premature optimization :)
613	Nice catch!

Actually attach actually updated diff this time :)

InstCombiner::visitLShr can perform the same transformation for the cases where correlated-value propagation is not needed to discover the range of the values.
However, unlike the transformation here, InstCombiner::visitLShr carefully makes sure that the transformation won't increase the total number of ZExt/ZExt instructions (Op1.hasOneUse check). Why the transformation here doesn't need similar check? Is it safe to remove such a check in InstCombiner::visitLShr?

In D47113#1106126, @bixia wrote:

Why the transformation here doesn't need similar check?

I would like that design criteria to be documented somewhere, too.

In D47113#1106126, @bixia wrote:

Is it safe to remove such a check in InstCombiner::visitLShr?

No:

In D46760#1105919, @spatel wrote:

Re: instcombine - if some transform is increasing instruction count by not checking for uses, that's a bug. I think we established that conclusively in D44266.

InstCombiner::visitShl performs similar narrowing without checking user count and can increase the total number to ZExt instructions.

In D47113#1106541, @bixia wrote:

InstCombiner::visitShl performs similar narrowing without checking user count and can increase the total number to ZExt instructions.

Sounds like a bug then, per @spatel's comment quoted in https://reviews.llvm.org/D47113#1106261?

I think this patch requires discussion on llvm-dev so everyone is clear on the direction and outcome:

We're saying that CVP (a target-independent IR canonicalization pass) will always try to narrow the width of binops.
It doesn't matter if that means increasing cast instruction count.
It doesn't matter if the narrow type is not legal in the target's datalayout (as long as we have a power-of-2).

The fact that we're dealing with 1 opcode at a time is only because we're not considering the general pattern and consequences.
We overlooked these questions in the specific case of div/rem (D44102) because we assumed that narrower div/rem are always better for analysis and codegen, but I doubt that we can extend that reasoning to all binops on all targets. This will fight with transforms in instcombine (see canEvaluateSExtd / canEvaluateZExtd).

Minor code/style comments only. Leaving the broader discussion to engaged parties.

lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
603	minor: second operand
648	You're repeating a pattern which is already there, but we should probably introduce an iteration outside the processX functions for this.

dmgreen added a subscriber: dmgreen.May 22 2018, 9:02 AM

In D47113#1106601, @spatel wrote:

I think this patch requires discussion on llvm-dev so everyone is clear on the direction and outcome:

http://lists.llvm.org/pipermail/llvm-dev/2018-May/123534.html

So yes, I ran some quick benchmarks and I believe this will cause regressions in some circumstances. In one case I looked at (which is running under our special LTO pipeline and may be a little difficult to replicate), we start off with this:

%shr = lshr i32 %sub, 6
%arrayidx = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr

This is turned into:

%shr.lhs.trunc = trunc i32 %sub to i16
%shr.rhs.trunc = trunc i32 6 to i16
%shr = lshr i16 %shr.lhs.trunc, %shr.rhs.trunc
%shr.zext = zext i16 %shr to i32
%arrayidx = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr.zext

Which gets turned right back into:

%shr = lshr i32 %sub, 6
%shr.zext = and i32 %shr, 1023
%arrayidx11 = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr.zext

I think extra And node will, under most circumstances, be removed during isel. But here this is part of a loop, and the extra cost causes us to go over the loop unroll threshold, so the loop is no longer fully unrolled.

Another case on v6m (thumb1only) looks more like a simple extra instruction in the final assembly. In either case the extra And 1023 seems to only be causing trouble.

I'm running some more benchmarks and will see what happens on other cores/benchmarks.

In D47113#1111060, @dmgreen wrote:

So yes, I ran some quick benchmarks

Thanks!

In D47113#1111060, @dmgreen wrote:
and I believe this will cause regressions in some circumstances. In one case I looked at (which is running under our special LTO pipeline and may be a little difficult to replicate), we start off with this:
%shr = lshr i32 %sub, 6
%arrayidx = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr
This is turned into:
%shr.lhs.trunc = trunc i32 %sub to i16
%shr.rhs.trunc = trunc i32 6 to i16
%shr = lshr i16 %shr.lhs.trunc, %shr.rhs.trunc

In D47113#1111060, @dmgreen wrote:

%shr.zext = zext i16 %shr to i32
%arrayidx = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr.zext

Hmm, i think here we could avoid zext.

In D47113#1111060, @dmgreen wrote:

Which gets turned right back into:

%shr = lshr i32 %sub, 6
%shr.zext = and i32 %shr, 1023
%arrayidx11 = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr.zext

I think extra And node will, under most circumstances, be removed during isel. But here this is part of a loop, and the extra cost causes us to go over the loop unroll threshold, so the loop is no longer fully unrolled.

Another case on v6m (thumb1only) looks more like a simple extra instruction in the final assembly. In either case the extra And 1023 seems to only be causing trouble.

Thank you!
So to not much surprise, @spatel was right in https://reviews.llvm.org/D47113#1106601

This all makes me think we should look in the third direction:

In D46760#1105919, @spatel wrote:

You want to narrow multi-use sequences of code because instcombine can't do that profitably using minimal peepholes:
...
The original motivation for -aggressive-instcombine (TruncInstCombine) was something almost like that - see D38313. Can you extend that? Note that the general problem isn't about udiv/urem, lshr, or any particular binop. It's about narrowing a sequence of arbitrary binops (and maybe even more than binops).

In D47113#1111060, @dmgreen wrote:

I'm running some more benchmarks and will see what happens on other cores/benchmarks.

Once done, could you please post something to the thread, so it too would contain the knowledge?

In D47113#1113249, @lebedev.ri wrote:

In D47113#1111060, @dmgreen wrote:

%shr.zext = zext i16 %shr to i32
%arrayidx = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr.zext

Hmm, i think here we could avoid zext.

Actually, uhm, https://godbolt.org/g/jtNCSp, is the idx always canonicalized to i64?

Sorry, I forgot to mention the important fact that those results were Arm, specifically thumbv8m.baseline on a cortex-m23 (where the pointers are 32bit, and only i32 are legal types). Try this data layout for arm code:
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"

This should show the extra And node being added (although in this case it won't lead to different assembly I believe, just larger IR. isel can remove the And).
https://godbolt.org/g/xjmSPn

The tests I ran above were embedded benchmarks on microcontroller cpus. They are quick and easy enough to run as they tend to contain no noise. The ones that went down were mostly thumb-1 only cores, but the cortex-m7 also showed decreases on the same test. The extra tests I've ran now are a your more standard set of linux benchmarks on cortex-a cores for both arm and aarch64 (spec, the llvm testsuite etc). The only thing that didn't look like noise was drop3 from the BitBench, which got better. Its doing some odd looking bit manipulation in a loop. Playing around a bit the score seems to go up or down depending on which parts of the loop are enabled.

As Sanjay found in http://lists.llvm.org/pipermail/llvm-dev/2018-January/120522.html, converting to illegal types can be beneficial if it leads to extra folds, but is difficult to tell when exactly it will make things better or worse. Reducing the type width from an i64 to an i32 is almost certainly a good thing to do on Arm, but reducing an i32 to an i8 isn't so cut-and-dry. The safe option here would be to only convert to legal types (or perhaps also reducing to types larger than the largest legal type. I'm not sure what that would do in the motivating case).

bixia mentioned this in D46760: [InstCombine] Enhance narrowUDivURem..Jun 5 2018, 10:03 PM

Marking just remove from review queue since discussion appears to have stalled.

This revision now requires changes to proceed.Feb 4 2019, 3:56 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2019, 3:56 PM

lebedev.ri abandoned this revision.Jun 21 2019, 8:52 AM

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

CorrelatedValuePropagation.cpp

57 lines

test/

Transforms/

CorrelatedValuePropagation/

lshr.ll

80 lines

PhaseOrdering/

udiv-urem-instcombine-vs-cvp.ll

15 lines

Diff 147721

lib/Transforms/Scalar/CorrelatedValuePropagation.cpp

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines
STATISTIC(NumSelects, "Number of selects propagated");		STATISTIC(NumSelects, "Number of selects propagated");
STATISTIC(NumMemAccess, "Number of memory access targets propagated");		STATISTIC(NumMemAccess, "Number of memory access targets propagated");
STATISTIC(NumCmps, "Number of comparisons propagated");		STATISTIC(NumCmps, "Number of comparisons propagated");
STATISTIC(NumReturns, "Number of return values propagated");		STATISTIC(NumReturns, "Number of return values propagated");
STATISTIC(NumDeadCases, "Number of switch cases removed");		STATISTIC(NumDeadCases, "Number of switch cases removed");
STATISTIC(NumSDivs, "Number of sdiv converted to udiv");		STATISTIC(NumSDivs, "Number of sdiv converted to udiv");
STATISTIC(NumUDivs, "Number of udivs whose width was decreased");		STATISTIC(NumUDivs, "Number of udivs whose width was decreased");
STATISTIC(NumAShrs, "Number of ashr converted to lshr");		STATISTIC(NumAShrs, "Number of ashr converted to lshr");
		STATISTIC(NumLShrs, "Number of lshrs whose width was decreased");
STATISTIC(NumSRems, "Number of srem converted to urem");		STATISTIC(NumSRems, "Number of srem converted to urem");
STATISTIC(NumOverflows, "Number of overflow checks removed");		STATISTIC(NumOverflows, "Number of overflow checks removed");

static cl::opt<bool> DontProcessAdds("cvp-dont-process-adds", cl::init(true));		static cl::opt<bool> DontProcessAdds("cvp-dont-process-adds", cl::init(true));

namespace {		namespace {

class CorrelatedValuePropagation : public FunctionPass {		class CorrelatedValuePropagation : public FunctionPass {
▲ Show 20 Lines • Show All 501 Lines • ▼ Show 20 Lines	static bool processSDiv(BinaryOperator SDI, LazyValueInfo LVI) {
SDI->eraseFromParent();		SDI->eraseFromParent();

// Try to simplify our new udiv.		// Try to simplify our new udiv.
processUDivOrURem(BO, LVI);		processUDivOrURem(BO, LVI);

return true;		return true;
}		}

		static bool processLShr(BinaryOperator Instr, LazyValueInfo LVI) {
		assert(Instr->getOpcode() == Instruction::LShr);
		if (Instr->getType()->isVectorTy())
		return false;

		const auto OrigWidth = Instr->getType()->getIntegerBitWidth();

		auto getUnsignedMaxVal = [&Instr, &LVI](unsigned Operand) -> APInt {
		BasicBlock *BB = Instr->getParent();
		Value *Value = Instr->getOperand(Operand);
		ConstantRange Range = LVI->getConstantRange(Value, BB, Instr);
		return Range.getUnsignedMax();
		};

		// What is maximal value for the first operand?
		APInt LHS = getUnsignedMaxVal(/Operand=/0);
		// If we don't know the range, don't bother going any further.
		if (LHS.isAllOnesValue())
		return false;
		uint64_t LWidth = LHS.getActiveBits();

		// What is maximal value for the first operand? Might be a constant.
		reamesUnsubmitted Not Done Reply Inline Actions minor: second operand reames: minor: second operand
		APInt RHS = getUnsignedMaxVal(/Operand=/1);
		jlebarUnsubmitted Done Reply Inline Actions getConstantRange doesn't do the right thing when Value is a constant? jlebar: getConstantRange doesn't do the right thing when Value is a constant?
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Hm, it seems it does. I suppose this was a premature optimization :) lebedev.ri: Hm, it seems it does. I suppose this was a premature optimization :)
		// The shift amount needs to be less than the bit width of the instruction.
		// This also handles cases where shift amount is larger-than 64-bit value.
		if (RHS.uge(OrigWidth))
		return false;
		// The shift amount needs to be less than the width of the first operand.
		// Thus, the "width of the second operand" is the shift amount plus one.
		uint64_t RWidth = uint64_t(1) + RHS.getZExtValue();

		// So what is the biggest width?
		jlebarUnsubmitted Done Reply Inline Actions I don't think this is sufficient to ensure that RHS.getZExtValue() below doesn't assert? (For example, RHS could be an int128, as I read the langref.) jlebar: I don't think this is sufficient to ensure that RHS.getZExtValue() below doesn't assert? (For…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Nice catch! lebedev.ri: Nice catch!
		uint64_t NewWidth = std::max<uint64_t>(LWidth, RWidth);
		// Increase it to the next power-of-two.
		NewWidth = PowerOf2Ceil(NewWidth);
		// Don't shrink below 8 bits wide.
		NewWidth = std::max<uint64_t>(NewWidth, 8);

		// NewWidth might be greater than OrigWidth. Don't increase the width.
		if (NewWidth >= OrigWidth)
		return false;

		++NumLShrs;
		auto *BO = CreateTruncatedBO(Instr, NewWidth);
		BO->setIsExact(Instr->isExact());
		Instr->eraseFromParent();

		return true;
		}

static bool processAShr(BinaryOperator SDI, LazyValueInfo LVI) {		static bool processAShr(BinaryOperator SDI, LazyValueInfo LVI) {
if (SDI->getType()->isVectorTy())		if (SDI->getType()->isVectorTy())
return false;		return false;

Constant *Zero = ConstantInt::get(SDI->getType(), 0);		Constant *Zero = ConstantInt::get(SDI->getType(), 0);
if (LVI->getPredicateAt(ICmpInst::ICMP_SGE, SDI->getOperand(0), Zero, SDI) !=		if (LVI->getPredicateAt(ICmpInst::ICMP_SGE, SDI->getOperand(0), Zero, SDI) !=
LazyValueInfo::True)		LazyValueInfo::True)
return false;		return false;

++NumAShrs;		++NumAShrs;
auto *BO = BinaryOperator::CreateLShr(SDI->getOperand(0), SDI->getOperand(1),		auto *BO = BinaryOperator::CreateLShr(SDI->getOperand(0), SDI->getOperand(1),
SDI->getName(), SDI);		SDI->getName(), SDI);
BO->setIsExact(SDI->isExact());		BO->setIsExact(SDI->isExact());
SDI->replaceAllUsesWith(BO);		SDI->replaceAllUsesWith(BO);
SDI->eraseFromParent();		SDI->eraseFromParent();

		// Try to process our new lshr.
		reamesUnsubmitted Not Done Reply Inline Actions You're repeating a pattern which is already there, but we should probably introduce an iteration outside the processX functions for this. reames: You're repeating a pattern which is already there, but we should probably introduce an…
		processLShr(BO, LVI);

return true;		return true;
}		}

static bool processAdd(BinaryOperator AddOp, LazyValueInfo LVI) {		static bool processAdd(BinaryOperator AddOp, LazyValueInfo LVI) {
using OBO = OverflowingBinaryOperator;		using OBO = OverflowingBinaryOperator;

if (DontProcessAdds)		if (DontProcessAdds)
return false;		return false;
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator BI = BB->begin(), BE = BB->end(); BI != BE;) {
break;		break;
case Instruction::SDiv:		case Instruction::SDiv:
BBChanged \|= processSDiv(cast<BinaryOperator>(II), LVI);		BBChanged \|= processSDiv(cast<BinaryOperator>(II), LVI);
break;		break;
case Instruction::UDiv:		case Instruction::UDiv:
case Instruction::URem:		case Instruction::URem:
BBChanged \|= processUDivOrURem(cast<BinaryOperator>(II), LVI);		BBChanged \|= processUDivOrURem(cast<BinaryOperator>(II), LVI);
break;		break;
		case Instruction::LShr:
		BBChanged \|= processLShr(cast<BinaryOperator>(II), LVI);
		break;
case Instruction::AShr:		case Instruction::AShr:
BBChanged \|= processAShr(cast<BinaryOperator>(II), LVI);		BBChanged \|= processAShr(cast<BinaryOperator>(II), LVI);
break;		break;
case Instruction::Add:		case Instruction::Add:
BBChanged \|= processAdd(cast<BinaryOperator>(II), LVI);		BBChanged \|= processAdd(cast<BinaryOperator>(II), LVI);
break;		break;
}		}
}		}
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

test/Transforms/CorrelatedValuePropagation/lshr.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -correlated-propagation -S \| FileCheck %s			; RUN: opt < %s -correlated-propagation -S \| FileCheck %s

	; Tests with constant shift amount			; Tests with constant shift amount

	define void @const_val0(i32 %n) {			define void @const_val0(i32 %n) {
	; CHECK-LABEL: @const_val0(			; CHECK-LABEL: @const_val0(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65534			; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65534
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[N]], 7			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 7 to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp = icmp ule i32 %n, 65534			%cmp = icmp ule i32 %n, 65534
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr i32 %n, 7			%div = lshr i32 %n, 7
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @const_val1(i32 %n) {			define void @const_val1(i32 %n) {
	; CHECK-LABEL: @const_val1(			; CHECK-LABEL: @const_val1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65535			; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65535
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[N]], 7			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 7 to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp = icmp ule i32 %n, 65535			%cmp = icmp ule i32 %n, 65535
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr i32 %n, 7			%div = lshr i32 %n, 7
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @const_val1_variant2(i32 %n) {			define void @const_val1_variant2(i32 %n) {
	; CHECK-LABEL: @const_val1_variant2(			; CHECK-LABEL: @const_val1_variant2(
	; CHECK-NEXT: [[TRUNC:%.]] = and i32 [[N:%.]], 65535			; CHECK-NEXT: [[TRUNC:%.]] = and i32 [[N:%.]], 65535
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[TRUNC]], 7			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[TRUNC]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 7 to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	%trunc = and i32 %n, 65535			%trunc = and i32 %n, 65535
	%div = lshr i32 %trunc, 7			%div = lshr i32 %trunc, 7
	ret void			ret void
	}			}

	define void @const_val2_BAD(i32 %n) {			define void @const_val2_BAD(i32 %n) {
	Show All 20 Lines
	}			}

	define void @const_shift0(i32 %n) {			define void @const_shift0(i32 %n) {
	; CHECK-LABEL: @const_shift0(			; CHECK-LABEL: @const_shift0(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65535			; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65535
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[N]], 15			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 15 to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp = icmp ule i32 %n, 65535			%cmp = icmp ule i32 %n, 65535
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines

	exit:			exit:
	ret void			ret void
	}			}

	define void @largeconst(i256 %n) {			define void @largeconst(i256 %n) {
	; CHECK-LABEL: @largeconst(			; CHECK-LABEL: @largeconst(
	; CHECK-NEXT: [[TRUNCN:%.]] = and i256 [[N:%.]], 1			; CHECK-NEXT: [[TRUNCN:%.]] = and i256 [[N:%.]], 1
	; CHECK-NEXT: [[DIV:%.*]] = lshr i256 [[TRUNCN]], 127			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i256 [[TRUNCN]] to i128
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i256 127 to i128
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i128 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i128 [[DIV1]] to i256
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	%truncn = and i256 %n, 1			%truncn = and i256 %n, 1
	%div = lshr i256 %truncn, 127			%div = lshr i256 %truncn, 127
	ret void			ret void
	}			}

	define void @badlargeconst_overflowhazard(i256 %n) {			define void @badlargeconst_overflowhazard(i256 %n) {
	▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines
	define void @var_val0(i32 %m, i32 %n) {			define void @var_val0(i32 %m, i32 %n) {
	; CHECK-LABEL: @var_val0(			; CHECK-LABEL: @var_val0(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65534			; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65534
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 7			; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 7
	; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]			; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[M]], [[N]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[M]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp1 = icmp ule i32 %m, 65534			%cmp1 = icmp ule i32 %m, 65534
	%cmp2 = icmp ule i32 %n, 7			%cmp2 = icmp ule i32 %n, 7
	%cmp = and i1 %cmp1, %cmp2			%cmp = and i1 %cmp1, %cmp2
	Show All 10 Lines
	define void @var_val1(i32 %m, i32 %n) {			define void @var_val1(i32 %m, i32 %n) {
	; CHECK-LABEL: @var_val1(			; CHECK-LABEL: @var_val1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65535			; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65535
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 7			; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 7
	; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]			; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[M]], [[N]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[M]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp1 = icmp ule i32 %m, 65535			%cmp1 = icmp ule i32 %m, 65535
	%cmp2 = icmp ule i32 %n, 7			%cmp2 = icmp ule i32 %n, 7
	%cmp = and i1 %cmp1, %cmp2			%cmp = and i1 %cmp1, %cmp2
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr i32 %m, %n			%div = lshr i32 %m, %n
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @var_val1_variant2(i32 %m, i32 %n) {			define void @var_val1_variant2(i32 %m, i32 %n) {
	; CHECK-LABEL: @var_val1_variant2(			; CHECK-LABEL: @var_val1_variant2(
	; CHECK-NEXT: [[TRUNCM:%.]] = and i32 [[N:%.]], 65535			; CHECK-NEXT: [[TRUNCM:%.]] = and i32 [[N:%.]], 65535
	; CHECK-NEXT: [[TRUNCN:%.*]] = and i32 [[N]], 7			; CHECK-NEXT: [[TRUNCN:%.*]] = and i32 [[N]], 7
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[TRUNCM]], [[TRUNCN]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[TRUNCM]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 [[TRUNCN]] to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	%truncm = and i32 %n, 65535			%truncm = and i32 %n, 65535
	%truncn = and i32 %n, 7			%truncn = and i32 %n, 7
	%div = lshr i32 %truncm, %truncn			%div = lshr i32 %truncm, %truncn
	ret void			ret void
	}			}

	Show All 27 Lines
	define void @var_shift0(i32 %m, i32 %n) {			define void @var_shift0(i32 %m, i32 %n) {
	; CHECK-LABEL: @var_shift0(			; CHECK-LABEL: @var_shift0(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65535			; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65535
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 15			; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 15
	; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]			; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[M]], [[N]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[M]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp1 = icmp ule i32 %m, 65535			%cmp1 = icmp ule i32 %m, 65535
	%cmp2 = icmp ule i32 %n, 15			%cmp2 = icmp ule i32 %n, 15
	%cmp = and i1 %cmp1, %cmp2			%cmp = and i1 %cmp1, %cmp2
	▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	define void @largevar(i256 %m, i256 %n) {			define void @largevar(i256 %m, i256 %n) {
	; CHECK-LABEL: @largevar(			; CHECK-LABEL: @largevar(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp ule i256 [[M:%.]], 1			; CHECK-NEXT: [[CMP1:%.]] = icmp ule i256 [[M:%.]], 1
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i256 [[N:%.]], 127			; CHECK-NEXT: [[CMP2:%.]] = icmp ule i256 [[N:%.]], 127
	; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]			; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i256 [[M]], [[N]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i256 [[M]] to i128
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i256 [[N]] to i128
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i128 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i128 [[DIV1]] to i256
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp1 = icmp ule i256 %m, 1			%cmp1 = icmp ule i256 %m, 1
	%cmp2 = icmp ule i256 %n, 127			%cmp2 = icmp ule i256 %n, 127
	%cmp = and i1 %cmp1, %cmp2			%cmp = and i1 %cmp1, %cmp2
	▲ Show 20 Lines • Show All 207 Lines • ▼ Show 20 Lines
	; Special tests			; Special tests

	define void @constexact(i32 %n) {			define void @constexact(i32 %n) {
	; CHECK-LABEL: @constexact(			; CHECK-LABEL: @constexact(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65535			; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 65535
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr exact i32 [[N]], 6			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 6 to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr exact i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp = icmp ule i32 %n, 65535			%cmp = icmp ule i32 %n, 65535
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr exact i32 %n, 6			%div = lshr exact i32 %n, 6
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @varexact(i32 %m, i32 %n) {			define void @varexact(i32 %m, i32 %n) {
	; CHECK-LABEL: @varexact(			; CHECK-LABEL: @varexact(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65535			; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 65535
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 15			; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 15
	; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]			; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr exact i32 [[M]], [[N]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[M]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr exact i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp1 = icmp ule i32 %m, 65535			%cmp1 = icmp ule i32 %m, 65535
	%cmp2 = icmp ule i32 %n, 15			%cmp2 = icmp ule i32 %n, 15
	%cmp = and i1 %cmp1, %cmp2			%cmp = and i1 %cmp1, %cmp2
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr exact i32 %m, %n			%div = lshr exact i32 %m, %n
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @const_8bit(i32 %n) {			define void @const_8bit(i32 %n) {
	; CHECK-LABEL: @const_8bit(			; CHECK-LABEL: @const_8bit(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 15			; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 15
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[N]], 1			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[N]] to i8
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 1 to i8
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i8 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i8 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp = icmp ule i32 %n, 15			%cmp = icmp ule i32 %n, 15
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr i32 %n, 1			%div = lshr i32 %n, 1
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @var_8bit(i32 %m, i32 %n) {			define void @var_8bit(i32 %m, i32 %n) {
	; CHECK-LABEL: @var_8bit(			; CHECK-LABEL: @var_8bit(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 15			; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 15
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 1			; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 1
	; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]			; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[M]], [[N]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[M]] to i8
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 [[N]] to i8
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i8 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i8 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp1 = icmp ule i32 %m, 15			%cmp1 = icmp ule i32 %m, 15
	%cmp2 = icmp ule i32 %n, 1			%cmp2 = icmp ule i32 %n, 1
	%cmp = and i1 %cmp1, %cmp2			%cmp = and i1 %cmp1, %cmp2
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr i32 %m, %n			%div = lshr i32 %m, %n
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @const_poweroftwo(i32 %n) {			define void @const_poweroftwo(i32 %n) {
	; CHECK-LABEL: @const_poweroftwo(			; CHECK-LABEL: @const_poweroftwo(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 4095			; CHECK-NEXT: [[CMP:%.]] = icmp ule i32 [[N:%.]], 4095
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[N]], 1			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 1 to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp = icmp ule i32 %n, 4095			%cmp = icmp ule i32 %n, 4095
	br i1 %cmp, label %bb, label %exit			br i1 %cmp, label %bb, label %exit

	bb:			bb:
	%div = lshr i32 %n, 1			%div = lshr i32 %n, 1
	br label %exit			br label %exit

	exit:			exit:
	ret void			ret void
	}			}

	define void @var_poweroftwo(i32 %m, i32 %n) {			define void @var_poweroftwo(i32 %m, i32 %n) {
	; CHECK-LABEL: @var_poweroftwo(			; CHECK-LABEL: @var_poweroftwo(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 4095			; CHECK-NEXT: [[CMP1:%.]] = icmp ule i32 [[M:%.]], 4095
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 1			; CHECK-NEXT: [[CMP2:%.]] = icmp ule i32 [[N:%.]], 1
	; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]			; CHECK-NEXT: [[CMP:%.*]] = and i1 [[CMP1]], [[CMP2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[CMP]], label [[BB:%.]], label [[EXIT:%.]]
	; CHECK: bb:			; CHECK: bb:
	; CHECK-NEXT: [[DIV:%.*]] = lshr i32 [[M]], [[N]]			; CHECK-NEXT: [[DIV_LHS_TRUNC:%.*]] = trunc i32 [[M]] to i16
				; CHECK-NEXT: [[DIV_RHS_TRUNC:%.*]] = trunc i32 [[N]] to i16
				; CHECK-NEXT: [[DIV1:%.*]] = lshr i16 [[DIV_LHS_TRUNC]], [[DIV_RHS_TRUNC]]
				; CHECK-NEXT: [[DIV_ZEXT:%.*]] = zext i16 [[DIV1]] to i32
	; CHECK-NEXT: br label [[EXIT]]			; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%cmp1 = icmp ule i32 %m, 4095			%cmp1 = icmp ule i32 %m, 4095
	%cmp2 = icmp ule i32 %n, 1			%cmp2 = icmp ule i32 %n, 1
	%cmp = and i1 %cmp1, %cmp2			%cmp = and i1 %cmp1, %cmp2
	Show All 33 Lines

test/Transforms/PhaseOrdering/udiv-urem-instcombine-vs-cvp.ll

	Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	; CHECK-CVP-NEXT: [[UREM_RHS_TRUNC:%.*]] = trunc i64 32 to i32			; CHECK-CVP-NEXT: [[UREM_RHS_TRUNC:%.*]] = trunc i64 32 to i32
	; CHECK-CVP-NEXT: [[UREM2:%.*]] = urem i32 [[UREM_LHS_TRUNC]], [[UREM_RHS_TRUNC]]			; CHECK-CVP-NEXT: [[UREM2:%.*]] = urem i32 [[UREM_LHS_TRUNC]], [[UREM_RHS_TRUNC]]
	; CHECK-CVP-NEXT: [[UREM_ZEXT:%.*]] = zext i32 [[UREM2]] to i64			; CHECK-CVP-NEXT: [[UREM_ZEXT:%.*]] = zext i32 [[UREM2]] to i64
	; CHECK-CVP-NEXT: [[UADD:%.*]] = add i64 [[UDIV_ZEXT]], [[UREM_ZEXT]]			; CHECK-CVP-NEXT: [[UADD:%.*]] = add i64 [[UDIV_ZEXT]], [[UREM_ZEXT]]
	; CHECK-CVP-NEXT: ret i64 [[UADD]]			; CHECK-CVP-NEXT: ret i64 [[UADD]]
	;			;
	; CHECK-IC-CVP-LABEL: @poweroftwo(			; CHECK-IC-CVP-LABEL: @poweroftwo(
	; CHECK-IC-CVP-NEXT: [[ZA:%.]] = zext i32 [[A:%.]] to i64			; CHECK-IC-CVP-NEXT: [[ZA:%.]] = zext i32 [[A:%.]] to i64
	; CHECK-IC-CVP-NEXT: [[UDIV:%.*]] = lshr i64 [[ZA]], 5			; CHECK-IC-CVP-NEXT: [[UDIV_LHS_TRUNC:%.*]] = trunc i64 [[ZA]] to i32
				; CHECK-IC-CVP-NEXT: [[UDIV_RHS_TRUNC:%.*]] = trunc i64 5 to i32
				; CHECK-IC-CVP-NEXT: [[UDIV1:%.*]] = lshr i32 [[UDIV_LHS_TRUNC]], [[UDIV_RHS_TRUNC]]
				; CHECK-IC-CVP-NEXT: [[UDIV_ZEXT:%.*]] = zext i32 [[UDIV1]] to i64
	; CHECK-IC-CVP-NEXT: [[UREM:%.*]] = and i64 [[ZA]], 31			; CHECK-IC-CVP-NEXT: [[UREM:%.*]] = and i64 [[ZA]], 31
	; CHECK-IC-CVP-NEXT: [[UADD:%.*]] = add nuw nsw i64 [[UDIV]], [[UREM]]			; CHECK-IC-CVP-NEXT: [[UADD:%.*]] = add nuw nsw i64 [[UDIV_ZEXT]], [[UREM]]
	; CHECK-IC-CVP-NEXT: ret i64 [[UADD]]			; CHECK-IC-CVP-NEXT: ret i64 [[UADD]]
	;			;
	; CHECK-IC-CVP-IC-LABEL: @poweroftwo(			; CHECK-IC-CVP-IC-LABEL: @poweroftwo(
	; CHECK-IC-CVP-IC-NEXT: [[ZA:%.]] = zext i32 [[A:%.]] to i64			; CHECK-IC-CVP-IC-NEXT: [[UDIV1:%.]] = lshr i32 [[A:%.]], 5
	; CHECK-IC-CVP-IC-NEXT: [[UDIV:%.*]] = lshr i64 [[ZA]], 5			; CHECK-IC-CVP-IC-NEXT: [[TMP1:%.*]] = and i32 [[A]], 31
	; CHECK-IC-CVP-IC-NEXT: [[UREM:%.*]] = and i64 [[ZA]], 31			; CHECK-IC-CVP-IC-NEXT: [[ADDCONV:%.*]] = add nuw nsw i32 [[UDIV1]], [[TMP1]]
	; CHECK-IC-CVP-IC-NEXT: [[UADD:%.*]] = add nuw nsw i64 [[UDIV]], [[UREM]]			; CHECK-IC-CVP-IC-NEXT: [[UADD:%.*]] = zext i32 [[ADDCONV]] to i64
	; CHECK-IC-CVP-IC-NEXT: ret i64 [[UADD]]			; CHECK-IC-CVP-IC-NEXT: ret i64 [[UADD]]
	;			;
	%za = zext i32 %a to i64			%za = zext i32 %a to i64
	%udiv = udiv i64 %za, 32			%udiv = udiv i64 %za, 32
	%urem = urem i64 %za, 32			%urem = urem i64 %za, 32
	%uadd = add i64 %udiv, %urem			%uadd = add i64 %udiv, %urem
	ret i64 %uadd			ret i64 %uadd
	}			}