This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
2
TargetLowering.h
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
1/3
LegalizeIntegerTypes.cpp
5/13
TargetLowering.cpp
-
Target/
-
ARM/
2/3
ARMISelLowering.cpp
-
X86/
-
X86ISelLowering.cpp
-
test/CodeGen/
-
CodeGen/
-
ARM/
-
div.ll
-
RISCV/
-
div-by-constant.ll
-
div.ll
-
split-udiv-by-constant.ll
-
split-urem-by-constant.ll
-
VE/Scalar/
-
Scalar/
-
rem.ll
-
X86/
1/2
divide-by-constant.ll
1
divmod128.ll

Differential D130862

[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.
ClosedPublic

Authored by craig.topper on Jul 31 2022, 4:36 PM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
efriedma
arsenm

Commits

rG38ffa2bb9637: [LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.

Summary

For remainder:
If (1 << (Bitwidth / 2)) % Divisor == 1, we can add the high and low halves
together and use a (Bitwidth / 2) urem. If (BitWidth /2) is a legal integer
type, this urem will be expand by DAGCombiner using multiply by magic
constant. We do have to take into account that adding high and low
together can produce a carry, making it a (BitWidth / 2)+1 bit number.
So we need to also add back in the carry from the first addition.

For division:
We can use the above trick to compute the remainder, subtract that
remainder from the dividend, then multiply by the multiplicative
inverse of the Divisor modulo (1 << BitWidth).

This is based on the section "Remainder by Summing Digits" in
Hacker's delight.

The remainder trick is similar to a trick you may have learned for
determining if a decimal number is divisible by 3. You can add all the
digits together and see if the sum is divisible by 3. If you're not sure
if the sum is divisible by 3, you can add its digits together. This
can be repeated until you have a single decimal digit. If that digit
is 3, 6, or 9, then the original number is divisible by 3. This works
because 10 % 3 == 1.

gcc already does this same trick. There are additional tricks gcc
does urem as well as srem, udiv, and sdiv that I plan to add in
future patches.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

craig.topper created this revision.Jul 31 2022, 4:36 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2022, 4:36 PM

Herald added subscribers: luke957, StephenFan, frasercrmck and 23 others. · View Herald Transcript

craig.topper requested review of this revision.Jul 31 2022, 4:36 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 31 2022, 4:36 PM

Herald added subscribers: • pcwang-thead, MaskRay, wdng. · View Herald Transcript

Harbormaster completed remote builds in B178484: Diff 448896.Jul 31 2022, 5:28 PM

precommit tests?

llvm/include/llvm/CodeGen/TargetLowering.h
4729	expandUREMByConstant ?
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4474	auto *CN
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7162	assert N->getOpcode == ISD::UREM?
7195	Are you we causing any oneuse issues if we split here and then can't expand below?
7213	Pull out repeated getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), HiLoVT) ?

craig.topper added inline comments.Aug 1 2022, 9:23 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7195	Good point. Today it's not an issue because that code only executes on the X86 win64 path and we have ADDCARRY available. But I'll fix it.

Do you have an overall plan written up somewhere?

There are a few different ways you could extend this:

This could be extended to handle factors of two: for example, a % 10 -> ((a / 2) % 5) * 2 + (a % 2).
This could be extended to handle more factors by slicing up numbers differently. e.g. for a % 7, slice the number into three 30-bit pieces, since 2^30 mod 7 = 1.
This could possibly be used to implement division: e.g. a / 5 -> (a - (a % 5)) * inverse(5). Not sure if this is more efficient than other approaches; might depend on the target.

On the general topic of division by constants, see also discussion on https://github.com/llvm/llvm-project/issues/56153

In D130862#3691418, @efriedma wrote:

Do you have an overall plan written up somewhere?

There are a few different ways you could extend this:

This could be extended to handle factors of two: for example, a % 10 -> ((a / 2) % 5) * 2 + (a % 2).

Yeah. I was also going to look at that. gcc does something, but what they do can be improved.

This could be extended to handle more factors by slicing up numbers differently. e.g. for a % 7, slice the number into three 30-bit pieces, since 2^30 mod 7 = 1.

That is what gcc does and what I was going to look at doing soon.

This could possibly be used to implement division: e.g. a / 5 -> (a - (a % 5)) * inverse(5). Not sure if this is more efficient than other approaches; might depend on the target.

That's also what gcc does. This interacts poorly with DivRemPairs. If we have a both a division and remainder DivRemPairs rewrites the remainder in terms of division. If we're going to use remainder to do the division, then what DivRemPairs is doing is the wrong direction.

My immediate next step was looking at extending this patch to UDIV using this method for the same constant divisors.

On the general topic of division by constants, see also discussion on https://github.com/llvm/llvm-project/issues/56153

I hadn't seen that bug, but I had talked to @ndesaulniers which is what motivated this.

a / 5, and a%10 etc.

Is it possible that the processor already does these tricks internally to improve the performance of these operations?

nickdesaulniers added a subscriber: nickdesaulniers.Aug 1 2022, 11:02 AM

Can we add an additional test to llvm/test/CodeGen/ARM/div.ll that urem with a small constant does not produce a libcall to __aeabi_uldivmod? The change to llvm/lib/Target/X86/X86ISelLowering.cpp makes me wonder how many backend-specific changes are necessary here. Also, not sure if this is something we only want to do for -O2 (vs -Os)?

In D130862#3691543, @nickdesaulniers wrote:

Can we add an additional test to llvm/test/CodeGen/ARM/div.ll that urem with a small constant does not produce a libcall to __aeabi_uldivmod?

Sure. I can do that.

The change to llvm/lib/Target/X86/X86ISelLowering.cpp makes me wonder how many backend-specific changes are necessary here.

Also, not sure if this is something we only want to do for -O2 (vs -Os)?

I put in the isIntDivCheap check to catch the targets that restricted div by constant expansion for -Oz. Though maybe I need to check that for HiLoVT too since we're kind of assuming HiLoVT UREM will be expanded by DAGCombiner.

In D130862#3691501, @hiraditya wrote:

a / 5, and a%10 etc.

Is it possible that the processor already does these tricks internally to improve the performance of these operations?

We're concerned here about cases where the numerator doesn't fit into one word. x86 has an instruction for division where the numerator is two words, but no other commonly used target does. And even on x86, it's slow.

In D130862#3691456, @craig.topper wrote:

In D130862#3691418, @efriedma wrote:

This could possibly be used to implement division: e.g. a / 5 -> (a - (a % 5)) * inverse(5). Not sure if this is more efficient than other approaches; might depend on the target.

That's also what gcc does. This interacts poorly with DivRemPairs. If we have a both a division and remainder DivRemPairs rewrites the remainder in terms of division. If we're going to use remainder to do the division, then what DivRemPairs is doing is the wrong direction.

My immediate next step was looking at extending this patch to UDIV using this method for the same constant divisors.

It might be worth to compare this to algorithm 4 from https://gmplib.org/~tege/division-paper.pdf .

RKSimon added inline comments.Aug 2 2022, 5:27 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4474	Is it possible to support constant splat vectors as well?

craig.topper added inline comments.Aug 2 2022, 8:16 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4474	This function is only called for scalar integers.

Partial update. Still need to add ARM tests and really commit the tests. Patch has been rebased as if they had been commited.

Harbormaster completed remote builds in B178956: Diff 449560.Aug 3 2022, 12:14 AM

craig.topper added inline comments.Aug 3 2022, 12:15 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7195	Rather than trying to create the nodes only when the below succeeds, I added an ultimate fall back for targets that don't use ZeroOrOneBooleanContent.

Add ARM test and add to ARM's LowerREM function.

Harbormaster completed remote builds in B179752: Diff 450589.Aug 6 2022, 10:23 PM

Add back the RISCV test I lost.

Harbormaster completed remote builds in B179763: Diff 450601.Aug 6 2022, 11:17 PM

craig.topper retitled this revision from [LegalizeTypes] Improve splitting for urem by constant for some constants. to [LegalizeTypes][WIP] Improve splitting for urem by constant for some constants..Aug 7 2022, 9:54 PM

Refactor to support UDIV and UDIVREM. The latter needed to support ARM.
Support constants that have trailing zeros such as 10 and 12 using the formula suggested by Eli.
Does not support breaking into 30-bit pieces.

Need to add more tests and comment cleanup. I did some local testing checking of a few divisors.

Harbormaster completed remote builds in B179836: Diff 450701.Aug 7 2022, 10:44 PM

craig.topper retitled this revision from [LegalizeTypes][WIP] Improve splitting for urem by constant for some constants. to [LegalizeTypes][WIP] Improve splitting for urem/udiv by constant for some constants..Aug 7 2022, 11:49 PM

RKSimon added inline comments.Aug 8 2022, 6:23 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7169	assert((Opcode == ISD::UREM \|\| Opcode == ISD::UDIV \|\| Opcode == ISD::UDIVREM) && "unsigned division expected");

Do we want to maintain codegen'ing a libcall when optimizing for code size? If so, is there an existing test that demonstrates that, or do we need a new one?

Does 32b MIPS need custom target lowering? 32b ppc?

Support constants that have trailing zeros such as 10 and 12 using the formula suggested by Eli.

Mind adding a citation via comment in source or commit message?

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7278	Should this be `LL` or `DL`? (I dunno, just asking; curious why `DL` exists).
7292	Every caller is `BUILD_PAIR`'ing the results. Would it be a better interface to simply return an `SDValue` for the remainder and one for the quotient, rather than a pair of `SDValue`s that need to be `BUILD_PAIR`ed again in the caller (even though we've done so already in the callee)?
llvm/lib/Target/ARM/ARMISelLowering.cpp
20427	I find the Result param a little confusing. Sometimes it starts with the quotient, sometimes it starts with the remainder. Should these be two distinct parameters to `expandDIVREMByConstant`? ie. one `SmallVector<SDValue>` for Div, one for Rem? Especially since we're handling 3 different opcodes, possibly 6 in the future for signed types.

In D130862#3707290, @nickdesaulniers wrote:

Do we want to maintain codegen'ing a libcall when optimizing for code size? If so, is there an existing test that demonstrates that, or do we need a new one?

Does 32b MIPS need custom target lowering? 32b ppc?

I have not checked yet.

Support constants that have trailing zeros such as 10 and 12 using the formula suggested by Eli.

Mind adding a citation via comment in source or commit message?

Yes. I need to update comments. Not sure if we want the trailing zeros case in the base patch or in its own commit. I squashed everything together after I changed the base patch interface so much to support DIV and DIVREM.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7278	LL is correct. DL is the shifted version. We need the bits that were shifted off here so we need the original. I need to improve variable naming.
7292	The caller in LegalizeIntegerTypes does not BUILD_PAIR the results.
llvm/lib/Target/ARM/ARMISelLowering.cpp
20427	Returning two vectors complicates the code in `X86TargetLowering::LowerWin64_i128OP` which would need to check the opcode to know where to get the result since it handles both DIV and REM.
20489	grr why does clang-format keep over indenting this return?

craig.topper added a parent revision: D131442: [X86][RISCV] Pre-commit tests for D130862. NFC.Aug 8 2022, 2:28 PM

https://github.com/ClangBuiltLinux/linux/issues/1635 and https://github.com/ClangBuiltLinux/linux/issues/1679 are both fixed by this series. :)

Rebase on new tests

Harbormaster completed remote builds in B181141: Diff 452485.Aug 14 2022, 1:14 AM

craig.topper mentioned this in rGb8c5420d74cb: [X86][RISCV] Pre-commit tests for D130862. NFC.Aug 14 2022, 4:31 PM

craig.topper edited the summary of this revision. (Show Details)Aug 14 2022, 7:21 PM

craig.topper edited the summary of this revision. (Show Details)Aug 14 2022, 7:24 PM

Remove the even divisor support to simplify the patch and the explanation of the concept.
Will move to a separate patch.

craig.topper retitled this revision from [LegalizeTypes][WIP] Improve splitting for urem/udiv by constant for some constants. to [LegalizeTypes] Improve splitting for urem/udiv by constant for some constants..Aug 14 2022, 9:32 PM

Harbormaster completed remote builds in B181217: Diff 452587.Aug 14 2022, 10:31 PM

RKSimon added inline comments.Aug 15 2022, 2:17 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7183	const APInt &Divisor ?
llvm/test/CodeGen/X86/divmod128.ll
918	pre-commit this? what about optsize - should that expand do you think?

craig.topper mentioned this in rG0d1d36cfa614: [X86] Pre-commit tests for D130862. NFC.Sep 4 2022, 9:26 PM

-Pre-commit tests.
-Make APInt const. It will need to be changed back to non-const when we support even divisors.
-Disable for optsize instead of minsize. I think that matches gcc.

Harbormaster completed remote builds in B185037: Diff 457887.Sep 4 2022, 10:43 PM

LGTM with a couple of minors (although I'm not very familiar with the algorithm) - any other comments?

llvm/include/llvm/CodeGen/TargetLowering.h
4730	InL / InH -> LL / LH ?
llvm/test/CodeGen/X86/divide-by-constant.ll
939	why did you remove this?

craig.topper added inline comments.Sep 9 2022, 9:15 AM

llvm/test/CodeGen/X86/divide-by-constant.ll
939	I didn't mean to. Not sure what happened there.

Address review comments

Harbormaster completed remote builds in B185897: Diff 459137.Sep 9 2022, 12:02 PM

LGTM cheers

This revision is now accepted and ready to land.Sep 12 2022, 5:14 AM

In D130862#3691501, @hiraditya wrote:

a / 5, and a%10 etc.

Is it possible that the processor already does these tricks internally to improve the performance of these operations?

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7237–7242	Can't you just unconditionally use getZExtOrTrunc? the booleancontents would just apply the optimization fold later

craig.topper added inline comments.Sep 12 2022, 9:22 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7237–7242	I wrote it like this because that's what was done for ISD::SUB at the end of ExpandIntRes_ADDSUB. ISD::ADD in that function seemed more complicated that it needs to be.

This revision was landed with ongoing or failed builds.Sep 12 2022, 10:42 AM

Closed by commit rG38ffa2bb9637: [LegalizeTypes] Improve splitting for urem/udiv by constant for some constants. (authored by craig.topper). · Explain Why

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rG38ffa2bb9637: [LegalizeTypes] Improve splitting for urem/udiv by constant for some constants..

keith.walker.arm mentioned this in D135875: [ARM] Add additional targets to divide tests..Oct 14 2022, 1:40 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetLowering.h

20 lines

lib/

CodeGen/

SelectionDAG/

LegalizeIntegerTypes.cpp

32 lines

TargetLowering.cpp

146 lines

Target/

ARM/

ARMISelLowering.cpp

26 lines

X86/

X86ISelLowering.cpp

6 lines

test/

CodeGen/

ARM/

div.ll

30 lines

RISCV/

div-by-constant.ll

27 lines

div.ll

27 lines

split-udiv-by-constant.ll

463 lines

split-urem-by-constant.ll

296 lines

VE/

Scalar/

rem.ll

4 lines

X86/

divide-by-constant.ll

428 lines

divmod128.ll

856 lines

Diff 459515

llvm/include/llvm/CodeGen/TargetLowering.h

	Show First 20 Lines • Show All 4,707 Lines • ▼ Show 20 Lines
	/// \param RL Low bits of the RHS of the MUL. See LL for meaning			/// \param RL Low bits of the RHS of the MUL. See LL for meaning
	/// \param RH High bits of the RHS of the MUL. See LL for meaning.			/// \param RH High bits of the RHS of the MUL. See LL for meaning.
	/// \returns true if the node has been expanded. false if it has not			/// \returns true if the node has been expanded. false if it has not
	bool expandMUL(SDNode *N, SDValue &Lo, SDValue &Hi, EVT HiLoVT,			bool expandMUL(SDNode *N, SDValue &Lo, SDValue &Hi, EVT HiLoVT,
	SelectionDAG &DAG, MulExpansionKind Kind,			SelectionDAG &DAG, MulExpansionKind Kind,
	SDValue LL = SDValue(), SDValue LH = SDValue(),			SDValue LL = SDValue(), SDValue LH = SDValue(),
	SDValue RL = SDValue(), SDValue RH = SDValue()) const;			SDValue RL = SDValue(), SDValue RH = SDValue()) const;

				/// Attempt to expand an n-bit div/rem/divrem by constant using a n/2-bit
				/// urem by constant and other arithmetic ops. The n/2-bit urem by constant
				/// will be expanded by DAGCombiner. This is not possible for all constant
				/// divisors.
				/// \param N Node to expand
				/// \param Result A vector that will be filled with the lo and high parts of
				/// the results. For *DIVREM, this will be the quotient parts followed
				/// by the remainder parts.
				/// \param HiLoVT The value type to use for the Lo and Hi parts. Should be
				/// half of VT.
				/// \param LL Low bits of the LHS of the operation. You can use this
				/// parameter if you want to control how low bits are extracted from
				/// the LHS.
				/// \param LH High bits of the LHS of the operation. See LL for meaning.
				RKSimonUnsubmitted Not Done Reply Inline Actions expandUREMByConstant ? RKSimon: expandUREMByConstant ?
				/// \returns true if the node has been expanded, false if it has not.
				RKSimonUnsubmitted Not Done Reply Inline Actions InL / InH -> LL / LH ? RKSimon: InL / InH -> LL / LH ?
				bool expandDIVREMByConstant(SDNode *N, SmallVectorImpl<SDValue> &Result,
				EVT HiLoVT, SelectionDAG &DAG,
				SDValue LL = SDValue(),
				SDValue LH = SDValue()) const;

	/// Expand funnel shift.			/// Expand funnel shift.
	/// \param N Node to expand			/// \param N Node to expand
	/// \returns The expansion if successful, SDValue() otherwise			/// \returns The expansion if successful, SDValue() otherwise
	SDValue expandFunnelShift(SDNode *N, SelectionDAG &DAG) const;			SDValue expandFunnelShift(SDNode *N, SelectionDAG &DAG) const;

	/// Expand rotations.			/// Expand rotations.
	/// \param N Node to expand			/// \param N Node to expand
	/// \param AllowVectorOps expand vector rotate, this should only be performed			/// \param AllowVectorOps expand vector rotate, this should only be performed
	▲ Show 20 Lines • Show All 316 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 4,421 Lines • ▼ Show 20 Lines	void DAGTypeLegalizer::ExpandIntRes_UDIV(SDNode *N,
SDValue Ops[2] = { N->getOperand(0), N->getOperand(1) };		SDValue Ops[2] = { N->getOperand(0), N->getOperand(1) };

if (TLI.getOperationAction(ISD::UDIVREM, VT) == TargetLowering::Custom) {		if (TLI.getOperationAction(ISD::UDIVREM, VT) == TargetLowering::Custom) {
SDValue Res = DAG.getNode(ISD::UDIVREM, dl, DAG.getVTList(VT, VT), Ops);		SDValue Res = DAG.getNode(ISD::UDIVREM, dl, DAG.getVTList(VT, VT), Ops);
SplitInteger(Res.getValue(0), Lo, Hi);		SplitInteger(Res.getValue(0), Lo, Hi);
return;		return;
}		}

		// Try to expand UDIV by constant.
		if (isa<ConstantSDNode>(N->getOperand(1))) {
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
		// Only if the new type is legal.
		if (isTypeLegal(NVT)) {
		SDValue InL, InH;
		GetExpandedInteger(N->getOperand(0), InL, InH);
		SmallVector<SDValue> Result;
		if (TLI.expandDIVREMByConstant(N, Result, NVT, DAG, InL, InH)) {
		Lo = Result[0];
		Hi = Result[1];
		return;
		}
		}
		}

RTLIB::Libcall LC = RTLIB::UNKNOWN_LIBCALL;		RTLIB::Libcall LC = RTLIB::UNKNOWN_LIBCALL;
if (VT == MVT::i16)		if (VT == MVT::i16)
LC = RTLIB::UDIV_I16;		LC = RTLIB::UDIV_I16;
else if (VT == MVT::i32)		else if (VT == MVT::i32)
LC = RTLIB::UDIV_I32;		LC = RTLIB::UDIV_I32;
else if (VT == MVT::i64)		else if (VT == MVT::i64)
LC = RTLIB::UDIV_I64;		LC = RTLIB::UDIV_I64;
else if (VT == MVT::i128)		else if (VT == MVT::i128)
Show All 11 Lines	void DAGTypeLegalizer::ExpandIntRes_UREM(SDNode *N,
SDValue Ops[2] = { N->getOperand(0), N->getOperand(1) };		SDValue Ops[2] = { N->getOperand(0), N->getOperand(1) };

if (TLI.getOperationAction(ISD::UDIVREM, VT) == TargetLowering::Custom) {		if (TLI.getOperationAction(ISD::UDIVREM, VT) == TargetLowering::Custom) {
SDValue Res = DAG.getNode(ISD::UDIVREM, dl, DAG.getVTList(VT, VT), Ops);		SDValue Res = DAG.getNode(ISD::UDIVREM, dl, DAG.getVTList(VT, VT), Ops);
SplitInteger(Res.getValue(1), Lo, Hi);		SplitInteger(Res.getValue(1), Lo, Hi);
return;		return;
}		}

		// Try to expand UREM by constant.
		if (isa<ConstantSDNode>(N->getOperand(1))) {
		RKSimonUnsubmitted Not Done Reply Inline Actions auto CN RKSimon:* auto *CN
		RKSimonUnsubmitted Not Done Reply Inline Actions Is it possible to support constant splat vectors as well? RKSimon: Is it possible to support constant splat vectors as well?
		craig.topperAuthorUnsubmitted Done Reply Inline Actions This function is only called for scalar integers. craig.topper: This function is only called for scalar integers.
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
		// Only if the new type is legal.
		if (isTypeLegal(NVT)) {
		SDValue InL, InH;
		GetExpandedInteger(N->getOperand(0), InL, InH);
		SmallVector<SDValue> Result;
		if (TLI.expandDIVREMByConstant(N, Result, NVT, DAG, InL, InH)) {
		Lo = Result[0];
		Hi = Result[1];
		return;
		}
		}
		}

RTLIB::Libcall LC = RTLIB::UNKNOWN_LIBCALL;		RTLIB::Libcall LC = RTLIB::UNKNOWN_LIBCALL;
if (VT == MVT::i16)		if (VT == MVT::i16)
LC = RTLIB::UREM_I16;		LC = RTLIB::UREM_I16;
else if (VT == MVT::i32)		else if (VT == MVT::i32)
LC = RTLIB::UREM_I32;		LC = RTLIB::UREM_I32;
else if (VT == MVT::i64)		else if (VT == MVT::i64)
LC = RTLIB::UREM_I64;		LC = RTLIB::UREM_I64;
else if (VT == MVT::i128)		else if (VT == MVT::i128)
▲ Show 20 Lines • Show All 1,099 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,144 Lines • ▼ Show 20 Lines	bool TargetLowering::expandMUL(SDNode *N, SDValue &Lo, SDValue &Hi, EVT HiLoVT,
if (Ok) {		if (Ok) {
assert(Result.size() == 2);		assert(Result.size() == 2);
Lo = Result[0];		Lo = Result[0];
Hi = Result[1];		Hi = Result[1];
}		}
return Ok;		return Ok;
}		}

		// Optimize unsigned division or remainder by constants for types twice as large
		// as a legal VT.
		//
		// If (1 << (BitWidth / 2)) % Constant == 1, then the remainder
		// can be computed
		// as:
		// Sum += __builtin_uadd_overflow(Lo, High, &Sum);
		// Remainder = Sum % Constant
		// This is based on "Remainder by Summing Digits" from Hacker's Delight.
		//
		RKSimonUnsubmitted Not Done Reply Inline Actions assert N->getOpcode == ISD::UREM? RKSimon: assert N->getOpcode == ISD::UREM?
		// For division, we can compute the remainder, subtract it from the dividend,
		// and then multiply by the multiplicative inverse modulo (1 << (BitWidth / 2)).
		bool TargetLowering::expandDIVREMByConstant(SDNode *N,
		SmallVectorImpl<SDValue> &Result,
		EVT HiLoVT, SelectionDAG &DAG,
		SDValue LL, SDValue LH) const {
		unsigned Opcode = N->getOpcode();
		RKSimonUnsubmitted Not Done Reply Inline Actions assert((Opcode == ISD::UREM \|\| Opcode == ISD::UDIV \|\| Opcode == ISD::UDIVREM) && "unsigned division expected"); RKSimon: assert((Opcode == ISD::UREM \|\| Opcode == ISD::UDIV \|\| Opcode == ISD::UDIVREM) && "unsigned…
		EVT VT = N->getValueType(0);

		// TODO: Support signed division/remainder.
		if (Opcode == ISD::SREM \|\| Opcode == ISD::SDIV \|\| Opcode == ISD::SDIVREM)
		return false;
		assert(
		(Opcode == ISD::UREM \|\| Opcode == ISD::UDIV \|\| Opcode == ISD::UDIVREM) &&
		"Unexpected opcode");

		auto *CN = dyn_cast<ConstantSDNode>(N->getOperand(1));
		if (!CN)
		return false;

		const APInt &Divisor = CN->getAPIntValue();
		RKSimonUnsubmitted Not Done Reply Inline Actions const APInt &Divisor ? RKSimon: const APInt &Divisor ?
		unsigned BitWidth = Divisor.getBitWidth();
		unsigned HBitWidth = BitWidth / 2;
		assert(VT.getScalarSizeInBits() == BitWidth &&
		HiLoVT.getScalarSizeInBits() == HBitWidth && "Unexpected VTs");

		// Divisor needs to less than (1 << HBitWidth).
		APInt HalfMaxPlus1 = APInt::getOneBitSet(BitWidth, HBitWidth);
		if (Divisor.uge(HalfMaxPlus1))
		return false;

		// We depend on the UREM by constant optimization in DAGCombiner that requires
		// high multiply.
		RKSimonUnsubmitted Not Done Reply Inline Actions Are you we causing any oneuse issues if we split here and then can't expand below? RKSimon: Are you we causing any oneuse issues if we split here and then can't expand below?
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Good point. Today it's not an issue because that code only executes on the X86 win64 path and we have ADDCARRY available. But I'll fix it. craig.topper: Good point. Today it's not an issue because that code only executes on the X86 win64 path and…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Rather than trying to create the nodes only when the below succeeds, I added an ultimate fall back for targets that don't use ZeroOrOneBooleanContent. craig.topper: Rather than trying to create the nodes only when the below succeeds, I added an ultimate fall…
		if (!isOperationLegalOrCustom(ISD::MULHU, HiLoVT) &&
		!isOperationLegalOrCustom(ISD::UMUL_LOHI, HiLoVT))
		return false;

		// Don't expand if optimizing for size.
		if (DAG.shouldOptForSize())
		return false;

		// Early out for 0, 1 or even divisors.
		if (Divisor.ule(1) \|\| Divisor[0] == 0)
		return false;

		SDLoc dl(N);
		SDValue Sum;

		// If (1 << HBitWidth) % divisor == 1, we can add the two halves together and
		// then add in the carry.
		// TODO: If we can't split it in half, we might be able to split into 3 or
		RKSimonUnsubmitted Not Done Reply Inline Actions Pull out repeated getSetCCResultType(DAG.getDataLayout(), DAG.getContext(), HiLoVT) ? RKSimon:* Pull out repeated getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), HiLoVT) ?
		// more pieces using a smaller bit width.
		if (HalfMaxPlus1.urem(Divisor).isOneValue()) {
		assert(!LL == !LH && "Expected both input halves or no input halves!");
		if (!LL) {
		LL = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, HiLoVT, N->getOperand(0),
		DAG.getIntPtrConstant(0, dl));
		LH = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, HiLoVT, N->getOperand(0),
		DAG.getIntPtrConstant(1, dl));
		}

		// Use addcarry if we can, otherwise use a compare to detect overflow.
		EVT SetCCType =
		getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), HiLoVT);
		if (isOperationLegalOrCustom(ISD::ADDCARRY, HiLoVT)) {
		SDVTList VTList = DAG.getVTList(HiLoVT, SetCCType);
		Sum = DAG.getNode(ISD::UADDO, dl, VTList, LL, LH);
		Sum = DAG.getNode(ISD::ADDCARRY, dl, VTList, Sum,
		DAG.getConstant(0, dl, HiLoVT), Sum.getValue(1));
		} else {
		Sum = DAG.getNode(ISD::ADD, dl, HiLoVT, LL, LH);
		SDValue Carry = DAG.getSetCC(dl, SetCCType, Sum, LL, ISD::SETULT);
		// If the boolean for the target is 0 or 1, we can add the setcc result
		// directly.
		if (getBooleanContents(HiLoVT) ==
		TargetLoweringBase::ZeroOrOneBooleanContent)
		Carry = DAG.getZExtOrTrunc(Carry, dl, HiLoVT);
		else
		Carry = DAG.getSelect(dl, HiLoVT, Carry, DAG.getConstant(1, dl, HiLoVT),
		DAG.getConstant(0, dl, HiLoVT));
		arsenmUnsubmitted Not Done Reply Inline Actions Can't you just unconditionally use getZExtOrTrunc? the booleancontents would just apply the optimization fold later arsenm: Can't you just unconditionally use getZExtOrTrunc? the booleancontents would just apply the…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions I wrote it like this because that's what was done for ISD::SUB at the end of ExpandIntRes_ADDSUB. ISD::ADD in that function seemed more complicated that it needs to be. craig.topper: I wrote it like this because that's what was done for ISD::SUB at the end of…
		Sum = DAG.getNode(ISD::ADD, dl, HiLoVT, Sum, Carry);
		}
		}

		// If we didn't find a sum, we can't do the expansion.
		if (!Sum)
		return false;

		// Perform a HiLoVT urem on the Sum using truncated divisor.
		SDValue RemL =
		DAG.getNode(ISD::UREM, dl, HiLoVT, Sum,
		DAG.getConstant(Divisor.trunc(HBitWidth), dl, HiLoVT));
		// High half of the remainder is 0.
		SDValue RemH = DAG.getConstant(0, dl, HiLoVT);

		// If we only want remainder, we're done.
		if (Opcode == ISD::UREM) {
		Result.push_back(RemL);
		Result.push_back(RemH);
		return true;
		}

		// Otherwise, we need to compute the quotient.

		// Join the remainder halves.
		SDValue Rem = DAG.getNode(ISD::BUILD_PAIR, dl, VT, RemL, RemH);

		// Subtract the remainder from the input.
		SDValue In = DAG.getNode(ISD::SUB, dl, VT, N->getOperand(0), Rem);

		// Multiply by the multiplicative inverse of the divisor modulo
		// (1 << BitWidth).
		APInt Mod = APInt::getSignedMinValue(BitWidth + 1);
		APInt MulFactor = Divisor.zext(BitWidth + 1);
		MulFactor = MulFactor.multiplicativeInverse(Mod);
		MulFactor = MulFactor.trunc(BitWidth);
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Should this be `LL` or `DL`? (I dunno, just asking; curious why `DL` exists). nickdesaulniers: Should this be `LL` or `DL`? (I dunno, just asking; curious why `DL` exists).
		craig.topperAuthorUnsubmitted Done Reply Inline Actions LL is correct. DL is the shifted version. We need the bits that were shifted off here so we need the original. I need to improve variable naming. craig.topper: LL is correct. DL is the shifted version. We need the bits that were shifted off here so we…

		SDValue Quotient =
		DAG.getNode(ISD::MUL, dl, VT, In, DAG.getConstant(MulFactor, dl, VT));

		// Split the quotient into low and high parts.
		SDValue QuotL = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, HiLoVT, Quotient,
		DAG.getIntPtrConstant(0, dl));
		SDValue QuotH = DAG.getNode(ISD::EXTRACT_ELEMENT, dl, HiLoVT, Quotient,
		DAG.getIntPtrConstant(1, dl));
		Result.push_back(QuotL);
		Result.push_back(QuotH);
		// For DIVREM, also return the remainder parts.
		if (Opcode == ISD::UDIVREM) {
		Result.push_back(RemL);
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Every caller is `BUILD_PAIR`'ing the results. Would it be a better interface to simply return an `SDValue` for the remainder and one for the quotient, rather than a pair of `SDValue`s that need to be `BUILD_PAIR`ed again in the caller (even though we've done so already in the callee)? nickdesaulniers: Every caller is `BUILD_PAIR`'ing the results. Would it be a better interface to simply return…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions The caller in LegalizeIntegerTypes does not BUILD_PAIR the results. craig.topper: The caller in LegalizeIntegerTypes does not BUILD_PAIR the results.
		Result.push_back(RemH);
		}

		return true;
		}

// Check that (every element of) Z is undef or not an exact multiple of BW.		// Check that (every element of) Z is undef or not an exact multiple of BW.
static bool isNonZeroModBitWidthOrUndef(SDValue Z, unsigned BW) {		static bool isNonZeroModBitWidthOrUndef(SDValue Z, unsigned BW) {
return ISD::matchUnaryPredicate(		return ISD::matchUnaryPredicate(
Z,		Z,
[=](ConstantSDNode *C) { return !C \|\| C->getAPIntValue().urem(BW) != 0; },		[=](ConstantSDNode *C) { return !C \|\| C->getAPIntValue().urem(BW) != 0; },
true);		true);
}		}

▲ Show 20 Lines • Show All 2,773 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 20,411 Lines • ▼ Show 20 Lines	assert((Subtarget->isTargetAEABI() \|\| Subtarget->isTargetAndroid() \|\|
Subtarget->isTargetGNUAEABI() \|\| Subtarget->isTargetMuslAEABI() \|\|		Subtarget->isTargetGNUAEABI() \|\| Subtarget->isTargetMuslAEABI() \|\|
Subtarget->isTargetWindows()) &&		Subtarget->isTargetWindows()) &&
"Register-based DivRem lowering only");		"Register-based DivRem lowering only");
unsigned Opcode = Op->getOpcode();		unsigned Opcode = Op->getOpcode();
assert((Opcode == ISD::SDIVREM \|\| Opcode == ISD::UDIVREM) &&		assert((Opcode == ISD::SDIVREM \|\| Opcode == ISD::UDIVREM) &&
"Invalid opcode for Div/Rem lowering");		"Invalid opcode for Div/Rem lowering");
bool isSigned = (Opcode == ISD::SDIVREM);		bool isSigned = (Opcode == ISD::SDIVREM);
EVT VT = Op->getValueType(0);		EVT VT = Op->getValueType(0);
Type Ty = VT.getTypeForEVT(DAG.getContext());
SDLoc dl(Op);		SDLoc dl(Op);

		if (VT == MVT::i64 && isa<ConstantSDNode>(Op.getOperand(1))) {
		SmallVector<SDValue> Result;
		if (expandDIVREMByConstant(Op.getNode(), Result, MVT::i32, DAG)) {
		SDValue Res0 =
		DAG.getNode(ISD::BUILD_PAIR, dl, VT, Result[0], Result[1]);
		SDValue Res1 =
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions I find the Result param a little confusing. Sometimes it starts with the quotient, sometimes it starts with the remainder. Should these be two distinct parameters to `expandDIVREMByConstant`? ie. one `SmallVector<SDValue>` for Div, one for Rem? Especially since we're handling 3 different opcodes, possibly 6 in the future for signed types. nickdesaulniers: I find the Result param a little confusing. Sometimes it starts with the quotient, sometimes it…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Returning two vectors complicates the code in `X86TargetLowering::LowerWin64_i128OP` which would need to check the opcode to know where to get the result since it handles both DIV and REM. craig.topper: Returning two vectors complicates the code in `X86TargetLowering::LowerWin64_i128OP` which…
		DAG.getNode(ISD::BUILD_PAIR, dl, VT, Result[2], Result[3]);
		return DAG.getNode(ISD::MERGE_VALUES, dl, Op->getVTList(),
		{Res0, Res1});
		}
		}

		Type Ty = VT.getTypeForEVT(DAG.getContext());

// If the target has hardware divide, use divide + multiply + subtract:		// If the target has hardware divide, use divide + multiply + subtract:
// div = a / b		// div = a / b
// rem = a - b * div		// rem = a - b * div
// return {div, rem}		// return {div, rem}
// This should be lowered into UDIV/SDIV + MLS later on.		// This should be lowered into UDIV/SDIV + MLS later on.
bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivideInThumbMode()		bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivideInThumbMode()
: Subtarget->hasDivideInARMMode();		: Subtarget->hasDivideInARMMode();
if (hasDivide && Op->getValueType(0).isSimple() &&		if (hasDivide && Op->getValueType(0).isSimple() &&
Show All 32 Lines	SDValue ARMTargetLowering::LowerDivRem(SDValue Op, SelectionDAG &DAG) const {

std::pair<SDValue, SDValue> CallInfo = LowerCallTo(CLI);		std::pair<SDValue, SDValue> CallInfo = LowerCallTo(CLI);
return CallInfo.first;		return CallInfo.first;
}		}

// Lowers REM using divmod helpers		// Lowers REM using divmod helpers
// see RTABI section 4.2/4.3		// see RTABI section 4.2/4.3
SDValue ARMTargetLowering::LowerREM(SDNode *N, SelectionDAG &DAG) const {		SDValue ARMTargetLowering::LowerREM(SDNode *N, SelectionDAG &DAG) const {
		EVT VT = N->getValueType(0);

		if (VT == MVT::i64 && isa<ConstantSDNode>(N->getOperand(1))) {
		SmallVector<SDValue> Result;
		if (expandDIVREMByConstant(N, Result, MVT::i32, DAG))
		return DAG.getNode(ISD::BUILD_PAIR, SDLoc(N), N->getValueType(0),
		craig.topperAuthorUnsubmitted Done Reply Inline Actions grr why does clang-format keep over indenting this return? craig.topper: grr why does clang-format keep over indenting this return?
		Result[0], Result[1]);
		}

// Build return types (div and rem)		// Build return types (div and rem)
std::vector<Type*> RetTyParams;		std::vector<Type*> RetTyParams;
Type *RetTyElement;		Type *RetTyElement;

switch (N->getValueType(0).getSimpleVT().SimpleTy) {		switch (VT.getSimpleVT().SimpleTy) {
default: llvm_unreachable("Unexpected request for libcall!");		default: llvm_unreachable("Unexpected request for libcall!");
case MVT::i8: RetTyElement = Type::getInt8Ty(*DAG.getContext()); break;		case MVT::i8: RetTyElement = Type::getInt8Ty(*DAG.getContext()); break;
case MVT::i16: RetTyElement = Type::getInt16Ty(*DAG.getContext()); break;		case MVT::i16: RetTyElement = Type::getInt16Ty(*DAG.getContext()); break;
case MVT::i32: RetTyElement = Type::getInt32Ty(*DAG.getContext()); break;		case MVT::i32: RetTyElement = Type::getInt32Ty(*DAG.getContext()); break;
case MVT::i64: RetTyElement = Type::getInt64Ty(*DAG.getContext()); break;		case MVT::i64: RetTyElement = Type::getInt64Ty(*DAG.getContext()); break;
}		}

RetTyParams.push_back(RetTyElement);		RetTyParams.push_back(RetTyElement);
▲ Show 20 Lines • Show All 1,294 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 29,522 Lines • ▼ Show 20 Lines
	}			}

	SDValue X86TargetLowering::LowerWin64_i128OP(SDValue Op, SelectionDAG &DAG) const {			SDValue X86TargetLowering::LowerWin64_i128OP(SDValue Op, SelectionDAG &DAG) const {
	assert(Subtarget.isTargetWin64() && "Unexpected target");			assert(Subtarget.isTargetWin64() && "Unexpected target");
	EVT VT = Op.getValueType();			EVT VT = Op.getValueType();
	assert(VT.isInteger() && VT.getSizeInBits() == 128 &&			assert(VT.isInteger() && VT.getSizeInBits() == 128 &&
	"Unexpected return type for lowering");			"Unexpected return type for lowering");

				if (isa<ConstantSDNode>(Op->getOperand(1))) {
				SmallVector<SDValue> Result;
				if (expandDIVREMByConstant(Op.getNode(), Result, MVT::i64, DAG))
				return DAG.getNode(ISD::BUILD_PAIR, SDLoc(Op), VT, Result[0], Result[1]);
				}

	RTLIB::Libcall LC;			RTLIB::Libcall LC;
	bool isSigned;			bool isSigned;
	switch (Op->getOpcode()) {			switch (Op->getOpcode()) {
	default: llvm_unreachable("Unexpected request for libcall!");			default: llvm_unreachable("Unexpected request for libcall!");
	case ISD::SDIV: isSigned = true; LC = RTLIB::SDIV_I128; break;			case ISD::SDIV: isSigned = true; LC = RTLIB::SDIV_I128; break;
	case ISD::UDIV: isSigned = false; LC = RTLIB::UDIV_I128; break;			case ISD::UDIV: isSigned = false; LC = RTLIB::UDIV_I128; break;
	case ISD::SREM: isSigned = true; LC = RTLIB::SREM_I128; break;			case ISD::SREM: isSigned = true; LC = RTLIB::SREM_I128; break;
	case ISD::UREM: isSigned = false; LC = RTLIB::UREM_I128; break;			case ISD::UREM: isSigned = false; LC = RTLIB::UREM_I128; break;
	▲ Show 20 Lines • Show All 27,124 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/div.ll

	Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines

	; EABI MODE = Remainder in R2-R3, quotient in R0-R1			; EABI MODE = Remainder in R2-R3, quotient in R0-R1
	; CHECK-EABI: __aeabi_uldivmod			; CHECK-EABI: __aeabi_uldivmod
	; CHECK-EABI-NEXT: mov r0, r2			; CHECK-EABI-NEXT: mov r0, r2
	; CHECK-EABI-NEXT: mov r1, r3			; CHECK-EABI-NEXT: mov r1, r3
	%tmp1 = urem i64 %a, %b ; <i64> [#uses=1]			%tmp1 = urem i64 %a, %b ; <i64> [#uses=1]
	ret i64 %tmp1			ret i64 %tmp1
	}			}

				; Make sure we avoid a libcall for some constants.
				define i64 @f7(i64 %a) {
				; CHECK-SWDIV-LABEL: f7
				; CHECK-SWDIV: adc
				; CHECK-SWDIV: umull
				; CHECK-HWDIV-LABEL: f7
				; CHECK-HWDIV: adc
				; CHECK-HWDIV: umull
				; CHECK-EABI-LABEL: f7
				; CHECK-EABI: adc
				; CHECK-EABI: umull
				%tmp1 = urem i64 %a, 3
				ret i64 %tmp1
				}

				; Make sure we avoid a libcall for some constants.
				define i64 @f8(i64 %a) {
				; CHECK-SWDIV-LABEL: f8
				; CHECK-SWDIV: adc
				; CHECK-SWDIV: umull
				; CHECK-HWDIV-LABEL: f8
				; CHECK-HWDIV: adc
				; CHECK-HWDIV: umull
				; CHECK-EABI-LABEL: f8
				; CHECK-EABI: adc
				; CHECK-EABI: umull
				%tmp1 = udiv i64 %a, 3
				ret i64 %tmp1
				}

llvm/test/CodeGen/RISCV/div-by-constant.ll

	Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%1 = udiv i32 %a, 7			%1 = udiv i32 %a, 7
	ret i32 %1			ret i32 %1
	}			}

	define i64 @udiv64_constant_no_add(i64 %a) nounwind {			define i64 @udiv64_constant_no_add(i64 %a) nounwind {
	; RV32-LABEL: udiv64_constant_no_add:			; RV32-LABEL: udiv64_constant_no_add:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: li a2, 5			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a3, 838861
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: addi a4, a3, -819
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a5, a2, a4
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a6, a5, 2
				; RV32-NEXT: andi a5, a5, -4
				; RV32-NEXT: add a5, a5, a6
				; RV32-NEXT: sub a2, a2, a5
				; RV32-NEXT: sub a5, a0, a2
				; RV32-NEXT: addi a3, a3, -820
				; RV32-NEXT: mul a3, a5, a3
				; RV32-NEXT: mulhu a6, a5, a4
				; RV32-NEXT: add a3, a6, a3
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: mul a0, a0, a4
				; RV32-NEXT: add a1, a3, a0
				; RV32-NEXT: mul a0, a5, a4
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: udiv64_constant_no_add:			; RV64-LABEL: udiv64_constant_no_add:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: lui a1, %hi(.LCPI2_0)			; RV64-NEXT: lui a1, %hi(.LCPI2_0)
	; RV64-NEXT: ld a1, %lo(.LCPI2_0)(a1)			; RV64-NEXT: ld a1, %lo(.LCPI2_0)(a1)
	; RV64-NEXT: mulhu a0, a0, a1			; RV64-NEXT: mulhu a0, a0, a1
	; RV64-NEXT: srli a0, a0, 2			; RV64-NEXT: srli a0, a0, 2
	▲ Show 20 Lines • Show All 807 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/div.ll

	Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines
	; RV32I-NEXT: li a3, 0			; RV32I-NEXT: li a3, 0
	; RV32I-NEXT: call __udivdi3@plt			; RV32I-NEXT: call __udivdi3@plt
	; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	;			;
	; RV32IM-LABEL: udiv64_constant:			; RV32IM-LABEL: udiv64_constant:
	; RV32IM: # %bb.0:			; RV32IM: # %bb.0:
	; RV32IM-NEXT: addi sp, sp, -16			; RV32IM-NEXT: add a2, a0, a1
	; RV32IM-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32IM-NEXT: sltu a3, a2, a0
	; RV32IM-NEXT: li a2, 5			; RV32IM-NEXT: add a2, a2, a3
	; RV32IM-NEXT: li a3, 0			; RV32IM-NEXT: lui a3, 838861
	; RV32IM-NEXT: call __udivdi3@plt			; RV32IM-NEXT: addi a4, a3, -819
	; RV32IM-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32IM-NEXT: mulhu a5, a2, a4
	; RV32IM-NEXT: addi sp, sp, 16			; RV32IM-NEXT: srli a6, a5, 2
				; RV32IM-NEXT: andi a5, a5, -4
				; RV32IM-NEXT: add a5, a5, a6
				; RV32IM-NEXT: sub a2, a2, a5
				; RV32IM-NEXT: sub a5, a0, a2
				; RV32IM-NEXT: addi a3, a3, -820
				; RV32IM-NEXT: mul a3, a5, a3
				; RV32IM-NEXT: mulhu a6, a5, a4
				; RV32IM-NEXT: add a3, a6, a3
				; RV32IM-NEXT: sltu a0, a0, a2
				; RV32IM-NEXT: sub a0, a1, a0
				; RV32IM-NEXT: mul a0, a0, a4
				; RV32IM-NEXT: add a1, a3, a0
				; RV32IM-NEXT: mul a0, a5, a4
	; RV32IM-NEXT: ret			; RV32IM-NEXT: ret
	;			;
	; RV64I-LABEL: udiv64_constant:			; RV64I-LABEL: udiv64_constant:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: li a1, 5			; RV64I-NEXT: li a1, 5
	; RV64I-NEXT: tail __udivdi3@plt			; RV64I-NEXT: tail __udivdi3@plt
	;			;
	; RV64IM-LABEL: udiv64_constant:			; RV64IM-LABEL: udiv64_constant:
	▲ Show 20 Lines • Show All 1,085 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/split-udiv-by-constant.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: sed 's/iXLen2/i64/g' %s \| llc -mtriple=riscv32 -mattr=+m \| \			; RUN: sed 's/iXLen2/i64/g' %s \| llc -mtriple=riscv32 -mattr=+m \| \
	; RUN: FileCheck %s --check-prefix=RV32			; RUN: FileCheck %s --check-prefix=RV32
	; RUN: sed 's/iXLen2/i128/g' %s \| llc -mtriple=riscv64 -mattr=+m \| \			; RUN: sed 's/iXLen2/i128/g' %s \| llc -mtriple=riscv64 -mattr=+m \| \
	; RUN: FileCheck %s --check-prefix=RV64			; RUN: FileCheck %s --check-prefix=RV64

	define iXLen2 @test_udiv_3(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_3(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_3:			; RV32-LABEL: test_udiv_3:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: li a2, 3			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a3, 699051
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: addi a4, a3, -1365
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a5, a2, a4
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a6, a5, 1
				; RV32-NEXT: andi a5, a5, -2
				; RV32-NEXT: add a5, a5, a6
				; RV32-NEXT: sub a2, a2, a5
				; RV32-NEXT: sub a5, a0, a2
				; RV32-NEXT: addi a3, a3, -1366
				; RV32-NEXT: mul a3, a5, a3
				; RV32-NEXT: mulhu a6, a5, a4
				; RV32-NEXT: add a3, a6, a3
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: mul a0, a0, a4
				; RV32-NEXT: add a1, a3, a0
				; RV32-NEXT: mul a0, a5, a4
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_3:			; RV64-LABEL: test_udiv_3:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI0_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI0_0)(a2)
	; RV64-NEXT: li a2, 3			; RV64-NEXT: add a3, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a4, a3, a0
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: add a3, a3, a4
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a4, a3, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a5, a4, 1
				; RV64-NEXT: andi a4, a4, -2
				; RV64-NEXT: lui a6, %hi(.LCPI0_1)
				; RV64-NEXT: ld a6, %lo(.LCPI0_1)(a6)
				; RV64-NEXT: add a4, a4, a5
				; RV64-NEXT: sub a3, a3, a4
				; RV64-NEXT: sub a4, a0, a3
				; RV64-NEXT: mul a5, a4, a6
				; RV64-NEXT: mulhu a6, a4, a2
				; RV64-NEXT: add a5, a6, a5
				; RV64-NEXT: sltu a0, a0, a3
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a2
				; RV64-NEXT: add a1, a5, a0
				; RV64-NEXT: mul a0, a4, a2
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 3			%a = udiv iXLen2 %x, 3
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_5(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_5(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_5:			; RV32-LABEL: test_udiv_5:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: li a2, 5			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a3, 838861
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: addi a4, a3, -819
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a5, a2, a4
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a6, a5, 2
				; RV32-NEXT: andi a5, a5, -4
				; RV32-NEXT: add a5, a5, a6
				; RV32-NEXT: sub a2, a2, a5
				; RV32-NEXT: sub a5, a0, a2
				; RV32-NEXT: addi a3, a3, -820
				; RV32-NEXT: mul a3, a5, a3
				; RV32-NEXT: mulhu a6, a5, a4
				; RV32-NEXT: add a3, a6, a3
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: mul a0, a0, a4
				; RV32-NEXT: add a1, a3, a0
				; RV32-NEXT: mul a0, a5, a4
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_5:			; RV64-LABEL: test_udiv_5:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI1_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI1_0)(a2)
	; RV64-NEXT: li a2, 5			; RV64-NEXT: add a3, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a4, a3, a0
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: add a3, a3, a4
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a4, a3, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a5, a4, 2
				; RV64-NEXT: andi a4, a4, -4
				; RV64-NEXT: lui a6, %hi(.LCPI1_1)
				; RV64-NEXT: ld a6, %lo(.LCPI1_1)(a6)
				; RV64-NEXT: add a4, a4, a5
				; RV64-NEXT: sub a3, a3, a4
				; RV64-NEXT: sub a4, a0, a3
				; RV64-NEXT: mul a5, a4, a6
				; RV64-NEXT: mulhu a6, a4, a2
				; RV64-NEXT: add a5, a6, a5
				; RV64-NEXT: sltu a0, a0, a3
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a2
				; RV64-NEXT: add a1, a5, a0
				; RV64-NEXT: mul a0, a4, a2
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 5			%a = udiv iXLen2 %x, 5
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_7(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_7(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_7:			; RV32-LABEL: test_udiv_7:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 9			%a = udiv iXLen2 %x, 9
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_15(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_15(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_15:			; RV32-LABEL: test_udiv_15:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: li a2, 15			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a3, 559241
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: addi a3, a3, -1911
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a3, a2, a3
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a3, a3, 3
				; RV32-NEXT: slli a4, a3, 4
				; RV32-NEXT: sub a3, a3, a4
				; RV32-NEXT: add a2, a2, a3
				; RV32-NEXT: sub a3, a0, a2
				; RV32-NEXT: lui a4, 978671
				; RV32-NEXT: addi a5, a4, -274
				; RV32-NEXT: mul a5, a3, a5
				; RV32-NEXT: addi a4, a4, -273
				; RV32-NEXT: mulhu a6, a3, a4
				; RV32-NEXT: add a5, a6, a5
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: mul a0, a0, a4
				; RV32-NEXT: add a1, a5, a0
				; RV32-NEXT: mul a0, a3, a4
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_15:			; RV64-LABEL: test_udiv_15:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI4_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI4_0)(a2)
	; RV64-NEXT: li a2, 15			; RV64-NEXT: add a3, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a4, a3, a0
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: add a3, a3, a4
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a2, a3, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a2, a2, 3
				; RV64-NEXT: slli a4, a2, 4
				; RV64-NEXT: sub a2, a2, a4
				; RV64-NEXT: lui a4, %hi(.LCPI4_1)
				; RV64-NEXT: ld a4, %lo(.LCPI4_1)(a4)
				; RV64-NEXT: lui a5, %hi(.LCPI4_2)
				; RV64-NEXT: ld a5, %lo(.LCPI4_2)(a5)
				; RV64-NEXT: add a2, a3, a2
				; RV64-NEXT: sub a3, a0, a2
				; RV64-NEXT: mul a4, a3, a4
				; RV64-NEXT: mulhu a6, a3, a5
				; RV64-NEXT: add a4, a6, a4
				; RV64-NEXT: sltu a0, a0, a2
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a5
				; RV64-NEXT: add a1, a4, a0
				; RV64-NEXT: mul a0, a3, a5
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 15			%a = udiv iXLen2 %x, 15
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_17(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_17(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_17:			; RV32-LABEL: test_udiv_17:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: li a2, 17			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a3, 986895
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: addi a4, a3, 241
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a5, a2, a4
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a6, a5, 4
				; RV32-NEXT: andi a5, a5, -16
				; RV32-NEXT: add a5, a5, a6
				; RV32-NEXT: sub a2, a2, a5
				; RV32-NEXT: sub a5, a0, a2
				; RV32-NEXT: addi a3, a3, 240
				; RV32-NEXT: mul a3, a5, a3
				; RV32-NEXT: mulhu a6, a5, a4
				; RV32-NEXT: add a3, a6, a3
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: mul a0, a0, a4
				; RV32-NEXT: add a1, a3, a0
				; RV32-NEXT: mul a0, a5, a4
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_17:			; RV64-LABEL: test_udiv_17:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI5_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI5_0)(a2)
	; RV64-NEXT: li a2, 17			; RV64-NEXT: add a3, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a4, a3, a0
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: add a3, a3, a4
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a4, a3, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a5, a4, 4
				; RV64-NEXT: andi a4, a4, -16
				; RV64-NEXT: lui a6, %hi(.LCPI5_1)
				; RV64-NEXT: ld a6, %lo(.LCPI5_1)(a6)
				; RV64-NEXT: add a4, a4, a5
				; RV64-NEXT: sub a3, a3, a4
				; RV64-NEXT: sub a4, a0, a3
				; RV64-NEXT: mul a5, a4, a6
				; RV64-NEXT: mulhu a6, a4, a2
				; RV64-NEXT: add a5, a6, a5
				; RV64-NEXT: sltu a0, a0, a3
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a2
				; RV64-NEXT: add a1, a5, a0
				; RV64-NEXT: mul a0, a4, a2
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 17			%a = udiv iXLen2 %x, 17
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_255(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_255(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_255:			; RV32-LABEL: test_udiv_255:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: li a2, 255			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a3, 526344
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: addi a3, a3, 129
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a3, a2, a3
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a3, a3, 7
				; RV32-NEXT: slli a4, a3, 8
				; RV32-NEXT: sub a3, a3, a4
				; RV32-NEXT: add a2, a2, a3
				; RV32-NEXT: sub a3, a0, a2
				; RV32-NEXT: lui a4, 1044464
				; RV32-NEXT: addi a5, a4, -258
				; RV32-NEXT: mul a5, a3, a5
				; RV32-NEXT: addi a4, a4, -257
				; RV32-NEXT: mulhu a6, a3, a4
				; RV32-NEXT: add a5, a6, a5
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: mul a0, a0, a4
				; RV32-NEXT: add a1, a5, a0
				; RV32-NEXT: mul a0, a3, a4
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_255:			; RV64-LABEL: test_udiv_255:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI6_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI6_0)(a2)
	; RV64-NEXT: li a2, 255			; RV64-NEXT: add a3, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a4, a3, a0
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: add a3, a3, a4
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a2, a3, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a2, a2, 7
				; RV64-NEXT: slli a4, a2, 8
				; RV64-NEXT: sub a2, a2, a4
				; RV64-NEXT: lui a4, %hi(.LCPI6_1)
				; RV64-NEXT: ld a4, %lo(.LCPI6_1)(a4)
				; RV64-NEXT: lui a5, %hi(.LCPI6_2)
				; RV64-NEXT: ld a5, %lo(.LCPI6_2)(a5)
				; RV64-NEXT: add a2, a3, a2
				; RV64-NEXT: sub a3, a0, a2
				; RV64-NEXT: mul a4, a3, a4
				; RV64-NEXT: mulhu a6, a3, a5
				; RV64-NEXT: add a4, a6, a4
				; RV64-NEXT: sltu a0, a0, a2
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a5
				; RV64-NEXT: add a1, a4, a0
				; RV64-NEXT: mul a0, a3, a5
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 255			%a = udiv iXLen2 %x, 255
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_257(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_257(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_257:			; RV32-LABEL: test_udiv_257:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: li a2, 257			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a3, 1044496
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: addi a4, a3, -255
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a5, a2, a4
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a6, a5, 8
				; RV32-NEXT: andi a5, a5, -256
				; RV32-NEXT: add a5, a5, a6
				; RV32-NEXT: sub a2, a2, a5
				; RV32-NEXT: sub a5, a0, a2
				; RV32-NEXT: addi a3, a3, -256
				; RV32-NEXT: mul a3, a5, a3
				; RV32-NEXT: mulhu a6, a5, a4
				; RV32-NEXT: add a3, a6, a3
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: mul a0, a0, a4
				; RV32-NEXT: add a1, a3, a0
				; RV32-NEXT: mul a0, a5, a4
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_257:			; RV64-LABEL: test_udiv_257:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI7_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI7_0)(a2)
	; RV64-NEXT: li a2, 257			; RV64-NEXT: add a3, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a4, a3, a0
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: add a3, a3, a4
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a4, a3, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a5, a4, 8
				; RV64-NEXT: andi a4, a4, -256
				; RV64-NEXT: lui a6, %hi(.LCPI7_1)
				; RV64-NEXT: ld a6, %lo(.LCPI7_1)(a6)
				; RV64-NEXT: add a4, a4, a5
				; RV64-NEXT: sub a3, a3, a4
				; RV64-NEXT: sub a4, a0, a3
				; RV64-NEXT: mul a5, a4, a6
				; RV64-NEXT: mulhu a6, a4, a2
				; RV64-NEXT: add a5, a6, a5
				; RV64-NEXT: sltu a0, a0, a3
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a2
				; RV64-NEXT: add a1, a5, a0
				; RV64-NEXT: mul a0, a4, a2
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 257			%a = udiv iXLen2 %x, 257
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_65535(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_65535(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_65535:			; RV32-LABEL: test_udiv_65535:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: lui a2, 16			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: addi a2, a2, -1			; RV32-NEXT: lui a3, 524296
	; RV32-NEXT: li a3, 0			; RV32-NEXT: addi a3, a3, 1
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: mulhu a3, a2, a3
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: srli a3, a3, 15
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: slli a4, a3, 16
				; RV32-NEXT: sub a3, a3, a4
				; RV32-NEXT: add a2, a2, a3
				; RV32-NEXT: sub a3, a0, a2
				; RV32-NEXT: lui a4, 1048560
				; RV32-NEXT: addi a5, a4, -2
				; RV32-NEXT: mul a5, a3, a5
				; RV32-NEXT: addi a4, a4, -1
				; RV32-NEXT: mulhu a4, a3, a4
				; RV32-NEXT: add a4, a4, a5
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: slli a1, a0, 16
				; RV32-NEXT: add a0, a1, a0
				; RV32-NEXT: sub a1, a4, a0
				; RV32-NEXT: slli a0, a3, 16
				; RV32-NEXT: add a0, a0, a3
				; RV32-NEXT: neg a0, a0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_65535:			; RV64-LABEL: test_udiv_65535:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI8_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI8_0)(a2)
	; RV64-NEXT: lui a2, 16			; RV64-NEXT: add a3, a0, a1
	; RV64-NEXT: addiw a2, a2, -1			; RV64-NEXT: sltu a4, a3, a0
	; RV64-NEXT: li a3, 0			; RV64-NEXT: add a3, a3, a4
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: mulhu a2, a3, a2
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: srli a2, a2, 15
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: slli a4, a2, 16
				; RV64-NEXT: sub a2, a2, a4
				; RV64-NEXT: add a2, a3, a2
				; RV64-NEXT: sub a3, a0, a2
				; RV64-NEXT: lui a4, 983039
				; RV64-NEXT: slli a4, a4, 4
				; RV64-NEXT: addi a4, a4, -1
				; RV64-NEXT: slli a4, a4, 16
				; RV64-NEXT: addi a5, a4, -2
				; RV64-NEXT: mul a5, a3, a5
				; RV64-NEXT: addi a4, a4, -1
				; RV64-NEXT: mulhu a6, a3, a4
				; RV64-NEXT: add a5, a6, a5
				; RV64-NEXT: sltu a0, a0, a2
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a4
				; RV64-NEXT: add a1, a5, a0
				; RV64-NEXT: mul a0, a3, a4
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 65535			%a = udiv iXLen2 %x, 65535
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_65537(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_65537(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_65537:			; RV32-LABEL: test_udiv_65537:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a2, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a3, a2, a0
	; RV32-NEXT: lui a2, 16			; RV32-NEXT: add a2, a2, a3
	; RV32-NEXT: addi a2, a2, 1			; RV32-NEXT: lui a3, 1048560
	; RV32-NEXT: li a3, 0			; RV32-NEXT: addi a4, a3, 1
	; RV32-NEXT: call __udivdi3@plt			; RV32-NEXT: mulhu a5, a2, a4
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: and a3, a5, a3
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a5, a5, 16
				; RV32-NEXT: or a3, a3, a5
				; RV32-NEXT: sub a2, a2, a3
				; RV32-NEXT: sub a3, a0, a2
				; RV32-NEXT: mulhu a4, a3, a4
				; RV32-NEXT: slli a5, a3, 16
				; RV32-NEXT: sub a4, a4, a5
				; RV32-NEXT: sltu a0, a0, a2
				; RV32-NEXT: sub a0, a1, a0
				; RV32-NEXT: slli a1, a0, 16
				; RV32-NEXT: sub a0, a0, a1
				; RV32-NEXT: add a1, a4, a0
				; RV32-NEXT: sub a0, a3, a5
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_udiv_65537:			; RV64-LABEL: test_udiv_65537:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: add a2, a0, a1
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: sltu a3, a2, a0
	; RV64-NEXT: lui a2, 16			; RV64-NEXT: add a2, a2, a3
	; RV64-NEXT: addiw a2, a2, 1			; RV64-NEXT: lui a3, 983041
	; RV64-NEXT: li a3, 0			; RV64-NEXT: slli a3, a3, 4
	; RV64-NEXT: call __udivti3@plt			; RV64-NEXT: addi a3, a3, -1
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: slli a3, a3, 16
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: addi a4, a3, 1
				; RV64-NEXT: mulhu a5, a2, a4
				; RV64-NEXT: lui a6, 1048560
				; RV64-NEXT: and a6, a5, a6
				; RV64-NEXT: srli a5, a5, 16
				; RV64-NEXT: add a5, a6, a5
				; RV64-NEXT: sub a2, a2, a5
				; RV64-NEXT: sub a5, a0, a2
				; RV64-NEXT: mul a3, a5, a3
				; RV64-NEXT: mulhu a6, a5, a4
				; RV64-NEXT: add a3, a6, a3
				; RV64-NEXT: sltu a0, a0, a2
				; RV64-NEXT: sub a0, a1, a0
				; RV64-NEXT: mul a0, a0, a4
				; RV64-NEXT: add a1, a3, a0
				; RV64-NEXT: mul a0, a5, a4
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = udiv iXLen2 %x, 65537			%a = udiv iXLen2 %x, 65537
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_udiv_12(iXLen2 %x) nounwind {			define iXLen2 @test_udiv_12(iXLen2 %x) nounwind {
	; RV32-LABEL: test_udiv_12:			; RV32-LABEL: test_udiv_12:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	Show All 22 Lines

llvm/test/CodeGen/RISCV/split-urem-by-constant.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: sed 's/iXLen2/i64/g' %s \| llc -mtriple=riscv32 -mattr=+m \| \			; RUN: sed 's/iXLen2/i64/g' %s \| llc -mtriple=riscv32 -mattr=+m \| \
	; RUN: FileCheck %s --check-prefix=RV32			; RUN: FileCheck %s --check-prefix=RV32
	; RUN: sed 's/iXLen2/i128/g' %s \| llc -mtriple=riscv64 -mattr=+m \| \			; RUN: sed 's/iXLen2/i128/g' %s \| llc -mtriple=riscv64 -mattr=+m \| \
	; RUN: FileCheck %s --check-prefix=RV64			; RUN: FileCheck %s --check-prefix=RV64

	define iXLen2 @test_urem_3(iXLen2 %x) nounwind {			define iXLen2 @test_urem_3(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_3:			; RV32-LABEL: test_urem_3:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: li a2, 3			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a1, 699051
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: addi a1, a1, -1365
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a1, a0, a1
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a2, a1, 1
				; RV32-NEXT: andi a1, a1, -2
				; RV32-NEXT: add a1, a1, a2
				; RV32-NEXT: sub a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_3:			; RV64-LABEL: test_urem_3:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI0_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI0_0)(a2)
	; RV64-NEXT: li a2, 3			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a1, a0, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a2, a1, 1
				; RV64-NEXT: andi a1, a1, -2
				; RV64-NEXT: add a1, a1, a2
				; RV64-NEXT: sub a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 3			%a = urem iXLen2 %x, 3
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_5(iXLen2 %x) nounwind {			define iXLen2 @test_urem_5(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_5:			; RV32-LABEL: test_urem_5:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: li a2, 5			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a1, 838861
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: addi a1, a1, -819
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a1, a0, a1
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a2, a1, 2
				; RV32-NEXT: andi a1, a1, -4
				; RV32-NEXT: add a1, a1, a2
				; RV32-NEXT: sub a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_5:			; RV64-LABEL: test_urem_5:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI1_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI1_0)(a2)
	; RV64-NEXT: li a2, 5			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a1, a0, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a2, a1, 2
				; RV64-NEXT: andi a1, a1, -4
				; RV64-NEXT: add a1, a1, a2
				; RV64-NEXT: sub a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 5			%a = urem iXLen2 %x, 5
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_7(iXLen2 %x) nounwind {			define iXLen2 @test_urem_7(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_7:			; RV32-LABEL: test_urem_7:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 9			%a = urem iXLen2 %x, 9
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_15(iXLen2 %x) nounwind {			define iXLen2 @test_urem_15(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_15:			; RV32-LABEL: test_urem_15:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: li a2, 15			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a1, 559241
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: addi a1, a1, -1911
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a1, a0, a1
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a1, a1, 3
				; RV32-NEXT: slli a2, a1, 4
				; RV32-NEXT: sub a1, a1, a2
				; RV32-NEXT: add a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_15:			; RV64-LABEL: test_urem_15:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI4_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI4_0)(a2)
	; RV64-NEXT: li a2, 15			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a1, a0, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a1, a1, 3
				; RV64-NEXT: slli a2, a1, 4
				; RV64-NEXT: sub a1, a1, a2
				; RV64-NEXT: add a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 15			%a = urem iXLen2 %x, 15
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_17(iXLen2 %x) nounwind {			define iXLen2 @test_urem_17(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_17:			; RV32-LABEL: test_urem_17:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: li a2, 17			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a1, 986895
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: addi a1, a1, 241
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a1, a0, a1
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a2, a1, 4
				; RV32-NEXT: andi a1, a1, -16
				; RV32-NEXT: add a1, a1, a2
				; RV32-NEXT: sub a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_17:			; RV64-LABEL: test_urem_17:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI5_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI5_0)(a2)
	; RV64-NEXT: li a2, 17			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a1, a0, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a2, a1, 4
				; RV64-NEXT: andi a1, a1, -16
				; RV64-NEXT: add a1, a1, a2
				; RV64-NEXT: sub a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 17			%a = urem iXLen2 %x, 17
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_255(iXLen2 %x) nounwind {			define iXLen2 @test_urem_255(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_255:			; RV32-LABEL: test_urem_255:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: li a2, 255			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a1, 526344
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: addi a1, a1, 129
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a1, a0, a1
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a1, a1, 7
				; RV32-NEXT: slli a2, a1, 8
				; RV32-NEXT: sub a1, a1, a2
				; RV32-NEXT: add a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_255:			; RV64-LABEL: test_urem_255:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI6_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI6_0)(a2)
	; RV64-NEXT: li a2, 255			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a1, a0, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a1, a1, 7
				; RV64-NEXT: slli a2, a1, 8
				; RV64-NEXT: sub a1, a1, a2
				; RV64-NEXT: add a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 255			%a = urem iXLen2 %x, 255
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_257(iXLen2 %x) nounwind {			define iXLen2 @test_urem_257(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_257:			; RV32-LABEL: test_urem_257:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: li a2, 257			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: li a3, 0			; RV32-NEXT: lui a1, 1044496
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: addi a1, a1, -255
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: mulhu a1, a0, a1
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a2, a1, 8
				; RV32-NEXT: andi a1, a1, -256
				; RV32-NEXT: add a1, a1, a2
				; RV32-NEXT: sub a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_257:			; RV64-LABEL: test_urem_257:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI7_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI7_0)(a2)
	; RV64-NEXT: li a2, 257			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: li a3, 0			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: mulhu a1, a0, a2
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: srli a2, a1, 8
				; RV64-NEXT: andi a1, a1, -256
				; RV64-NEXT: add a1, a1, a2
				; RV64-NEXT: sub a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 257			%a = urem iXLen2 %x, 257
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_65535(iXLen2 %x) nounwind {			define iXLen2 @test_urem_65535(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_65535:			; RV32-LABEL: test_urem_65535:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: lui a2, 16			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: addi a2, a2, -1			; RV32-NEXT: lui a1, 524296
	; RV32-NEXT: li a3, 0			; RV32-NEXT: addi a1, a1, 1
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: mulhu a1, a0, a1
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: srli a1, a1, 15
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: slli a2, a1, 16
				; RV32-NEXT: sub a1, a1, a2
				; RV32-NEXT: add a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_65535:			; RV64-LABEL: test_urem_65535:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: lui a2, %hi(.LCPI8_0)
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: ld a2, %lo(.LCPI8_0)(a2)
	; RV64-NEXT: lui a2, 16			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: addiw a2, a2, -1			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: li a3, 0			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: mulhu a1, a0, a2
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: srli a1, a1, 15
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: slli a2, a1, 16
				; RV64-NEXT: sub a1, a1, a2
				; RV64-NEXT: add a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 65535			%a = urem iXLen2 %x, 65535
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_65537(iXLen2 %x) nounwind {			define iXLen2 @test_urem_65537(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_65537:			; RV32-LABEL: test_urem_65537:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	; RV32-NEXT: addi sp, sp, -16			; RV32-NEXT: add a1, a0, a1
	; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill			; RV32-NEXT: sltu a0, a1, a0
	; RV32-NEXT: lui a2, 16			; RV32-NEXT: add a0, a1, a0
	; RV32-NEXT: addi a2, a2, 1			; RV32-NEXT: lui a1, 1048560
	; RV32-NEXT: li a3, 0			; RV32-NEXT: addi a2, a1, 1
	; RV32-NEXT: call __umoddi3@plt			; RV32-NEXT: mulhu a2, a0, a2
	; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload			; RV32-NEXT: and a1, a2, a1
	; RV32-NEXT: addi sp, sp, 16			; RV32-NEXT: srli a2, a2, 16
				; RV32-NEXT: or a1, a1, a2
				; RV32-NEXT: sub a0, a0, a1
				; RV32-NEXT: li a1, 0
	; RV32-NEXT: ret			; RV32-NEXT: ret
	;			;
	; RV64-LABEL: test_urem_65537:			; RV64-LABEL: test_urem_65537:
	; RV64: # %bb.0:			; RV64: # %bb.0:
	; RV64-NEXT: addi sp, sp, -16			; RV64-NEXT: add a1, a0, a1
	; RV64-NEXT: sd ra, 8(sp) # 8-byte Folded Spill			; RV64-NEXT: sltu a0, a1, a0
	; RV64-NEXT: lui a2, 16			; RV64-NEXT: add a0, a1, a0
	; RV64-NEXT: addiw a2, a2, 1			; RV64-NEXT: lui a1, 983041
	; RV64-NEXT: li a3, 0			; RV64-NEXT: slli a1, a1, 4
	; RV64-NEXT: call __umodti3@plt			; RV64-NEXT: addi a1, a1, -1
	; RV64-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64-NEXT: slli a1, a1, 16
	; RV64-NEXT: addi sp, sp, 16			; RV64-NEXT: addi a1, a1, 1
				; RV64-NEXT: mulhu a1, a0, a1
				; RV64-NEXT: lui a2, 1048560
				; RV64-NEXT: and a2, a1, a2
				; RV64-NEXT: srli a1, a1, 16
				; RV64-NEXT: add a1, a2, a1
				; RV64-NEXT: sub a0, a0, a1
				; RV64-NEXT: li a1, 0
	; RV64-NEXT: ret			; RV64-NEXT: ret
	%a = urem iXLen2 %x, 65537			%a = urem iXLen2 %x, 65537
	ret iXLen2 %a			ret iXLen2 %a
	}			}

	define iXLen2 @test_urem_12(iXLen2 %x) nounwind {			define iXLen2 @test_urem_12(iXLen2 %x) nounwind {
	; RV32-LABEL: test_urem_12:			; RV32-LABEL: test_urem_12:
	; RV32: # %bb.0:			; RV32: # %bb.0:
	Show All 23 Lines

llvm/test/CodeGen/VE/Scalar/rem.ll

	Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines

	; Function Attrs: norecurse nounwind readnone			; Function Attrs: norecurse nounwind readnone
	define i128 @remu128ri(i128 %a) {			define i128 @remu128ri(i128 %a) {
	; CHECK-LABEL: remu128ri:			; CHECK-LABEL: remu128ri:
	; CHECK: .LBB{{[0-9]+}}_2:			; CHECK: .LBB{{[0-9]+}}_2:
	; CHECK-NEXT: lea %s2, __umodti3@lo			; CHECK-NEXT: lea %s2, __umodti3@lo
	; CHECK-NEXT: and %s2, %s2, (32)0			; CHECK-NEXT: and %s2, %s2, (32)0
	; CHECK-NEXT: lea.sl %s12, __umodti3@hi(, %s2)			; CHECK-NEXT: lea.sl %s12, __umodti3@hi(, %s2)
	; CHECK-NEXT: or %s2, 3, (0)1			; CHECK-NEXT: or %s2, 11, (0)1
	; CHECK-NEXT: or %s3, 0, (0)1			; CHECK-NEXT: or %s3, 0, (0)1
	; CHECK-NEXT: bsic %s10, (, %s12)			; CHECK-NEXT: bsic %s10, (, %s12)
	; CHECK-NEXT: or %s11, 0, %s9			; CHECK-NEXT: or %s11, 0, %s9
	%r = urem i128 %a, 3			%r = urem i128 %a, 11
	ret i128 %r			ret i128 %r
	}			}

	; Function Attrs: norecurse nounwind readnone			; Function Attrs: norecurse nounwind readnone
	define i64 @remu64ri(i64 %a) {			define i64 @remu64ri(i64 %a) {
	; CHECK-LABEL: remu64ri:			; CHECK-LABEL: remu64ri:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: divu.l %s1, %s0, (62)0			; CHECK-NEXT: divu.l %s1, %s0, (62)0
	▲ Show 20 Lines • Show All 103 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/divide-by-constant.ll

Show First 20 Lines • Show All 454 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%5 = insertvalue { i64, i32 } undef, i64 %2, 0		%5 = insertvalue { i64, i32 } undef, i64 %2, 0
%6 = insertvalue { i64, i32 } %5, i32 %4, 1		%6 = insertvalue { i64, i32 } %5, i32 %4, 1
ret { i64, i32 } %6		ret { i64, i32 } %6
}		}

define i64 @urem_i64_3(i64 %x) nounwind {		define i64 @urem_i64_3(i64 %x) nounwind {
; X32-LABEL: urem_i64_3:		; X32-LABEL: urem_i64_3:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $0		; X32-NEXT: addl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $3		; X32-NEXT: adcl $0, %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl $-1431655765, %edx # imm = 0xAAAAAAAB
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: mull %edx
; X32-NEXT: addl $28, %esp		; X32-NEXT: shrl %edx
		; X32-NEXT: leal (%edx,%edx,2), %eax
		; X32-NEXT: subl %eax, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_3:		; X64-LABEL: urem_i64_3:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-6148914691236517205, %rcx # imm = 0xAAAAAAAAAAAAAAAB		; X64-NEXT: movabsq $-6148914691236517205, %rcx # imm = 0xAAAAAAAAAAAAAAAB
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: shrq %rdx		; X64-NEXT: shrq %rdx
; X64-NEXT: leaq (%rdx,%rdx,2), %rax		; X64-NEXT: leaq (%rdx,%rdx,2), %rax
; X64-NEXT: subq %rax, %rdi		; X64-NEXT: subq %rax, %rdi
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = urem i64 %x, 3		%rem = urem i64 %x, 3
ret i64 %rem		ret i64 %rem
}		}

define i64 @urem_i64_5(i64 %x) nounwind {		define i64 @urem_i64_5(i64 %x) nounwind {
; X32-LABEL: urem_i64_5:		; X32-LABEL: urem_i64_5:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $0		; X32-NEXT: addl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $5		; X32-NEXT: adcl $0, %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl $-858993459, %edx # imm = 0xCCCCCCCD
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: mull %edx
; X32-NEXT: addl $28, %esp		; X32-NEXT: shrl $2, %edx
		; X32-NEXT: leal (%edx,%edx,4), %eax
		; X32-NEXT: subl %eax, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_5:		; X64-LABEL: urem_i64_5:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-3689348814741910323, %rcx # imm = 0xCCCCCCCCCCCCCCCD		; X64-NEXT: movabsq $-3689348814741910323, %rcx # imm = 0xCCCCCCCCCCCCCCCD
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: shrq $2, %rdx		; X64-NEXT: shrq $2, %rdx
; X64-NEXT: leaq (%rdx,%rdx,4), %rax		; X64-NEXT: leaq (%rdx,%rdx,4), %rax
; X64-NEXT: subq %rax, %rdi		; X64-NEXT: subq %rax, %rdi
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = urem i64 %x, 5		%rem = urem i64 %x, 5
ret i64 %rem		ret i64 %rem
}		}

define i64 @urem_i64_15(i64 %x) nounwind {		define i64 @urem_i64_15(i64 %x) nounwind {
; X32-LABEL: urem_i64_15:		; X32-LABEL: urem_i64_15:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $0		; X32-NEXT: addl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $15		; X32-NEXT: adcl $0, %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl $-2004318071, %edx # imm = 0x88888889
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: mull %edx
; X32-NEXT: addl $28, %esp		; X32-NEXT: shrl $3, %edx
		; X32-NEXT: leal (%edx,%edx,4), %eax
		; X32-NEXT: leal (%eax,%eax,2), %eax
		; X32-NEXT: subl %eax, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_15:		; X64-LABEL: urem_i64_15:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-8608480567731124087, %rcx # imm = 0x8888888888888889		; X64-NEXT: movabsq $-8608480567731124087, %rcx # imm = 0x8888888888888889
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: shrq $3, %rdx		; X64-NEXT: shrq $3, %rdx
; X64-NEXT: leaq (%rdx,%rdx,4), %rax		; X64-NEXT: leaq (%rdx,%rdx,4), %rax
; X64-NEXT: leaq (%rax,%rax,2), %rax		; X64-NEXT: leaq (%rax,%rax,2), %rax
; X64-NEXT: subq %rax, %rdi		; X64-NEXT: subq %rax, %rdi
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = urem i64 %x, 15		%rem = urem i64 %x, 15
ret i64 %rem		ret i64 %rem
}		}

define i64 @urem_i64_17(i64 %x) nounwind {		define i64 @urem_i64_17(i64 %x) nounwind {
; X32-LABEL: urem_i64_17:		; X32-LABEL: urem_i64_17:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $0		; X32-NEXT: addl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $17		; X32-NEXT: adcl $0, %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl $-252645135, %edx # imm = 0xF0F0F0F1
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: mull %edx
; X32-NEXT: addl $28, %esp		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: andl $-16, %eax
		; X32-NEXT: shrl $4, %edx
		; X32-NEXT: addl %eax, %edx
		; X32-NEXT: subl %edx, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_17:		; X64-LABEL: urem_i64_17:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-1085102592571150095, %rcx # imm = 0xF0F0F0F0F0F0F0F1		; X64-NEXT: movabsq $-1085102592571150095, %rcx # imm = 0xF0F0F0F0F0F0F0F1
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: andq $-16, %rax		; X64-NEXT: andq $-16, %rax
; X64-NEXT: shrq $4, %rdx		; X64-NEXT: shrq $4, %rdx
; X64-NEXT: addq %rax, %rdx		; X64-NEXT: addq %rax, %rdx
; X64-NEXT: subq %rdx, %rdi		; X64-NEXT: subq %rdx, %rdi
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = urem i64 %x, 17		%rem = urem i64 %x, 17
ret i64 %rem		ret i64 %rem
}		}

define i64 @urem_i64_255(i64 %x) nounwind {		define i64 @urem_i64_255(i64 %x) nounwind {
; X32-LABEL: urem_i64_255:		; X32-LABEL: urem_i64_255:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %esi
; X32-NEXT: pushl $0		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $255		; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: addl %esi, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: adcl $0, %eax
; X32-NEXT: addl $28, %esp		; X32-NEXT: movl $-2139062143, %edx # imm = 0x80808081
		; X32-NEXT: mull %edx
		; X32-NEXT: shrl $7, %edx
		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: shll $8, %eax
		; X32-NEXT: subl %eax, %edx
		; X32-NEXT: addl %esi, %ecx
		; X32-NEXT: adcl %edx, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
		; X32-NEXT: popl %esi
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_255:		; X64-LABEL: urem_i64_255:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-9187201950435737471, %rcx # imm = 0x8080808080808081		; X64-NEXT: movabsq $-9187201950435737471, %rcx # imm = 0x8080808080808081
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: shrq $7, %rdx		; X64-NEXT: shrq $7, %rdx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shlq $8, %rax		; X64-NEXT: shlq $8, %rax
; X64-NEXT: subq %rax, %rdx		; X64-NEXT: subq %rax, %rdx
; X64-NEXT: leaq (%rdx,%rdi), %rax		; X64-NEXT: leaq (%rdx,%rdi), %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = urem i64 %x, 255		%rem = urem i64 %x, 255
ret i64 %rem		ret i64 %rem
}		}

define i64 @urem_i64_257(i64 %x) nounwind {		define i64 @urem_i64_257(i64 %x) nounwind {
; X32-LABEL: urem_i64_257:		; X32-LABEL: urem_i64_257:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $0		; X32-NEXT: addl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $257 # imm = 0x101		; X32-NEXT: adcl $0, %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl $-16711935, %edx # imm = 0xFF00FF01
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: mull %edx
; X32-NEXT: addl $28, %esp		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: andl $-256, %eax
		; X32-NEXT: shrl $8, %edx
		; X32-NEXT: addl %eax, %edx
		; X32-NEXT: subl %edx, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_257:		; X64-LABEL: urem_i64_257:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-71777214294589695, %rcx # imm = 0xFF00FF00FF00FF01		; X64-NEXT: movabsq $-71777214294589695, %rcx # imm = 0xFF00FF00FF00FF01
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: andq $-256, %rax		; X64-NEXT: andq $-256, %rax
; X64-NEXT: shrq $8, %rdx		; X64-NEXT: shrq $8, %rdx
; X64-NEXT: addq %rax, %rdx		; X64-NEXT: addq %rax, %rdx
; X64-NEXT: subq %rdx, %rdi		; X64-NEXT: subq %rdx, %rdi
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = urem i64 %x, 257		%rem = urem i64 %x, 257
ret i64 %rem		ret i64 %rem
}		}

define i64 @urem_i64_65535(i64 %x) nounwind {		define i64 @urem_i64_65535(i64 %x) nounwind {
; X32-LABEL: urem_i64_65535:		; X32-LABEL: urem_i64_65535:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %esi
; X32-NEXT: pushl $0		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $65535 # imm = 0xFFFF		; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: addl %esi, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: adcl $0, %eax
; X32-NEXT: addl $28, %esp		; X32-NEXT: movl $-2147450879, %edx # imm = 0x80008001
		; X32-NEXT: mull %edx
		; X32-NEXT: shrl $15, %edx
		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: shll $16, %eax
		; X32-NEXT: subl %eax, %edx
		; X32-NEXT: addl %esi, %ecx
		; X32-NEXT: adcl %edx, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
		; X32-NEXT: popl %esi
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_65535:		; X64-LABEL: urem_i64_65535:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-9223231297218904063, %rcx # imm = 0x8000800080008001		; X64-NEXT: movabsq $-9223231297218904063, %rcx # imm = 0x8000800080008001
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: shrq $15, %rdx		; X64-NEXT: shrq $15, %rdx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shlq $16, %rax		; X64-NEXT: shlq $16, %rax
; X64-NEXT: subq %rax, %rdx		; X64-NEXT: subq %rax, %rdx
; X64-NEXT: leaq (%rdx,%rdi), %rax		; X64-NEXT: leaq (%rdx,%rdi), %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = urem i64 %x, 65535		%rem = urem i64 %x, 65535
ret i64 %rem		ret i64 %rem
}		}

define i64 @urem_i64_65537(i64 %x) nounwind {		define i64 @urem_i64_65537(i64 %x) nounwind {
; X32-LABEL: urem_i64_65537:		; X32-LABEL: urem_i64_65537:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $0		; X32-NEXT: addl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $65537 # imm = 0x10001		; X32-NEXT: adcl $0, %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl $-65535, %edx # imm = 0xFFFF0001
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: calll __umoddi3		; X32-NEXT: mull %edx
; X32-NEXT: addl $28, %esp		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: shrl $16, %eax
		; X32-NEXT: shldl $16, %edx, %eax
		; X32-NEXT: subl %eax, %ecx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: xorl %edx, %edx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: urem_i64_65537:		; X64-LABEL: urem_i64_65537:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movabsq $-281470681808895, %rcx # imm = 0xFFFF0000FFFF0001		; X64-NEXT: movabsq $-281470681808895, %rcx # imm = 0xFFFF0000FFFF0001
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
Show All 34 Lines
entry:		entry:
%rem = urem i64 %x, 12		%rem = urem i64 %x, 12
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_3(i64 %x) nounwind {		define i64 @udiv_i64_3(i64 %x) nounwind {
; X32-LABEL: udiv_i64_3:		; X32-LABEL: udiv_i64_3:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %ebx
; X32-NEXT: pushl $0		; X32-NEXT: pushl %edi
; X32-NEXT: pushl $3		; X32-NEXT: pushl %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
; X32-NEXT: calll __udivdi3		; X32-NEXT: movl %ecx, %esi
; X32-NEXT: addl $28, %esp		; X32-NEXT: addl %edi, %esi
		; X32-NEXT: adcl $0, %esi
		; X32-NEXT: movl $-1431655765, %ebx # imm = 0xAAAAAAAB
		; X32-NEXT: movl %esi, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: shrl %edx
		; X32-NEXT: leal (%edx,%edx,2), %eax
		; X32-NEXT: subl %eax, %esi
		; X32-NEXT: subl %esi, %ecx
		; X32-NEXT: sbbl $0, %edi
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: imull $-1431655766, %ecx, %ecx # imm = 0xAAAAAAAA
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: imull $-1431655765, %edi, %ecx # imm = 0xAAAAAAAB
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: popl %esi
		; X32-NEXT: popl %edi
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_3:		; X64-LABEL: udiv_i64_3:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-6148914691236517205, %rcx # imm = 0xAAAAAAAAAAAAAAAB		; X64-NEXT: movabsq $-6148914691236517205, %rcx # imm = 0xAAAAAAAAAAAAAAAB
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq %rax		; X64-NEXT: shrq %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 3		%rem = udiv i64 %x, 3
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_5(i64 %x) nounwind {		define i64 @udiv_i64_5(i64 %x) nounwind {
; X32-LABEL: udiv_i64_5:		; X32-LABEL: udiv_i64_5:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %ebx
; X32-NEXT: pushl $0		; X32-NEXT: pushl %edi
; X32-NEXT: pushl $5		; X32-NEXT: pushl %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
; X32-NEXT: calll __udivdi3		; X32-NEXT: movl %ecx, %esi
; X32-NEXT: addl $28, %esp		; X32-NEXT: addl %edi, %esi
		; X32-NEXT: adcl $0, %esi
		; X32-NEXT: movl $-858993459, %ebx # imm = 0xCCCCCCCD
		; X32-NEXT: movl %esi, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: shrl $2, %edx
		; X32-NEXT: leal (%edx,%edx,4), %eax
		; X32-NEXT: subl %eax, %esi
		; X32-NEXT: subl %esi, %ecx
		; X32-NEXT: sbbl $0, %edi
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: imull $-858993460, %ecx, %ecx # imm = 0xCCCCCCCC
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: imull $-858993459, %edi, %ecx # imm = 0xCCCCCCCD
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: popl %esi
		; X32-NEXT: popl %edi
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_5:		; X64-LABEL: udiv_i64_5:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-3689348814741910323, %rcx # imm = 0xCCCCCCCCCCCCCCCD		; X64-NEXT: movabsq $-3689348814741910323, %rcx # imm = 0xCCCCCCCCCCCCCCCD
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq $2, %rax		; X64-NEXT: shrq $2, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 5		%rem = udiv i64 %x, 5
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_15(i64 %x) nounwind {		define i64 @udiv_i64_15(i64 %x) nounwind {
; X32-LABEL: udiv_i64_15:		; X32-LABEL: udiv_i64_15:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %edi
; X32-NEXT: pushl $0		; X32-NEXT: pushl %esi
; X32-NEXT: pushl $15		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %esi
; X32-NEXT: calll __udivdi3		; X32-NEXT: addl %edi, %esi
; X32-NEXT: addl $28, %esp		; X32-NEXT: adcl $0, %esi
		; X32-NEXT: movl $-2004318071, %edx # imm = 0x88888889
		; X32-NEXT: movl %esi, %eax
		; X32-NEXT: mull %edx
		; X32-NEXT: shrl $3, %edx
		; X32-NEXT: leal (%edx,%edx,4), %eax
		; X32-NEXT: leal (%eax,%eax,2), %eax
		; X32-NEXT: subl %eax, %esi
		; X32-NEXT: subl %esi, %ecx
		; X32-NEXT: sbbl $0, %edi
		; X32-NEXT: movl $-286331153, %edx # imm = 0xEEEEEEEF
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %edx
		; X32-NEXT: imull $-286331154, %ecx, %ecx # imm = 0xEEEEEEEE
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: imull $-286331153, %edi, %ecx # imm = 0xEEEEEEEF
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: popl %esi
		; X32-NEXT: popl %edi
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_15:		; X64-LABEL: udiv_i64_15:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-8608480567731124087, %rcx # imm = 0x8888888888888889		; X64-NEXT: movabsq $-8608480567731124087, %rcx # imm = 0x8888888888888889
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq $3, %rax		; X64-NEXT: shrq $3, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 15		%rem = udiv i64 %x, 15
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_17(i64 %x) nounwind {		define i64 @udiv_i64_17(i64 %x) nounwind {
; X32-LABEL: udiv_i64_17:		; X32-LABEL: udiv_i64_17:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %ebx
; X32-NEXT: pushl $0		; X32-NEXT: pushl %edi
; X32-NEXT: pushl $17		; X32-NEXT: pushl %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
; X32-NEXT: calll __udivdi3		; X32-NEXT: movl %ecx, %esi
; X32-NEXT: addl $28, %esp		; X32-NEXT: addl %edi, %esi
		; X32-NEXT: adcl $0, %esi
		; X32-NEXT: movl $-252645135, %ebx # imm = 0xF0F0F0F1
		; X32-NEXT: movl %esi, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: andl $-16, %eax
		; X32-NEXT: shrl $4, %edx
		; X32-NEXT: addl %eax, %edx
		; X32-NEXT: subl %edx, %esi
		; X32-NEXT: subl %esi, %ecx
		; X32-NEXT: sbbl $0, %edi
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: imull $-252645136, %ecx, %ecx # imm = 0xF0F0F0F0
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: imull $-252645135, %edi, %ecx # imm = 0xF0F0F0F1
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: popl %esi
		; X32-NEXT: popl %edi
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_17:		; X64-LABEL: udiv_i64_17:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-1085102592571150095, %rcx # imm = 0xF0F0F0F0F0F0F0F1		; X64-NEXT: movabsq $-1085102592571150095, %rcx # imm = 0xF0F0F0F0F0F0F0F1
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq $4, %rax		; X64-NEXT: shrq $4, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 17		%rem = udiv i64 %x, 17
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_255(i64 %x) nounwind {		define i64 @udiv_i64_255(i64 %x) nounwind {
; X32-LABEL: udiv_i64_255:		; X32-LABEL: udiv_i64_255:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %esi
; X32-NEXT: pushl $0		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $255		; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: addl %esi, %eax
; X32-NEXT: calll __udivdi3		; X32-NEXT: adcl $0, %eax
; X32-NEXT: addl $28, %esp		; X32-NEXT: movl $-2139062143, %edx # imm = 0x80808081
		; X32-NEXT: mull %edx
		; X32-NEXT: shrl $7, %edx
		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: shll $8, %eax
		; X32-NEXT: subl %eax, %edx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: addl %esi, %eax
		; X32-NEXT: adcl %edx, %eax
		; X32-NEXT: subl %eax, %ecx
		; X32-NEXT: sbbl $0, %esi
		; X32-NEXT: movl $-16843009, %edx # imm = 0xFEFEFEFF
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %edx
		; X32-NEXT: imull $-16843010, %ecx, %ecx # imm = 0xFEFEFEFE
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: imull $-16843009, %esi, %ecx # imm = 0xFEFEFEFF
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: popl %esi
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_255:		; X64-LABEL: udiv_i64_255:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-9187201950435737471, %rcx # imm = 0x8080808080808081		; X64-NEXT: movabsq $-9187201950435737471, %rcx # imm = 0x8080808080808081
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq $7, %rax		; X64-NEXT: shrq $7, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 255		%rem = udiv i64 %x, 255
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_257(i64 %x) nounwind {		define i64 @udiv_i64_257(i64 %x) nounwind {
; X32-LABEL: udiv_i64_257:		; X32-LABEL: udiv_i64_257:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %ebx
; X32-NEXT: pushl $0		; X32-NEXT: pushl %edi
; X32-NEXT: pushl $257 # imm = 0x101		; X32-NEXT: pushl %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
; X32-NEXT: calll __udivdi3		; X32-NEXT: movl %ecx, %esi
; X32-NEXT: addl $28, %esp		; X32-NEXT: addl %edi, %esi
		; X32-NEXT: adcl $0, %esi
		; X32-NEXT: movl $-16711935, %ebx # imm = 0xFF00FF01
		; X32-NEXT: movl %esi, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: andl $-256, %eax
		; X32-NEXT: shrl $8, %edx
		; X32-NEXT: addl %eax, %edx
		; X32-NEXT: subl %edx, %esi
		; X32-NEXT: subl %esi, %ecx
		; X32-NEXT: sbbl $0, %edi
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: imull $-16711936, %ecx, %ecx # imm = 0xFF00FF00
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: imull $-16711935, %edi, %ecx # imm = 0xFF00FF01
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: popl %esi
		; X32-NEXT: popl %edi
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_257:		; X64-LABEL: udiv_i64_257:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-71777214294589695, %rcx # imm = 0xFF00FF00FF00FF01		; X64-NEXT: movabsq $-71777214294589695, %rcx # imm = 0xFF00FF00FF00FF01
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq $8, %rax		; X64-NEXT: shrq $8, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 257		%rem = udiv i64 %x, 257
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_65535(i64 %x) nounwind {		define i64 @udiv_i64_65535(i64 %x) nounwind {
; X32-LABEL: udiv_i64_65535:		; X32-LABEL: udiv_i64_65535:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %esi
; X32-NEXT: pushl $0		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl $65535 # imm = 0xFFFF		; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl %ecx, %eax
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: addl %esi, %eax
; X32-NEXT: calll __udivdi3		; X32-NEXT: adcl $0, %eax
; X32-NEXT: addl $28, %esp		; X32-NEXT: movl $-2147450879, %edx # imm = 0x80008001
		; X32-NEXT: mull %edx
		; X32-NEXT: shrl $15, %edx
		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: shll $16, %eax
		; X32-NEXT: subl %eax, %edx
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: addl %esi, %eax
		; X32-NEXT: adcl %edx, %eax
		; X32-NEXT: subl %eax, %ecx
		; X32-NEXT: sbbl $0, %esi
		; X32-NEXT: movl $-65537, %edx # imm = 0xFFFEFFFF
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %edx
		; X32-NEXT: imull $-65538, %ecx, %ecx # imm = 0xFFFEFFFE
		; X32-NEXT: addl %ecx, %edx
		; X32-NEXT: movl %esi, %ecx
		; X32-NEXT: shll $16, %ecx
		; X32-NEXT: addl %esi, %ecx
		; X32-NEXT: subl %ecx, %edx
		; X32-NEXT: popl %esi
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_65535:		; X64-LABEL: udiv_i64_65535:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-9223231297218904063, %rcx # imm = 0x8000800080008001		; X64-NEXT: movabsq $-9223231297218904063, %rcx # imm = 0x8000800080008001
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq $15, %rax		; X64-NEXT: shrq $15, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 65535		%rem = udiv i64 %x, 65535
ret i64 %rem		ret i64 %rem
}		}

define i64 @udiv_i64_65537(i64 %x) nounwind {		define i64 @udiv_i64_65537(i64 %x) nounwind {
; X32-LABEL: udiv_i64_65537:		; X32-LABEL: udiv_i64_65537:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: pushl %ebx
; X32-NEXT: pushl $0		; X32-NEXT: pushl %edi
; X32-NEXT: pushl $65537 # imm = 0x10001		; X32-NEXT: pushl %esi
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
; X32-NEXT: calll __udivdi3		; X32-NEXT: movl %ecx, %esi
; X32-NEXT: addl $28, %esp		; X32-NEXT: addl %edi, %esi
		; X32-NEXT: adcl $0, %esi
		; X32-NEXT: movl $-65535, %ebx # imm = 0xFFFF0001
		; X32-NEXT: movl %esi, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: movl %edx, %eax
		; X32-NEXT: shrl $16, %eax
		; X32-NEXT: shldl $16, %edx, %eax
		; X32-NEXT: subl %eax, %esi
		; X32-NEXT: subl %esi, %ecx
		; X32-NEXT: sbbl $0, %edi
		; X32-NEXT: movl %ecx, %eax
		; X32-NEXT: mull %ebx
		; X32-NEXT: shll $16, %ecx
		; X32-NEXT: subl %ecx, %edx
		; X32-NEXT: movl %edi, %ecx
		; X32-NEXT: shll $16, %ecx
		; X32-NEXT: subl %ecx, %edi
		; X32-NEXT: addl %edi, %edx
		; X32-NEXT: popl %esi
		; X32-NEXT: popl %edi
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: udiv_i64_65537:		; X64-LABEL: udiv_i64_65537:
; X64: # %bb.0: # %entry		; X64: # %bb.0: # %entry
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movabsq $-281470681808895, %rcx # imm = 0xFFFF0000FFFF0001		; X64-NEXT: movabsq $-281470681808895, %rcx # imm = 0xFFFF0000FFFF0001
; X64-NEXT: mulq %rcx		; X64-NEXT: mulq %rcx
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
Show All 24 Lines
; X64-NEXT: movq %rdx, %rax		; X64-NEXT: movq %rdx, %rax
; X64-NEXT: shrq $3, %rax		; X64-NEXT: shrq $3, %rax
; X64-NEXT: retq		; X64-NEXT: retq
entry:		entry:
%rem = udiv i64 %x, 12		%rem = udiv i64 %x, 12
ret i64 %rem		ret i64 %rem
}		}

		; Make sure we don't inline expand for optsize.
define i64 @urem_i64_3_optsize(i64 %x) nounwind optsize {		define i64 @urem_i64_3_optsize(i64 %x) nounwind optsize {
; X32-LABEL: urem_i64_3_optsize:		; X32-LABEL: urem_i64_3_optsize:
RKSimonUnsubmitted Not Done Reply Inline Actions why did you remove this? RKSimon: why did you remove this?
craig.topperAuthorUnsubmitted Done Reply Inline Actions I didn't mean to. Not sure what happened there. craig.topper: I didn't mean to. Not sure what happened there.
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: subl $12, %esp		; X32-NEXT: subl $12, %esp
; X32-NEXT: pushl $0		; X32-NEXT: pushl $0
; X32-NEXT: pushl $3		; X32-NEXT: pushl $3
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: pushl {{[0-9]+}}(%esp)
; X32-NEXT: pushl {{[0-9]+}}(%esp)		; X32-NEXT: pushl {{[0-9]+}}(%esp)
; X32-NEXT: calll __umoddi3		; X32-NEXT: calll __umoddi3
; X32-NEXT: addl $28, %esp		; X32-NEXT: addl $28, %esp
Show All 17 Lines

llvm/test/CodeGen/X86/divmod128.ll

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	; WIN64-NEXT: retq
%1 = urem i128 %x, 11		%1 = urem i128 %x, 11
%2 = trunc i128 %1 to i64		%2 = trunc i128 %1 to i64
ret i64 %2		ret i64 %2
}		}

define i64 @udiv128(i128 %x) nounwind {		define i64 @udiv128(i128 %x) nounwind {
; X86-64-LABEL: udiv128:		; X86-64-LABEL: udiv128:
; X86-64: # %bb.0:		; X86-64: # %bb.0:
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: addq %rdi, %rsi
; X86-64-NEXT: movl $3, %edx		; X86-64-NEXT: adcq $0, %rsi
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: movabsq $-6148914691236517205, %rcx # imm = 0xAAAAAAAAAAAAAAAB
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movq %rsi, %rax
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq %rdx
		; X86-64-NEXT: leaq (%rdx,%rdx,2), %rax
		; X86-64-NEXT: subq %rsi, %rax
		; X86-64-NEXT: addq %rdi, %rax
		; X86-64-NEXT: imulq %rcx, %rax
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv128:		; WIN64-LABEL: udiv128:
; WIN64: # %bb.0:		; WIN64: # %bb.0:
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rcx, %r8
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %r8
; WIN64-NEXT: movq $3, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-6148914691236517205, %r9 # imm = 0xAAAAAAAAAAAAAAAB
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %r8, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: mulq %r9
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: shrq %rdx
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: leaq (%rdx,%rdx,2), %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: subq %r8, %rax
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: addq %rcx, %rax
		; WIN64-NEXT: imulq %r9, %rax
; WIN64-NEXT: retq		; WIN64-NEXT: retq


%1 = udiv i128 %x, 3		%1 = udiv i128 %x, 3
%2 = trunc i128 %1 to i64		%2 = trunc i128 %1 to i64
ret i64 %2		ret i64 %2
}		}

define i128 @urem_i128_3(i128 %x) nounwind {		define i128 @urem_i128_3(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_3:		; X86-64-LABEL: urem_i128_3:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: addq %rsi, %rdi
; X86-64-NEXT: movl $3, %edx		; X86-64-NEXT: adcq $0, %rdi
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: movabsq $-6148914691236517205, %rcx # imm = 0xAAAAAAAAAAAAAAAB
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq %rdx
		; X86-64-NEXT: leaq (%rdx,%rdx,2), %rax
		; X86-64-NEXT: subq %rax, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_3:		; WIN64-LABEL: urem_i128_3:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-6148914691236517205, %rdx # imm = 0xAAAAAAAAAAAAAAAB
; WIN64-NEXT: movq $3, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: shrq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: leaq (%rdx,%rdx,2), %rax
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: subq %rax, %rcx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: movq %xmm0, %rdx
; WIN64-NEXT: addq $72, %rsp
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 3		%rem = urem i128 %x, 3
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_5(i128 %x) nounwind {		define i128 @urem_i128_5(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_5:		; X86-64-LABEL: urem_i128_5:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: addq %rsi, %rdi
; X86-64-NEXT: movl $5, %edx		; X86-64-NEXT: adcq $0, %rdi
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: movabsq $-3689348814741910323, %rcx # imm = 0xCCCCCCCCCCCCCCCD
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq $2, %rdx
		; X86-64-NEXT: leaq (%rdx,%rdx,4), %rax
		; X86-64-NEXT: subq %rax, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_5:		; WIN64-LABEL: urem_i128_5:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-3689348814741910323, %rdx # imm = 0xCCCCCCCCCCCCCCCD
; WIN64-NEXT: movq $5, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: shrq $2, %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: leaq (%rdx,%rdx,4), %rax
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: subq %rax, %rcx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: movq %xmm0, %rdx
; WIN64-NEXT: addq $72, %rsp
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 5		%rem = urem i128 %x, 5
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_15(i128 %x) nounwind {		define i128 @urem_i128_15(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_15:		; X86-64-LABEL: urem_i128_15:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: addq %rsi, %rdi
; X86-64-NEXT: movl $15, %edx		; X86-64-NEXT: adcq $0, %rdi
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: movabsq $-8608480567731124087, %rcx # imm = 0x8888888888888889
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq $3, %rdx
		; X86-64-NEXT: leaq (%rdx,%rdx,4), %rax
		; X86-64-NEXT: leaq (%rax,%rax,2), %rax
		; X86-64-NEXT: subq %rax, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_15:		; WIN64-LABEL: urem_i128_15:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-8608480567731124087, %rdx # imm = 0x8888888888888889
; WIN64-NEXT: movq $15, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: shrq $3, %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: leaq (%rdx,%rdx,4), %rax
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: leaq (%rax,%rax,2), %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: subq %rax, %rcx
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: addq $72, %rsp
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 15		%rem = urem i128 %x, 15
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_17(i128 %x) nounwind {		define i128 @urem_i128_17(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_17:		; X86-64-LABEL: urem_i128_17:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: addq %rsi, %rdi
; X86-64-NEXT: movl $17, %edx		; X86-64-NEXT: adcq $0, %rdi
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: movabsq $-1085102592571150095, %rcx # imm = 0xF0F0F0F0F0F0F0F1
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: andq $-16, %rax
		; X86-64-NEXT: shrq $4, %rdx
		; X86-64-NEXT: addq %rax, %rdx
		; X86-64-NEXT: subq %rdx, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_17:		; WIN64-LABEL: urem_i128_17:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-1085102592571150095, %rdx # imm = 0xF0F0F0F0F0F0F0F1
; WIN64-NEXT: movq $17, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: andq $-16, %rax
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: shrq $4, %rdx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: addq %rax, %rdx
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rdx, %rcx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 17		%rem = urem i128 %x, 17
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_255(i128 %x) nounwind {		define i128 @urem_i128_255(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_255:		; X86-64-LABEL: urem_i128_255:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: movl $255, %edx		; X86-64-NEXT: addq %rsi, %rax
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rax
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movabsq $-9187201950435737471, %rcx # imm = 0x8080808080808081
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq $7, %rdx
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: shlq $8, %rax
		; X86-64-NEXT: subq %rax, %rdx
		; X86-64-NEXT: addq %rsi, %rdi
		; X86-64-NEXT: adcq %rdx, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_255:		; WIN64-LABEL: urem_i128_255:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rax
; WIN64-NEXT: movq $255, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-9187201950435737471, %rdx # imm = 0x8080808080808081
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: shrq $7, %rdx
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: shlq $8, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rax, %rdx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: addq %rcx, %r8
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: adcq %rdx, %r8
		; WIN64-NEXT: movq %r8, %rax
		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 255		%rem = urem i128 %x, 255
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_257(i128 %x) nounwind {		define i128 @urem_i128_257(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_257:		; X86-64-LABEL: urem_i128_257:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: addq %rsi, %rdi
; X86-64-NEXT: movl $257, %edx # imm = 0x101		; X86-64-NEXT: adcq $0, %rdi
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: movabsq $-71777214294589695, %rcx # imm = 0xFF00FF00FF00FF01
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: andq $-256, %rax
		; X86-64-NEXT: shrq $8, %rdx
		; X86-64-NEXT: addq %rax, %rdx
		; X86-64-NEXT: subq %rdx, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_257:		; WIN64-LABEL: urem_i128_257:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-71777214294589695, %rdx # imm = 0xFF00FF00FF00FF01
; WIN64-NEXT: movq $257, {{[0-9]+}}(%rsp) # imm = 0x101		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: andq $-256, %rax
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: shrq $8, %rdx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: addq %rax, %rdx
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rdx, %rcx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 257		%rem = urem i128 %x, 257
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_65535(i128 %x) nounwind {		define i128 @urem_i128_65535(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_65535:		; X86-64-LABEL: urem_i128_65535:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: movl $65535, %edx # imm = 0xFFFF		; X86-64-NEXT: addq %rsi, %rax
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rax
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movabsq $-9223231297218904063, %rcx # imm = 0x8000800080008001
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq $15, %rdx
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: shlq $16, %rax
		; X86-64-NEXT: subq %rax, %rdx
		; X86-64-NEXT: addq %rsi, %rdi
		; X86-64-NEXT: adcq %rdx, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_65535:		; WIN64-LABEL: urem_i128_65535:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rax
; WIN64-NEXT: movq $65535, {{[0-9]+}}(%rsp) # imm = 0xFFFF		; WIN64-NEXT: adcq $0, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-9223231297218904063, %rdx # imm = 0x8000800080008001
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: shrq $15, %rdx
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: shlq $16, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rax, %rdx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: addq %rcx, %r8
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: adcq %rdx, %r8
		; WIN64-NEXT: movq %r8, %rax
		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 65535		%rem = urem i128 %x, 65535
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_65537(i128 %x) nounwind {		define i128 @urem_i128_65537(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_65537:		; X86-64-LABEL: urem_i128_65537:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: addq %rsi, %rdi
; X86-64-NEXT: movl $65537, %edx # imm = 0x10001		; X86-64-NEXT: adcq $0, %rdi
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: movabsq $-281470681808895, %rcx # imm = 0xFFFF0000FFFF0001
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: andq $-65536, %rax # imm = 0xFFFF0000
		; X86-64-NEXT: shrq $16, %rdx
		; X86-64-NEXT: addq %rax, %rdx
		; X86-64-NEXT: subq %rdx, %rdi
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: xorl %edx, %edx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: urem_i128_65537:		; WIN64-LABEL: urem_i128_65537:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-281470681808895, %rdx # imm = 0xFFFF0000FFFF0001
; WIN64-NEXT: movq $65537, {{[0-9]+}}(%rsp) # imm = 0x10001		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: andq $-65536, %rax # imm = 0xFFFF0000
; WIN64-NEXT: callq __umodti3		; WIN64-NEXT: shrq $16, %rdx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: addq %rax, %rdx
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rdx, %rcx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: xorl %edx, %edx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = urem i128 %x, 65537		%rem = urem i128 %x, 65537
ret i128 %rem		ret i128 %rem
}		}

define i128 @urem_i128_12(i128 %x) nounwind {		define i128 @urem_i128_12(i128 %x) nounwind {
; X86-64-LABEL: urem_i128_12:		; X86-64-LABEL: urem_i128_12:
Show All 23 Lines
entry:		entry:
%rem = urem i128 %x, 12		%rem = urem i128 %x, 12
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_3(i128 %x) nounwind {		define i128 @udiv_i128_3(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_3:		; X86-64-LABEL: udiv_i128_3:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rcx
; X86-64-NEXT: movl $3, %edx		; X86-64-NEXT: addq %rsi, %rcx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rcx
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movabsq $-6148914691236517205, %r8 # imm = 0xAAAAAAAAAAAAAAAB
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: movq %rcx, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: shrq %rdx
		; X86-64-NEXT: leaq (%rdx,%rdx,2), %rax
		; X86-64-NEXT: subq %rax, %rcx
		; X86-64-NEXT: subq %rcx, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-6148914691236517206, %rcx # imm = 0xAAAAAAAAAAAAAAAA
		; X86-64-NEXT: imulq %rdi, %rcx
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: addq %rcx, %rdx
		; X86-64-NEXT: imulq %rsi, %r8
		; X86-64-NEXT: addq %r8, %rdx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_3:		; WIN64-LABEL: udiv_i128_3:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %r9
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq $3, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-6148914691236517205, %r10 # imm = 0xAAAAAAAAAAAAAAAB
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: mulq %r10
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: shrq %rdx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: leaq (%rdx,%rdx,2), %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rax, %rcx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: subq %rcx, %r9
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-6148914691236517206, %rcx # imm = 0xAAAAAAAAAAAAAAAA
		; WIN64-NEXT: imulq %r9, %rcx
		; WIN64-NEXT: movq %r9, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %rcx, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 3		%rem = udiv i128 %x, 3
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_5(i128 %x) nounwind {		define i128 @udiv_i128_5(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_5:		; X86-64-LABEL: udiv_i128_5:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rcx
; X86-64-NEXT: movl $5, %edx		; X86-64-NEXT: addq %rsi, %rcx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rcx
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movabsq $-3689348814741910323, %r8 # imm = 0xCCCCCCCCCCCCCCCD
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: movq %rcx, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: shrq $2, %rdx
		; X86-64-NEXT: leaq (%rdx,%rdx,4), %rax
		; X86-64-NEXT: subq %rax, %rcx
		; X86-64-NEXT: subq %rcx, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-3689348814741910324, %rcx # imm = 0xCCCCCCCCCCCCCCCC
		; X86-64-NEXT: imulq %rdi, %rcx
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: addq %rcx, %rdx
		; X86-64-NEXT: imulq %rsi, %r8
		; X86-64-NEXT: addq %r8, %rdx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_5:		; WIN64-LABEL: udiv_i128_5:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %r9
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq $5, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-3689348814741910323, %r10 # imm = 0xCCCCCCCCCCCCCCCD
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: mulq %r10
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: shrq $2, %rdx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: leaq (%rdx,%rdx,4), %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rax, %rcx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: subq %rcx, %r9
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-3689348814741910324, %rcx # imm = 0xCCCCCCCCCCCCCCCC
		; WIN64-NEXT: imulq %r9, %rcx
		; WIN64-NEXT: movq %r9, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %rcx, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 5		%rem = udiv i128 %x, 5
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_15(i128 %x) nounwind {		define i128 @udiv_i128_15(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_15:		; X86-64-LABEL: udiv_i128_15:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rcx
; X86-64-NEXT: movl $15, %edx		; X86-64-NEXT: addq %rsi, %rcx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rcx
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movabsq $-8608480567731124087, %rdx # imm = 0x8888888888888889
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: movq %rcx, %rax
		; X86-64-NEXT: mulq %rdx
		; X86-64-NEXT: shrq $3, %rdx
		; X86-64-NEXT: leaq (%rdx,%rdx,4), %rax
		; X86-64-NEXT: leaq (%rax,%rax,2), %rax
		; X86-64-NEXT: subq %rax, %rcx
		; X86-64-NEXT: subq %rcx, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-1229782938247303442, %r8 # imm = 0xEEEEEEEEEEEEEEEE
		; X86-64-NEXT: imulq %rdi, %r8
		; X86-64-NEXT: movabsq $-1229782938247303441, %rcx # imm = 0xEEEEEEEEEEEEEEEF
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: addq %r8, %rdx
		; X86-64-NEXT: imulq %rsi, %rcx
		; X86-64-NEXT: addq %rcx, %rdx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_15:		; WIN64-LABEL: udiv_i128_15:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %r9
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq $15, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-8608480567731124087, %rdx # imm = 0x8888888888888889
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: shrq $3, %rdx
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: leaq (%rdx,%rdx,4), %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: leaq (%rax,%rax,2), %rax
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: subq %rax, %rcx
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: subq %rcx, %r9
		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-1229782938247303442, %rcx # imm = 0xEEEEEEEEEEEEEEEE
		; WIN64-NEXT: imulq %r9, %rcx
		; WIN64-NEXT: movabsq $-1229782938247303441, %r10 # imm = 0xEEEEEEEEEEEEEEEF
		; WIN64-NEXT: movq %r9, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %rcx, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 15		%rem = udiv i128 %x, 15
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_17(i128 %x) nounwind {		define i128 @udiv_i128_17(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_17:		; X86-64-LABEL: udiv_i128_17:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rcx
; X86-64-NEXT: movl $17, %edx		; X86-64-NEXT: addq %rsi, %rcx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rcx
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movabsq $-1085102592571150095, %r8 # imm = 0xF0F0F0F0F0F0F0F1
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: movq %rcx, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: andq $-16, %rax
		; X86-64-NEXT: shrq $4, %rdx
		; X86-64-NEXT: addq %rax, %rdx
		; X86-64-NEXT: subq %rdx, %rcx
		; X86-64-NEXT: subq %rcx, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-1085102592571150096, %rcx # imm = 0xF0F0F0F0F0F0F0F0
		; X86-64-NEXT: imulq %rdi, %rcx
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: addq %rcx, %rdx
		; X86-64-NEXT: imulq %rsi, %r8
		; X86-64-NEXT: addq %r8, %rdx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_17:		; WIN64-LABEL: udiv_i128_17:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %r9
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq $17, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-1085102592571150095, %r10 # imm = 0xF0F0F0F0F0F0F0F1
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: mulq %r10
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: andq $-16, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: shrq $4, %rdx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: addq %rax, %rdx
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: subq %rdx, %rcx
		; WIN64-NEXT: subq %rcx, %r9
		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-1085102592571150096, %rcx # imm = 0xF0F0F0F0F0F0F0F0
		; WIN64-NEXT: imulq %r9, %rcx
		; WIN64-NEXT: movq %r9, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %rcx, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 17		%rem = udiv i128 %x, 17
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_255(i128 %x) nounwind {		define i128 @udiv_i128_255(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_255:		; X86-64-LABEL: udiv_i128_255:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: movl $255, %edx		; X86-64-NEXT: addq %rsi, %rax
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rax
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movabsq $-9187201950435737471, %rcx # imm = 0x8080808080808081
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq $7, %rdx
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: shlq $8, %rax
		; X86-64-NEXT: subq %rax, %rdx
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: addq %rsi, %rax
		; X86-64-NEXT: adcq %rdx, %rax
		; X86-64-NEXT: subq %rax, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-72340172838076674, %r8 # imm = 0xFEFEFEFEFEFEFEFE
		; X86-64-NEXT: imulq %rdi, %r8
		; X86-64-NEXT: movabsq $-72340172838076673, %rcx # imm = 0xFEFEFEFEFEFEFEFF
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: addq %r8, %rdx
		; X86-64-NEXT: imulq %rsi, %rcx
		; X86-64-NEXT: addq %rcx, %rdx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_255:		; WIN64-LABEL: udiv_i128_255:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rax
; WIN64-NEXT: movq $255, {{[0-9]+}}(%rsp)		; WIN64-NEXT: adcq $0, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-9187201950435737471, %rdx # imm = 0x8080808080808081
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: shrq $7, %rdx
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: shlq $8, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rax, %rdx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: addq %r8, %rax
		; WIN64-NEXT: adcq %rdx, %rax
		; WIN64-NEXT: subq %rax, %rcx
		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-72340172838076674, %r9 # imm = 0xFEFEFEFEFEFEFEFE
		; WIN64-NEXT: imulq %rcx, %r9
		; WIN64-NEXT: movabsq $-72340172838076673, %r10 # imm = 0xFEFEFEFEFEFEFEFF
		; WIN64-NEXT: movq %rcx, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %r9, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 255		%rem = udiv i128 %x, 255
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_257(i128 %x) nounwind {		define i128 @udiv_i128_257(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_257:		; X86-64-LABEL: udiv_i128_257:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rcx
; X86-64-NEXT: movl $257, %edx # imm = 0x101		; X86-64-NEXT: addq %rsi, %rcx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rcx
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movabsq $-71777214294589695, %r8 # imm = 0xFF00FF00FF00FF01
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: movq %rcx, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: andq $-256, %rax
		; X86-64-NEXT: shrq $8, %rdx
		; X86-64-NEXT: addq %rax, %rdx
		; X86-64-NEXT: subq %rdx, %rcx
		; X86-64-NEXT: subq %rcx, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-71777214294589696, %rcx # imm = 0xFF00FF00FF00FF00
		; X86-64-NEXT: imulq %rdi, %rcx
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: addq %rcx, %rdx
		; X86-64-NEXT: imulq %rsi, %r8
		; X86-64-NEXT: addq %r8, %rdx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_257:		; WIN64-LABEL: udiv_i128_257:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %r9
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rcx
; WIN64-NEXT: movq $257, {{[0-9]+}}(%rsp) # imm = 0x101		; WIN64-NEXT: adcq $0, %rcx
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-71777214294589695, %r10 # imm = 0xFF00FF00FF00FF01
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: mulq %r10
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: andq $-256, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: shrq $8, %rdx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: addq %rax, %rdx
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: subq %rdx, %rcx
		; WIN64-NEXT: subq %rcx, %r9
		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-71777214294589696, %rcx # imm = 0xFF00FF00FF00FF00
		; WIN64-NEXT: imulq %r9, %rcx
		; WIN64-NEXT: movq %r9, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %rcx, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 257		%rem = udiv i128 %x, 257
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_65535(i128 %x) nounwind {		define i128 @udiv_i128_65535(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_65535:		; X86-64-LABEL: udiv_i128_65535:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: movq %rdi, %rax
; X86-64-NEXT: movl $65535, %edx # imm = 0xFFFF		; X86-64-NEXT: addq %rsi, %rax
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: adcq $0, %rax
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: movabsq $-9223231297218904063, %rcx # imm = 0x8000800080008001
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: shrq $15, %rdx
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: shlq $16, %rax
		; X86-64-NEXT: subq %rax, %rdx
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: addq %rsi, %rax
		; X86-64-NEXT: adcq %rdx, %rax
		; X86-64-NEXT: subq %rax, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-281479271743490, %r8 # imm = 0xFFFEFFFEFFFEFFFE
		; X86-64-NEXT: imulq %rdi, %r8
		; X86-64-NEXT: movabsq $-281479271743489, %rcx # imm = 0xFFFEFFFEFFFEFFFF
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %rcx
		; X86-64-NEXT: addq %r8, %rdx
		; X86-64-NEXT: imulq %rsi, %rcx
		; X86-64-NEXT: addq %rcx, %rdx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_65535:		; WIN64-LABEL: udiv_i128_65535:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: movq %rdx, %r8
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: addq %rdx, %rax
; WIN64-NEXT: movq $65535, {{[0-9]+}}(%rsp) # imm = 0xFFFF		; WIN64-NEXT: adcq $0, %rax
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movabsq $-9223231297218904063, %rdx # imm = 0x8000800080008001
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: mulq %rdx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: shrq $15, %rdx
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: movq %rdx, %rax
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: shlq $16, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: subq %rax, %rdx
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: movq %rcx, %rax
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: addq %r8, %rax
		; WIN64-NEXT: adcq %rdx, %rax
		; WIN64-NEXT: subq %rax, %rcx
		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-281479271743490, %r9 # imm = 0xFFFEFFFEFFFEFFFE
		; WIN64-NEXT: imulq %rcx, %r9
		; WIN64-NEXT: movabsq $-281479271743489, %r10 # imm = 0xFFFEFFFEFFFEFFFF
		; WIN64-NEXT: movq %rcx, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %r9, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 65535		%rem = udiv i128 %x, 65535
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_65537(i128 %x) nounwind {		define i128 @udiv_i128_65537(i128 %x) nounwind {
; X86-64-LABEL: udiv_i128_65537:		; X86-64-LABEL: udiv_i128_65537:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
		; X86-64-NEXT: movq %rdi, %rcx
		; X86-64-NEXT: addq %rsi, %rcx
		; X86-64-NEXT: adcq $0, %rcx
		; X86-64-NEXT: movabsq $-281470681808895, %r8 # imm = 0xFFFF0000FFFF0001
		; X86-64-NEXT: movq %rcx, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: movq %rdx, %rax
		; X86-64-NEXT: andq $-65536, %rax # imm = 0xFFFF0000
		; X86-64-NEXT: shrq $16, %rdx
		; X86-64-NEXT: addq %rax, %rdx
		; X86-64-NEXT: subq %rdx, %rcx
		; X86-64-NEXT: subq %rcx, %rdi
		; X86-64-NEXT: sbbq $0, %rsi
		; X86-64-NEXT: movabsq $-281470681808896, %rcx # imm = 0xFFFF0000FFFF0000
		; X86-64-NEXT: imulq %rdi, %rcx
		; X86-64-NEXT: movq %rdi, %rax
		; X86-64-NEXT: mulq %r8
		; X86-64-NEXT: addq %rcx, %rdx
		; X86-64-NEXT: imulq %rsi, %r8
		; X86-64-NEXT: addq %r8, %rdx
		; X86-64-NEXT: retq
		;
		; WIN64-LABEL: udiv_i128_65537:
		; WIN64: # %bb.0: # %entry
		; WIN64-NEXT: movq %rdx, %r8
		; WIN64-NEXT: movq %rcx, %r9
		; WIN64-NEXT: addq %rdx, %rcx
		; WIN64-NEXT: adcq $0, %rcx
		; WIN64-NEXT: movabsq $-281470681808895, %r10 # imm = 0xFFFF0000FFFF0001
		; WIN64-NEXT: movq %rcx, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: movq %rdx, %rax
		; WIN64-NEXT: andq $-65536, %rax # imm = 0xFFFF0000
		; WIN64-NEXT: shrq $16, %rdx
		; WIN64-NEXT: addq %rax, %rdx
		; WIN64-NEXT: subq %rdx, %rcx
		; WIN64-NEXT: subq %rcx, %r9
		; WIN64-NEXT: sbbq $0, %r8
		; WIN64-NEXT: movabsq $-281470681808896, %rcx # imm = 0xFFFF0000FFFF0000
		; WIN64-NEXT: imulq %r9, %rcx
		; WIN64-NEXT: movq %r9, %rax
		; WIN64-NEXT: mulq %r10
		; WIN64-NEXT: addq %rcx, %rdx
		; WIN64-NEXT: imulq %r10, %r8
		; WIN64-NEXT: addq %r8, %rdx
		; WIN64-NEXT: retq
		entry:
		%rem = udiv i128 %x, 65537
		ret i128 %rem
		}

		define i128 @udiv_i128_12(i128 %x) nounwind {
		; X86-64-LABEL: udiv_i128_12:
		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: pushq %rax
; X86-64-NEXT: movl $65537, %edx # imm = 0x10001		; X86-64-NEXT: movl $12, %edx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: xorl %ecx, %ecx
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: callq __udivti3@PLT
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: popq %rcx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_65537:		; WIN64-LABEL: udiv_i128_12:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: subq $72, %rsp
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)
; WIN64-NEXT: movq $65537, {{[0-9]+}}(%rsp) # imm = 0x10001		; WIN64-NEXT: movq $12, {{[0-9]+}}(%rsp)
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx		; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: callq __udivti3
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: movq %xmm0, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: movq %xmm0, %rdx
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: addq $72, %rsp
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 65537		%rem = udiv i128 %x, 12
ret i128 %rem		ret i128 %rem
}		}

define i128 @udiv_i128_12(i128 %x) nounwind {		; Make sure we don't inline expand for minsize.
; X86-64-LABEL: udiv_i128_12:		define i128 @urem_i128_3_minsize(i128 %x) nounwind minsize {
		RKSimonUnsubmitted Not Done Reply Inline Actions pre-commit this? what about optsize - should that expand do you think? RKSimon: pre-commit this? what about optsize - should that expand do you think?
		; X86-64-LABEL: urem_i128_3_minsize:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: pushq %rax
; X86-64-NEXT: movl $12, %edx		; X86-64-NEXT: pushq $3
		; X86-64-NEXT: popq %rdx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: xorl %ecx, %ecx
; X86-64-NEXT: callq __udivti3@PLT		; X86-64-NEXT: callq __umodti3@PLT
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: popq %rcx
; X86-64-NEXT: retq		; X86-64-NEXT: retq
;		;
; WIN64-LABEL: udiv_i128_12:		; WIN64-LABEL: urem_i128_3_minsize:
; WIN64: # %bb.0: # %entry		; WIN64: # %bb.0: # %entry
; WIN64-NEXT: subq $72, %rsp		; WIN64-NEXT: subq $72, %rsp
; WIN64-NEXT: movq %rdx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rax
; WIN64-NEXT: movq %rcx, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rdx, 8(%rax)
; WIN64-NEXT: movq $12, {{[0-9]+}}(%rsp)		; WIN64-NEXT: movq %rcx, (%rax)
; WIN64-NEXT: movq $0, {{[0-9]+}}(%rsp)
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rcx
; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx		; WIN64-NEXT: leaq {{[0-9]+}}(%rsp), %rdx
; WIN64-NEXT: callq __udivti3		; WIN64-NEXT: movq $3, (%rdx)
		; WIN64-NEXT: andq $0, 8(%rdx)
		; WIN64-NEXT: movq %rax, %rcx
		; WIN64-NEXT: callq __umodti3
; WIN64-NEXT: movq %xmm0, %rax		; WIN64-NEXT: movq %xmm0, %rax
; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]		; WIN64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,2,3]
; WIN64-NEXT: movq %xmm0, %rdx		; WIN64-NEXT: movq %xmm0, %rdx
; WIN64-NEXT: addq $72, %rsp		; WIN64-NEXT: addq $72, %rsp
; WIN64-NEXT: retq		; WIN64-NEXT: retq
entry:		entry:
%rem = udiv i128 %x, 12		%rem = urem i128 %x, 3
ret i128 %rem		ret i128 %rem
}		}

		; Make sure we don't inline expand for optsize.
define i128 @urem_i128_3_optsize(i128 %x) nounwind optsize {		define i128 @urem_i128_3_optsize(i128 %x) nounwind optsize {
; X86-64-LABEL: urem_i128_3_optsize:		; X86-64-LABEL: urem_i128_3_optsize:
; X86-64: # %bb.0: # %entry		; X86-64: # %bb.0: # %entry
; X86-64-NEXT: pushq %rax		; X86-64-NEXT: pushq %rax
; X86-64-NEXT: movl $3, %edx		; X86-64-NEXT: movl $3, %edx
; X86-64-NEXT: xorl %ecx, %ecx		; X86-64-NEXT: xorl %ecx, %ecx
; X86-64-NEXT: callq __umodti3@PLT		; X86-64-NEXT: callq __umodti3@PLT
; X86-64-NEXT: popq %rcx		; X86-64-NEXT: popq %rcx
Show All 22 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 459515

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/Target/ARM/ARMISelLowering.cpp

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/ARM/div.ll

llvm/test/CodeGen/RISCV/div-by-constant.ll

llvm/test/CodeGen/RISCV/div.ll

llvm/test/CodeGen/RISCV/split-udiv-by-constant.ll

llvm/test/CodeGen/RISCV/split-urem-by-constant.ll

llvm/test/CodeGen/VE/Scalar/rem.ll

llvm/test/CodeGen/X86/divide-by-constant.ll

llvm/test/CodeGen/X86/divmod128.ll

[LegalizeTypes] Improve splitting for urem/udiv by constant for some constants.
ClosedPublic