This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
1/1
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
TargetLowering.h
-
IR/
-
Intrinsics.td
-
Target/
-
TargetSelectionDAG.td
-
lib/
-
CodeGen/
-
SelectionDAG/
-
LegalizeDAG.cpp
1
LegalizeIntegerTypes.cpp
-
LegalizeVectorOps.cpp
-
LegalizeVectorTypes.cpp
-
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
-
TargetLowering.cpp
-
TargetLoweringBase.cpp
-
IR/
-
Verifier.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
sdiv_fix_sat.ll
-
udiv_fix_sat.ll

Differential D71550

[Intrinsic] Add fixed point saturating division intrinsics.
ClosedPublic

Authored by ebevhan on Dec 16 2019, 7:32 AM.

Download Raw Diff

Details

Reviewers

bjope
leonardchan
craig.topper

Commits

rG6e561d1c94ed: [Intrinsic] Add fixed point saturating division intrinsics.

Summary

This patch adds intrinsics and ISelDAG nodes for signed
and unsigned fixed-point division:

llvm.sdiv.fix.sat.*
llvm.udiv.fix.sat.*

These intrinsics perform scaled, saturating division
on two integers or vectors of integers. They are
required for the implementation of the Embedded-C
fixed-point arithmetic in Clang.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ebevhan created this revision.Dec 16 2019, 7:32 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 16 2019, 7:32 AM

Herald added subscribers: llvm-commits, jdoerfert, hiraditya. · View Herald Transcript

ebevhan added a parent revision: D70007: [Intrinsic] Add fixed point division intrinsics..Dec 16 2019, 7:33 AM

Harbormaster completed remote builds in B42556: Diff 234060.Dec 16 2019, 7:39 AM

No need to do the promotion in DAGBuilder for unsigned saturating 0-scale operations, they cannot experience overflow.

Harbormaster completed remote builds in B42560: Diff 234066.Dec 16 2019, 8:06 AM

Ka-Ka added a subscriber: Ka-Ka.Dec 16 2019, 11:16 PM

When promoting to a legal type, we must shift up the value for saturating divisions.

Harbormaster completed remote builds in B42721: Diff 234498.Dec 18 2019, 3:26 AM

Ping.

Rebased.

Harbormaster completed remote builds in B43503: Diff 236803.Jan 8 2020, 5:57 AM

LGTM for the code so far. Just want to confirm on my side that the codegen works as intended also.

bjope added inline comments.Jan 14 2020, 12:08 AM

llvm/docs/LangRef.rst
14368	Typo: divison/division

Fixed typo.

Harbormaster completed remote builds in B44578: Diff 239558.Jan 22 2020, 6:05 AM

ebevhan marked an inline comment as done.Jan 22 2020, 6:08 AM

It seems that I run into LLVM ERROR: Unsupported library call operation! if I call llvm.sdiv.fix.sat.i64 with a scale of 63:

define i64 @sdiv_sat_i64_s63(i64 %x, i64 %y) {
  %tmp = call i64 @llvm.sdiv.fix.sat.i64(i64 %x, i64 %y, i32 63)
  ret i64 %tmp;
}

This seems to be the only issue I could find from running the code.

Rebased.

Harbormaster completed remote builds in B44675: Diff 239792.Jan 23 2020, 12:39 AM

In D71550#1835003, @leonardchan wrote:
It seems that I run into LLVM ERROR: Unsupported library call operation! if I call llvm.sdiv.fix.sat.i64 with a scale of 63:
define i64 @sdiv_sat_i64_s63(i64 %x, i64 %y) {
  %tmp = call i64 @llvm.sdiv.fix.sat.i64(i64 %x, i64 %y, i32 63)
  ret i64 %tmp;
}
This seems to be the only issue I could find from running the code.

Ah, that's unfortunate... It's a consequence of being forced to handle the integer division overflow case gracefully.

With an i64, we get a sext+shl and an i65 after building. During the initial promotion we try to promote to i128. That gives us 63 bits of redundant sign (128-65), but in order to handle the case of MSB / ALL1 (MIN / -EPS) without invoking undefined behavior in the integer division, we require an extra bit. Since we don't have that bit to spare, we end up not being able to do the operation in i128. Then we get an i256 sdiv and there's no codegen for that.

I'm unsure what to do to make this look good. It would be convenient to skip the operation altogether in the overflow case, but we can't generate control flow here. I could check for the overflow case and fudge the inputs/result to prevent the overflow, but that feels sort of wrong. Might be the only viable option, though.

Ah, that's unfortunate... It's a consequence of being forced to handle the integer division overflow case gracefully.

With an i64, we get a sext+shl and an i65 after building. During the initial promotion we try to promote to i128. That gives us 63 bits of redundant sign (128-65), but in order to handle the case of MSB / ALL1 (MIN / -EPS) without invoking undefined behavior in the integer division, we require an extra bit. Since we don't have that bit to spare, we end up not being able to do the operation in i128. Then we get an i256 sdiv and there's no codegen for that.

I'm unsure what to do to make this look good. It would be convenient to skip the operation altogether in the overflow case, but we can't generate control flow here. I could check for the overflow case and fudge the inputs/result to prevent the overflow, but that feels sort of wrong. Might be the only viable option, though.

Ah I see. Could you clarify though on how handling MIN / -EPS invokes undefined behavior?

Also, if handling this UB requires an extra bit, it seems odd that we would add one a second time. I could be wrong, but isn't the reason for adding the 1st extra bit (i64 -> i65) is just to force the operands to be promoted? If so, then by the time we check LHSLead + RHSTrail < Scale + (unsigned)(Saturating && Signed)) in expandFixedPointDiv, we would've already gotten the extra bit from the initial promotion. I feel like we should only need to promote once when doing fixed point division.

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
897	nit: align the arguments

Ah I see. Could you clarify though on how handling MIN / -EPS invokes undefined behavior?

It's the same as if we had MIN / -1 in integer arithmetic. The result is MAX+1, so we get overflow, which is UB. The same case happens for our lowering of scaled division, since if we have MIN / -EPS (-EPS is -1 in the integer representation), during widening/promotion we may potentially shift up the LHS by the amount widened, giving us MIN / -EPS again, which causes overflow when we do the integer division.

This doesn't matter for non-saturating division since the operation overflows there as well, but for saturating division we need to check the result and saturate, but the div instruction will cause an exception so we never get to that stage.

Also, if handling this UB requires an extra bit, it seems odd that we would add one a second time. I could be wrong, but isn't the reason for adding the 1st extra bit (i64 -> i65) is just to force the operands to be promoted? If so, then by the time we check LHSLead + RHSTrail < Scale + (unsigned)(Saturating && Signed)) in expandFixedPointDiv, we would've already gotten the extra bit from the initial promotion.

Yes, we add the extra bit to force promotion, but we have to shift up and consume that bit in order to preserve the saturating behavior (SelectionDAGBuilder.cpp:5477) so we don't win any leading sign bits there. We only get those when we do the promotion, but we are one bit short there because of the pre-promotion, so we end up having to widen again. expandFixedPointDiv (during promotion) and earlyExpandDIVFIX do not need to do this extra shifting since they handle saturation directly.

The premature promotion really makes fixed-point division overly complicated.

I feel like we should only need to promote once when doing fixed point division.

You are right in that the pre-promotion causes us to lose out on a bit. It would maybe be possible to skip the aforementioned one-bit shift, but then we have to tell promotion that we need one extra bit of saturation, and I don't know how to effectively propagate that information.

In D71550#1880727, @ebevhan wrote:

Ah I see. Could you clarify though on how handling MIN / -EPS invokes undefined behavior?

It's the same as if we had MIN / -1 in integer arithmetic. The result is MAX+1, so we get overflow, which is UB. The same case happens for our lowering of scaled division, since if we have MIN / -EPS (-EPS is -1 in the integer representation), during widening/promotion we may potentially shift up the LHS by the amount widened, giving us MIN / -EPS again, which causes overflow when we do the integer division.

This doesn't matter for non-saturating division since the operation overflows there as well, but for saturating division we need to check the result and saturate, but the div instruction will cause an exception so we never get to that stage.

Also, if handling this UB requires an extra bit, it seems odd that we would add one a second time. I could be wrong, but isn't the reason for adding the 1st extra bit (i64 -> i65) is just to force the operands to be promoted? If so, then by the time we check LHSLead + RHSTrail < Scale + (unsigned)(Saturating && Signed)) in expandFixedPointDiv, we would've already gotten the extra bit from the initial promotion.

Yes, we add the extra bit to force promotion, but we have to shift up and consume that bit in order to preserve the saturating behavior (SelectionDAGBuilder.cpp:5477) so we don't win any leading sign bits there. We only get those when we do the promotion, but we are one bit short there because of the pre-promotion, so we end up having to widen again. expandFixedPointDiv (during promotion) and earlyExpandDIVFIX do not need to do this extra shifting since they handle saturation directly.

The premature promotion really makes fixed-point division overly complicated.

I feel like we should only need to promote once when doing fixed point division.

You are right in that the pre-promotion causes us to lose out on a bit. It would maybe be possible to skip the aforementioned one-bit shift, but then we have to tell promotion that we need one extra bit of saturation, and I don't know how to effectively propagate that information.

Thanks for the clarification. Yeah, this is unfortunate. Not being able to do s.63 saturating multiplication kind of leaves a bad taste in my mouth, but for the purposes of implementing the standard, *I don't think* there's an instance where division for the proposed types can result in this (at least on a typical desktop processor). The main issues I can see down the road are frontend related, like if we perhaps will want to add something like long long _Fract that has a signed 64 bit width, or other frontends want to have types of varying scale and width. For now though, this still seems to work for our purposes and I also can't think of anything better sooo...LGTM.

This revision is now accepted and ready to land.Feb 20 2020, 11:57 AM

In D71550#1885169, @leonardchan wrote:

Not being able to do s.63 saturating multiplication kind of leaves a bad taste in my mouth, but for the purposes of implementing the standard, *I don't think* there's an instance where division for the proposed types can result in this (at least on a typical desktop processor). The main issues I can see down the road are frontend related, like if we perhaps will want to add something like long long _Fract that has a signed 64 bit width, or other frontends want to have types of varying scale and width.

Yes, it's pretty unfortunate that it's not as general as it could be. The largest (default) scale that Clang will output for the existing fixed-point types is 32, so I don't think it will hit the limit there. The long long _Fract case is compelling but we'll see if we ever get there.

Closed by commit rG6e561d1c94ed: [Intrinsic] Add fixed point saturating division intrinsics. (authored by ebevhan). · Explain WhyFeb 24 2020, 1:56 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

130 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

5 lines

TargetLowering.h

4 lines

IR/

Intrinsics.td

8 lines

Target/

TargetSelectionDAG.td

2 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

6 lines

LegalizeIntegerTypes.cpp

130 lines

LegalizeVectorOps.cpp

17 lines

LegalizeVectorTypes.cpp

4 lines

SelectionDAGBuilder.cpp

27 lines

SelectionDAGDumper.cpp

2 lines

TargetLowering.cpp

19 lines

TargetLoweringBase.cpp

2 lines

IR/

Verifier.cpp

6 lines

test/

CodeGen/

X86/

sdiv_fix_sat.ll

1411 lines

udiv_fix_sat.ll

528 lines

Diff 246158

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 14,325 Lines • ▼ Show 20 Lines	.. code-block:: llvm
%res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)		%res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
%res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)		%res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
%res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)		%res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)

; The result in the following could be rounded up to 1 or down to 0.5		; The result in the following could be rounded up to 1 or down to 0.5
%res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)		%res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)


		'``llvm.sdiv.fix.sat.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax
		"""""""

		This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
		on any integer bit width or vectors of integers.

		::

		declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
		declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
		declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
		declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)

		Overview
		"""""""""

		The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
		fixed point saturation division on 2 arguments of the same scale.

		Arguments
		""""""""""

		The arguments (%a and %b) and the result may be of integer types of any bit
		width, but they must have the same bit width. ``%a`` and ``%b`` are the two
		values that will undergo signed fixed point division. The argument
		``%scale`` represents the scale of both operands, and must be a constant
		integer.

		Semantics:
		""""""""""

		This operation performs fixed point division on the 2 arguments of a
		bjopeUnsubmitted Done Reply Inline Actions Typo: divison/division bjope: Typo: divison/division
		specified scale. The result will also be returned in the same scale specified
		in the third argument.

		If the result value cannot be precisely represented in the given scale, the
		value is rounded up or down to the closest representable value. The rounding
		direction is unspecified.

		The maximum value this operation can clamp to is the largest signed value
		representable by the bit width of the first 2 arguments. The minimum value is the
		smallest signed value representable by this bit width.

		It is undefined behavior if the second argument is zero.


		Examples
		"""""""""

		.. code-block:: llvm

		%res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
		%res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
		%res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)

		; The result in the following could be rounded up to 1 or down to 0.5
		%res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)

		; Saturation
		%res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7)
		%res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75)
		%res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2)


		'``llvm.udiv.fix.sat.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax
		"""""""

		This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
		on any integer bit width or vectors of integers.

		::

		declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
		declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
		declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
		declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)

		Overview
		"""""""""

		The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
		fixed point saturation division on 2 arguments of the same scale.

		Arguments
		""""""""""

		The arguments (%a and %b) and the result may be of integer types of any bit
		width, but they must have the same bit width. ``%a`` and ``%b`` are the two
		values that will undergo unsigned fixed point division. The argument
		``%scale`` represents the scale of both operands, and must be a constant
		integer.

		Semantics:
		""""""""""

		This operation performs fixed point division on the 2 arguments of a
		specified scale. The result will also be returned in the same scale specified
		in the third argument.

		If the result value cannot be precisely represented in the given scale, the
		value is rounded up or down to the closest representable value. The rounding
		direction is unspecified.

		The maximum value this operation can clamp to is the largest unsigned value
		representable by the bit width of the first 2 arguments. The minimum value is the
		smallest unsigned value representable by this bit width (zero).

		It is undefined behavior if the second argument is zero.

		Examples
		"""""""""

		.. code-block:: llvm

		%res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
		%res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)

		; The result in the following could be rounded down to 0.5 or up to 1
		%res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75)

		; Saturation
		%res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75)


Specialised Arithmetic Intrinsics		Specialised Arithmetic Intrinsics
---------------------------------		---------------------------------

.. _i_intr_llvm_canonicalize:		.. _i_intr_llvm_canonicalize:

'``llvm.canonicalize.*``' Intrinsic		'``llvm.canonicalize.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

▲ Show 20 Lines • Show All 4,458 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 285 Lines • ▼ Show 20 Lines	enum NodeType {
SMULFIXSAT, UMULFIXSAT,		SMULFIXSAT, UMULFIXSAT,

/// RESULT = [US]DIVFIX(LHS, RHS, SCALE) - Perform fixed point division on		/// RESULT = [US]DIVFIX(LHS, RHS, SCALE) - Perform fixed point division on
/// 2 integers with the same width and scale. SCALE represents the scale		/// 2 integers with the same width and scale. SCALE represents the scale
/// of both operands as fixed point numbers. This SCALE parameter must be a		/// of both operands as fixed point numbers. This SCALE parameter must be a
/// constant integer.		/// constant integer.
SDIVFIX, UDIVFIX,		SDIVFIX, UDIVFIX,

		/// Same as the corresponding unsaturated fixed point instructions, but the
		/// result is clamped between the min and max values representable by the
		/// bits of the first 2 operands.
		SDIVFIXSAT, UDIVFIXSAT,

/// Simple binary floating point operators.		/// Simple binary floating point operators.
FADD, FSUB, FMUL, FDIV, FREM,		FADD, FSUB, FMUL, FDIV, FREM,

/// Constrained versions of the binary floating point operators.		/// Constrained versions of the binary floating point operators.
/// These will be lowered to the simple operators before final selection.		/// These will be lowered to the simple operators before final selection.
/// They are used to limit optimizations while the DAG is being		/// They are used to limit optimizations while the DAG is being
/// optimized.		/// optimized.
STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,		STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,
▲ Show 20 Lines • Show All 838 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 1,037 Lines • ▼ Show 20 Lines	LegalizeAction getFixedPointOperationAction(unsigned Op, EVT VT,
switch (Op) {		switch (Op) {
default:		default:
llvm_unreachable("Unexpected fixed point operation.");		llvm_unreachable("Unexpected fixed point operation.");
case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
case ISD::SDIVFIX:		case ISD::SDIVFIX:
		case ISD::SDIVFIXSAT:
case ISD::UDIVFIX:		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT:
Supported = isSupportedFixedPointOperation(Op, VT, Scale);		Supported = isSupportedFixedPointOperation(Op, VT, Scale);
break;		break;
}		}

return Supported ? Action : Expand;		return Supported ? Action : Expand;
}		}

// If Op is a strict floating-point operation, return the result		// If Op is a strict floating-point operation, return the result
▲ Show 20 Lines • Show All 3,209 Lines • ▼ Show 20 Lines
/// Method for building the DAG expansion of ISD::[US][ADD\|SUB]SAT. This		/// Method for building the DAG expansion of ISD::[US][ADD\|SUB]SAT. This
/// method accepts integers as its arguments.		/// method accepts integers as its arguments.
SDValue expandAddSubSat(SDNode *Node, SelectionDAG &DAG) const;		SDValue expandAddSubSat(SDNode *Node, SelectionDAG &DAG) const;

/// Method for building the DAG expansion of ISD::[U\|S]MULFIX[SAT]. This		/// Method for building the DAG expansion of ISD::[U\|S]MULFIX[SAT]. This
/// method accepts integers as its arguments.		/// method accepts integers as its arguments.
SDValue expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const;		SDValue expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const;

/// Method for building the DAG expansion of ISD::[US]DIVFIX. This		/// Method for building the DAG expansion of ISD::[US]DIVFIX[SAT]. This
/// method accepts integers as its arguments.		/// method accepts integers as its arguments.
/// Note: This method may fail if the division could not be performed		/// Note: This method may fail if the division could not be performed
/// within the type. Clients must retry with a wider type if this happens.		/// within the type. Clients must retry with a wider type if this happens.
SDValue expandFixedPointDiv(unsigned Opcode, const SDLoc &dl,		SDValue expandFixedPointDiv(unsigned Opcode, const SDLoc &dl,
SDValue LHS, SDValue RHS,		SDValue LHS, SDValue RHS,
unsigned Scale, SelectionDAG &DAG) const;		unsigned Scale, SelectionDAG &DAG) const;

/// Method for building the DAG expansion of ISD::U(ADD\|SUB)O. Expansion		/// Method for building the DAG expansion of ISD::U(ADD\|SUB)O. Expansion
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 963 Lines • ▼ Show 20 Lines
	//			//
	def int_smul_fix_sat : Intrinsic<[llvm_anyint_ty],			def int_smul_fix_sat : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],			[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
	[IntrNoMem, IntrSpeculatable, IntrWillReturn, Commutative, ImmArg<2>]>;			[IntrNoMem, IntrSpeculatable, IntrWillReturn, Commutative, ImmArg<2>]>;
	def int_umul_fix_sat : Intrinsic<[llvm_anyint_ty],			def int_umul_fix_sat : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],			[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
	[IntrNoMem, IntrSpeculatable, IntrWillReturn, Commutative, ImmArg<2>]>;			[IntrNoMem, IntrSpeculatable, IntrWillReturn, Commutative, ImmArg<2>]>;

				def int_sdiv_fix_sat : Intrinsic<[llvm_anyint_ty],
				[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
				[IntrNoMem, ImmArg<2>]>;

				def int_udiv_fix_sat : Intrinsic<[llvm_anyint_ty],
				[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
				[IntrNoMem, ImmArg<2>]>;

	//===------------------------- Memory Use Markers -------------------------===//			//===------------------------- Memory Use Markers -------------------------===//
	//			//
	def int_lifetime_start : Intrinsic<[],			def int_lifetime_start : Intrinsic<[],
	[llvm_i64_ty, llvm_anyptr_ty],			[llvm_i64_ty, llvm_anyptr_ty],
	[IntrArgMemOnly, IntrWillReturn, NoCapture<1>, ImmArg<0>]>;			[IntrArgMemOnly, IntrWillReturn, NoCapture<1>, ImmArg<0>]>;
	def int_lifetime_end : Intrinsic<[],			def int_lifetime_end : Intrinsic<[],
	[llvm_i64_ty, llvm_anyptr_ty],			[llvm_i64_ty, llvm_anyptr_ty],
	[IntrArgMemOnly, IntrWillReturn, NoCapture<1>, ImmArg<0>]>;			[IntrArgMemOnly, IntrWillReturn, NoCapture<1>, ImmArg<0>]>;
	▲ Show 20 Lines • Show All 410 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetSelectionDAG.td

	Show First 20 Lines • Show All 396 Lines • ▼ Show 20 Lines
	def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;			def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;
	def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;			def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;

	def smulfix : SDNode<"ISD::SMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;			def smulfix : SDNode<"ISD::SMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;
	def smulfixsat : SDNode<"ISD::SMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;			def smulfixsat : SDNode<"ISD::SMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;
	def umulfix : SDNode<"ISD::UMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;			def umulfix : SDNode<"ISD::UMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;
	def umulfixsat : SDNode<"ISD::UMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;			def umulfixsat : SDNode<"ISD::UMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;
	def sdivfix : SDNode<"ISD::SDIVFIX" , SDTIntScaledBinOp>;			def sdivfix : SDNode<"ISD::SDIVFIX" , SDTIntScaledBinOp>;
				def sdivfixsat : SDNode<"ISD::SDIVFIXSAT", SDTIntScaledBinOp>;
	def udivfix : SDNode<"ISD::UDIVFIX" , SDTIntScaledBinOp>;			def udivfix : SDNode<"ISD::UDIVFIX" , SDTIntScaledBinOp>;
				def udivfixsat : SDNode<"ISD::UDIVFIXSAT", SDTIntScaledBinOp>;

	def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;			def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;
	def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;			def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;
	def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;			def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;

	def abs : SDNode<"ISD::ABS" , SDTIntUnaryOp>;			def abs : SDNode<"ISD::ABS" , SDTIntUnaryOp>;
	def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;			def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;
	def bswap : SDNode<"ISD::BSWAP" , SDTIntUnaryOp>;			def bswap : SDNode<"ISD::BSWAP" , SDTIntUnaryOp>;
	▲ Show 20 Lines • Show All 1,192 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 1,126 Lines • ▼ Show 20 Lines	case ISD::USUBSAT: {
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
}		}
case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
case ISD::SDIVFIX:		case ISD::SDIVFIX:
case ISD::UDIVFIX: {		case ISD::SDIVFIXSAT:
		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT: {
unsigned Scale = Node->getConstantOperandVal(2);		unsigned Scale = Node->getConstantOperandVal(2);
Action = TLI.getFixedPointOperationAction(Node->getOpcode(),		Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
Node->getValueType(0), Scale);		Node->getValueType(0), Scale);
break;		break;
}		}
case ISD::MSCATTER:		case ISD::MSCATTER:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<MaskedScatterSDNode>(Node)->getValue().getValueType());		cast<MaskedScatterSDNode>(Node)->getValue().getValueType());
▲ Show 20 Lines • Show All 2,340 Lines • ▼ Show 20 Lines	case ISD::USUBSAT:
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
Results.push_back(TLI.expandFixedPointMul(Node, DAG));		Results.push_back(TLI.expandFixedPointMul(Node, DAG));
break;		break;
case ISD::SDIVFIX:		case ISD::SDIVFIX:
		case ISD::SDIVFIXSAT:
case ISD::UDIVFIX:		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT:
if (SDValue V = TLI.expandFixedPointDiv(Node->getOpcode(), SDLoc(Node),		if (SDValue V = TLI.expandFixedPointDiv(Node->getOpcode(), SDLoc(Node),
Node->getOperand(0),		Node->getOperand(0),
Node->getOperand(1),		Node->getOperand(1),
Node->getConstantOperandVal(2),		Node->getConstantOperandVal(2),
DAG)) {		DAG)) {
Results.push_back(V);		Results.push_back(V);
break;		break;
}		}
▲ Show 20 Lines • Show All 1,424 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	#endif
case ISD::USUBSAT: Res = PromoteIntRes_ADDSUBSAT(N); break;		case ISD::USUBSAT: Res = PromoteIntRes_ADDSUBSAT(N); break;

case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT: Res = PromoteIntRes_MULFIX(N); break;		case ISD::UMULFIXSAT: Res = PromoteIntRes_MULFIX(N); break;

case ISD::SDIVFIX:		case ISD::SDIVFIX:
case ISD::UDIVFIX: Res = PromoteIntRes_DIVFIX(N); break;		case ISD::SDIVFIXSAT:
		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT: Res = PromoteIntRes_DIVFIX(N); break;

case ISD::ABS: Res = PromoteIntRes_ABS(N); break;		case ISD::ABS: Res = PromoteIntRes_ABS(N); break;

case ISD::ATOMIC_LOAD:		case ISD::ATOMIC_LOAD:
Res = PromoteIntRes_Atomic0(cast<AtomicSDNode>(N)); break;		Res = PromoteIntRes_Atomic0(cast<AtomicSDNode>(N)); break;

case ISD::ATOMIC_LOAD_ADD:		case ISD::ATOMIC_LOAD_ADD:
case ISD::ATOMIC_LOAD_SUB:		case ISD::ATOMIC_LOAD_SUB:
▲ Show 20 Lines • Show All 605 Lines • ▼ Show 20 Lines	if (Saturating) {
unsigned ShiftOp = Signed ? ISD::SRA : ISD::SRL;		unsigned ShiftOp = Signed ? ISD::SRA : ISD::SRL;
return DAG.getNode(ShiftOp, dl, PromotedType, Result,		return DAG.getNode(ShiftOp, dl, PromotedType, Result,
DAG.getConstant(DiffSize, dl, ShiftTy));		DAG.getConstant(DiffSize, dl, ShiftTy));
}		}
return DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted, Op2Promoted,		return DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted, Op2Promoted,
N->getOperand(2));		N->getOperand(2));
}		}

		static SDValue SaturateWidenedDIVFIX(SDValue V, SDLoc &dl,
		unsigned SatW, bool Signed,
		const TargetLowering &TLI,
		SelectionDAG &DAG) {
		EVT VT = V.getValueType();
		unsigned VTW = VT.getScalarSizeInBits();

		if (!Signed) {
		// Saturate to the unsigned maximum by getting the minimum of V and the
		// maximum.
		return DAG.getNode(ISD::UMIN, dl, VT, V,
		DAG.getConstant(APInt::getLowBitsSet(VTW, SatW),
		dl, VT));
		}

		// Saturate to the signed maximum (the low SatW - 1 bits) by taking the
		// signed minimum of it and V.
		V = DAG.getNode(ISD::SMIN, dl, VT, V,
		DAG.getConstant(APInt::getLowBitsSet(VTW, SatW - 1),
		dl, VT));
		// Saturate to the signed minimum (the high SatW + 1 bits) by taking the
		// signed maximum of it and V.
		V = DAG.getNode(ISD::SMAX, dl, VT, V,
		DAG.getConstant(APInt::getHighBitsSet(VTW, VTW - SatW + 1),
		dl, VT));
		return V;
		}

static SDValue earlyExpandDIVFIX(SDNode *N, SDValue LHS, SDValue RHS,		static SDValue earlyExpandDIVFIX(SDNode *N, SDValue LHS, SDValue RHS,
unsigned Scale, const TargetLowering &TLI,		unsigned Scale, const TargetLowering &TLI,
SelectionDAG &DAG) {		SelectionDAG &DAG, unsigned SatW = 0) {
EVT VT = LHS.getValueType();		EVT VT = LHS.getValueType();
bool Signed = N->getOpcode() == ISD::SDIVFIX;		unsigned VTSize = VT.getScalarSizeInBits();
		bool Signed = N->getOpcode() == ISD::SDIVFIX \|\|
		N->getOpcode() == ISD::SDIVFIXSAT;
		bool Saturating = N->getOpcode() == ISD::SDIVFIXSAT \|\|
		N->getOpcode() == ISD::UDIVFIXSAT;

SDLoc dl(N);		SDLoc dl(N);
// See if we can perform the division in this type without widening.		// Widen the types by a factor of two. This is guaranteed to expand, since it
if (SDValue V = TLI.expandFixedPointDiv(N->getOpcode(), dl, LHS, RHS, Scale,		// will always have enough high bits in the LHS to shift into.
DAG))		EVT WideVT = EVT::getIntegerVT(DAG.getContext(), VTSize 2);
return V;		if (VT.isVector())
		WideVT = EVT::getVectorVT(*DAG.getContext(), WideVT,
// If that didn't work, double the type width and try again. That must work,		VT.getVectorElementCount());
// or something is wrong.
EVT WideVT = EVT::getIntegerVT(*DAG.getContext(),
VT.getScalarSizeInBits() * 2);
if (Signed) {		if (Signed) {
LHS = DAG.getSExtOrTrunc(LHS, dl, WideVT);		LHS = DAG.getSExtOrTrunc(LHS, dl, WideVT);
RHS = DAG.getSExtOrTrunc(RHS, dl, WideVT);		RHS = DAG.getSExtOrTrunc(RHS, dl, WideVT);
} else {		} else {
LHS = DAG.getZExtOrTrunc(LHS, dl, WideVT);		LHS = DAG.getZExtOrTrunc(LHS, dl, WideVT);
RHS = DAG.getZExtOrTrunc(RHS, dl, WideVT);		RHS = DAG.getZExtOrTrunc(RHS, dl, WideVT);
}		}

// TODO: Saturation.

SDValue Res = TLI.expandFixedPointDiv(N->getOpcode(), dl, LHS, RHS, Scale,		SDValue Res = TLI.expandFixedPointDiv(N->getOpcode(), dl, LHS, RHS, Scale,
DAG);		DAG);
assert(Res && "Expanding DIVFIX with wide type failed?");		assert(Res && "Expanding DIVFIX with wide type failed?");
		if (Saturating) {
		// If the caller has told us to saturate at something less, use that width
		// instead of the type before doubling. However, it cannot be more than
		// what we just widened!
		assert(SatW <= VTSize &&
		"Tried to saturate to more than the original type?");
		Res = SaturateWidenedDIVFIX(Res, dl, SatW == 0 ? VTSize : SatW, Signed,
		TLI, DAG);
		}
return DAG.getZExtOrTrunc(Res, dl, VT);		return DAG.getZExtOrTrunc(Res, dl, VT);
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_DIVFIX(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_DIVFIX(SDNode *N) {
SDLoc dl(N);		SDLoc dl(N);
SDValue Op1Promoted, Op2Promoted;		SDValue Op1Promoted, Op2Promoted;
bool Signed = N->getOpcode() == ISD::SDIVFIX;		bool Signed = N->getOpcode() == ISD::SDIVFIX \|\|
		N->getOpcode() == ISD::SDIVFIXSAT;
		bool Saturating = N->getOpcode() == ISD::SDIVFIXSAT \|\|
		N->getOpcode() == ISD::UDIVFIXSAT;
if (Signed) {		if (Signed) {
Op1Promoted = SExtPromotedInteger(N->getOperand(0));		Op1Promoted = SExtPromotedInteger(N->getOperand(0));
Op2Promoted = SExtPromotedInteger(N->getOperand(1));		Op2Promoted = SExtPromotedInteger(N->getOperand(1));
} else {		} else {
Op1Promoted = ZExtPromotedInteger(N->getOperand(0));		Op1Promoted = ZExtPromotedInteger(N->getOperand(0));
Op2Promoted = ZExtPromotedInteger(N->getOperand(1));		Op2Promoted = ZExtPromotedInteger(N->getOperand(1));
}		}
EVT PromotedType = Op1Promoted.getValueType();		EVT PromotedType = Op1Promoted.getValueType();
unsigned Scale = N->getConstantOperandVal(2);		unsigned Scale = N->getConstantOperandVal(2);

SDValue Res;
// If the type is already legal and the operation is legal in that type, we		// If the type is already legal and the operation is legal in that type, we
// should not early expand.		// should not early expand.
if (TLI.isTypeLegal(PromotedType)) {		if (TLI.isTypeLegal(PromotedType)) {
TargetLowering::LegalizeAction Action =		TargetLowering::LegalizeAction Action =
TLI.getFixedPointOperationAction(N->getOpcode(), PromotedType, Scale);		TLI.getFixedPointOperationAction(N->getOpcode(), PromotedType, Scale);
if (Action == TargetLowering::Legal \|\| Action == TargetLowering::Custom)		if (Action == TargetLowering::Legal \|\| Action == TargetLowering::Custom) {
Res = DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted,		EVT ShiftTy = TLI.getShiftAmountTy(PromotedType, DAG.getDataLayout());
		unsigned Diff = PromotedType.getScalarSizeInBits() -
		N->getValueType(0).getScalarSizeInBits();
		if (Saturating)
		Op1Promoted = DAG.getNode(ISD::SHL, dl, PromotedType, Op1Promoted,
		DAG.getConstant(Diff, dl, ShiftTy));
		SDValue Res = DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted,
Op2Promoted, N->getOperand(2));		Op2Promoted, N->getOperand(2));
		if (Saturating)
		Res = DAG.getNode(Signed ? ISD::SRA : ISD::SRL, dl, PromotedType, Res,
		DAG.getConstant(Diff, dl, ShiftTy));
		return Res;
		}
}		}

if (!Res)		// See if we can perform the division in this type without expanding.
Res = earlyExpandDIVFIX(N, Op1Promoted, Op2Promoted, Scale, TLI, DAG);		if (SDValue Res = TLI.expandFixedPointDiv(N->getOpcode(), dl, Op1Promoted,
		Op2Promoted, Scale, DAG)) {
		leonardchanUnsubmitted Not Done Reply Inline Actions nit: align the arguments leonardchan: nit: align the arguments
// TODO: Saturation.		if (Saturating)
		Res = SaturateWidenedDIVFIX(Res, dl,
		N->getValueType(0).getScalarSizeInBits(),
		Signed, TLI, DAG);
return Res;		return Res;
}		}
		// If we cannot, expand it to twice the type width. If we are saturating, give
		// it the original width as a saturating width so we don't need to emit
		// two saturations.
		return earlyExpandDIVFIX(N, Op1Promoted, Op2Promoted, Scale, TLI, DAG,
		N->getValueType(0).getScalarSizeInBits());
		}

SDValue DAGTypeLegalizer::PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo) {		SDValue DAGTypeLegalizer::PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo) {
if (ResNo == 1)		if (ResNo == 1)
return PromoteIntRes_Overflow(N);		return PromoteIntRes_Overflow(N);

// The operation overflowed iff the result in the larger type is not the		// The operation overflowed iff the result in the larger type is not the
// sign extension of its truncation to the original type.		// sign extension of its truncation to the original type.
SDValue LHS = SExtPromotedInteger(N->getOperand(0));		SDValue LHS = SExtPromotedInteger(N->getOperand(0));
▲ Show 20 Lines • Show All 451 Lines • ▼ Show 20 Lines	bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {

case ISD::PREFETCH: Res = PromoteIntOp_PREFETCH(N, OpNo); break;		case ISD::PREFETCH: Res = PromoteIntOp_PREFETCH(N, OpNo); break;

case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
case ISD::SDIVFIX:		case ISD::SDIVFIX:
case ISD::UDIVFIX: Res = PromoteIntOp_FIX(N); break;		case ISD::SDIVFIXSAT:
		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT: Res = PromoteIntOp_FIX(N); break;

case ISD::FPOWI: Res = PromoteIntOp_FPOWI(N); break;		case ISD::FPOWI: Res = PromoteIntOp_FPOWI(N); break;

case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
case ISD::VECREDUCE_XOR:		case ISD::VECREDUCE_XOR:
▲ Show 20 Lines • Show All 591 Lines • ▼ Show 20 Lines	#endif
case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;		case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;

case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT: ExpandIntRes_MULFIX(N, Lo, Hi); break;		case ISD::UMULFIXSAT: ExpandIntRes_MULFIX(N, Lo, Hi); break;

case ISD::SDIVFIX:		case ISD::SDIVFIX:
case ISD::UDIVFIX: ExpandIntRes_DIVFIX(N, Lo, Hi); break;		case ISD::SDIVFIXSAT:
		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT: ExpandIntRes_DIVFIX(N, Lo, Hi); break;

case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
case ISD::VECREDUCE_XOR:		case ISD::VECREDUCE_XOR:
case ISD::VECREDUCE_SMAX:		case ISD::VECREDUCE_SMAX:
case ISD::VECREDUCE_SMIN:		case ISD::VECREDUCE_SMIN:
▲ Show 20 Lines • Show All 1,313 Lines • ▼ Show 20 Lines	void DAGTypeLegalizer::ExpandIntRes_MULFIX(SDNode *N, SDValue &Lo,
// Saturate to signed minimum.		// Saturate to signed minimum.
APInt MinHi = APInt::getSignedMinValue(NVTSize);		APInt MinHi = APInt::getSignedMinValue(NVTSize);
Hi = DAG.getSelect(dl, NVT, SatMin, DAG.getConstant(MinHi, dl, NVT), Hi);		Hi = DAG.getSelect(dl, NVT, SatMin, DAG.getConstant(MinHi, dl, NVT), Hi);
Lo = DAG.getSelect(dl, NVT, SatMin, NVTZero, Lo);		Lo = DAG.getSelect(dl, NVT, SatMin, NVTZero, Lo);
}		}

void DAGTypeLegalizer::ExpandIntRes_DIVFIX(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::ExpandIntRes_DIVFIX(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
SDValue Res = earlyExpandDIVFIX(N, N->getOperand(0), N->getOperand(1),		SDLoc dl(N);
		// Try expanding in the existing type first.
		SDValue Res = TLI.expandFixedPointDiv(N->getOpcode(), dl, N->getOperand(0),
		N->getOperand(1),
		N->getConstantOperandVal(2), DAG);

		if (!Res)
		Res = earlyExpandDIVFIX(N, N->getOperand(0), N->getOperand(1),
N->getConstantOperandVal(2), TLI, DAG);		N->getConstantOperandVal(2), TLI, DAG);
SplitInteger(Res, Lo, Hi);		SplitInteger(Res, Lo, Hi);
}		}

void DAGTypeLegalizer::ExpandIntRes_SADDSUBO(SDNode *Node,		void DAGTypeLegalizer::ExpandIntRes_SADDSUBO(SDNode *Node,
SDValue &Lo, SDValue &Hi) {		SDValue &Lo, SDValue &Hi) {
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
SDLoc dl(Node);		SDLoc dl(Node);
▲ Show 20 Lines • Show All 1,201 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	class VectorLegalizer {
std::pair<SDValue, SDValue> ExpandLoad(SDNode *N);		std::pair<SDValue, SDValue> ExpandLoad(SDNode *N);
SDValue ExpandStore(SDNode *N);		SDValue ExpandStore(SDNode *N);
SDValue ExpandFNEG(SDNode *Node);		SDValue ExpandFNEG(SDNode *Node);
void ExpandFSUB(SDNode *Node, SmallVectorImpl<SDValue> &Results);		void ExpandFSUB(SDNode *Node, SmallVectorImpl<SDValue> &Results);
void ExpandBITREVERSE(SDNode *Node, SmallVectorImpl<SDValue> &Results);		void ExpandBITREVERSE(SDNode *Node, SmallVectorImpl<SDValue> &Results);
void ExpandUADDSUBO(SDNode *Node, SmallVectorImpl<SDValue> &Results);		void ExpandUADDSUBO(SDNode *Node, SmallVectorImpl<SDValue> &Results);
void ExpandSADDSUBO(SDNode *Node, SmallVectorImpl<SDValue> &Results);		void ExpandSADDSUBO(SDNode *Node, SmallVectorImpl<SDValue> &Results);
void ExpandMULO(SDNode *Node, SmallVectorImpl<SDValue> &Results);		void ExpandMULO(SDNode *Node, SmallVectorImpl<SDValue> &Results);
SDValue ExpandFixedPointDiv(SDNode *Node);		void ExpandFixedPointDiv(SDNode *Node, SmallVectorImpl<SDValue> &Results);
SDValue ExpandStrictFPOp(SDNode *Node);		SDValue ExpandStrictFPOp(SDNode *Node);
void ExpandStrictFPOp(SDNode *Node, SmallVectorImpl<SDValue> &Results);		void ExpandStrictFPOp(SDNode *Node, SmallVectorImpl<SDValue> &Results);

void UnrollStrictFPOp(SDNode *Node, SmallVectorImpl<SDValue> &Results);		void UnrollStrictFPOp(SDNode *Node, SmallVectorImpl<SDValue> &Results);

/// Implements vector promotion.		/// Implements vector promotion.
///		///
/// This is essentially just bitcasting the operands to a different type and		/// This is essentially just bitcasting the operands to a different type and
▲ Show 20 Lines • Show All 304 Lines • ▼ Show 20 Lines	#include "llvm/IR/ConstrainedOps.def"
case ISD::USUBSAT:		case ISD::USUBSAT:
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
case ISD::SDIVFIX:		case ISD::SDIVFIX:
case ISD::UDIVFIX: {		case ISD::SDIVFIXSAT:
		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT: {
unsigned Scale = Node->getConstantOperandVal(2);		unsigned Scale = Node->getConstantOperandVal(2);
Action = TLI.getFixedPointOperationAction(Node->getOpcode(),		Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
Node->getValueType(0), Scale);		Node->getValueType(0), Scale);
break;		break;
}		}
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
▲ Show 20 Lines • Show All 488 Lines • ▼ Show 20 Lines	void VectorLegalizer::Expand(SDNode *Node, SmallVectorImpl<SDValue> &Results) {
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
// FIXME: We do not expand SMULFIXSAT/UMULFIXSAT here yet, not sure exactly		// FIXME: We do not expand SMULFIXSAT/UMULFIXSAT here yet, not sure exactly
// why. Maybe it results in worse codegen compared to the unroll for some		// why. Maybe it results in worse codegen compared to the unroll for some
// targets? This should probably be investigated. And if we still prefer to		// targets? This should probably be investigated. And if we still prefer to
// unroll an explanation could be helpful.		// unroll an explanation could be helpful.
break;		break;
case ISD::SDIVFIX:		case ISD::SDIVFIX:
case ISD::UDIVFIX:		case ISD::UDIVFIX:
Results.push_back(ExpandFixedPointDiv(Node));		ExpandFixedPointDiv(Node, Results);
return;		return;
		case ISD::SDIVFIXSAT:
		case ISD::UDIVFIXSAT:
		break;
#define DAG_INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC, DAGN) \		#define DAG_INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC, DAGN) \
case ISD::STRICT_##DAGN:		case ISD::STRICT_##DAGN:
#include "llvm/IR/ConstrainedOps.def"		#include "llvm/IR/ConstrainedOps.def"
ExpandStrictFPOp(Node, Results);		ExpandStrictFPOp(Node, Results);
return;		return;
case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
▲ Show 20 Lines • Show All 468 Lines • ▼ Show 20 Lines	void VectorLegalizer::ExpandMULO(SDNode *Node,
SDValue Result, Overflow;		SDValue Result, Overflow;
if (!TLI.expandMULO(Node, Result, Overflow, DAG))		if (!TLI.expandMULO(Node, Result, Overflow, DAG))
std::tie(Result, Overflow) = DAG.UnrollVectorOverflowOp(Node);		std::tie(Result, Overflow) = DAG.UnrollVectorOverflowOp(Node);

Results.push_back(Result);		Results.push_back(Result);
Results.push_back(Overflow);		Results.push_back(Overflow);
}		}

SDValue VectorLegalizer::ExpandFixedPointDiv(SDNode *Node) {		void VectorLegalizer::ExpandFixedPointDiv(SDNode *Node,
		SmallVectorImpl<SDValue> &Results) {
SDNode *N = Node;		SDNode *N = Node;
if (SDValue Expanded = TLI.expandFixedPointDiv(N->getOpcode(), SDLoc(N),		if (SDValue Expanded = TLI.expandFixedPointDiv(N->getOpcode(), SDLoc(N),
N->getOperand(0), N->getOperand(1), N->getConstantOperandVal(2), DAG))		N->getOperand(0), N->getOperand(1), N->getConstantOperandVal(2), DAG))
return Expanded;		Results.push_back(Expanded);
return DAG.UnrollVectorOp(N);
}		}

void VectorLegalizer::ExpandStrictFPOp(SDNode *Node,		void VectorLegalizer::ExpandStrictFPOp(SDNode *Node,
SmallVectorImpl<SDValue> &Results) {		SmallVectorImpl<SDValue> &Results) {
if (Node->getOpcode() == ISD::STRICT_UINT_TO_FP) {		if (Node->getOpcode() == ISD::STRICT_UINT_TO_FP) {
ExpandUINT_TO_FLOAT(Node, Results);		ExpandUINT_TO_FLOAT(Node, Results);
return;		return;
}		}
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 160 Lines • ▼ Show 20 Lines	#include "llvm/IR/ConstrainedOps.def"
case ISD::SMULO:		case ISD::SMULO:
R = ScalarizeVecRes_OverflowOp(N, ResNo);		R = ScalarizeVecRes_OverflowOp(N, ResNo);
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
case ISD::SDIVFIX:		case ISD::SDIVFIX:
		case ISD::SDIVFIXSAT:
case ISD::UDIVFIX:		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT:
R = ScalarizeVecRes_FIX(N);		R = ScalarizeVecRes_FIX(N);
break;		break;
}		}

// If R is null, the sub-method took care of registering the result.		// If R is null, the sub-method took care of registering the result.
if (R.getNode())		if (R.getNode())
SetScalarizedVector(SDValue(N, ResNo), R);		SetScalarizedVector(SDValue(N, ResNo), R);
}		}
▲ Show 20 Lines • Show All 773 Lines • ▼ Show 20 Lines	#include "llvm/IR/ConstrainedOps.def"
case ISD::SMULO:		case ISD::SMULO:
SplitVecRes_OverflowOp(N, ResNo, Lo, Hi);		SplitVecRes_OverflowOp(N, ResNo, Lo, Hi);
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::SMULFIXSAT:		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
case ISD::UMULFIXSAT:		case ISD::UMULFIXSAT:
case ISD::SDIVFIX:		case ISD::SDIVFIX:
		case ISD::SDIVFIXSAT:
case ISD::UDIVFIX:		case ISD::UDIVFIX:
		case ISD::UDIVFIXSAT:
SplitVecRes_FIX(N, Lo, Hi);		SplitVecRes_FIX(N, Lo, Hi);
break;		break;
}		}

// If Lo/Hi is null, the sub-method took care of registering results etc.		// If Lo/Hi is null, the sub-method took care of registering results etc.
if (Lo.getNode())		if (Lo.getNode())
SetSplitVector(SDValue(N, ResNo), Lo, Hi);		SetSplitVector(SDValue(N, ResNo), Lo, Hi);
}		}
▲ Show 20 Lines • Show All 4,212 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,445 Lines • ▼ Show 20 Lines	static SDValue ExpandPowI(const SDLoc &DL, SDValue LHS, SDValue RHS,
// Otherwise, expand to a libcall.		// Otherwise, expand to a libcall.
return DAG.getNode(ISD::FPOWI, DL, LHS.getValueType(), LHS, RHS);		return DAG.getNode(ISD::FPOWI, DL, LHS.getValueType(), LHS, RHS);
}		}

static SDValue expandDivFix(unsigned Opcode, const SDLoc &DL,		static SDValue expandDivFix(unsigned Opcode, const SDLoc &DL,
SDValue LHS, SDValue RHS, SDValue Scale,		SDValue LHS, SDValue RHS, SDValue Scale,
SelectionDAG &DAG, const TargetLowering &TLI) {		SelectionDAG &DAG, const TargetLowering &TLI) {
EVT VT = LHS.getValueType();		EVT VT = LHS.getValueType();
bool Signed = Opcode == ISD::SDIVFIX;		bool Signed = Opcode == ISD::SDIVFIX \|\| Opcode == ISD::SDIVFIXSAT;
		bool Saturating = Opcode == ISD::SDIVFIXSAT \|\| Opcode == ISD::UDIVFIXSAT;
LLVMContext &Ctx = *DAG.getContext();		LLVMContext &Ctx = *DAG.getContext();

// If the type is legal but the operation isn't, this node might survive all		// If the type is legal but the operation isn't, this node might survive all
// the way to operation legalization. If we end up there and we do not have		// the way to operation legalization. If we end up there and we do not have
// the ability to widen the type (if VT*2 is not legal), we cannot expand the		// the ability to widen the type (if VT*2 is not legal), we cannot expand the
// node.		// node.

// Coax the legalizer into expanding the node during type legalization instead		// Coax the legalizer into expanding the node during type legalization instead
// by bumping the size by one bit. This will force it to Promote, enabling the		// by bumping the size by one bit. This will force it to Promote, enabling the
// early expansion and avoiding the need to expand later.		// early expansion and avoiding the need to expand later.

// We don't have to do this if Scale is 0; that can always be expanded.		// We don't have to do this if Scale is 0; that can always be expanded, unless
		// it's a saturating signed operation. Those can experience true integer
		// division overflow, a case which we must avoid.

// FIXME: We wouldn't have to do this (or any of the early		// FIXME: We wouldn't have to do this (or any of the early
// expansion/promotion) if it was possible to expand a libcall of an		// expansion/promotion) if it was possible to expand a libcall of an
// illegal type during operation legalization. But it's not, so things		// illegal type during operation legalization. But it's not, so things
// get a bit hacky.		// get a bit hacky.
unsigned ScaleInt = cast<ConstantSDNode>(Scale)->getZExtValue();		unsigned ScaleInt = cast<ConstantSDNode>(Scale)->getZExtValue();
if (ScaleInt > 0 &&		if ((ScaleInt > 0 \|\| (Saturating && Signed)) &&
(TLI.isTypeLegal(VT) \|\|		(TLI.isTypeLegal(VT) \|\|
(VT.isVector() && TLI.isTypeLegal(VT.getVectorElementType())))) {		(VT.isVector() && TLI.isTypeLegal(VT.getVectorElementType())))) {
TargetLowering::LegalizeAction Action = TLI.getFixedPointOperationAction(		TargetLowering::LegalizeAction Action = TLI.getFixedPointOperationAction(
Opcode, VT, ScaleInt);		Opcode, VT, ScaleInt);
if (Action != TargetLowering::Legal && Action != TargetLowering::Custom) {		if (Action != TargetLowering::Legal && Action != TargetLowering::Custom) {
EVT PromVT;		EVT PromVT;
if (VT.isScalarInteger())		if (VT.isScalarInteger())
PromVT = EVT::getIntegerVT(Ctx, VT.getSizeInBits() + 1);		PromVT = EVT::getIntegerVT(Ctx, VT.getSizeInBits() + 1);
else if (VT.isVector()) {		else if (VT.isVector()) {
PromVT = VT.getVectorElementType();		PromVT = VT.getVectorElementType();
PromVT = EVT::getIntegerVT(Ctx, PromVT.getSizeInBits() + 1);		PromVT = EVT::getIntegerVT(Ctx, PromVT.getSizeInBits() + 1);
PromVT = EVT::getVectorVT(Ctx, PromVT, VT.getVectorElementCount());		PromVT = EVT::getVectorVT(Ctx, PromVT, VT.getVectorElementCount());
} else		} else
llvm_unreachable("Wrong VT for DIVFIX?");		llvm_unreachable("Wrong VT for DIVFIX?");
if (Signed) {		if (Signed) {
LHS = DAG.getSExtOrTrunc(LHS, DL, PromVT);		LHS = DAG.getSExtOrTrunc(LHS, DL, PromVT);
RHS = DAG.getSExtOrTrunc(RHS, DL, PromVT);		RHS = DAG.getSExtOrTrunc(RHS, DL, PromVT);
} else {		} else {
LHS = DAG.getZExtOrTrunc(LHS, DL, PromVT);		LHS = DAG.getZExtOrTrunc(LHS, DL, PromVT);
RHS = DAG.getZExtOrTrunc(RHS, DL, PromVT);		RHS = DAG.getZExtOrTrunc(RHS, DL, PromVT);
}		}
// TODO: Saturation.		EVT ShiftTy = TLI.getShiftAmountTy(PromVT, DAG.getDataLayout());
		// For saturating operations, we need to shift up the LHS to get the
		// proper saturation width, and then shift down again afterwards.
		if (Saturating)
		LHS = DAG.getNode(ISD::SHL, DL, PromVT, LHS,
		DAG.getConstant(1, DL, ShiftTy));
SDValue Res = DAG.getNode(Opcode, DL, PromVT, LHS, RHS, Scale);		SDValue Res = DAG.getNode(Opcode, DL, PromVT, LHS, RHS, Scale);
		if (Saturating)
		Res = DAG.getNode(Signed ? ISD::SRA : ISD::SRL, DL, PromVT, Res,
		DAG.getConstant(1, DL, ShiftTy));
return DAG.getZExtOrTrunc(Res, DL, VT);		return DAG.getZExtOrTrunc(Res, DL, VT);
}		}
}		}

return DAG.getNode(Opcode, DL, VT, LHS, RHS, Scale);		return DAG.getNode(Opcode, DL, VT, LHS, RHS, Scale);
}		}

// getUnderlyingArgRegs - Find underlying registers used for a truncated,		// getUnderlyingArgRegs - Find underlying registers used for a truncated,
▲ Show 20 Lines • Show All 247 Lines • ▼ Show 20 Lines	static unsigned FixedPointIntrinsicToOpcode(unsigned Intrinsic) {
case Intrinsic::smul_fix_sat:		case Intrinsic::smul_fix_sat:
return ISD::SMULFIXSAT;		return ISD::SMULFIXSAT;
case Intrinsic::umul_fix_sat:		case Intrinsic::umul_fix_sat:
return ISD::UMULFIXSAT;		return ISD::UMULFIXSAT;
case Intrinsic::sdiv_fix:		case Intrinsic::sdiv_fix:
return ISD::SDIVFIX;		return ISD::SDIVFIX;
case Intrinsic::udiv_fix:		case Intrinsic::udiv_fix:
return ISD::UDIVFIX;		return ISD::UDIVFIX;
		case Intrinsic::sdiv_fix_sat:
		return ISD::SDIVFIXSAT;
		case Intrinsic::udiv_fix_sat:
		return ISD::UDIVFIXSAT;
default:		default:
llvm_unreachable("Unhandled fixed point intrinsic");		llvm_unreachable("Unhandled fixed point intrinsic");
}		}
}		}

void SelectionDAGBuilder::lowerCallToExternalSymbol(const CallInst &I,		void SelectionDAGBuilder::lowerCallToExternalSymbol(const CallInst &I,
const char *FunctionName) {		const char *FunctionName) {
assert(FunctionName && "FunctionName must not be nullptr");		assert(FunctionName && "FunctionName must not be nullptr");
▲ Show 20 Lines • Show All 687 Lines • ▼ Show 20 Lines	case Intrinsic::umul_fix_sat: {
SDValue Op1 = getValue(I.getArgOperand(0));		SDValue Op1 = getValue(I.getArgOperand(0));
SDValue Op2 = getValue(I.getArgOperand(1));		SDValue Op2 = getValue(I.getArgOperand(1));
SDValue Op3 = getValue(I.getArgOperand(2));		SDValue Op3 = getValue(I.getArgOperand(2));
setValue(&I, DAG.getNode(FixedPointIntrinsicToOpcode(Intrinsic), sdl,		setValue(&I, DAG.getNode(FixedPointIntrinsicToOpcode(Intrinsic), sdl,
Op1.getValueType(), Op1, Op2, Op3));		Op1.getValueType(), Op1, Op2, Op3));
return;		return;
}		}
case Intrinsic::sdiv_fix:		case Intrinsic::sdiv_fix:
case Intrinsic::udiv_fix: {		case Intrinsic::udiv_fix:
		case Intrinsic::sdiv_fix_sat:
		case Intrinsic::udiv_fix_sat: {
SDValue Op1 = getValue(I.getArgOperand(0));		SDValue Op1 = getValue(I.getArgOperand(0));
SDValue Op2 = getValue(I.getArgOperand(1));		SDValue Op2 = getValue(I.getArgOperand(1));
SDValue Op3 = getValue(I.getArgOperand(2));		SDValue Op3 = getValue(I.getArgOperand(2));
setValue(&I, expandDivFix(FixedPointIntrinsicToOpcode(Intrinsic), sdl,		setValue(&I, expandDivFix(FixedPointIntrinsicToOpcode(Intrinsic), sdl,
Op1, Op2, Op3, DAG, TLI));		Op1, Op2, Op3, DAG, TLI));
return;		return;
}		}
case Intrinsic::stacksave: {		case Intrinsic::stacksave: {
▲ Show 20 Lines • Show All 4,213 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	#endif
case ISD::USUBSAT: return "usubsat";		case ISD::USUBSAT: return "usubsat";

case ISD::SMULFIX: return "smulfix";		case ISD::SMULFIX: return "smulfix";
case ISD::SMULFIXSAT: return "smulfixsat";		case ISD::SMULFIXSAT: return "smulfixsat";
case ISD::UMULFIX: return "umulfix";		case ISD::UMULFIX: return "umulfix";
case ISD::UMULFIXSAT: return "umulfixsat";		case ISD::UMULFIXSAT: return "umulfixsat";

case ISD::SDIVFIX: return "sdivfix";		case ISD::SDIVFIX: return "sdivfix";
		case ISD::SDIVFIXSAT: return "sdivfixsat";
case ISD::UDIVFIX: return "udivfix";		case ISD::UDIVFIX: return "udivfix";
		case ISD::UDIVFIXSAT: return "udivfixsat";

// Conversion operators.		// Conversion operators.
case ISD::SIGN_EXTEND: return "sign_extend";		case ISD::SIGN_EXTEND: return "sign_extend";
case ISD::ZERO_EXTEND: return "zero_extend";		case ISD::ZERO_EXTEND: return "zero_extend";
case ISD::ANY_EXTEND: return "any_extend";		case ISD::ANY_EXTEND: return "any_extend";
case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";		case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";
case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";		case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";
case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";		case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";
▲ Show 20 Lines • Show All 661 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,326 Lines • ▼ Show 20 Lines	TargetLowering::expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const {
Result = DAG.getSelectCC(dl, Hi, HighMask, SatMin, Result, ISD::SETLT);		Result = DAG.getSelectCC(dl, Hi, HighMask, SatMin, Result, ISD::SETLT);
return Result;		return Result;
}		}

SDValue		SDValue
TargetLowering::expandFixedPointDiv(unsigned Opcode, const SDLoc &dl,		TargetLowering::expandFixedPointDiv(unsigned Opcode, const SDLoc &dl,
SDValue LHS, SDValue RHS,		SDValue LHS, SDValue RHS,
unsigned Scale, SelectionDAG &DAG) const {		unsigned Scale, SelectionDAG &DAG) const {
assert((Opcode == ISD::SDIVFIX \|\|		assert((Opcode == ISD::SDIVFIX \|\| Opcode == ISD::SDIVFIXSAT \|\|
Opcode == ISD::UDIVFIX) &&		Opcode == ISD::UDIVFIX \|\| Opcode == ISD::UDIVFIXSAT) &&
"Expected a fixed point division opcode");		"Expected a fixed point division opcode");

EVT VT = LHS.getValueType();		EVT VT = LHS.getValueType();
bool Signed = Opcode == ISD::SDIVFIX;		bool Signed = Opcode == ISD::SDIVFIX \|\| Opcode == ISD::SDIVFIXSAT;
		bool Saturating = Opcode == ISD::SDIVFIXSAT \|\| Opcode == ISD::UDIVFIXSAT;
EVT BoolVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);		EVT BoolVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);

// If there is enough room in the type to upscale the LHS or downscale the		// If there is enough room in the type to upscale the LHS or downscale the
// RHS before the division, we can perform it in this type without having to		// RHS before the division, we can perform it in this type without having to
// resize. For signed operations, the LHS headroom is the number of		// resize. For signed operations, the LHS headroom is the number of
// redundant sign bits, and for unsigned ones it is the number of zeroes.		// redundant sign bits, and for unsigned ones it is the number of zeroes.
// The headroom for the RHS is the number of trailing zeroes.		// The headroom for the RHS is the number of trailing zeroes.
unsigned LHSLead = Signed ? DAG.ComputeNumSignBits(LHS) - 1		unsigned LHSLead = Signed ? DAG.ComputeNumSignBits(LHS) - 1
: DAG.computeKnownBits(LHS).countMinLeadingZeros();		: DAG.computeKnownBits(LHS).countMinLeadingZeros();
unsigned RHSTrail = DAG.computeKnownBits(RHS).countMinTrailingZeros();		unsigned RHSTrail = DAG.computeKnownBits(RHS).countMinTrailingZeros();

if (LHSLead + RHSTrail < Scale)		// For signed saturating operations, we need to be able to detect true integer
		// division overflow; that is, when you have MIN / -EPS. However, this
		// is undefined behavior and if we emit divisions that could take such
		// values it may cause undesired behavior (arithmetic exceptions on x86, for
		// example).
		// Avoid this by requiring an extra bit so that we never get this case.
		// FIXME: This is a bit unfortunate as it means that for an 8-bit 7-scale
		// signed saturating division, we need to emit a whopping 32-bit division.
		if (LHSLead + RHSTrail < Scale + (unsigned)(Saturating && Signed))
return SDValue();		return SDValue();

unsigned LHSShift = std::min(LHSLead, Scale);		unsigned LHSShift = std::min(LHSLead, Scale);
unsigned RHSShift = Scale - LHSShift;		unsigned RHSShift = Scale - LHSShift;

// At this point, we know that if we shift the LHS up by LHSShift and the		// At this point, we know that if we shift the LHS up by LHSShift and the
// RHS down by RHSShift, we can emit a regular division with a final scaling		// RHS down by RHSShift, we can emit a regular division with a final scaling
// factor of Scale.		// factor of Scale.
Show All 37 Lines	SDValue Sub1 = DAG.getNode(ISD::SUB, dl, VT, Quot,
DAG.getConstant(1, dl, VT));		DAG.getConstant(1, dl, VT));
Quot = DAG.getSelect(dl, VT,		Quot = DAG.getSelect(dl, VT,
DAG.getNode(ISD::AND, dl, BoolVT, RemNonZero, QuotNeg),		DAG.getNode(ISD::AND, dl, BoolVT, RemNonZero, QuotNeg),
Sub1, Quot);		Sub1, Quot);
} else		} else
Quot = DAG.getNode(ISD::UDIV, dl, VT,		Quot = DAG.getNode(ISD::UDIV, dl, VT,
LHS, RHS);		LHS, RHS);

// TODO: Saturation.

return Quot;		return Quot;
}		}

void TargetLowering::expandUADDSUBO(		void TargetLowering::expandUADDSUBO(
SDNode *Node, SDValue &Result, SDValue &Overflow, SelectionDAG &DAG) const {		SDNode *Node, SDValue &Result, SDValue &Overflow, SelectionDAG &DAG) const {
SDLoc dl(Node);		SDLoc dl(Node);
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
▲ Show 20 Lines • Show All 265 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 654 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::all_valuetypes()) {
setOperationAction(ISD::UADDSAT, VT, Expand);		setOperationAction(ISD::UADDSAT, VT, Expand);
setOperationAction(ISD::SSUBSAT, VT, Expand);		setOperationAction(ISD::SSUBSAT, VT, Expand);
setOperationAction(ISD::USUBSAT, VT, Expand);		setOperationAction(ISD::USUBSAT, VT, Expand);
setOperationAction(ISD::SMULFIX, VT, Expand);		setOperationAction(ISD::SMULFIX, VT, Expand);
setOperationAction(ISD::SMULFIXSAT, VT, Expand);		setOperationAction(ISD::SMULFIXSAT, VT, Expand);
setOperationAction(ISD::UMULFIX, VT, Expand);		setOperationAction(ISD::UMULFIX, VT, Expand);
setOperationAction(ISD::UMULFIXSAT, VT, Expand);		setOperationAction(ISD::UMULFIXSAT, VT, Expand);
setOperationAction(ISD::SDIVFIX, VT, Expand);		setOperationAction(ISD::SDIVFIX, VT, Expand);
		setOperationAction(ISD::SDIVFIXSAT, VT, Expand);
setOperationAction(ISD::UDIVFIX, VT, Expand);		setOperationAction(ISD::UDIVFIX, VT, Expand);
		setOperationAction(ISD::UDIVFIXSAT, VT, Expand);

// Overflow operations default to expand		// Overflow operations default to expand
setOperationAction(ISD::SADDO, VT, Expand);		setOperationAction(ISD::SADDO, VT, Expand);
setOperationAction(ISD::SSUBO, VT, Expand);		setOperationAction(ISD::SSUBO, VT, Expand);
setOperationAction(ISD::UADDO, VT, Expand);		setOperationAction(ISD::UADDO, VT, Expand);
setOperationAction(ISD::USUBO, VT, Expand);		setOperationAction(ISD::USUBO, VT, Expand);
setOperationAction(ISD::SMULO, VT, Expand);		setOperationAction(ISD::SMULO, VT, Expand);
setOperationAction(ISD::UMULO, VT, Expand);		setOperationAction(ISD::UMULO, VT, Expand);
▲ Show 20 Lines • Show All 1,401 Lines • Show Last 20 Lines

llvm/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,721 Lines • ▼ Show 20 Lines	Assert(Op2->getType()->isIntOrIntVectorTy(),
"of ints");		"of ints");
break;		break;
}		}
case Intrinsic::smul_fix:		case Intrinsic::smul_fix:
case Intrinsic::smul_fix_sat:		case Intrinsic::smul_fix_sat:
case Intrinsic::umul_fix:		case Intrinsic::umul_fix:
case Intrinsic::umul_fix_sat:		case Intrinsic::umul_fix_sat:
case Intrinsic::sdiv_fix:		case Intrinsic::sdiv_fix:
case Intrinsic::udiv_fix: {		case Intrinsic::sdiv_fix_sat:
		case Intrinsic::udiv_fix:
		case Intrinsic::udiv_fix_sat: {
Value *Op1 = Call.getArgOperand(0);		Value *Op1 = Call.getArgOperand(0);
Value *Op2 = Call.getArgOperand(1);		Value *Op2 = Call.getArgOperand(1);
Assert(Op1->getType()->isIntOrIntVectorTy(),		Assert(Op1->getType()->isIntOrIntVectorTy(),
"first operand of [us][mul\|div]_fix[_sat] must be an int type or "		"first operand of [us][mul\|div]_fix[_sat] must be an int type or "
"vector of ints");		"vector of ints");
Assert(Op2->getType()->isIntOrIntVectorTy(),		Assert(Op2->getType()->isIntOrIntVectorTy(),
"second operand of [us][mul\|div]_fix[_sat] must be an int type or "		"second operand of [us][mul\|div]_fix[_sat] must be an int type or "
"vector of ints");		"vector of ints");

auto *Op3 = cast<ConstantInt>(Call.getArgOperand(2));		auto *Op3 = cast<ConstantInt>(Call.getArgOperand(2));
Assert(Op3->getType()->getBitWidth() <= 32,		Assert(Op3->getType()->getBitWidth() <= 32,
"third argument of [us][mul\|div]_fix[_sat] must fit within 32 bits");		"third argument of [us][mul\|div]_fix[_sat] must fit within 32 bits");

if (ID == Intrinsic::smul_fix \|\| ID == Intrinsic::smul_fix_sat \|\|		if (ID == Intrinsic::smul_fix \|\| ID == Intrinsic::smul_fix_sat \|\|
ID == Intrinsic::sdiv_fix) {		ID == Intrinsic::sdiv_fix \|\| ID == Intrinsic::sdiv_fix_sat) {
Assert(		Assert(
Op3->getZExtValue() < Op1->getType()->getScalarSizeInBits(),		Op3->getZExtValue() < Op1->getType()->getScalarSizeInBits(),
"the scale of s[mul\|div]_fix[_sat] must be less than the width of "		"the scale of s[mul\|div]_fix[_sat] must be less than the width of "
"the operands");		"the operands");
} else {		} else {
Assert(Op3->getZExtValue() <= Op1->getType()->getScalarSizeInBits(),		Assert(Op3->getZExtValue() <= Op1->getType()->getScalarSizeInBits(),
"the scale of u[mul\|div]_fix[_sat] must be less than or equal "		"the scale of u[mul\|div]_fix[_sat] must be less than or equal "
"to the width of the operands");		"to the width of the operands");
▲ Show 20 Lines • Show All 827 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/sdiv_fix_sat.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s --check-prefix=X64
				; RUN: llc < %s -mtriple=i686 -mattr=cmov \| FileCheck %s --check-prefix=X86

				declare i4 @llvm.sdiv.fix.sat.i4 (i4, i4, i32)
				declare i15 @llvm.sdiv.fix.sat.i15 (i15, i15, i32)
				declare i16 @llvm.sdiv.fix.sat.i16 (i16, i16, i32)
				declare i18 @llvm.sdiv.fix.sat.i18 (i18, i18, i32)
				declare i64 @llvm.sdiv.fix.sat.i64 (i64, i64, i32)
				declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32>, <4 x i32>, i32)

				define i16 @func(i16 %x, i16 %y) nounwind {
				;
				; X64-LABEL: func:
				; X64: # %bb.0:
				; X64-NEXT: movswl %si, %esi
				; X64-NEXT: movswl %di, %ecx
				; X64-NEXT: shll $8, %ecx
				; X64-NEXT: movl %ecx, %eax
				; X64-NEXT: cltd
				; X64-NEXT: idivl %esi
				; X64-NEXT: # kill: def $eax killed $eax def $rax
				; X64-NEXT: leal -1(%rax), %edi
				; X64-NEXT: testl %esi, %esi
				; X64-NEXT: sets %sil
				; X64-NEXT: testl %ecx, %ecx
				; X64-NEXT: sets %cl
				; X64-NEXT: xorb %sil, %cl
				; X64-NEXT: testl %edx, %edx
				; X64-NEXT: setne %dl
				; X64-NEXT: testb %cl, %dl
				; X64-NEXT: cmovel %eax, %edi
				; X64-NEXT: cmpl $65535, %edi # imm = 0xFFFF
				; X64-NEXT: movl $65535, %ecx # imm = 0xFFFF
				; X64-NEXT: cmovll %edi, %ecx
				; X64-NEXT: cmpl $-65536, %ecx # imm = 0xFFFF0000
				; X64-NEXT: movl $-65536, %eax # imm = 0xFFFF0000
				; X64-NEXT: cmovgl %ecx, %eax
				; X64-NEXT: shrl %eax
				; X64-NEXT: # kill: def $ax killed $ax killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movswl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movswl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: shll $8, %ecx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: cltd
				; X86-NEXT: idivl %esi
				; X86-NEXT: leal -1(%eax), %edi
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: sets %bl
				; X86-NEXT: testl %ecx, %ecx
				; X86-NEXT: sets %cl
				; X86-NEXT: xorb %bl, %cl
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: setne %dl
				; X86-NEXT: testb %cl, %dl
				; X86-NEXT: cmovel %eax, %edi
				; X86-NEXT: cmpl $65535, %edi # imm = 0xFFFF
				; X86-NEXT: movl $65535, %ecx # imm = 0xFFFF
				; X86-NEXT: cmovll %edi, %ecx
				; X86-NEXT: cmpl $-65536, %ecx # imm = 0xFFFF0000
				; X86-NEXT: movl $-65536, %eax # imm = 0xFFFF0000
				; X86-NEXT: cmovgl %ecx, %eax
				; X86-NEXT: shrl %eax
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: retl
				%tmp = call i16 @llvm.sdiv.fix.sat.i16(i16 %x, i16 %y, i32 7)
				ret i16 %tmp
				}

				define i16 @func2(i8 %x, i8 %y) nounwind {
				;
				; X64-LABEL: func2:
				; X64: # %bb.0:
				; X64-NEXT: movsbl %dil, %eax
				; X64-NEXT: movsbl %sil, %ecx
				; X64-NEXT: movswl %cx, %esi
				; X64-NEXT: movswl %ax, %ecx
				; X64-NEXT: shll $14, %ecx
				; X64-NEXT: movl %ecx, %eax
				; X64-NEXT: cltd
				; X64-NEXT: idivl %esi
				; X64-NEXT: # kill: def $eax killed $eax def $rax
				; X64-NEXT: leal -1(%rax), %edi
				; X64-NEXT: testl %esi, %esi
				; X64-NEXT: sets %sil
				; X64-NEXT: testl %ecx, %ecx
				; X64-NEXT: sets %cl
				; X64-NEXT: xorb %sil, %cl
				; X64-NEXT: testl %edx, %edx
				; X64-NEXT: setne %dl
				; X64-NEXT: testb %cl, %dl
				; X64-NEXT: cmovel %eax, %edi
				; X64-NEXT: cmpl $16383, %edi # imm = 0x3FFF
				; X64-NEXT: movl $16383, %ecx # imm = 0x3FFF
				; X64-NEXT: cmovll %edi, %ecx
				; X64-NEXT: cmpl $-16384, %ecx # imm = 0xC000
				; X64-NEXT: movl $-16384, %eax # imm = 0xC000
				; X64-NEXT: cmovgl %ecx, %eax
				; X64-NEXT: # kill: def $ax killed $ax killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func2:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movsbl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movsbl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: shll $14, %ecx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: cltd
				; X86-NEXT: idivl %esi
				; X86-NEXT: leal -1(%eax), %edi
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: sets %bl
				; X86-NEXT: testl %ecx, %ecx
				; X86-NEXT: sets %cl
				; X86-NEXT: xorb %bl, %cl
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: setne %dl
				; X86-NEXT: testb %cl, %dl
				; X86-NEXT: cmovel %eax, %edi
				; X86-NEXT: cmpl $16383, %edi # imm = 0x3FFF
				; X86-NEXT: movl $16383, %ecx # imm = 0x3FFF
				; X86-NEXT: cmovll %edi, %ecx
				; X86-NEXT: cmpl $-16384, %ecx # imm = 0xC000
				; X86-NEXT: movl $-16384, %eax # imm = 0xC000
				; X86-NEXT: cmovgl %ecx, %eax
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: retl
				%x2 = sext i8 %x to i15
				%y2 = sext i8 %y to i15
				%tmp = call i15 @llvm.sdiv.fix.sat.i15(i15 %x2, i15 %y2, i32 14)
				%tmp2 = sext i15 %tmp to i16
				ret i16 %tmp2
				}

				define i16 @func3(i15 %x, i8 %y) nounwind {
				;
				; X64-LABEL: func3:
				; X64: # %bb.0:
				; X64-NEXT: shll $8, %esi
				; X64-NEXT: movswl %si, %ecx
				; X64-NEXT: addl %edi, %edi
				; X64-NEXT: shrl $4, %ecx
				; X64-NEXT: movl %edi, %eax
				; X64-NEXT: cwtd
				; X64-NEXT: idivw %cx
				; X64-NEXT: # kill: def $ax killed $ax def $rax
				; X64-NEXT: leal -1(%rax), %esi
				; X64-NEXT: testw %di, %di
				; X64-NEXT: sets %dil
				; X64-NEXT: testw %cx, %cx
				; X64-NEXT: sets %cl
				; X64-NEXT: xorb %dil, %cl
				; X64-NEXT: testw %dx, %dx
				; X64-NEXT: setne %dl
				; X64-NEXT: testb %cl, %dl
				; X64-NEXT: cmovel %eax, %esi
				; X64-NEXT: movswl %si, %eax
				; X64-NEXT: cmpl $16383, %eax # imm = 0x3FFF
				; X64-NEXT: movl $16383, %ecx # imm = 0x3FFF
				; X64-NEXT: cmovll %esi, %ecx
				; X64-NEXT: movswl %cx, %eax
				; X64-NEXT: cmpl $-16384, %eax # imm = 0xC000
				; X64-NEXT: movl $49152, %eax # imm = 0xC000
				; X64-NEXT: cmovgl %ecx, %eax
				; X64-NEXT: # kill: def $ax killed $ax killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func3:
				; X86: # %bb.0:
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: shll $8, %eax
				; X86-NEXT: movswl %ax, %esi
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: shrl $4, %esi
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: cwtd
				; X86-NEXT: idivw %si
				; X86-NEXT: # kill: def $ax killed $ax def $eax
				; X86-NEXT: leal -1(%eax), %edi
				; X86-NEXT: testw %cx, %cx
				; X86-NEXT: sets %cl
				; X86-NEXT: testw %si, %si
				; X86-NEXT: sets %ch
				; X86-NEXT: xorb %cl, %ch
				; X86-NEXT: testw %dx, %dx
				; X86-NEXT: setne %cl
				; X86-NEXT: testb %ch, %cl
				; X86-NEXT: cmovel %eax, %edi
				; X86-NEXT: movswl %di, %eax
				; X86-NEXT: cmpl $16383, %eax # imm = 0x3FFF
				; X86-NEXT: movl $16383, %ecx # imm = 0x3FFF
				; X86-NEXT: cmovll %edi, %ecx
				; X86-NEXT: movswl %cx, %eax
				; X86-NEXT: cmpl $-16384, %eax # imm = 0xC000
				; X86-NEXT: movl $49152, %eax # imm = 0xC000
				; X86-NEXT: cmovgl %ecx, %eax
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: retl
				%y2 = sext i8 %y to i15
				%y3 = shl i15 %y2, 7
				%tmp = call i15 @llvm.sdiv.fix.sat.i15(i15 %x, i15 %y3, i32 4)
				%tmp2 = sext i15 %tmp to i16
				ret i16 %tmp2
				}

				define i4 @func4(i4 %x, i4 %y) nounwind {
				;
				; X64-LABEL: func4:
				; X64: # %bb.0:
				; X64-NEXT: pushq %rbx
				; X64-NEXT: shlb $4, %sil
				; X64-NEXT: sarb $4, %sil
				; X64-NEXT: shlb $4, %dil
				; X64-NEXT: sarb $4, %dil
				; X64-NEXT: shlb $2, %dil
				; X64-NEXT: movsbl %dil, %ecx
				; X64-NEXT: movl %ecx, %eax
				; X64-NEXT: idivb %sil
				; X64-NEXT: movsbl %ah, %ebx
				; X64-NEXT: movzbl %al, %eax
				; X64-NEXT: leal -1(%rax), %edi
				; X64-NEXT: movzbl %dil, %edi
				; X64-NEXT: testb %sil, %sil
				; X64-NEXT: sets %dl
				; X64-NEXT: testb %cl, %cl
				; X64-NEXT: sets %cl
				; X64-NEXT: xorb %dl, %cl
				; X64-NEXT: testb %bl, %bl
				; X64-NEXT: setne %dl
				; X64-NEXT: testb %cl, %dl
				; X64-NEXT: cmovel %eax, %edi
				; X64-NEXT: cmpb $7, %dil
				; X64-NEXT: movl $7, %ecx
				; X64-NEXT: cmovll %edi, %ecx
				; X64-NEXT: cmpb $-8, %cl
				; X64-NEXT: movl $248, %eax
				; X64-NEXT: cmovgl %ecx, %eax
				; X64-NEXT: # kill: def $al killed $al killed $eax
				; X64-NEXT: popq %rbx
				; X64-NEXT: retq
				;
				; X86-LABEL: func4:
				; X86: # %bb.0:
				; X86-NEXT: pushl %esi
				; X86-NEXT: movb {{[0-9]+}}(%esp), %dl
				; X86-NEXT: shlb $4, %dl
				; X86-NEXT: sarb $4, %dl
				; X86-NEXT: movb {{[0-9]+}}(%esp), %dh
				; X86-NEXT: shlb $4, %dh
				; X86-NEXT: sarb $4, %dh
				; X86-NEXT: shlb $2, %dh
				; X86-NEXT: movsbl %dh, %eax
				; X86-NEXT: idivb %dl
				; X86-NEXT: movsbl %ah, %ecx
				; X86-NEXT: movzbl %al, %esi
				; X86-NEXT: decb %al
				; X86-NEXT: movzbl %al, %eax
				; X86-NEXT: testb %dl, %dl
				; X86-NEXT: sets %dl
				; X86-NEXT: testb %dh, %dh
				; X86-NEXT: sets %dh
				; X86-NEXT: xorb %dl, %dh
				; X86-NEXT: testb %cl, %cl
				; X86-NEXT: setne %cl
				; X86-NEXT: testb %dh, %cl
				; X86-NEXT: cmovel %esi, %eax
				; X86-NEXT: cmpb $7, %al
				; X86-NEXT: movl $7, %ecx
				; X86-NEXT: cmovll %eax, %ecx
				; X86-NEXT: cmpb $-8, %cl
				; X86-NEXT: movl $248, %eax
				; X86-NEXT: cmovgl %ecx, %eax
				; X86-NEXT: # kill: def $al killed $al killed $eax
				; X86-NEXT: popl %esi
				; X86-NEXT: retl
				%tmp = call i4 @llvm.sdiv.fix.sat.i4(i4 %x, i4 %y, i32 2)
				ret i4 %tmp
				}

				define i64 @func5(i64 %x, i64 %y) nounwind {
				;
				; X64-LABEL: func5:
				; X64: # %bb.0:
				; X64-NEXT: pushq %rbp
				; X64-NEXT: pushq %r15
				; X64-NEXT: pushq %r14
				; X64-NEXT: pushq %r13
				; X64-NEXT: pushq %r12
				; X64-NEXT: pushq %rbx
				; X64-NEXT: subq $40, %rsp
				; X64-NEXT: movq %rdi, %r15
				; X64-NEXT: leaq (%rdi,%rdi), %rax
				; X64-NEXT: shrq $33, %rax
				; X64-NEXT: movq %rdi, %r12
				; X64-NEXT: sarq $63, %r12
				; X64-NEXT: shlq $31, %r12
				; X64-NEXT: orq %rax, %r12
				; X64-NEXT: sets {{[-0-9]+}}(%r{{[sb]}}p) # 1-byte Folded Spill
				; X64-NEXT: shlq $32, %r15
				; X64-NEXT: movq %rsi, %rdx
				; X64-NEXT: movq %rsi, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rsi, %r13
				; X64-NEXT: sarq $63, %r13
				; X64-NEXT: movq %r15, %rdi
				; X64-NEXT: movq %r12, %rsi
				; X64-NEXT: movq %r13, %rcx
				; X64-NEXT: callq __divti3
				; X64-NEXT: movq %rax, %rbx
				; X64-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %rbp
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: subq $1, %rbx
				; X64-NEXT: sbbq $0, %rbp
				; X64-NEXT: testq %r13, %r13
				; X64-NEXT: sets %r14b
				; X64-NEXT: xorb {{[-0-9]+}}(%r{{[sb]}}p), %r14b # 1-byte Folded Reload
				; X64-NEXT: movq %r15, %rdi
				; X64-NEXT: movq %r12, %rsi
				; X64-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rdx # 8-byte Reload
				; X64-NEXT: movq %r13, %rcx
				; X64-NEXT: callq __modti3
				; X64-NEXT: orq %rax, %rdx
				; X64-NEXT: setne %al
				; X64-NEXT: testb %r14b, %al
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %rbp # 8-byte Folded Reload
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %rbx # 8-byte Folded Reload
				; X64-NEXT: cmpq $-1, %rbx
				; X64-NEXT: movq $-1, %rax
				; X64-NEXT: movq $-1, %rcx
				; X64-NEXT: cmovbq %rbx, %rcx
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: testq %rbp, %rbp
				; X64-NEXT: cmovnsq %rax, %rbx
				; X64-NEXT: cmoveq %rcx, %rbx
				; X64-NEXT: cmovnsq %rdx, %rbp
				; X64-NEXT: testq %rbx, %rbx
				; X64-NEXT: movl $0, %ecx
				; X64-NEXT: cmovaq %rbx, %rcx
				; X64-NEXT: testq %rbp, %rbp
				; X64-NEXT: cmovnsq %rbp, %rax
				; X64-NEXT: cmovsq %rdx, %rbx
				; X64-NEXT: cmpq $-1, %rbp
				; X64-NEXT: cmoveq %rcx, %rbx
				; X64-NEXT: shrdq $1, %rax, %rbx
				; X64-NEXT: movq %rbx, %rax
				; X64-NEXT: addq $40, %rsp
				; X64-NEXT: popq %rbx
				; X64-NEXT: popq %r12
				; X64-NEXT: popq %r13
				; X64-NEXT: popq %r14
				; X64-NEXT: popq %r15
				; X64-NEXT: popq %rbp
				; X64-NEXT: retq
				;
				; X86-LABEL: func5:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: andl $-8, %esp
				; X86-NEXT: subl $88, %esp
				; X86-NEXT: movl 8(%ebp), %ecx
				; X86-NEXT: movl 12(%ebp), %eax
				; X86-NEXT: movl 20(%ebp), %ebx
				; X86-NEXT: sarl $31, %ebx
				; X86-NEXT: movl %eax, %edi
				; X86-NEXT: sarl $31, %edi
				; X86-NEXT: movl %edi, %edx
				; X86-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: shldl $31, %eax, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: shldl $31, %ecx, %eax
				; X86-NEXT: movl %eax, %esi
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: shll $31, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl 20(%ebp)
				; X86-NEXT: pushl 16(%ebp)
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __divti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: subl $1, %esi
				; X86-NEXT: sbbl $0, %edi
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %ebx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: sets %al
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: testl %ecx, %ecx
				; X86-NEXT: sets %ah
				; X86-NEXT: xorb %al, %ah
				; X86-NEXT: movb %ah, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Spill
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl 20(%ebp)
				; X86-NEXT: pushl 16(%ebp)
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __modti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: orl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: orl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: orl %eax, %ecx
				; X86-NEXT: setne %al
				; X86-NEXT: testb %al, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Reload
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Folded Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Folded Reload
				; X86-NEXT: testl %ebx, %ebx
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmovsl %ebx, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: cmovsl %esi, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $2147483647, %ecx # imm = 0x7FFFFFFF
				; X86-NEXT: cmovsl %edi, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl %ebx, %edx
				; X86-NEXT: sarl $31, %edx
				; X86-NEXT: andl %eax, %edx
				; X86-NEXT: testl %ebx, %ebx
				; X86-NEXT: cmovel %ebx, %edx
				; X86-NEXT: cmpl $-1, %esi
				; X86-NEXT: movl $-1, %eax
				; X86-NEXT: cmovbl %esi, %eax
				; X86-NEXT: cmpl $2147483647, %edi # imm = 0x7FFFFFFF
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: cmovael %ecx, %esi
				; X86-NEXT: cmovel %eax, %esi
				; X86-NEXT: movl $2147483647, %eax # imm = 0x7FFFFFFF
				; X86-NEXT: cmovael %eax, %edi
				; X86-NEXT: orl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Folded Reload
				; X86-NEXT: cmovnel {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Folded Reload
				; X86-NEXT: cmovnel {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: movl $0, %eax
				; X86-NEXT: cmoval %esi, %eax
				; X86-NEXT: cmpl $-2147483648, %edi # imm = 0x80000000
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmoval %esi, %ecx
				; X86-NEXT: cmovel %eax, %ecx
				; X86-NEXT: movl $-2147483648, %eax # imm = 0x80000000
				; X86-NEXT: cmoval %edi, %eax
				; X86-NEXT: cmpl $0, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: movl $-2147483648, %ebx # imm = 0x80000000
				; X86-NEXT: cmovsl %ebx, %edi
				; X86-NEXT: movl $0, %ebx
				; X86-NEXT: cmovsl %ebx, %esi
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: cmpl $-1, %edx
				; X86-NEXT: cmovel %ecx, %esi
				; X86-NEXT: cmovel %eax, %edi
				; X86-NEXT: movl %esi, %eax
				; X86-NEXT: movl %edi, %edx
				; X86-NEXT: leal -12(%ebp), %esp
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl
				%tmp = call i64 @llvm.sdiv.fix.sat.i64(i64 %x, i64 %y, i32 31)
				ret i64 %tmp
				}

				define i18 @func6(i16 %x, i16 %y) nounwind {
				;
				; X64-LABEL: func6:
				; X64: # %bb.0:
				; X64-NEXT: movswl %di, %ecx
				; X64-NEXT: movswl %si, %esi
				; X64-NEXT: shll $7, %ecx
				; X64-NEXT: movl %ecx, %eax
				; X64-NEXT: cltd
				; X64-NEXT: idivl %esi
				; X64-NEXT: # kill: def $eax killed $eax def $rax
				; X64-NEXT: leal -1(%rax), %edi
				; X64-NEXT: testl %esi, %esi
				; X64-NEXT: sets %sil
				; X64-NEXT: testl %ecx, %ecx
				; X64-NEXT: sets %cl
				; X64-NEXT: xorb %sil, %cl
				; X64-NEXT: testl %edx, %edx
				; X64-NEXT: setne %dl
				; X64-NEXT: testb %cl, %dl
				; X64-NEXT: cmovel %eax, %edi
				; X64-NEXT: cmpl $131071, %edi # imm = 0x1FFFF
				; X64-NEXT: movl $131071, %ecx # imm = 0x1FFFF
				; X64-NEXT: cmovll %edi, %ecx
				; X64-NEXT: cmpl $-131072, %ecx # imm = 0xFFFE0000
				; X64-NEXT: movl $-131072, %eax # imm = 0xFFFE0000
				; X64-NEXT: cmovgl %ecx, %eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func6:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movswl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movswl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: shll $7, %ecx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: cltd
				; X86-NEXT: idivl %esi
				; X86-NEXT: leal -1(%eax), %edi
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: sets %bl
				; X86-NEXT: testl %ecx, %ecx
				; X86-NEXT: sets %cl
				; X86-NEXT: xorb %bl, %cl
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: setne %dl
				; X86-NEXT: testb %cl, %dl
				; X86-NEXT: cmovel %eax, %edi
				; X86-NEXT: cmpl $131071, %edi # imm = 0x1FFFF
				; X86-NEXT: movl $131071, %ecx # imm = 0x1FFFF
				; X86-NEXT: cmovll %edi, %ecx
				; X86-NEXT: cmpl $-131072, %ecx # imm = 0xFFFE0000
				; X86-NEXT: movl $-131072, %eax # imm = 0xFFFE0000
				; X86-NEXT: cmovgl %ecx, %eax
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: retl
				%x2 = sext i16 %x to i18
				%y2 = sext i16 %y to i18
				%tmp = call i18 @llvm.sdiv.fix.sat.i18(i18 %x2, i18 %y2, i32 7)
				ret i18 %tmp
				}

				define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {
				;
				; X64-LABEL: vec:
				; X64: # %bb.0:
				; X64-NEXT: pushq %rbp
				; X64-NEXT: pushq %r15
				; X64-NEXT: pushq %r14
				; X64-NEXT: pushq %r13
				; X64-NEXT: pushq %r12
				; X64-NEXT: pushq %rbx
				; X64-NEXT: subq $104, %rsp
				; X64-NEXT: movdqa %xmm1, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: pxor %xmm2, %xmm2
				; X64-NEXT: pcmpgtd %xmm0, %xmm2
				; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1]
				; X64-NEXT: paddq %xmm0, %xmm0
				; X64-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: movq %xmm0, %rbp
				; X64-NEXT: movq %rbp, %r12
				; X64-NEXT: shrq $33, %r12
				; X64-NEXT: movq %rbp, %r14
				; X64-NEXT: sarq $63, %r14
				; X64-NEXT: shlq $31, %r14
				; X64-NEXT: orq %r14, %r12
				; X64-NEXT: pxor %xmm0, %xmm0
				; X64-NEXT: pcmpgtd %xmm1, %xmm0
				; X64-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
				; X64-NEXT: movdqa %xmm1, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: movq %xmm1, %rdx
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %rbx
				; X64-NEXT: sarq $63, %rbx
				; X64-NEXT: shlq $31, %rbp
				; X64-NEXT: movq %rbp, %rdi
				; X64-NEXT: movq %r12, %rsi
				; X64-NEXT: movq %rbx, %rcx
				; X64-NEXT: callq __divti3
				; X64-NEXT: movq %rax, %r13
				; X64-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %r15
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: subq $1, %r13
				; X64-NEXT: sbbq $0, %r15
				; X64-NEXT: shrq $63, %r14
				; X64-NEXT: xorl %ebx, %r14d
				; X64-NEXT: movq %rbp, %rdi
				; X64-NEXT: movq %r12, %rsi
				; X64-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rdx # 8-byte Reload
				; X64-NEXT: movq %rbx, %rcx
				; X64-NEXT: callq __modti3
				; X64-NEXT: orq %rax, %rdx
				; X64-NEXT: setne %al
				; X64-NEXT: testb %r14b, %al
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r15 # 8-byte Folded Reload
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r13 # 8-byte Folded Reload
				; X64-NEXT: movl $4294967295, %edx # imm = 0xFFFFFFFF
				; X64-NEXT: cmpq %rdx, %r13
				; X64-NEXT: movl $4294967295, %eax # imm = 0xFFFFFFFF
				; X64-NEXT: cmovbq %r13, %rax
				; X64-NEXT: xorl %ecx, %ecx
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovnsq %rdx, %r13
				; X64-NEXT: cmoveq %rax, %r13
				; X64-NEXT: cmovnsq %rcx, %r15
				; X64-NEXT: movabsq $-4294967296, %rcx # imm = 0xFFFFFFFF00000000
				; X64-NEXT: cmpq %rcx, %r13
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: cmovaq %r13, %rax
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovsq %rcx, %r13
				; X64-NEXT: cmpq $-1, %r15
				; X64-NEXT: cmoveq %rax, %r13
				; X64-NEXT: movq %r13, %xmm0
				; X64-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: pshufd $78, {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; X64-NEXT: # xmm0 = mem[2,3,0,1]
				; X64-NEXT: movq %xmm0, %r13
				; X64-NEXT: movq %r13, %rbx
				; X64-NEXT: shrq $33, %rbx
				; X64-NEXT: movq %r13, %r14
				; X64-NEXT: sarq $63, %r14
				; X64-NEXT: shlq $31, %r14
				; X64-NEXT: orq %r14, %rbx
				; X64-NEXT: pshufd $78, {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; X64-NEXT: # xmm0 = mem[2,3,0,1]
				; X64-NEXT: movq %xmm0, %rdx
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %rbp
				; X64-NEXT: sarq $63, %rbp
				; X64-NEXT: shlq $31, %r13
				; X64-NEXT: movq %r13, %rdi
				; X64-NEXT: movq %rbx, %rsi
				; X64-NEXT: movq %rbp, %rcx
				; X64-NEXT: callq __divti3
				; X64-NEXT: movq %rax, %r12
				; X64-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %r15
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: subq $1, %r12
				; X64-NEXT: sbbq $0, %r15
				; X64-NEXT: shrq $63, %r14
				; X64-NEXT: xorl %ebp, %r14d
				; X64-NEXT: movq %r13, %rdi
				; X64-NEXT: movq %rbx, %rsi
				; X64-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rdx # 8-byte Reload
				; X64-NEXT: movq %rbp, %rcx
				; X64-NEXT: callq __modti3
				; X64-NEXT: orq %rax, %rdx
				; X64-NEXT: setne %al
				; X64-NEXT: testb %r14b, %al
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r15 # 8-byte Folded Reload
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r12 # 8-byte Folded Reload
				; X64-NEXT: movl $4294967295, %ecx # imm = 0xFFFFFFFF
				; X64-NEXT: cmpq %rcx, %r12
				; X64-NEXT: movl $4294967295, %eax # imm = 0xFFFFFFFF
				; X64-NEXT: cmovbq %r12, %rax
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovnsq %rcx, %r12
				; X64-NEXT: cmoveq %rax, %r12
				; X64-NEXT: movl $0, %eax
				; X64-NEXT: cmovnsq %rax, %r15
				; X64-NEXT: movabsq $-4294967296, %rcx # imm = 0xFFFFFFFF00000000
				; X64-NEXT: cmpq %rcx, %r12
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: cmovaq %r12, %rax
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovsq %rcx, %r12
				; X64-NEXT: cmpq $-1, %r15
				; X64-NEXT: cmoveq %rax, %r12
				; X64-NEXT: movq %r12, %xmm0
				; X64-NEXT: movdqa {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Reload
				; X64-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm0[0]
				; X64-NEXT: psrlq $1, %xmm1
				; X64-NEXT: movdqa %xmm1, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: pshufd $78, {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Folded Reload
				; X64-NEXT: # xmm1 = mem[2,3,0,1]
				; X64-NEXT: pxor %xmm0, %xmm0
				; X64-NEXT: pcmpgtd %xmm1, %xmm0
				; X64-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
				; X64-NEXT: paddq %xmm1, %xmm1
				; X64-NEXT: movdqa %xmm1, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: movq %xmm1, %r12
				; X64-NEXT: movq %r12, %rbx
				; X64-NEXT: shrq $33, %rbx
				; X64-NEXT: movq %r12, %r14
				; X64-NEXT: sarq $63, %r14
				; X64-NEXT: shlq $31, %r14
				; X64-NEXT: orq %r14, %rbx
				; X64-NEXT: pshufd $78, {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Folded Reload
				; X64-NEXT: # xmm1 = mem[2,3,0,1]
				; X64-NEXT: pxor %xmm0, %xmm0
				; X64-NEXT: pcmpgtd %xmm1, %xmm0
				; X64-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
				; X64-NEXT: movdqa %xmm1, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: movq %xmm1, %rdx
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %rbp
				; X64-NEXT: sarq $63, %rbp
				; X64-NEXT: shlq $31, %r12
				; X64-NEXT: movq %r12, %rdi
				; X64-NEXT: movq %rbx, %rsi
				; X64-NEXT: movq %rbp, %rcx
				; X64-NEXT: callq __divti3
				; X64-NEXT: movq %rax, %r13
				; X64-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %r15
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: subq $1, %r13
				; X64-NEXT: sbbq $0, %r15
				; X64-NEXT: shrq $63, %r14
				; X64-NEXT: xorl %ebp, %r14d
				; X64-NEXT: movq %r12, %rdi
				; X64-NEXT: movq %rbx, %rsi
				; X64-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rdx # 8-byte Reload
				; X64-NEXT: movq %rbp, %rcx
				; X64-NEXT: callq __modti3
				; X64-NEXT: orq %rax, %rdx
				; X64-NEXT: setne %al
				; X64-NEXT: testb %r14b, %al
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r15 # 8-byte Folded Reload
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r13 # 8-byte Folded Reload
				; X64-NEXT: movl $4294967295, %ecx # imm = 0xFFFFFFFF
				; X64-NEXT: cmpq %rcx, %r13
				; X64-NEXT: movl $4294967295, %eax # imm = 0xFFFFFFFF
				; X64-NEXT: cmovbq %r13, %rax
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovnsq %rcx, %r13
				; X64-NEXT: cmoveq %rax, %r13
				; X64-NEXT: movl $0, %eax
				; X64-NEXT: cmovnsq %rax, %r15
				; X64-NEXT: movabsq $-4294967296, %rcx # imm = 0xFFFFFFFF00000000
				; X64-NEXT: cmpq %rcx, %r13
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: cmovaq %r13, %rax
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovsq %rcx, %r13
				; X64-NEXT: cmpq $-1, %r15
				; X64-NEXT: cmoveq %rax, %r13
				; X64-NEXT: movq %r13, %xmm0
				; X64-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; X64-NEXT: pshufd $78, {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; X64-NEXT: # xmm0 = mem[2,3,0,1]
				; X64-NEXT: movq %xmm0, %r13
				; X64-NEXT: movq %r13, %rbx
				; X64-NEXT: shrq $33, %rbx
				; X64-NEXT: movq %r13, %r14
				; X64-NEXT: sarq $63, %r14
				; X64-NEXT: shlq $31, %r14
				; X64-NEXT: orq %r14, %rbx
				; X64-NEXT: pshufd $78, {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; X64-NEXT: # xmm0 = mem[2,3,0,1]
				; X64-NEXT: movq %xmm0, %rdx
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %rbp
				; X64-NEXT: sarq $63, %rbp
				; X64-NEXT: shlq $31, %r13
				; X64-NEXT: movq %r13, %rdi
				; X64-NEXT: movq %rbx, %rsi
				; X64-NEXT: movq %rbp, %rcx
				; X64-NEXT: callq __divti3
				; X64-NEXT: movq %rax, %r12
				; X64-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: movq %rdx, %r15
				; X64-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
				; X64-NEXT: subq $1, %r12
				; X64-NEXT: sbbq $0, %r15
				; X64-NEXT: shrq $63, %r14
				; X64-NEXT: xorl %ebp, %r14d
				; X64-NEXT: movq %r13, %rdi
				; X64-NEXT: movq %rbx, %rsi
				; X64-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rdx # 8-byte Reload
				; X64-NEXT: movq %rbp, %rcx
				; X64-NEXT: callq __modti3
				; X64-NEXT: orq %rax, %rdx
				; X64-NEXT: setne %al
				; X64-NEXT: testb %r14b, %al
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r15 # 8-byte Folded Reload
				; X64-NEXT: cmoveq {{[-0-9]+}}(%r{{[sb]}}p), %r12 # 8-byte Folded Reload
				; X64-NEXT: movl $4294967295, %ecx # imm = 0xFFFFFFFF
				; X64-NEXT: cmpq %rcx, %r12
				; X64-NEXT: movl $4294967295, %eax # imm = 0xFFFFFFFF
				; X64-NEXT: cmovbq %r12, %rax
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovnsq %rcx, %r12
				; X64-NEXT: cmoveq %rax, %r12
				; X64-NEXT: movl $0, %eax
				; X64-NEXT: cmovnsq %rax, %r15
				; X64-NEXT: movabsq $-4294967296, %rcx # imm = 0xFFFFFFFF00000000
				; X64-NEXT: cmpq %rcx, %r12
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: cmovaq %r12, %rax
				; X64-NEXT: testq %r15, %r15
				; X64-NEXT: cmovsq %rcx, %r12
				; X64-NEXT: cmpq $-1, %r15
				; X64-NEXT: cmoveq %rax, %r12
				; X64-NEXT: movq %r12, %xmm0
				; X64-NEXT: movdqa {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Reload
				; X64-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm0[0]
				; X64-NEXT: psrlq $1, %xmm1
				; X64-NEXT: movaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Reload
				; X64-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2]
				; X64-NEXT: addq $104, %rsp
				; X64-NEXT: popq %rbx
				; X64-NEXT: popq %r12
				; X64-NEXT: popq %r13
				; X64-NEXT: popq %r14
				; X64-NEXT: popq %r15
				; X64-NEXT: popq %rbp
				; X64-NEXT: retq
				;
				; X86-LABEL: vec:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: andl $-8, %esp
				; X86-NEXT: subl $256, %esp # imm = 0x100
				; X86-NEXT: movl 24(%ebp), %ecx
				; X86-NEXT: movl 40(%ebp), %ebx
				; X86-NEXT: movl %ebx, %edx
				; X86-NEXT: sarl $31, %edx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: sarl $31, %eax
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: adcl %eax, %eax
				; X86-NEXT: movl %ecx, %esi
				; X86-NEXT: andl $1, %eax
				; X86-NEXT: movl %eax, %edi
				; X86-NEXT: shll $31, %eax
				; X86-NEXT: shrl %ecx
				; X86-NEXT: subl %eax, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: shll $31, %esi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: negl %edi
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %ebx
				; X86-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __modti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl 36(%ebp), %edi
				; X86-NEXT: movl %edi, %edx
				; X86-NEXT: sarl $31, %edx
				; X86-NEXT: movl 20(%ebp), %ecx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: sarl $31, %eax
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: adcl %eax, %eax
				; X86-NEXT: movl %ecx, %esi
				; X86-NEXT: andl $1, %eax
				; X86-NEXT: movl %eax, %ebx
				; X86-NEXT: shll $31, %eax
				; X86-NEXT: shrl %ecx
				; X86-NEXT: subl %eax, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: shll $31, %esi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: negl %ebx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %edi
				; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __modti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl 28(%ebp), %edi
				; X86-NEXT: movl %edi, %ebx
				; X86-NEXT: sarl $31, %ebx
				; X86-NEXT: movl 12(%ebp), %ecx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: sarl $31, %eax
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: adcl %eax, %eax
				; X86-NEXT: movl %ecx, %edx
				; X86-NEXT: andl $1, %eax
				; X86-NEXT: movl %eax, %esi
				; X86-NEXT: shll $31, %eax
				; X86-NEXT: shrl %ecx
				; X86-NEXT: subl %eax, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: shll $31, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: negl %esi
				; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __divti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl 32(%ebp), %edx
				; X86-NEXT: movl %edx, %edi
				; X86-NEXT: sarl $31, %edi
				; X86-NEXT: movl 16(%ebp), %ecx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: sarl $31, %eax
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: adcl %eax, %eax
				; X86-NEXT: movl %ecx, %ebx
				; X86-NEXT: andl $1, %eax
				; X86-NEXT: movl %eax, %esi
				; X86-NEXT: shll $31, %eax
				; X86-NEXT: shrl %ecx
				; X86-NEXT: subl %eax, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: shll $31, %ebx
				; X86-NEXT: negl %esi
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __modti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl 32(%ebp)
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl %esi
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __divti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl 40(%ebp)
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __divti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl 36(%ebp)
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: movl %ecx, %ebx
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __divti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: subl $1, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %esi, %ecx
				; X86-NEXT: sbbl $0, %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: testl %ebx, %ebx
				; X86-NEXT: sets %bl
				; X86-NEXT: cmpl $0, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: sets %bh
				; X86-NEXT: xorb %bl, %bh
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: orl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: orl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: orl %eax, %edx
				; X86-NEXT: setne %al
				; X86-NEXT: testb %bh, %al
				; X86-NEXT: cmovel %esi, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: subl $1, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: movl %edx, %ecx
				; X86-NEXT: sbbl $0, %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %esi, %eax
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: cmpl $0, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: sets %bl
				; X86-NEXT: testl %edi, %edi
				; X86-NEXT: sets %bh
				; X86-NEXT: xorb %bl, %bh
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: orl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: orl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: orl %edi, %eax
				; X86-NEXT: setne %al
				; X86-NEXT: testb %bh, %al
				; X86-NEXT: cmovel %edx, %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel %esi, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: subl $1, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %esi, %eax
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, %edi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: sets %al
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: testl %ecx, %ecx
				; X86-NEXT: sets %bl
				; X86-NEXT: xorb %al, %bl
				; X86-NEXT: leal {{[0-9]+}}(%esp), %eax
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl 28(%ebp)
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: pushl %eax
				; X86-NEXT: calll __modti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: orl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: orl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: orl %eax, %ecx
				; X86-NEXT: setne %al
				; X86-NEXT: testb %bl, %al
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel %esi, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Folded Reload
				; X86-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: subl $1, %edx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: movl %edi, %eax
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: sbbl $0, %ecx
				; X86-NEXT: cmpl $0, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: sets %bl
				; X86-NEXT: cmpl $0, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Folded Reload
				; X86-NEXT: sets %bh
				; X86-NEXT: xorb %bl, %bh
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: orl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: orl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: orl %eax, %esi
				; X86-NEXT: setne %al
				; X86-NEXT: testb %bh, %al
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel %edi, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
				; X86-NEXT: testl %ecx, %ecx
				; X86-NEXT: movl $0, %eax
				; X86-NEXT: cmovsl %ecx, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $-1, %eax
				; X86-NEXT: cmovsl %edx, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: movl %eax, %edx
				; X86-NEXT: sarl $31, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl %edx, %esi
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
				; X86-NEXT: testl %eax, %eax
				; X86-NEXT: cmovel %eax, %esi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $0, %edx
				; X86-NEXT: cmovsl %eax, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $-1, %eax
				; X86-NEXT: cmovsl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: movl %eax, %ebx
				; X86-NEXT: sarl $31, %ebx
				; X86-NEXT: movl %ebx, %edx
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: testl %eax, %eax
				; X86-NEXT: cmovel %eax, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $0, %edx
				; X86-NEXT: cmovsl %eax, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $-1, %eax
				; X86-NEXT: cmovsl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: movl %eax, %esi
				; X86-NEXT: sarl $31, %esi
				; X86-NEXT: movl %esi, %edx
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: testl %eax, %eax
				; X86-NEXT: cmovel %eax, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $0, %edi
				; X86-NEXT: cmovsl %eax, %edi
				; X86-NEXT: movl $-1, %eax
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: cmovsl %edx, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: sarl $31, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: testl %ecx, %ecx
				; X86-NEXT: cmovel %ecx, %eax
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl %edx, %ecx
				; X86-NEXT: cmpl $-1, %edx
				; X86-NEXT: movl $-1, %eax
				; X86-NEXT: cmovael %eax, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $0, %eax
				; X86-NEXT: sbbl %eax, %eax
				; X86-NEXT: notl %eax
				; X86-NEXT: orl %ecx, %eax
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmovbl %edx, %ecx
				; X86-NEXT: andl %edx, %esi
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: orl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: cmovel %ecx, %esi
				; X86-NEXT: cmovnel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: testl %eax, %eax
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmoval %eax, %ecx
				; X86-NEXT: cmpl $-1, %esi
				; X86-NEXT: movl $0, %edx
				; X86-NEXT: cmovnel %edx, %ecx
				; X86-NEXT: testl %edi, %edi
				; X86-NEXT: movl $-1, %edx
				; X86-NEXT: cmovsl %edx, %esi
				; X86-NEXT: movl $0, %edx
				; X86-NEXT: cmovsl %edx, %eax
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Folded Reload
				; X86-NEXT: cmpl $-1, %edi
				; X86-NEXT: cmovel %ecx, %eax
				; X86-NEXT: cmovnel %esi, %edi
				; X86-NEXT: shldl $31, %eax, %edi
				; X86-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: cmovael %ecx, %eax
				; X86-NEXT: movl %eax, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
				; X86-NEXT: cmpl $1, %esi
				; X86-NEXT: movl $0, %eax
				; X86-NEXT: sbbl %eax, %eax
				; X86-NEXT: notl %eax
				; X86-NEXT: orl %ecx, %eax
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmovbl %esi, %ecx
				; X86-NEXT: andl %esi, %ebx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: orl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: cmovel %ecx, %ebx
				; X86-NEXT: cmovnel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: testl %eax, %eax
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmoval %eax, %ecx
				; X86-NEXT: cmpl $-1, %ebx
				; X86-NEXT: movl $0, %edi
				; X86-NEXT: cmovnel %edi, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: movl $-1, %edx
				; X86-NEXT: cmovsl %edx, %ebx
				; X86-NEXT: cmovsl %edi, %eax
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
				; X86-NEXT: cmpl $-1, %esi
				; X86-NEXT: cmovel %ecx, %eax
				; X86-NEXT: cmovnel %ebx, %esi
				; X86-NEXT: shldl $31, %eax, %esi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: cmovael %edx, %eax
				; X86-NEXT: movl $-1, %ebx
				; X86-NEXT: movl %eax, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $0, %eax
				; X86-NEXT: sbbl %eax, %eax
				; X86-NEXT: notl %eax
				; X86-NEXT: orl %ecx, %eax
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmovbl %edx, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
				; X86-NEXT: andl %edx, %edi
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: orl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: cmovel %ecx, %edi
				; X86-NEXT: cmovnel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: testl %eax, %eax
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmoval %eax, %ecx
				; X86-NEXT: cmpl $-1, %edi
				; X86-NEXT: movl $0, %edx
				; X86-NEXT: cmovnel %edx, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: cmovsl %ebx, %edi
				; X86-NEXT: cmovsl %edx, %eax
				; X86-NEXT: andl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Folded Reload
				; X86-NEXT: cmpl $-1, %esi
				; X86-NEXT: cmovel %ecx, %eax
				; X86-NEXT: cmovnel %edi, %esi
				; X86-NEXT: shldl $31, %eax, %esi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: cmovael %ebx, %eax
				; X86-NEXT: movl $-1, %esi
				; X86-NEXT: movl %eax, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $0, %eax
				; X86-NEXT: sbbl %eax, %eax
				; X86-NEXT: notl %eax
				; X86-NEXT: orl %ecx, %eax
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmovbl %edx, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
				; X86-NEXT: andl %edx, %ebx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: orl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload
				; X86-NEXT: cmovel %ecx, %ebx
				; X86-NEXT: cmovnel {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Folded Reload
				; X86-NEXT: testl %eax, %eax
				; X86-NEXT: movl $0, %ecx
				; X86-NEXT: cmoval %eax, %ecx
				; X86-NEXT: cmpl $-1, %ebx
				; X86-NEXT: movl $0, %edi
				; X86-NEXT: cmovnel %edi, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: cmovsl %esi, %ebx
				; X86-NEXT: movl %ebx, %esi
				; X86-NEXT: cmovsl %edi, %eax
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
				; X86-NEXT: andl %edx, %ebx
				; X86-NEXT: cmpl $-1, %ebx
				; X86-NEXT: cmovel %ecx, %eax
				; X86-NEXT: cmovnel %esi, %ebx
				; X86-NEXT: shldl $31, %eax, %ebx
				; X86-NEXT: movl 8(%ebp), %eax
				; X86-NEXT: movl %ebx, 12(%eax)
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: movl %ecx, 8(%eax)
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: movl %ecx, 4(%eax)
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: movl %ecx, (%eax)
				; X86-NEXT: leal -12(%ebp), %esp
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl $4
				%tmp = call <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %x, <4 x i32> %y, i32 31)
				ret <4 x i32> %tmp
				}

llvm/test/CodeGen/X86/udiv_fix_sat.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s --check-prefix=X64
				; RUN: llc < %s -mtriple=i686 -mattr=cmov \| FileCheck %s --check-prefix=X86

				declare i4 @llvm.udiv.fix.sat.i4 (i4, i4, i32)
				declare i15 @llvm.udiv.fix.sat.i15 (i15, i15, i32)
				declare i16 @llvm.udiv.fix.sat.i16 (i16, i16, i32)
				declare i18 @llvm.udiv.fix.sat.i18 (i18, i18, i32)
				declare i64 @llvm.udiv.fix.sat.i64 (i64, i64, i32)
				declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32>, <4 x i32>, i32)

				define i16 @func(i16 %x, i16 %y) nounwind {
				; X64-LABEL: func:
				; X64: # %bb.0:
				; X64-NEXT: movzwl %si, %ecx
				; X64-NEXT: movzwl %di, %eax
				; X64-NEXT: shll $8, %eax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divl %ecx
				; X64-NEXT: cmpl $131071, %eax # imm = 0x1FFFF
				; X64-NEXT: movl $131071, %ecx # imm = 0x1FFFF
				; X64-NEXT: cmovael %ecx, %eax
				; X64-NEXT: shrl %eax
				; X64-NEXT: # kill: def $ax killed $ax killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func:
				; X86: # %bb.0:
				; X86-NEXT: movzwl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movzwl %ax, %eax
				; X86-NEXT: shll $8, %eax
				; X86-NEXT: xorl %edx, %edx
				; X86-NEXT: divl %ecx
				; X86-NEXT: cmpl $131071, %eax # imm = 0x1FFFF
				; X86-NEXT: movl $131071, %ecx # imm = 0x1FFFF
				; X86-NEXT: cmovael %ecx, %eax
				; X86-NEXT: shrl %eax
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: retl
				%tmp = call i16 @llvm.udiv.fix.sat.i16(i16 %x, i16 %y, i32 7)
				ret i16 %tmp
				}

				define i16 @func2(i8 %x, i8 %y) nounwind {
				; X64-LABEL: func2:
				; X64: # %bb.0:
				; X64-NEXT: movsbl %dil, %eax
				; X64-NEXT: andl $32767, %eax # imm = 0x7FFF
				; X64-NEXT: movsbl %sil, %ecx
				; X64-NEXT: andl $32767, %ecx # imm = 0x7FFF
				; X64-NEXT: shll $14, %eax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divl %ecx
				; X64-NEXT: cmpl $32767, %eax # imm = 0x7FFF
				; X64-NEXT: movl $32767, %ecx # imm = 0x7FFF
				; X64-NEXT: cmovbl %eax, %ecx
				; X64-NEXT: addl %ecx, %ecx
				; X64-NEXT: movswl %cx, %eax
				; X64-NEXT: shrl %eax
				; X64-NEXT: # kill: def $ax killed $ax killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func2:
				; X86: # %bb.0:
				; X86-NEXT: movsbl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: andl $32767, %ecx # imm = 0x7FFF
				; X86-NEXT: movsbl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: andl $32767, %eax # imm = 0x7FFF
				; X86-NEXT: shll $14, %eax
				; X86-NEXT: xorl %edx, %edx
				; X86-NEXT: divl %ecx
				; X86-NEXT: cmpl $32767, %eax # imm = 0x7FFF
				; X86-NEXT: movl $32767, %ecx # imm = 0x7FFF
				; X86-NEXT: cmovbl %eax, %ecx
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: movswl %cx, %eax
				; X86-NEXT: shrl %eax
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: retl
				%x2 = sext i8 %x to i15
				%y2 = sext i8 %y to i15
				%tmp = call i15 @llvm.udiv.fix.sat.i15(i15 %x2, i15 %y2, i32 14)
				%tmp2 = sext i15 %tmp to i16
				ret i16 %tmp2
				}

				define i16 @func3(i15 %x, i8 %y) nounwind {
				; X64-LABEL: func3:
				; X64: # %bb.0:
				; X64-NEXT: # kill: def $edi killed $edi def $rdi
				; X64-NEXT: leal (%rdi,%rdi), %eax
				; X64-NEXT: movzbl %sil, %ecx
				; X64-NEXT: shll $4, %ecx
				; X64-NEXT: # kill: def $ax killed $ax killed $eax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divw %cx
				; X64-NEXT: # kill: def $ax killed $ax def $eax
				; X64-NEXT: movzwl %ax, %ecx
				; X64-NEXT: cmpl $32767, %ecx # imm = 0x7FFF
				; X64-NEXT: movl $32767, %ecx # imm = 0x7FFF
				; X64-NEXT: cmovbl %eax, %ecx
				; X64-NEXT: addl %ecx, %ecx
				; X64-NEXT: movswl %cx, %eax
				; X64-NEXT: shrl %eax
				; X64-NEXT: # kill: def $ax killed $ax killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func3:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: addl %eax, %eax
				; X86-NEXT: movzbl %cl, %ecx
				; X86-NEXT: shll $4, %ecx
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: xorl %edx, %edx
				; X86-NEXT: divw %cx
				; X86-NEXT: # kill: def $ax killed $ax def $eax
				; X86-NEXT: movzwl %ax, %ecx
				; X86-NEXT: cmpl $32767, %ecx # imm = 0x7FFF
				; X86-NEXT: movl $32767, %ecx # imm = 0x7FFF
				; X86-NEXT: cmovbl %eax, %ecx
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: movswl %cx, %eax
				; X86-NEXT: shrl %eax
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: retl
				%y2 = sext i8 %y to i15
				%y3 = shl i15 %y2, 7
				%tmp = call i15 @llvm.udiv.fix.sat.i15(i15 %x, i15 %y3, i32 4)
				%tmp2 = sext i15 %tmp to i16
				ret i16 %tmp2
				}

				define i4 @func4(i4 %x, i4 %y) nounwind {
				; X64-LABEL: func4:
				; X64: # %bb.0:
				; X64-NEXT: andb $15, %sil
				; X64-NEXT: andb $15, %dil
				; X64-NEXT: shlb $2, %dil
				; X64-NEXT: movzbl %dil, %eax
				; X64-NEXT: divb %sil
				; X64-NEXT: movzbl %al, %ecx
				; X64-NEXT: cmpb $15, %cl
				; X64-NEXT: movl $15, %eax
				; X64-NEXT: cmovbl %ecx, %eax
				; X64-NEXT: # kill: def $al killed $al killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func4:
				; X86: # %bb.0:
				; X86-NEXT: movb {{[0-9]+}}(%esp), %cl
				; X86-NEXT: andb $15, %cl
				; X86-NEXT: movb {{[0-9]+}}(%esp), %al
				; X86-NEXT: andb $15, %al
				; X86-NEXT: shlb $2, %al
				; X86-NEXT: movzbl %al, %eax
				; X86-NEXT: divb %cl
				; X86-NEXT: movzbl %al, %ecx
				; X86-NEXT: cmpb $15, %al
				; X86-NEXT: movl $15, %eax
				; X86-NEXT: cmovbl %ecx, %eax
				; X86-NEXT: # kill: def $al killed $al killed $eax
				; X86-NEXT: retl
				%tmp = call i4 @llvm.udiv.fix.sat.i4(i4 %x, i4 %y, i32 2)
				ret i4 %tmp
				}

				define i64 @func5(i64 %x, i64 %y) nounwind {
				; X64-LABEL: func5:
				; X64: # %bb.0:
				; X64-NEXT: pushq %rbx
				; X64-NEXT: movq %rsi, %rdx
				; X64-NEXT: leaq (%rdi,%rdi), %rsi
				; X64-NEXT: shrq $33, %rsi
				; X64-NEXT: movq %rdi, %rax
				; X64-NEXT: shrq $32, %rax
				; X64-NEXT: andl $-2147483648, %eax # imm = 0x80000000
				; X64-NEXT: orq %rax, %rsi
				; X64-NEXT: shlq $32, %rdi
				; X64-NEXT: xorl %ebx, %ebx
				; X64-NEXT: xorl %ecx, %ecx
				; X64-NEXT: callq __udivti3
				; X64-NEXT: cmpq $-1, %rax
				; X64-NEXT: movq $-1, %rcx
				; X64-NEXT: cmovbq %rax, %rcx
				; X64-NEXT: cmpq $1, %rdx
				; X64-NEXT: movl $1, %esi
				; X64-NEXT: cmovbq %rdx, %rsi
				; X64-NEXT: sbbq %rbx, %rbx
				; X64-NEXT: notq %rbx
				; X64-NEXT: orq %rax, %rbx
				; X64-NEXT: cmpq $1, %rdx
				; X64-NEXT: cmoveq %rcx, %rbx
				; X64-NEXT: shrdq $1, %rsi, %rbx
				; X64-NEXT: movq %rbx, %rax
				; X64-NEXT: popq %rbx
				; X64-NEXT: retq
				;
				; X86-LABEL: func5:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
				; X86-NEXT: pushl %esi
				; X86-NEXT: andl $-8, %esp
				; X86-NEXT: subl $24, %esp
				; X86-NEXT: movl 8(%ebp), %eax
				; X86-NEXT: movl 12(%ebp), %ecx
				; X86-NEXT: movl %ecx, %edx
				; X86-NEXT: shrl %edx
				; X86-NEXT: shldl $31, %eax, %ecx
				; X86-NEXT: shll $31, %eax
				; X86-NEXT: movl %esp, %esi
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl 20(%ebp)
				; X86-NEXT: pushl 16(%ebp)
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %eax
				; X86-NEXT: pushl %esi
				; X86-NEXT: calll __udivti3
				; X86-NEXT: addl $32, %esp
				; X86-NEXT: movl (%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: movl $-1, %esi
				; X86-NEXT: cmovbl %eax, %esi
				; X86-NEXT: cmpl $-1, %edx
				; X86-NEXT: cmovel %edx, %eax
				; X86-NEXT: cmovel %esi, %eax
				; X86-NEXT: cmovael %ecx, %edx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: orl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: cmovnel %ecx, %edx
				; X86-NEXT: cmovnel %ecx, %eax
				; X86-NEXT: leal -4(%ebp), %esp
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl
				%tmp = call i64 @llvm.udiv.fix.sat.i64(i64 %x, i64 %y, i32 31)
				ret i64 %tmp
				}

				define i18 @func6(i16 %x, i16 %y) nounwind {
				; X64-LABEL: func6:
				; X64: # %bb.0:
				; X64-NEXT: movswl %di, %eax
				; X64-NEXT: andl $262143, %eax # imm = 0x3FFFF
				; X64-NEXT: movswl %si, %ecx
				; X64-NEXT: andl $262143, %ecx # imm = 0x3FFFF
				; X64-NEXT: shll $7, %eax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divl %ecx
				; X64-NEXT: cmpl $262143, %eax # imm = 0x3FFFF
				; X64-NEXT: movl $262143, %ecx # imm = 0x3FFFF
				; X64-NEXT: cmovael %ecx, %eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func6:
				; X86: # %bb.0:
				; X86-NEXT: movswl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: andl $262143, %ecx # imm = 0x3FFFF
				; X86-NEXT: movswl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: andl $262143, %eax # imm = 0x3FFFF
				; X86-NEXT: shll $7, %eax
				; X86-NEXT: xorl %edx, %edx
				; X86-NEXT: divl %ecx
				; X86-NEXT: cmpl $262143, %eax # imm = 0x3FFFF
				; X86-NEXT: movl $262143, %ecx # imm = 0x3FFFF
				; X86-NEXT: cmovael %ecx, %eax
				; X86-NEXT: retl
				%x2 = sext i16 %x to i18
				%y2 = sext i16 %y to i18
				%tmp = call i18 @llvm.udiv.fix.sat.i18(i18 %x2, i18 %y2, i32 7)
				ret i18 %tmp
				}

				define i16 @func7(i16 %x, i16 %y) nounwind {
				; X64-LABEL: func7:
				; X64: # %bb.0:
				; X64-NEXT: movzwl %si, %ecx
				; X64-NEXT: movzwl %di, %eax
				; X64-NEXT: addl %eax, %eax
				; X64-NEXT: shlq $16, %rax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divq %rcx
				; X64-NEXT: cmpq $131071, %rax # imm = 0x1FFFF
				; X64-NEXT: movl $131071, %ecx # imm = 0x1FFFF
				; X64-NEXT: cmovaeq %rcx, %rax
				; X64-NEXT: shrl %eax
				; X64-NEXT: # kill: def $ax killed $ax killed $rax
				; X64-NEXT: retq
				;
				; X86-LABEL: func7:
				; X86: # %bb.0:
				; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movzwl %cx, %ecx
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: movl %ecx, %edx
				; X86-NEXT: shrl $16, %edx
				; X86-NEXT: shll $16, %ecx
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl %eax
				; X86-NEXT: pushl %edx
				; X86-NEXT: pushl %ecx
				; X86-NEXT: calll __udivdi3
				; X86-NEXT: addl $16, %esp
				; X86-NEXT: cmpl $131071, %eax # imm = 0x1FFFF
				; X86-NEXT: movl $131071, %ecx # imm = 0x1FFFF
				; X86-NEXT: cmovael %ecx, %eax
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: cmovnel %ecx, %eax
				; X86-NEXT: shrl %eax
				; X86-NEXT: # kill: def $ax killed $ax killed $eax
				; X86-NEXT: retl
				%tmp = call i16 @llvm.udiv.fix.sat.i16(i16 %x, i16 %y, i32 16)
				ret i16 %tmp
				}

				define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {
				; X64-LABEL: vec:
				; X64: # %bb.0:
				; X64-NEXT: pxor %xmm8, %xmm8
				; X64-NEXT: movdqa %xmm1, %xmm2
				; X64-NEXT: punpckhdq {{.*#+}} xmm2 = xmm2[2],xmm8[2],xmm2[3],xmm8[3]
				; X64-NEXT: movq %xmm2, %rcx
				; X64-NEXT: movdqa %xmm0, %xmm4
				; X64-NEXT: punpckhdq {{.*#+}} xmm4 = xmm4[2],xmm8[2],xmm4[3],xmm8[3]
				; X64-NEXT: paddq %xmm4, %xmm4
				; X64-NEXT: psllq $31, %xmm4
				; X64-NEXT: movq %xmm4, %rax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divq %rcx
				; X64-NEXT: movq %rax, %xmm7
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm2[2,3,0,1]
				; X64-NEXT: movq %xmm2, %rcx
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm4[2,3,0,1]
				; X64-NEXT: movq %xmm2, %rax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divq %rcx
				; X64-NEXT: movq %rax, %xmm2
				; X64-NEXT: punpcklqdq {{.*#+}} xmm7 = xmm7[0],xmm2[0]
				; X64-NEXT: movdqa {{.*#+}} xmm4 = [9223372039002259456,9223372039002259456]
				; X64-NEXT: movdqa %xmm7, %xmm2
				; X64-NEXT: pxor %xmm4, %xmm2
				; X64-NEXT: movdqa {{.*#+}} xmm9 = [9223372043297226751,9223372043297226751]
				; X64-NEXT: movdqa %xmm9, %xmm6
				; X64-NEXT: pcmpgtd %xmm2, %xmm6
				; X64-NEXT: pshufd {{.*#+}} xmm3 = xmm6[0,0,2,2]
				; X64-NEXT: pcmpeqd %xmm9, %xmm2
				; X64-NEXT: pshufd {{.*#+}} xmm5 = xmm2[1,1,3,3]
				; X64-NEXT: pand %xmm3, %xmm5
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm6[1,1,3,3]
				; X64-NEXT: por %xmm5, %xmm2
				; X64-NEXT: movdqa {{.*#+}} xmm6 = [8589934591,8589934591]
				; X64-NEXT: pand %xmm2, %xmm7
				; X64-NEXT: pandn %xmm6, %xmm2
				; X64-NEXT: por %xmm7, %xmm2
				; X64-NEXT: psrlq $1, %xmm2
				; X64-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm8[0],xmm1[1],xmm8[1]
				; X64-NEXT: movq %xmm1, %rcx
				; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm8[0],xmm0[1],xmm8[1]
				; X64-NEXT: paddq %xmm0, %xmm0
				; X64-NEXT: psllq $31, %xmm0
				; X64-NEXT: movq %xmm0, %rax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divq %rcx
				; X64-NEXT: movq %rax, %xmm3
				; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[2,3,0,1]
				; X64-NEXT: movq %xmm1, %rcx
				; X64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[2,3,0,1]
				; X64-NEXT: movq %xmm0, %rax
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: divq %rcx
				; X64-NEXT: movq %rax, %xmm0
				; X64-NEXT: punpcklqdq {{.*#+}} xmm3 = xmm3[0],xmm0[0]
				; X64-NEXT: pxor %xmm3, %xmm4
				; X64-NEXT: movdqa %xmm9, %xmm0
				; X64-NEXT: pcmpgtd %xmm4, %xmm0
				; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm0[0,0,2,2]
				; X64-NEXT: pcmpeqd %xmm9, %xmm4
				; X64-NEXT: pshufd {{.*#+}} xmm4 = xmm4[1,1,3,3]
				; X64-NEXT: pand %xmm1, %xmm4
				; X64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,3,3]
				; X64-NEXT: por %xmm4, %xmm0
				; X64-NEXT: pand %xmm0, %xmm3
				; X64-NEXT: pandn %xmm6, %xmm0
				; X64-NEXT: por %xmm3, %xmm0
				; X64-NEXT: psrlq $1, %xmm0
				; X64-NEXT: shufps {{.*#+}} xmm0 = xmm0[0,2],xmm2[0,2]
				; X64-NEXT: retq
				;
				; X86-LABEL: vec:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: subl $16, %esp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: xorl %eax, %eax
				; X86-NEXT: addl %ecx, %ecx
				; X86-NEXT: setb %al
				; X86-NEXT: shldl $31, %ecx, %eax
				; X86-NEXT: shll $31, %ecx
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NEXT: pushl %eax
				; X86-NEXT: pushl %ecx
				; X86-NEXT: calll __udivdi3
				; X86-NEXT: addl $16, %esp
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: cmovbl %eax, %ecx
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $0, %edi
				; X86-NEXT: sbbl %edi, %edi
				; X86-NEXT: notl %edi
				; X86-NEXT: orl %eax, %edi
				; X86-NEXT: movl %edi, %ebx
				; X86-NEXT: xorl %eax, %eax
				; X86-NEXT: addl %esi, %esi
				; X86-NEXT: setb %al
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: cmovel %ecx, %ebx
				; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $1, %ecx
				; X86-NEXT: cmovael %ecx, %edx
				; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: shldl $31, %esi, %eax
				; X86-NEXT: shll $31, %esi
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NEXT: pushl %eax
				; X86-NEXT: pushl %esi
				; X86-NEXT: calll __udivdi3
				; X86-NEXT: addl $16, %esp
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: cmovbl %eax, %ecx
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $1, %esi
				; X86-NEXT: cmovbl %edx, %esi
				; X86-NEXT: movl %esi, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl $0, %esi
				; X86-NEXT: sbbl %esi, %esi
				; X86-NEXT: notl %esi
				; X86-NEXT: orl %eax, %esi
				; X86-NEXT: xorl %eax, %eax
				; X86-NEXT: addl %edi, %edi
				; X86-NEXT: setb %al
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp
				; X86-NEXT: cmovel %ecx, %esi
				; X86-NEXT: shldl $31, %edi, %eax
				; X86-NEXT: shll $31, %edi
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NEXT: pushl %eax
				; X86-NEXT: pushl %edi
				; X86-NEXT: calll __udivdi3
				; X86-NEXT: addl $16, %esp
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: movl $-1, %ebx
				; X86-NEXT: cmovbl %eax, %ebx
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $0, %edi
				; X86-NEXT: sbbl %edi, %edi
				; X86-NEXT: notl %edi
				; X86-NEXT: orl %eax, %edi
				; X86-NEXT: xorl %ecx, %ecx
				; X86-NEXT: addl %ebp, %ebp
				; X86-NEXT: setb %cl
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl %edx, %eax
				; X86-NEXT: movl $1, %edx
				; X86-NEXT: cmovael %edx, %eax
				; X86-NEXT: movl %eax, (%esp) # 4-byte Spill
				; X86-NEXT: cmovel %ebx, %edi
				; X86-NEXT: shldl $31, %ebp, %ecx
				; X86-NEXT: shll $31, %ebp
				; X86-NEXT: pushl $0
				; X86-NEXT: pushl {{[0-9]+}}(%esp)
				; X86-NEXT: pushl %ecx
				; X86-NEXT: pushl %ebp
				; X86-NEXT: calll __udivdi3
				; X86-NEXT: addl $16, %esp
				; X86-NEXT: cmpl $-1, %eax
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: cmovbl %eax, %ecx
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $1, %ebx
				; X86-NEXT: cmovbl %edx, %ebx
				; X86-NEXT: movl $0, %ebp
				; X86-NEXT: sbbl %ebp, %ebp
				; X86-NEXT: notl %ebp
				; X86-NEXT: orl %eax, %ebp
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: cmovel %ecx, %ebp
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: shrdl $1, %eax, %ecx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %eax # 4-byte Reload
				; X86-NEXT: shrdl $1, %eax, %esi
				; X86-NEXT: movl (%esp), %eax # 4-byte Reload
				; X86-NEXT: shrdl $1, %eax, %edi
				; X86-NEXT: shrdl $1, %ebx, %ebp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %ebp, 12(%eax)
				; X86-NEXT: movl %edi, 8(%eax)
				; X86-NEXT: movl %esi, 4(%eax)
				; X86-NEXT: movl %ecx, (%eax)
				; X86-NEXT: addl $16, %esp
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl $4
				%tmp = call <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %x, <4 x i32> %y, i32 31)
				ret <4 x i32> %tmp
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Intrinsic] Add fixed point saturating division intrinsics.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 246158

llvm/docs/LangRef.rst

llvm/include/llvm/CodeGen/ISDOpcodes.h

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/Target/TargetSelectionDAG.td

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/CodeGen/TargetLoweringBase.cpp

llvm/lib/IR/Verifier.cpp

llvm/test/CodeGen/X86/sdiv_fix_sat.ll

llvm/test/CodeGen/X86/udiv_fix_sat.ll

[Intrinsic] Add fixed point saturating division intrinsics.
ClosedPublic