This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
docs/
-
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
TargetLowering.h
-
IR/
-
Intrinsics.td
-
Target/
-
TargetSelectionDAG.td
-
lib/
-
CodeGen/
-
SelectionDAG/
-
LegalizeDAG.cpp
-
LegalizeIntegerTypes.cpp
-
LegalizeVectorOps.cpp
-
LegalizeVectorTypes.cpp
-
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
-
TargetLowering.cpp
-
TargetLoweringBase.cpp
-
IR/
-
Verifier.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
smul_fix_sat.ll
-
smul_fix_sat_constants.ll

Differential D55720

[Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
ClosedPublic

Authored by leonardchan on Dec 14 2018, 2:19 PM.

Download Raw Diff

Details

Reviewers

ebevhan
bjope
craig.topper
RKSimon

Commits

rZORG1579f571b3d1: [Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
rG1579f571b3d1: [Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
rG0bada7ce6c12: [Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
rL361289: [Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic

Summary

Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics.

Diff Detail

Repository: rL LLVM

Event Timeline

leonardchan created this revision.Dec 14 2018, 2:19 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptDec 14 2018, 2:19 PM

Nothing sticks out to me, so I think it looks good. Hard to tell if there are any sneaky edge cases in the lowering steps, though.

Maybe you should rebase this on top of the unsigned patch, since they're touching all the same places. Or are you waiting for it to land?

In D55720#1336355, @ebevhan wrote:

Nothing sticks out to me, so I think it looks good. Hard to tell if there are any sneaky edge cases in the lowering steps, though.

Maybe you should rebase this on top of the unsigned patch, since they're touching all the same places. Or are you waiting for it to land?

Yeah, I figure it would be better to submit these as separate patches since they're still technically independent of each other and ideally makes it easier for it to review.

@bjope @craig.topper @RKSimon Any comments on this patch?

RKSimon added inline comments.Jan 9 2019, 2:05 PM

llvm/test/CodeGen/X86/smul_fix_sat.ll
44 ↗	(On Diff #178285)	nounwind

leonardchan updated this revision to Diff 182085.Jan 16 2019, 9:52 AM

leonardchan marked an inline comment as done.

Should https://reviews.llvm.org/D56987 be a parent for this? Then you'd need to rebase getExpandedFixedPointMultiplication since that has changed into converting into MUL when scale is zero (that is not valid for saturation).

leonardchan added a parent revision: D56987: [Intrinsic] Expand SMULFIX to MUL, MULH[US], or [US]MUL_LOHI on vector arguments.Jan 24 2019, 8:16 AM

leonardchan removed a parent revision: D56987: [Intrinsic] Expand SMULFIX to MUL, MULH[US], or [US]MUL_LOHI on vector arguments.

leonardchan added a child revision: D56987: [Intrinsic] Expand SMULFIX to MUL, MULH[US], or [US]MUL_LOHI on vector arguments.

In D55720#1364944, @bjope wrote:

Should https://reviews.llvm.org/D56987 be a parent for this? Then you'd need to rebase getExpandedFixedPointMultiplication since that has changed into converting into MUL when scale is zero (that is not valid for saturation).

Rebased

RKSimon added inline comments.Jan 30 2019, 1:08 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5424 ↗	(On Diff #183329)	(style) Do an early out to reduce indentation if (!Saturating) return Result;

Updated and rebased

Please rebase after D55625 lands

Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2019, 5:42 AM

Updated and rebased

RKSimon added inline comments.Feb 4 2019, 10:39 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5447 ↗	(On Diff #185084)	You've changed the logic to let non-vector cases to fall through, which leads to UNDEFs for scale == 0 cases.

leonardchan updated this revision to Diff 185101.Feb 4 2019, 11:18 AM

leonardchan marked an inline comment as done.

leonardchan added a parent revision: D57836: [Intrinsic] Unsigned Fixed Point Saturation Multiplication Intrinsic.Feb 6 2019, 12:26 PM

bjope added inline comments.Feb 7 2019, 12:02 AM

llvm/lib/CodeGen/TargetLoweringBase.cpp
627 ↗	(On Diff #185101)	I'm not sure how to do this when overriding for a specific target. In our case we want it to be legal, but only when the scale is 15 (and VT is i16 or i24) or the scale is 31 (and VT is i32 or i40). Is there some easy solution for that? Setting it to legal/custom for any scale might be seen as an indication for optimizers that it is OK to introduce these operations for any scale. This is however a general comment, also for the already pushed non-saturating versions. So it isn't anything that you need to deal with in this patch. But we might need a better solution in the long term.

ebevhan added inline comments.Feb 7 2019, 12:33 AM

llvm/lib/CodeGen/TargetLoweringBase.cpp
627 ↗	(On Diff #185101)	That's what the `isSupportedFixedPointOperation` hook in TargetLowering is for.

bjope added inline comments.Feb 7 2019, 12:40 AM

llvm/test/CodeGen/X86/smul_fix_sat.ll
2 ↗	(On Diff #185101)	The expansion is quite complicated (the splitting in four parts and detecting overflow etc). Isn't there a risk that X86 is a typical target that will try to find more optimal solutions and maybe also make SMULFIXSAT legal? Then this test case might not really verify the expand code any longer? On the other hand, these test cases are just jibberish to me anyway. I can't tell from looking at the checks that DAGTypeLegalizer::ExpandIntRes_MULFIX is doing the right thing. And it would not really help if using another target. Are there perhaps other ways to test DAGTypeLegalizer, such as unit tests? One thing that probably can be done quite easily is to a bunch of tests using constant operands. Verifying that DAGCombiner will constant fold to the expected result after having expanded into legal operations (somehow making sure that DAGCombiner do not constant fold the SMULFIXSAT before it has been expanded, I guess someone will add such DAGCombines sooner or later). That way you might be able to get coverage for all paths through DAGTypeLegalizer::ExpandIntRes_MULFIX. Maybe this test should be in a separate test file.

bjope added inline comments.Feb 7 2019, 1:13 AM

llvm/lib/CodeGen/TargetLoweringBase.cpp
627 ↗	(On Diff #185101)	Ah, yes! And that is updated in this patch. Just me being blind (in combination with a some amnesia).

RKSimon added inline comments.Feb 7 2019, 4:54 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5420 ↗	(On Diff #185101)	Why drop the assert? Why not just add ISD::SMULFIXSAT tests?
5428 ↗	(On Diff #185101)	I think you need something like this here (please double check my logic): if (VT.isVector() && !isOperationLegalOrCustom(ISD::SMULO, VT) && !(!Saturating && isOperationLegalOrCustom(ISD::MUL, VT))) return SDValue(); // unroll And that will let you avoid the return SDValue() below by always defaulting to a scalar ISD::MUL/ISD::SMULO that legalization can handle.

RKSimon mentioned this in rL353546: [TargetLowering] Use ISD::FSHR in expandFixedPointMul.Feb 8 2019, 10:57 AM

RKSimon mentioned this in rGeb6a47a46274: [TargetLowering] Use ISD::FSHR in expandFixedPointMul.

RKSimon added inline comments.Feb 8 2019, 11:05 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5428 ↗	(On Diff #185101)	I've committed rL353546 which /should/ mean that the scale==0 case is now safe to drop through.

leonardchan updated this revision to Diff 186055.Feb 8 2019, 3:16 PM

leonardchan marked 6 inline comments as done.

Fixed mistake related to saturation during expansion. When we promote the operand and result type widths, this also changes the saturation width and affects the min/max values we compare against. This is easily solved by also shifting one of the operands after extension.

Oh I forgot to submit these inline comments

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5428 ↗	(On Diff #185101)	I thought we still want to allow vectors to pass to calls to `MUL` and `SMULO`? Wouldn't this scalarize when we disallow vectors even if `MUL` and `SMULO` are legeal?
llvm/test/CodeGen/X86/smul_fix_sat.ll
2 ↗	(On Diff #185101)	Yeah I can see how these tests are hard to read. I wasn't aware of other ways this could be tested other than making sure the codegen is the same each time. I have my own scripts with different cases to verify the output is correct, but wasn't sure of any existing widely used method of "taking my IR, running it, and verify the results". Testing with constant operands seems to produce better looking tests for non-saturating multiplication: define i4 @func() { ; X64-LABEL: func: ; X64: # %bb.0: ; X64-NEXT: movb $3, %al ; X64-NEXT: retq %tmp = call i4 @llvm.smul.fix.i4( i4 3, i4 2 , i32 1) ret i4 %tmp } where we can immediately tell the result is 3, but there's still branching in the saturating case: define i4 @func2() { ; X64-LABEL: func2: ; X64: # %bb.0: ; X64-NEXT: xorl %eax, %eax ; X64-NEXT: testb %al, %al ; X64-NEXT: movb $127, %cl ; X64-NEXT: jg .LBB1_2 ; X64-NEXT: # %bb.1: ; X64-NEXT: movb $3, %cl ; X64-NEXT: .LBB1_2: ; X64-NEXT: movb $-1, %al ; X64-NEXT: negb %al ; X64-NEXT: movb $-128, %al ; X64-NEXT: jl .LBB1_4 ; X64-NEXT: # %bb.3: ; X64-NEXT: movl %ecx, %eax ; X64-NEXT: .LBB1_4: ; X64-NEXT: retq %tmp = call i4 @llvm.smul.fix.sat.i4( i4 3, i4 2 , i32 1) ret i4 %tmp } so we can't get something as straightforward as with non-saturating.

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 20 2019, 5:58 PM

leonardchan updated this revision to Diff 187955.Feb 22 2019, 11:26 AM

leonardchan marked an inline comment as done.

leonardchan added inline comments.

llvm/test/CodeGen/X86/smul_fix_sat.ll
2 ↗	(On Diff #185101)	@bjope I added another test file that covers the saturation branches in ExpandIntRes_MULFIX using constant operands, although this doesn't seem to produce anything more readable than with variable operands.

bjope added inline comments.Feb 22 2019, 4:27 PM

llvm/test/CodeGen/X86/smul_fix_sat.ll
2 ↗	(On Diff #185101)	Maybe it doesn't fold due to lack of constant folding for SMUL_LOHI (at least not for x86). What a pity. I tried running the test using -mtriple=x86_64--, that at least produce code that is easier to map to the expansion. I also tried some other targets: -mtriple=ppc32 => looks like we get some constant folding here -mtriple=ppc64 => asserts in llvm::SelectionDAG::transferDbgValues -mtriple=hexagon => asserts in llvm::SelectionDAG::transferDbgValues -mtriple=systemz => asserts in llvm::SelectionDAG::transferDbgValues -mtriple=sparc => LLVM ERROR: Cannot select: t42: i32,i32 = addcarry t41:1, Constant:i32<0>, t90:1 (FWIW, no idea if the asserts and LLVM ERROR actually is related to your patch)

*ping*

llvm/test/CodeGen/X86/smul_fix_sat.ll
2 ↗	(On Diff #185101)	Updated the test to use `-mtriple=x86_64-linux`and it looks a lot more readable.

bjope added inline comments.Mar 7 2019, 8:10 AM

llvm/test/CodeGen/X86/smul_fix_sat.ll
2 ↗	(On Diff #185101)	Have you looked at the problem with asserts in llvm::SelectionDAG::transferDbgValues? It happens when expanding smulfixsat, so something seems to be broken regarding the legalization (depending on target used).

leonardchan added a parent revision: D59119: [SelectionDAG] Check legality for ADDCARRY in expandMUL_LOHI.Mar 7 2019, 5:03 PM

leonardchan removed a parent revision: D59119: [SelectionDAG] Check legality for ADDCARRY in expandMUL_LOHI.

leonardchan added a child revision: D59119: [SelectionDAG] Check legality for ADDCARRY in expandMUL_LOHI.

leonardchan updated this revision to Diff 189803.Mar 7 2019, 5:11 PM

leonardchan marked an inline comment as done.

leonardchan added inline comments.

llvm/test/CodeGen/X86/smul_fix_sat.ll
2 ↗	(On Diff #185101)	For `addcarry`, the problem seems to be that `ISD::ADDCARRY` is not supported on some 32 bit targets. The fix for this is just a check in `expandMUL_LOHI` to see if this operation is legal (https://reviews.llvm.org/D59119). For `llvm::SelectionDAG::transferDbgValues`, this is because `expandFixedPointMul` returns an empty `SDValue()` to indicate this function failed due to some unsupported operation (most likely `ISD::SMULO`). I imagine the simplest solution for this is to just `report_fatal_error` since we do not have other operations we can use to perform saturation multiplication.

Sorry for the holdup.

@bjope D61411 addresses the LLVM_ERROR from expanding ADDCARRY, so now we can compile the test for all triples you brought up before. Also updated the tests to reflect these changes, and am still able to confirm on my end that the intrinsic produces the correct results.

leonardchan marked 2 inline comments as done.May 9 2019, 4:05 PM

leonardchan added inline comments.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5428 ↗	(On Diff #185101)	Dropped and can confirm this works for my tests

bjope added inline comments.May 10 2019, 12:07 PM

llvm/docs/LangRef.rst
13355 ↗	(On Diff #198933)	I think this part about unspecified rounding direction either need to be explained more thoroughly somewhere, or we need to find a way to make it possible to specify it. Since the same problem already exist for smul.fix and umul.fix this shouldn't neccessarily be a stopper for this patch,. But I think it poses some problems that there sometimes are two correct results. Are we for example supposed to prohibit constant folding (allowing backend targets implement a specific rounding scheme)? Then I guess we can't implement the promotion/legalization part either, at least not without specifying which rounding scheme the legalization will use. It would be weird to prohibit constant folding in the first place, while at the same time legalization into generic ISD operations might result in DAG combiner actually ending up constant folding the expression.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
4448 ↗	(On Diff #198933)	Is this newline by mistake? Seems unrelated to the patch.

Ka-Ka added a subscriber: Ka-Ka.May 13 2019, 1:56 PM

leonardchan marked 4 inline comments as done.May 14 2019, 12:51 PM

leonardchan added inline comments.

llvm/docs/LangRef.rst
13355 ↗	(On Diff #198933)	Hmm. I'm starting to regret not specifying an argument for rounding when making these intrinsics. Would there be any large consequences for changing the intrinsics to accept a 4th argument for rounding?
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
4448 ↗	(On Diff #198933)	Accidental newline

leonardchan updated this revision to Diff 199498.May 14 2019, 12:52 PM

leonardchan marked an inline comment as done.

bjope added inline comments.May 15 2019, 7:35 AM

llvm/docs/LangRef.rst
13355 ↗	(On Diff #198933)	Had a short discussion with @ebevhan about this (offline). Adding the 4th argument for rounding would make things more clear (avoiding "unspecified") make things more complicated (how many rounding modes should be supported? do we need to support folding/promotion/lowering etc for all different kinds of rounding modes? how do we verify all those modes?) So despite my comment above, we think that the way forward is to keep the solution with "unspecified" for now (we already got it for the non-saturating intrinsics). But to avoid confusion when people are reading the LangRef and looking at the code etc. we probably want to describe what "rounding direction is unspecified" means somewhere (for example in the introduction about "Fixed Point Arithmetic Intrinsics"). Explaining things like: different optimizations (and legalization) in the pipeline are free to do the rounding in whatever direction they want to (but I think the accuracy of the result still should be well-defined so we need to say something about that?) KnownBits/ValueTracking can't assume the direction of the rounding.
13305 ↗	(On Diff #199498)	Do we need a new "chapter" for this? Maybe we can just continue the "Fixed Point Arithmetic Intrinsics" chapter here, and skip the general description below that only refers to "Fixed Point Arithmetic Intrinsics" and "Saturation Arithmetic". If needed, we can add the "(see Saturation Arithmetic)" somewhere in the semantic description of llvm.smul.fix.sat.* instead.
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2713 ↗	(On Diff #199498)	Even if rounding is unspecified, I believe this code is implementing some kind of rounding scheme. Should we perhaps say something about this in the function header. It can be at help when looking at the code in the future. Both to understand what the intention was with the original algorithm. And to understand the expected result when looking at test results etc. Or for some target to understand why a "legal"/"custom" lowering gives different result compared to "expand".

leonardchan updated this revision to Diff 199655.May 15 2019, 12:18 PM

leonardchan marked 5 inline comments as done.

leonardchan added inline comments.

llvm/docs/LangRef.rst
13355 ↗	(On Diff #198933)	I added more detail to the overview explaining the default expansion for multiplication and rounding to say that targets should specify their own hooks if they care about rounding and optimizations/legalizations should be performed based off that hook. Let me know if there's something else important that should be added.
13305 ↗	(On Diff #199498)	We probably don't need this. Especially since the fixed point section comes after the saturation section.
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2713 ↗	(On Diff #199498)	Added

LGTM! (if all comments from other reviewers has been taken care of) (maybe you should wait another day to see if anyone else object, but I think this patch has been open for a long time so there have been plenty of time for comments already)

Btw, I'm still having a hard time reviewing the X86 test cases with very long results, and a little bit worried that those just will give annoying churn when doing unrelated patches in the future, rather than help out detecting problems related to smul.fix. I currently have no more ideas on how to improve that.

Hopefully I'll be able to run some runtime comparison tests between X86 and our OOT target when this has landed (and when I've adapted our target to use these new intrinsics using "legal" lowering and not "expand").

This revision is now accepted and ready to land.May 17 2019, 8:31 AM

In D55720#1506687, @bjope wrote:

LGTM! (if all comments from other reviewers has been taken care of) (maybe you should wait another day to see if anyone else object, but I think this patch has been open for a long time so there have been plenty of time for comments already)

Btw, I'm still having a hard time reviewing the X86 test cases with very long results, and a little bit worried that those just will give annoying churn when doing unrelated patches in the future, rather than help out detecting problems related to smul.fix. I currently have no more ideas on how to improve that.

Hopefully I'll be able to run some runtime comparison tests between X86 and our OOT target when this has landed (and when I've adapted our target to use these new intrinsics using "legal" lowering and not "expand").

Thanks. Unless anyone has other comments, I'll attempt to commit this start of next week.

Closed by commit rL361289: [Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic (authored by leonardchan). · Explain WhyMay 21 2019, 12:14 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

docs/

LangRef.rst

95 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

5 lines

TargetLowering.h

1 line

IR/

Intrinsics.td

6 lines

Target/

TargetSelectionDAG.td

1 line

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

2 lines

LegalizeIntegerTypes.cpp

142 lines

LegalizeVectorOps.cpp

1 line

LegalizeVectorTypes.cpp

2 lines

SelectionDAGBuilder.cpp

8 lines

SelectionDAGDumper.cpp

2 lines

TargetLowering.cpp

56 lines

TargetLoweringBase.cpp

1 line

IR/

Verifier.cpp

13 lines

test/

CodeGen/

X86/

smul_fix_sat.ll

739 lines

smul_fix_sat_constants.ll

101 lines

Diff 200565

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 13,272 Lines • ▼ Show 20 Lines

A fixed point number represents a real data type for a number that has a fixed		A fixed point number represents a real data type for a number that has a fixed
number of digits after a radix point (equivalent to the decimal point '.').		number of digits after a radix point (equivalent to the decimal point '.').
The number of digits after the radix point is referred as the ``scale``. These		The number of digits after the radix point is referred as the ``scale``. These
are useful for representing fractional values to a specific precision. The		are useful for representing fractional values to a specific precision. The
following intrinsics perform fixed point arithmetic operations on 2 operands		following intrinsics perform fixed point arithmetic operations on 2 operands
of the same scale, specified as the third argument.		of the same scale, specified as the third argument.

		The `llvm.*mul.fix` family of intrinsic functions represents a multiplication
		of fixed point numbers through scaled integers. Therefore, fixed point
		multplication can be represented as

		::
		%result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
		=>
		%a2 = sext i4 %a to i8
		%b2 = sext i4 %b to i8
		%mul = mul nsw nuw i8 %a, %b
		%scale2 = trunc i32 %scale to i8
		%r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity
		%result = trunc i8 %r to i4

		For each of these functions, if the result cannot be represented exactly with
		the provided scale, the result is rounded. Rounding is unspecified since
		preferred rounding may vary for different targets. Rounding is specified
		through a target hook. Different pipelines should legalize or optimize this
		using the rounding specified by this hook if it is provided. Operations like
		constant folding, instruction combining, KnownBits, and ValueTracking should
		also use this hook, if provided, and not assume the direction of rounding. A
		rounded result must always be within one unit of precision from the true
		result. That is, the error between the returned result and the true result must
		be less than 1/2^(scale).


'``llvm.smul.fix.*``' Intrinsics		'``llvm.smul.fix.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax		Syntax
"""""""		"""""""

This is an overloaded intrinsic. You can use ``llvm.smul.fix``		This is an overloaded intrinsic. You can use ``llvm.smul.fix``
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	.. code-block:: llvm

%res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)		%res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
%res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)		%res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)

; The result in the following could be rounded down to 3.5 or up to 4		; The result in the following could be rounded down to 3.5 or up to 4
%res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)		%res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)


		'``llvm.smul.fix.sat.*``' Intrinsics
		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

		Syntax
		"""""""

		This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
		on any integer bit width or vectors of integers.

		::

		declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
		declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
		declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
		declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)

		Overview
		"""""""""

		The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
		fixed point saturation multiplication on 2 arguments of the same scale.

		Arguments
		""""""""""

		The arguments (%a and %b) and the result may be of integer types of any bit
		width, but they must have the same bit width. ``%a`` and ``%b`` are the two
		values that will undergo signed fixed point multiplication. The argument
		``%scale`` represents the scale of both operands, and must be a constant
		integer.

		Semantics:
		""""""""""

		This operation performs fixed point multiplication on the 2 arguments of a
		specified scale. The result will also be returned in the same scale specified
		in the third argument.

		If the result value cannot be precisely represented in the given scale, the
		value is rounded up or down to the closest representable value. The rounding
		direction is unspecified.

		The maximum value this operation can clamp to is the largest signed value
		representable by the bit width of the first 2 arguments. The minimum value is the
		smallest signed value representable by this bit width.


		Examples
		"""""""""

		.. code-block:: llvm

		%res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
		%res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
		%res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)

		; The result in the following could be rounded up to -2 or down to -2.5
		%res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)

		; Saturation
		%res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7
		%res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 2) ; %res = 7
		%res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 2, i32 2) ; %res = -8
		%res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 2) ; %res = 7

		; Scale can affect the saturation result
		%res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
		%res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)


Specialised Arithmetic Intrinsics		Specialised Arithmetic Intrinsics
---------------------------------		---------------------------------

'``llvm.canonicalize.*``' Intrinsic		'``llvm.canonicalize.*``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:		Syntax:
"""""""		"""""""
▲ Show 20 Lines • Show All 3,630 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 272 Lines • ▼ Show 20 Lines	enum NodeType {

/// RESULT = [US]MULFIX(LHS, RHS, SCALE) - Perform fixed point multiplication on		/// RESULT = [US]MULFIX(LHS, RHS, SCALE) - Perform fixed point multiplication on
/// 2 integers with the same width and scale. SCALE represents the scale of		/// 2 integers with the same width and scale. SCALE represents the scale of
/// both operands as fixed point numbers. This SCALE parameter must be a		/// both operands as fixed point numbers. This SCALE parameter must be a
/// constant integer. A scale of zero is effectively performing		/// constant integer. A scale of zero is effectively performing
/// multiplication on 2 integers.		/// multiplication on 2 integers.
SMULFIX, UMULFIX,		SMULFIX, UMULFIX,

		/// Same as the corresponding unsaturated fixed point instructions, but the
		/// result is clamped between the min and max values representable by the
		/// bits of the first 2 operands.
		SMULFIXSAT,

/// Simple binary floating point operators.		/// Simple binary floating point operators.
FADD, FSUB, FMUL, FDIV, FREM,		FADD, FSUB, FMUL, FDIV, FREM,

/// Constrained versions of the binary floating point operators.		/// Constrained versions of the binary floating point operators.
/// These will be lowered to the simple operators before final selection.		/// These will be lowered to the simple operators before final selection.
/// They are used to limit optimizations while the DAG is being		/// They are used to limit optimizations while the DAG is being
/// optimized.		/// optimized.
STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,		STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,
▲ Show 20 Lines • Show All 781 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 849 Lines • ▼ Show 20 Lines	LegalizeAction getFixedPointOperationAction(unsigned Op, EVT VT,

// This operation is supported in this type but may only work on specific		// This operation is supported in this type but may only work on specific
// scales.		// scales.
bool Supported;		bool Supported;
switch (Op) {		switch (Op) {
default:		default:
llvm_unreachable("Unexpected fixed point operation.");		llvm_unreachable("Unexpected fixed point operation.");
case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
Supported = isSupportedFixedPointOperation(Op, VT, Scale);		Supported = isSupportedFixedPointOperation(Op, VT, Scale);
break;		break;
}		}

return Supported ? Action : Expand;		return Supported ? Action : Expand;
}		}

▲ Show 20 Lines • Show All 3,193 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 868 Lines • ▼ Show 20 Lines
	def int_smul_fix : Intrinsic<[llvm_anyint_ty],			def int_smul_fix : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],			[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
	[IntrNoMem, IntrSpeculatable, Commutative, ImmArg<2>]>;			[IntrNoMem, IntrSpeculatable, Commutative, ImmArg<2>]>;

	def int_umul_fix : Intrinsic<[llvm_anyint_ty],			def int_umul_fix : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],			[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
	[IntrNoMem, IntrSpeculatable, Commutative, ImmArg<2>]>;			[IntrNoMem, IntrSpeculatable, Commutative, ImmArg<2>]>;

				//===------------------- Fixed Point Saturation Arithmetic Intrinsics ----------------===//
				//
				def int_smul_fix_sat : Intrinsic<[llvm_anyint_ty],
				[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
				[IntrNoMem, IntrSpeculatable, Commutative, ImmArg<2>]>;

	//===------------------------- Memory Use Markers -------------------------===//			//===------------------------- Memory Use Markers -------------------------===//
	//			//
	def int_lifetime_start : Intrinsic<[],			def int_lifetime_start : Intrinsic<[],
	[llvm_i64_ty, llvm_anyptr_ty],			[llvm_i64_ty, llvm_anyptr_ty],
	[IntrArgMemOnly, NoCapture<1>, ImmArg<0>]>;			[IntrArgMemOnly, NoCapture<1>, ImmArg<0>]>;
	def int_lifetime_end : Intrinsic<[],			def int_lifetime_end : Intrinsic<[],
	[llvm_i64_ty, llvm_anyptr_ty],			[llvm_i64_ty, llvm_anyptr_ty],
	[IntrArgMemOnly, NoCapture<1>, ImmArg<0>]>;			[IntrArgMemOnly, NoCapture<1>, ImmArg<0>]>;
	▲ Show 20 Lines • Show All 313 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Target/TargetSelectionDAG.td

Show First 20 Lines • Show All 385 Lines • ▼ Show 20 Lines	def umax : SDNode<"ISD::UMAX" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;

def saddsat : SDNode<"ISD::SADDSAT" , SDTIntBinOp, [SDNPCommutative]>;		def saddsat : SDNode<"ISD::SADDSAT" , SDTIntBinOp, [SDNPCommutative]>;
def uaddsat : SDNode<"ISD::UADDSAT" , SDTIntBinOp, [SDNPCommutative]>;		def uaddsat : SDNode<"ISD::UADDSAT" , SDTIntBinOp, [SDNPCommutative]>;
def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;		def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;
def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;		def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;

def smulfix : SDNode<"ISD::SMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;		def smulfix : SDNode<"ISD::SMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;
		def smulfixsat : SDNode<"ISD::SMULFIXSAT", SDTIntScaledBinOp, [SDNPCommutative]>;
def umulfix : SDNode<"ISD::UMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;		def umulfix : SDNode<"ISD::UMULFIX" , SDTIntScaledBinOp, [SDNPCommutative]>;

def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;		def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;
def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;		def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;
def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;		def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;

def abs : SDNode<"ISD::ABS" , SDTIntUnaryOp>;		def abs : SDNode<"ISD::ABS" , SDTIntUnaryOp>;
def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;		def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;
▲ Show 20 Lines • Show All 969 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 1,134 Lines • ▼ Show 20 Lines	#endif
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: {		case ISD::USUBSAT: {
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
}		}
case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX: {		case ISD::UMULFIX: {
unsigned Scale = Node->getConstantOperandVal(2);		unsigned Scale = Node->getConstantOperandVal(2);
Action = TLI.getFixedPointOperationAction(Node->getOpcode(),		Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
Node->getValueType(0), Scale);		Node->getValueType(0), Scale);
break;		break;
}		}
case ISD::MSCATTER:		case ISD::MSCATTER:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
▲ Show 20 Lines • Show All 2,178 Lines • ▼ Show 20 Lines	case ISD::ROTR:
break;		break;
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT:		case ISD::USUBSAT:
Results.push_back(TLI.expandAddSubSat(Node, DAG));		Results.push_back(TLI.expandAddSubSat(Node, DAG));
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
Results.push_back(TLI.expandFixedPointMul(Node, DAG));		Results.push_back(TLI.expandFixedPointMul(Node, DAG));
break;		break;
case ISD::ADDCARRY:		case ISD::ADDCARRY:
case ISD::SUBCARRY: {		case ISD::SUBCARRY: {
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
SDValue Carry = Node->getOperand(2);		SDValue Carry = Node->getOperand(2);
▲ Show 20 Lines • Show All 1,236 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 143 Lines • ▼ Show 20 Lines	#endif
case ISD::ADDCARRY:		case ISD::ADDCARRY:
case ISD::SUBCARRY: Res = PromoteIntRes_ADDSUBCARRY(N, ResNo); break;		case ISD::SUBCARRY: Res = PromoteIntRes_ADDSUBCARRY(N, ResNo); break;

case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: Res = PromoteIntRes_ADDSUBSAT(N); break;		case ISD::USUBSAT: Res = PromoteIntRes_ADDSUBSAT(N); break;
case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX: Res = PromoteIntRes_MULFIX(N); break;		case ISD::UMULFIX: Res = PromoteIntRes_MULFIX(N); break;
case ISD::ABS: Res = PromoteIntRes_ABS(N); break;		case ISD::ABS: Res = PromoteIntRes_ABS(N); break;

case ISD::ATOMIC_LOAD:		case ISD::ATOMIC_LOAD:
Res = PromoteIntRes_Atomic0(cast<AtomicSDNode>(N)); break;		Res = PromoteIntRes_Atomic0(cast<AtomicSDNode>(N)); break;

case ISD::ATOMIC_LOAD_ADD:		case ISD::ATOMIC_LOAD_ADD:
case ISD::ATOMIC_LOAD_SUB:		case ISD::ATOMIC_LOAD_SUB:
▲ Show 20 Lines • Show All 505 Lines • ▼ Show 20 Lines	SDValue Result =
DAG.getNode(Opcode, dl, PromotedType, Op1Promoted, Op2Promoted);		DAG.getNode(Opcode, dl, PromotedType, Op1Promoted, Op2Promoted);
return DAG.getNode(ShiftOp, dl, PromotedType, Result, ShiftAmount);		return DAG.getNode(ShiftOp, dl, PromotedType, Result, ShiftAmount);
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_MULFIX(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_MULFIX(SDNode *N) {
// Can just promote the operands then continue with operation.		// Can just promote the operands then continue with operation.
SDLoc dl(N);		SDLoc dl(N);
SDValue Op1Promoted, Op2Promoted;		SDValue Op1Promoted, Op2Promoted;
if (N->getOpcode() == ISD::SMULFIX) {		bool Signed =
		N->getOpcode() == ISD::SMULFIX \|\| N->getOpcode() == ISD::SMULFIXSAT;
		if (Signed) {
Op1Promoted = SExtPromotedInteger(N->getOperand(0));		Op1Promoted = SExtPromotedInteger(N->getOperand(0));
Op2Promoted = SExtPromotedInteger(N->getOperand(1));		Op2Promoted = SExtPromotedInteger(N->getOperand(1));
} else {		} else {
Op1Promoted = ZExtPromotedInteger(N->getOperand(0));		Op1Promoted = ZExtPromotedInteger(N->getOperand(0));
Op2Promoted = ZExtPromotedInteger(N->getOperand(1));		Op2Promoted = ZExtPromotedInteger(N->getOperand(1));
}		}
		EVT OldType = N->getOperand(0).getValueType();
EVT PromotedType = Op1Promoted.getValueType();		EVT PromotedType = Op1Promoted.getValueType();
		unsigned DiffSize =
		PromotedType.getScalarSizeInBits() - OldType.getScalarSizeInBits();

		bool Saturating = N->getOpcode() == ISD::SMULFIXSAT;
		if (Saturating) {
		// Promoting the operand and result values changes the saturation width,
		// which is extends the values that we clamp to on saturation. This could be
		// resolved by shifting one of the operands the same amount, which would
		// also shift the result we compare against, then shifting back.
		EVT ShiftTy = TLI.getShiftAmountTy(PromotedType, DAG.getDataLayout());
		Op1Promoted = DAG.getNode(ISD::SHL, dl, PromotedType, Op1Promoted,
		DAG.getConstant(DiffSize, dl, ShiftTy));
		SDValue Result = DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted,
		Op2Promoted, N->getOperand(2));
		unsigned ShiftOp = Signed ? ISD::SRA : ISD::SRL;
		return DAG.getNode(ShiftOp, dl, PromotedType, Result,
		DAG.getConstant(DiffSize, dl, ShiftTy));
		}
return DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted, Op2Promoted,		return DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted, Op2Promoted,
N->getOperand(2));		N->getOperand(2));
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo) {		SDValue DAGTypeLegalizer::PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo) {
if (ResNo == 1)		if (ResNo == 1)
return PromoteIntRes_Overflow(N);		return PromoteIntRes_Overflow(N);

▲ Show 20 Lines • Show All 431 Lines • ▼ Show 20 Lines	bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {
case ISD::SUBCARRY: Res = PromoteIntOp_ADDSUBCARRY(N, OpNo); break;		case ISD::SUBCARRY: Res = PromoteIntOp_ADDSUBCARRY(N, OpNo); break;

case ISD::FRAMEADDR:		case ISD::FRAMEADDR:
case ISD::RETURNADDR: Res = PromoteIntOp_FRAMERETURNADDR(N); break;		case ISD::RETURNADDR: Res = PromoteIntOp_FRAMERETURNADDR(N); break;

case ISD::PREFETCH: Res = PromoteIntOp_PREFETCH(N, OpNo); break;		case ISD::PREFETCH: Res = PromoteIntOp_PREFETCH(N, OpNo); break;

case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX: Res = PromoteIntOp_MULFIX(N); break;		case ISD::UMULFIX: Res = PromoteIntOp_MULFIX(N); break;

case ISD::FPOWI: Res = PromoteIntOp_FPOWI(N); break;		case ISD::FPOWI: Res = PromoteIntOp_FPOWI(N); break;

case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
▲ Show 20 Lines • Show All 547 Lines • ▼ Show 20 Lines	#endif
case ISD::USUBO: ExpandIntRes_UADDSUBO(N, Lo, Hi); break;		case ISD::USUBO: ExpandIntRes_UADDSUBO(N, Lo, Hi); break;
case ISD::UMULO:		case ISD::UMULO:
case ISD::SMULO: ExpandIntRes_XMULO(N, Lo, Hi); break;		case ISD::SMULO: ExpandIntRes_XMULO(N, Lo, Hi); break;

case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;		case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;

case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX: ExpandIntRes_MULFIX(N, Lo, Hi); break;		case ISD::UMULFIX: ExpandIntRes_MULFIX(N, Lo, Hi); break;

case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
case ISD::VECREDUCE_XOR:		case ISD::VECREDUCE_XOR:
case ISD::VECREDUCE_SMAX:		case ISD::VECREDUCE_SMAX:
▲ Show 20 Lines • Show All 1,007 Lines • ▼ Show 20 Lines
}		}

void DAGTypeLegalizer::ExpandIntRes_ADDSUBSAT(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::ExpandIntRes_ADDSUBSAT(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
SDValue Result = TLI.expandAddSubSat(N, DAG);		SDValue Result = TLI.expandAddSubSat(N, DAG);
SplitInteger(Result, Lo, Hi);		SplitInteger(Result, Lo, Hi);
}		}

		/// This performs an expansion of the integer result for a fixed point
		/// multiplication. The default expansion performs rounding down towards
		/// negative infinity, though targets that do care about rounding should specify
		/// a target hook for rounding and provide their own expansion or lowering of
		/// fixed point multiplication to be consistent with rounding.
void DAGTypeLegalizer::ExpandIntRes_MULFIX(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::ExpandIntRes_MULFIX(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
assert(
(N->getOpcode() == ISD::SMULFIX \|\| N->getOpcode() == ISD::UMULFIX) &&
"Expected operand to be signed or unsigned fixed point multiplication");

SDLoc dl(N);		SDLoc dl(N);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
		unsigned VTSize = VT.getScalarSizeInBits();
SDValue LHS = N->getOperand(0);		SDValue LHS = N->getOperand(0);
SDValue RHS = N->getOperand(1);		SDValue RHS = N->getOperand(1);
uint64_t Scale = N->getConstantOperandVal(2);		uint64_t Scale = N->getConstantOperandVal(2);
		bool Saturating = N->getOpcode() == ISD::SMULFIXSAT;
		EVT BoolVT =
		TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);
		SDValue Zero = DAG.getConstant(0, dl, VT);
if (!Scale) {		if (!Scale) {
SDValue Result = DAG.getNode(ISD::MUL, dl, VT, LHS, RHS);		SDValue Result;
		if (!Saturating) {
		Result = DAG.getNode(ISD::MUL, dl, VT, LHS, RHS);
		} else {
		Result = DAG.getNode(ISD::SMULO, dl, DAG.getVTList(VT, BoolVT), LHS, RHS);
		SDValue Product = Result.getValue(0);
		SDValue Overflow = Result.getValue(1);

		APInt MinVal = APInt::getSignedMinValue(VTSize);
		APInt MaxVal = APInt::getSignedMaxValue(VTSize);
		SDValue SatMin = DAG.getConstant(MinVal, dl, VT);
		SDValue SatMax = DAG.getConstant(MaxVal, dl, VT);
		SDValue ProdNeg = DAG.getSetCC(dl, BoolVT, Product, Zero, ISD::SETLT);
		Result = DAG.getSelect(dl, VT, ProdNeg, SatMax, SatMin);
		Result = DAG.getSelect(dl, VT, Overflow, Result, Product);
		}
SplitInteger(Result, Lo, Hi);		SplitInteger(Result, Lo, Hi);
return;		return;
}		}

EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
SDValue LL, LH, RL, RH;		SDValue LL, LH, RL, RH;
GetExpandedInteger(LHS, LL, LH);		GetExpandedInteger(LHS, LL, LH);
GetExpandedInteger(RHS, RL, RH);		GetExpandedInteger(RHS, RL, RH);
SmallVector<SDValue, 4> Result;		SmallVector<SDValue, 4> Result;

bool Signed = N->getOpcode() == ISD::SMULFIX;		bool Signed = (N->getOpcode() == ISD::SMULFIX \|\|
		N->getOpcode() == ISD::SMULFIXSAT);
unsigned LoHiOp = Signed ? ISD::SMUL_LOHI : ISD::UMUL_LOHI;		unsigned LoHiOp = Signed ? ISD::SMUL_LOHI : ISD::UMUL_LOHI;
if (!TLI.expandMUL_LOHI(LoHiOp, VT, dl, LHS, RHS, Result, NVT, DAG,		if (!TLI.expandMUL_LOHI(LoHiOp, VT, dl, LHS, RHS, Result, NVT, DAG,
TargetLowering::MulExpansionKind::OnlyLegalOrCustom,		TargetLowering::MulExpansionKind::OnlyLegalOrCustom,
LL, LH, RL, RH)) {		LL, LH, RL, RH)) {
report_fatal_error("Unable to expand MUL_FIX using MUL_LOHI.");		report_fatal_error("Unable to expand MUL_FIX using MUL_LOHI.");
return;		return;
}		}

unsigned VTSize = VT.getScalarSizeInBits();
unsigned NVTSize = NVT.getScalarSizeInBits();		unsigned NVTSize = NVT.getScalarSizeInBits();
		assert((VTSize == NVTSize * 2) && "Expected the new value type to be half "
		"the size of the current value type");
EVT ShiftTy = TLI.getShiftAmountTy(NVT, DAG.getDataLayout());		EVT ShiftTy = TLI.getShiftAmountTy(NVT, DAG.getDataLayout());

// Shift whole amount by scale.		// Shift whole amount by scale.
SDValue ResultLL = Result[0];		SDValue ResultLL = Result[0];
SDValue ResultLH = Result[1];		SDValue ResultLH = Result[1];
SDValue ResultHL = Result[2];		SDValue ResultHL = Result[2];
SDValue ResultHH = Result[3];		SDValue ResultHH = Result[3];

		SDValue SatMax, SatMin;
		SDValue NVTZero = DAG.getConstant(0, dl, NVT);
		SDValue NVTNeg1 = DAG.getConstant(-1, dl, NVT);
		EVT BoolNVT =
		TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), NVT);

// After getting the multplication result in 4 parts, we need to perform a		// After getting the multplication result in 4 parts, we need to perform a
// shift right by the amount of the scale to get the result in that scale.		// shift right by the amount of the scale to get the result in that scale.
// Let's say we multiply 2 64 bit numbers. The resulting value can be held in		// Let's say we multiply 2 64 bit numbers. The resulting value can be held in
// 128 bits that are cut into 4 32-bit parts:		// 128 bits that are cut into 4 32-bit parts:
//		//
// HH HL LH LL		// HH HL LH LL
// \|---32---\|---32---\|---32---\|---32---\|		// \|---32---\|---32---\|---32---\|---32---\|
// 128 96 64 32 0		// 128 96 64 32 0
Show All 12 Lines	if (Scale < NVTSize) {
SDValue SRLAmnt = DAG.getConstant(Scale, dl, ShiftTy);		SDValue SRLAmnt = DAG.getConstant(Scale, dl, ShiftTy);
SDValue SHLAmnt = DAG.getConstant(NVTSize - Scale, dl, ShiftTy);		SDValue SHLAmnt = DAG.getConstant(NVTSize - Scale, dl, ShiftTy);
Lo = DAG.getNode(ISD::SRL, dl, NVT, ResultLL, SRLAmnt);		Lo = DAG.getNode(ISD::SRL, dl, NVT, ResultLL, SRLAmnt);
Lo = DAG.getNode(ISD::OR, dl, NVT, Lo,		Lo = DAG.getNode(ISD::OR, dl, NVT, Lo,
DAG.getNode(ISD::SHL, dl, NVT, ResultLH, SHLAmnt));		DAG.getNode(ISD::SHL, dl, NVT, ResultLH, SHLAmnt));
Hi = DAG.getNode(ISD::SRL, dl, NVT, ResultLH, SRLAmnt);		Hi = DAG.getNode(ISD::SRL, dl, NVT, ResultLH, SRLAmnt);
Hi = DAG.getNode(ISD::OR, dl, NVT, Hi,		Hi = DAG.getNode(ISD::OR, dl, NVT, Hi,
DAG.getNode(ISD::SHL, dl, NVT, ResultHL, SHLAmnt));		DAG.getNode(ISD::SHL, dl, NVT, ResultHL, SHLAmnt));

		// We cannot overflow past HH when multiplying 2 ints of size VTSize, so the
		// highest bit of HH determines saturation direction in the event of
		// saturation.
		// The number of overflow bits we can check are VTSize - Scale + 1 (we
		// include the sign bit). If these top bits are > 0, then we overflowed past
		// the max value. If these top bits are < -1, then we overflowed past the
		// min value. Otherwise, we did not overflow.
		if (Saturating) {
		unsigned OverflowBits = VTSize - Scale + 1;
		assert(OverflowBits <= VTSize && OverflowBits > NVTSize &&
		"Extent of overflow bits must start within HL");
		SDValue HLHiMask = DAG.getConstant(
		APInt::getHighBitsSet(NVTSize, OverflowBits - NVTSize), dl, NVT);
		SDValue HLLoMask = DAG.getConstant(
		APInt::getLowBitsSet(NVTSize, VTSize - OverflowBits), dl, NVT);

		// HH > 0 or HH == 0 && HL > HLLoMask
		SDValue HHPos = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTZero, ISD::SETGT);
		SDValue HHZero = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTZero, ISD::SETEQ);
		SDValue HLPos =
		DAG.getSetCC(dl, BoolNVT, ResultHL, HLLoMask, ISD::SETUGT);
		SatMax = DAG.getNode(ISD::OR, dl, BoolNVT, HHPos,
		DAG.getNode(ISD::AND, dl, BoolNVT, HHZero, HLPos));

		// HH < -1 or HH == -1 && HL < HLHiMask
		SDValue HHNeg = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTNeg1, ISD::SETLT);
		SDValue HHNeg1 = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTNeg1, ISD::SETEQ);
		SDValue HLNeg =
		DAG.getSetCC(dl, BoolNVT, ResultHL, HLHiMask, ISD::SETULT);
		SatMin = DAG.getNode(ISD::OR, dl, BoolNVT, HHNeg,
		DAG.getNode(ISD::AND, dl, BoolNVT, HHNeg1, HLNeg));
		}
} else if (Scale == NVTSize) {		} else if (Scale == NVTSize) {
// If the scales are equal, Lo and Hi are ResultLH and Result HL,		// If the scales are equal, Lo and Hi are ResultLH and Result HL,
// respectively. Avoid shifting to prevent undefined behavior.		// respectively. Avoid shifting to prevent undefined behavior.
Lo = ResultLH;		Lo = ResultLH;
Hi = ResultHL;		Hi = ResultHL;

		// We overflow max if HH > 0 or HH == 0 && HL sign is negative.
		// We overflow min if HH < -1 or HH == -1 && HL sign is 0.
		if (Saturating) {
		SDValue HHPos = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTZero, ISD::SETGT);
		SDValue HHZero = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTZero, ISD::SETEQ);
		SDValue HLNeg = DAG.getSetCC(dl, BoolNVT, ResultHL, NVTZero, ISD::SETLT);
		SatMax = DAG.getNode(ISD::OR, dl, BoolNVT, HHPos,
		DAG.getNode(ISD::AND, dl, BoolNVT, HHZero, HLNeg));

		SDValue HHNeg = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTNeg1, ISD::SETLT);
		SDValue HHNeg1 = DAG.getSetCC(dl, BoolNVT, ResultHH, NVTNeg1, ISD::SETEQ);
		SDValue HLPos = DAG.getSetCC(dl, BoolNVT, ResultHL, NVTZero, ISD::SETGT);
		SatMin = DAG.getNode(ISD::OR, dl, BoolNVT, HHNeg,
		DAG.getNode(ISD::AND, dl, BoolNVT, HHNeg1, HLPos));
		}
} else if (Scale < VTSize) {		} else if (Scale < VTSize) {
// If the scale is instead less than the old VT size, but greater than or		// If the scale is instead less than the old VT size, but greater than or
// equal to the expanded VT size, the first part of the result (ResultLL) is		// equal to the expanded VT size, the first part of the result (ResultLL) is
// no longer a part of Lo because it would be scaled out anyway. Instead we		// no longer a part of Lo because it would be scaled out anyway. Instead we
// can start shifting right from the fourth part (ResultHH) to the second		// can start shifting right from the fourth part (ResultHH) to the second
// part (ResultLH), and Result LH will be the new Lo.		// part (ResultLH), and Result LH will be the new Lo.
SDValue SRLAmnt = DAG.getConstant(Scale - NVTSize, dl, ShiftTy);		SDValue SRLAmnt = DAG.getConstant(Scale - NVTSize, dl, ShiftTy);
SDValue SHLAmnt = DAG.getConstant(VTSize - Scale, dl, ShiftTy);		SDValue SHLAmnt = DAG.getConstant(VTSize - Scale, dl, ShiftTy);
Lo = DAG.getNode(ISD::SRL, dl, NVT, ResultLH, SRLAmnt);		Lo = DAG.getNode(ISD::SRL, dl, NVT, ResultLH, SRLAmnt);
Lo = DAG.getNode(ISD::OR, dl, NVT, Lo,		Lo = DAG.getNode(ISD::OR, dl, NVT, Lo,
DAG.getNode(ISD::SHL, dl, NVT, ResultHL, SHLAmnt));		DAG.getNode(ISD::SHL, dl, NVT, ResultHL, SHLAmnt));
Hi = DAG.getNode(ISD::SRL, dl, NVT, ResultHL, SRLAmnt);		Hi = DAG.getNode(ISD::SRL, dl, NVT, ResultHL, SRLAmnt);
Hi = DAG.getNode(ISD::OR, dl, NVT, Hi,		Hi = DAG.getNode(ISD::OR, dl, NVT, Hi,
DAG.getNode(ISD::SHL, dl, NVT, ResultHH, SHLAmnt));		DAG.getNode(ISD::SHL, dl, NVT, ResultHH, SHLAmnt));

		// This is similar to the case when we saturate if Scale < NVTSize, but we
		// only need to chech HH.
		if (Saturating) {
		unsigned OverflowBits = VTSize - Scale + 1;
		SDValue HHHiMask = DAG.getConstant(
		APInt::getHighBitsSet(NVTSize, OverflowBits), dl, NVT);
		SDValue HHLoMask = DAG.getConstant(
		APInt::getLowBitsSet(NVTSize, NVTSize - OverflowBits), dl, NVT);

		SatMax = DAG.getSetCC(dl, BoolNVT, ResultHH, HHLoMask, ISD::SETGT);
		SatMin = DAG.getSetCC(dl, BoolNVT, ResultHH, HHHiMask, ISD::SETLT);
		}
} else if (Scale == VTSize) {		} else if (Scale == VTSize) {
assert(		assert(
!Signed &&		!Signed &&
"Only unsigned types can have a scale equal to the operand bit width");		"Only unsigned types can have a scale equal to the operand bit width");

Lo = ResultHL;		Lo = ResultHL;
Hi = ResultHH;		Hi = ResultHH;
} else {		} else {
llvm_unreachable("Expected the scale to be less than or equal to the width "		llvm_unreachable("Expected the scale to be less than or equal to the width "
"of the operands");		"of the operands");
}		}

		if (Saturating) {
		APInt LHMax = APInt::getSignedMaxValue(NVTSize);
		APInt LLMax = APInt::getAllOnesValue(NVTSize);
		APInt LHMin = APInt::getSignedMinValue(NVTSize);
		Hi = DAG.getSelect(dl, NVT, SatMax, DAG.getConstant(LHMax, dl, NVT), Hi);
		Hi = DAG.getSelect(dl, NVT, SatMin, DAG.getConstant(LHMin, dl, NVT), Hi);
		Lo = DAG.getSelect(dl, NVT, SatMax, DAG.getConstant(LLMax, dl, NVT), Lo);
		Lo = DAG.getSelect(dl, NVT, SatMin, NVTZero, Lo);
		}
}		}

void DAGTypeLegalizer::ExpandIntRes_SADDSUBO(SDNode *Node,		void DAGTypeLegalizer::ExpandIntRes_SADDSUBO(SDNode *Node,
SDValue &Lo, SDValue &Hi) {		SDValue &Lo, SDValue &Hi) {
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
SDLoc dl(Node);		SDLoc dl(Node);

▲ Show 20 Lines • Show All 1,208 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 432 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::FCANONICALIZE:		case ISD::FCANONICALIZE:
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT:		case ISD::USUBSAT:
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX: {		case ISD::UMULFIX: {
unsigned Scale = Node->getConstantOperandVal(2);		unsigned Scale = Node->getConstantOperandVal(2);
Action = TLI.getFixedPointOperationAction(Node->getOpcode(),		Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
Node->getValueType(0), Scale);		Node->getValueType(0), Scale);
break;		break;
}		}
case ISD::FP_ROUND_INREG:		case ISD::FP_ROUND_INREG:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
▲ Show 20 Lines • Show All 945 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	#endif
case ISD::SADDO:		case ISD::SADDO:
case ISD::USUBO:		case ISD::USUBO:
case ISD::SSUBO:		case ISD::SSUBO:
case ISD::UMULO:		case ISD::UMULO:
case ISD::SMULO:		case ISD::SMULO:
R = ScalarizeVecRes_OverflowOp(N, ResNo);		R = ScalarizeVecRes_OverflowOp(N, ResNo);
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
R = ScalarizeVecRes_MULFIX(N);		R = ScalarizeVecRes_MULFIX(N);
break;		break;
}		}

// If R is null, the sub-method took care of registering the result.		// If R is null, the sub-method took care of registering the result.
if (R.getNode())		if (R.getNode())
SetScalarizedVector(SDValue(N, ResNo), R);		SetScalarizedVector(SDValue(N, ResNo), R);
▲ Show 20 Lines • Show All 772 Lines • ▼ Show 20 Lines	#endif
case ISD::SADDO:		case ISD::SADDO:
case ISD::USUBO:		case ISD::USUBO:
case ISD::SSUBO:		case ISD::SSUBO:
case ISD::UMULO:		case ISD::UMULO:
case ISD::SMULO:		case ISD::SMULO:
SplitVecRes_OverflowOp(N, ResNo, Lo, Hi);		SplitVecRes_OverflowOp(N, ResNo, Lo, Hi);
break;		break;
case ISD::SMULFIX:		case ISD::SMULFIX:
		case ISD::SMULFIXSAT:
case ISD::UMULFIX:		case ISD::UMULFIX:
SplitVecRes_MULFIX(N, Lo, Hi);		SplitVecRes_MULFIX(N, Lo, Hi);
break;		break;
}		}

// If Lo/Hi is null, the sub-method took care of registering results etc.		// If Lo/Hi is null, the sub-method took care of registering results etc.
if (Lo.getNode())		if (Lo.getNode())
SetSplitVector(SDValue(N, ResNo), Lo, Hi);		SetSplitVector(SDValue(N, ResNo), Lo, Hi);
▲ Show 20 Lines • Show All 4,001 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,292 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
case Intrinsic::umul_fix: {		case Intrinsic::umul_fix: {
SDValue Op1 = getValue(I.getArgOperand(0));		SDValue Op1 = getValue(I.getArgOperand(0));
SDValue Op2 = getValue(I.getArgOperand(1));		SDValue Op2 = getValue(I.getArgOperand(1));
SDValue Op3 = getValue(I.getArgOperand(2));		SDValue Op3 = getValue(I.getArgOperand(2));
setValue(&I, DAG.getNode(FixedPointIntrinsicToOpcode(Intrinsic), sdl,		setValue(&I, DAG.getNode(FixedPointIntrinsicToOpcode(Intrinsic), sdl,
Op1.getValueType(), Op1, Op2, Op3));		Op1.getValueType(), Op1, Op2, Op3));
return;		return;
}		}
		case Intrinsic::smul_fix_sat: {
		SDValue Op1 = getValue(I.getArgOperand(0));
		SDValue Op2 = getValue(I.getArgOperand(1));
		SDValue Op3 = getValue(I.getArgOperand(2));
		setValue(&I, DAG.getNode(ISD::SMULFIXSAT, sdl, Op1.getValueType(), Op1, Op2,
		Op3));
		return;
		}
case Intrinsic::stacksave: {		case Intrinsic::stacksave: {
SDValue Op = getRoot();		SDValue Op = getRoot();
Res = DAG.getNode(		Res = DAG.getNode(
ISD::STACKSAVE, sdl,		ISD::STACKSAVE, sdl,
DAG.getVTList(TLI.getPointerTy(DAG.getDataLayout()), MVT::Other), Op);		DAG.getVTList(TLI.getPointerTy(DAG.getDataLayout()), MVT::Other), Op);
setValue(&I, Res);		setValue(&I, Res);
DAG.setRoot(Res.getValue(1));		DAG.setRoot(Res.getValue(1));
return;		return;
▲ Show 20 Lines • Show All 4,596 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 295 Lines • ▼ Show 20 Lines	#endif
case ISD::SHL_PARTS: return "shl_parts";		case ISD::SHL_PARTS: return "shl_parts";
case ISD::SRA_PARTS: return "sra_parts";		case ISD::SRA_PARTS: return "sra_parts";
case ISD::SRL_PARTS: return "srl_parts";		case ISD::SRL_PARTS: return "srl_parts";

case ISD::SADDSAT: return "saddsat";		case ISD::SADDSAT: return "saddsat";
case ISD::UADDSAT: return "uaddsat";		case ISD::UADDSAT: return "uaddsat";
case ISD::SSUBSAT: return "ssubsat";		case ISD::SSUBSAT: return "ssubsat";
case ISD::USUBSAT: return "usubsat";		case ISD::USUBSAT: return "usubsat";

case ISD::SMULFIX: return "smulfix";		case ISD::SMULFIX: return "smulfix";
		case ISD::SMULFIXSAT: return "smulfixsat";
case ISD::UMULFIX: return "umulfix";		case ISD::UMULFIX: return "umulfix";

// Conversion operators.		// Conversion operators.
case ISD::SIGN_EXTEND: return "sign_extend";		case ISD::SIGN_EXTEND: return "sign_extend";
case ISD::ZERO_EXTEND: return "zero_extend";		case ISD::ZERO_EXTEND: return "zero_extend";
case ISD::ANY_EXTEND: return "any_extend";		case ISD::ANY_EXTEND: return "any_extend";
case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";		case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";
case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";		case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";
▲ Show 20 Lines • Show All 640 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 5,689 Lines • ▼ Show 20 Lines	if (Opcode == ISD::UADDSAT) {
Result = DAG.getSelect(dl, VT, SumNeg, SatMax, SatMin);		Result = DAG.getSelect(dl, VT, SumNeg, SatMax, SatMin);
return DAG.getSelect(dl, VT, Overflow, Result, SumDiff);		return DAG.getSelect(dl, VT, Overflow, Result, SumDiff);
}		}
}		}

SDValue		SDValue
TargetLowering::expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const {		TargetLowering::expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const {
assert((Node->getOpcode() == ISD::SMULFIX \|\|		assert((Node->getOpcode() == ISD::SMULFIX \|\|
Node->getOpcode() == ISD::UMULFIX) &&		Node->getOpcode() == ISD::UMULFIX \|\|
"Expected opcode to be SMULFIX or UMULFIX.");		Node->getOpcode() == ISD::SMULFIXSAT) &&
		"Expected a fixed point multiplication opcode");

SDLoc dl(Node);		SDLoc dl(Node);
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
EVT VT = LHS.getValueType();		EVT VT = LHS.getValueType();
unsigned Scale = Node->getConstantOperandVal(2);		unsigned Scale = Node->getConstantOperandVal(2);
		bool Saturating = Node->getOpcode() == ISD::SMULFIXSAT;
		EVT BoolVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);
		unsigned VTSize = VT.getScalarSizeInBits();

// [us]mul.fix(a, b, 0) -> mul(a, b)
if (!Scale) {		if (!Scale) {
if (VT.isVector() && !isOperationLegalOrCustom(ISD::MUL, VT))		// [us]mul.fix(a, b, 0) -> mul(a, b)
return SDValue();		if (!Saturating && isOperationLegalOrCustom(ISD::MUL, VT)) {
return DAG.getNode(ISD::MUL, dl, VT, LHS, RHS);		return DAG.getNode(ISD::MUL, dl, VT, LHS, RHS);
}		} else if (Saturating && isOperationLegalOrCustom(ISD::SMULO, VT)) {
		SDValue Result =
		DAG.getNode(ISD::SMULO, dl, DAG.getVTList(VT, BoolVT), LHS, RHS);
		SDValue Product = Result.getValue(0);
		SDValue Overflow = Result.getValue(1);
		SDValue Zero = DAG.getConstant(0, dl, VT);

unsigned VTSize = VT.getScalarSizeInBits();		APInt MinVal = APInt::getSignedMinValue(VTSize);
bool Signed = Node->getOpcode() == ISD::SMULFIX;		APInt MaxVal = APInt::getSignedMaxValue(VTSize);
		SDValue SatMin = DAG.getConstant(MinVal, dl, VT);
		SDValue SatMax = DAG.getConstant(MaxVal, dl, VT);
		SDValue ProdNeg = DAG.getSetCC(dl, BoolVT, Product, Zero, ISD::SETLT);
		Result = DAG.getSelect(dl, VT, ProdNeg, SatMax, SatMin);
		return DAG.getSelect(dl, VT, Overflow, Result, Product);
		}
		}

		bool Signed =
		Node->getOpcode() == ISD::SMULFIX \|\| Node->getOpcode() == ISD::SMULFIXSAT;
assert(((Signed && Scale < VTSize) \|\| (!Signed && Scale <= VTSize)) &&		assert(((Signed && Scale < VTSize) \|\| (!Signed && Scale <= VTSize)) &&
"Expected scale to be less than the number of bits if signed or at "		"Expected scale to be less than the number of bits if signed or at "
"most the number of bits if unsigned.");		"most the number of bits if unsigned.");
assert(LHS.getValueType() == RHS.getValueType() &&		assert(LHS.getValueType() == RHS.getValueType() &&
"Expected both operands to be the same type");		"Expected both operands to be the same type");

// Get the upper and lower bits of the result.		// Get the upper and lower bits of the result.
SDValue Lo, Hi;		SDValue Lo, Hi;
Show All 16 Lines	if (Scale == VTSize)
// Result is just the top half since we'd be shifting by the width of the		// Result is just the top half since we'd be shifting by the width of the
// operand.		// operand.
return Hi;		return Hi;

// The result will need to be shifted right by the scale since both operands		// The result will need to be shifted right by the scale since both operands
// are scaled. The result is given to us in 2 halves, so we only want part of		// are scaled. The result is given to us in 2 halves, so we only want part of
// both in the result.		// both in the result.
EVT ShiftTy = getShiftAmountTy(VT, DAG.getDataLayout());		EVT ShiftTy = getShiftAmountTy(VT, DAG.getDataLayout());
return DAG.getNode(ISD::FSHR, dl, VT, Hi, Lo,		SDValue Result = DAG.getNode(ISD::FSHR, dl, VT, Hi, Lo,
DAG.getConstant(Scale, dl, ShiftTy));		DAG.getConstant(Scale, dl, ShiftTy));
		if (!Saturating)
		return Result;

		unsigned OverflowBits = VTSize - Scale + 1; // +1 for the sign
		SDValue HiMask =
		DAG.getConstant(APInt::getHighBitsSet(VTSize, OverflowBits), dl, VT);
		SDValue LoMask = DAG.getConstant(
		APInt::getLowBitsSet(VTSize, VTSize - OverflowBits), dl, VT);
		APInt MaxVal = APInt::getSignedMaxValue(VTSize);
		APInt MinVal = APInt::getSignedMinValue(VTSize);

		Result = DAG.getSelectCC(dl, Hi, LoMask,
		DAG.getConstant(MaxVal, dl, VT), Result,
		ISD::SETGT);
		return DAG.getSelectCC(dl, Hi, HiMask,
		DAG.getConstant(MinVal, dl, VT), Result,
		ISD::SETLT);
}		}

void TargetLowering::expandUADDSUBO(		void TargetLowering::expandUADDSUBO(
SDNode *Node, SDValue &Result, SDValue &Overflow, SelectionDAG &DAG) const {		SDNode *Node, SDValue &Result, SDValue &Overflow, SelectionDAG &DAG) const {
SDLoc dl(Node);		SDLoc dl(Node);
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
bool IsAdd = Node->getOpcode() == ISD::UADDO;		bool IsAdd = Node->getOpcode() == ISD::UADDO;
▲ Show 20 Lines • Show All 270 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 617 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::all_valuetypes()) {
setOperationAction(ISD::ABS, VT, Expand);		setOperationAction(ISD::ABS, VT, Expand);
setOperationAction(ISD::FSHL, VT, Expand);		setOperationAction(ISD::FSHL, VT, Expand);
setOperationAction(ISD::FSHR, VT, Expand);		setOperationAction(ISD::FSHR, VT, Expand);
setOperationAction(ISD::SADDSAT, VT, Expand);		setOperationAction(ISD::SADDSAT, VT, Expand);
setOperationAction(ISD::UADDSAT, VT, Expand);		setOperationAction(ISD::UADDSAT, VT, Expand);
setOperationAction(ISD::SSUBSAT, VT, Expand);		setOperationAction(ISD::SSUBSAT, VT, Expand);
setOperationAction(ISD::USUBSAT, VT, Expand);		setOperationAction(ISD::USUBSAT, VT, Expand);
setOperationAction(ISD::SMULFIX, VT, Expand);		setOperationAction(ISD::SMULFIX, VT, Expand);
		setOperationAction(ISD::SMULFIXSAT, VT, Expand);
setOperationAction(ISD::UMULFIX, VT, Expand);		setOperationAction(ISD::UMULFIX, VT, Expand);

// Overflow operations default to expand		// Overflow operations default to expand
setOperationAction(ISD::SADDO, VT, Expand);		setOperationAction(ISD::SADDO, VT, Expand);
setOperationAction(ISD::SSUBO, VT, Expand);		setOperationAction(ISD::SSUBO, VT, Expand);
setOperationAction(ISD::UADDO, VT, Expand);		setOperationAction(ISD::UADDO, VT, Expand);
setOperationAction(ISD::USUBO, VT, Expand);		setOperationAction(ISD::USUBO, VT, Expand);
setOperationAction(ISD::SMULO, VT, Expand);		setOperationAction(ISD::SMULO, VT, Expand);
▲ Show 20 Lines • Show All 1,261 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,589 Lines • ▼ Show 20 Lines	Assert(Op1->getType()->isIntOrIntVectorTy(),
"first operand of [us][add\|sub]_sat must be an int type or vector "		"first operand of [us][add\|sub]_sat must be an int type or vector "
"of ints");		"of ints");
Assert(Op2->getType()->isIntOrIntVectorTy(),		Assert(Op2->getType()->isIntOrIntVectorTy(),
"second operand of [us][add\|sub]_sat must be an int type or vector "		"second operand of [us][add\|sub]_sat must be an int type or vector "
"of ints");		"of ints");
break;		break;
}		}
case Intrinsic::smul_fix:		case Intrinsic::smul_fix:
		case Intrinsic::smul_fix_sat:
case Intrinsic::umul_fix: {		case Intrinsic::umul_fix: {
Value *Op1 = Call.getArgOperand(0);		Value *Op1 = Call.getArgOperand(0);
Value *Op2 = Call.getArgOperand(1);		Value *Op2 = Call.getArgOperand(1);
Assert(Op1->getType()->isIntOrIntVectorTy(),		Assert(Op1->getType()->isIntOrIntVectorTy(),
"first operand of [us]mul_fix must be an int type or vector "		"first operand of [us]mul_fix[_sat] must be an int type or vector "
"of ints");		"of ints");
Assert(Op2->getType()->isIntOrIntVectorTy(),		Assert(Op2->getType()->isIntOrIntVectorTy(),
"second operand of [us]mul_fix must be an int type or vector "		"second operand of [us]mul_fix_[sat] must be an int type or vector "
"of ints");		"of ints");

auto *Op3 = cast<ConstantInt>(Call.getArgOperand(2));		auto *Op3 = cast<ConstantInt>(Call.getArgOperand(2));
Assert(Op3->getType()->getBitWidth() <= 32,		Assert(Op3->getType()->getBitWidth() <= 32,
"third argument of [us]mul_fix must fit within 32 bits");		"third argument of [us]mul_fix[_sat] must fit within 32 bits");

if (ID == Intrinsic::smul_fix) {		if (ID == Intrinsic::smul_fix \|\| ID == Intrinsic::smul_fix_sat) {
Assert(		Assert(
Op3->getZExtValue() < Op1->getType()->getScalarSizeInBits(),		Op3->getZExtValue() < Op1->getType()->getScalarSizeInBits(),
"the scale of smul_fix must be less than the width of the operands");		"the scale of smul_fix[_sat] must be less than the width of the operands");
} else {		} else {
Assert(Op3->getZExtValue() <= Op1->getType()->getScalarSizeInBits(),		Assert(Op3->getZExtValue() <= Op1->getType()->getScalarSizeInBits(),
"the scale of umul_fix must be less than or equal to the width of "		"the scale of umul_fix[_sat] must be less than or equal to the width of "
"the operands");		"the operands");
}		}
break;		break;
}		}
case Intrinsic::lround:		case Intrinsic::lround:
case Intrinsic::llround: {		case Intrinsic::llround: {
Type *ValTy = Call.getArgOperand(0)->getType();		Type *ValTy = Call.getArgOperand(0)->getType();
Type *ResultTy = Call.getType();		Type *ResultTy = Call.getType();
▲ Show 20 Lines • Show All 780 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/smul_fix_sat.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s --check-prefix=X64
				; RUN: llc < %s -mtriple=i686 -mattr=cmov \| FileCheck %s --check-prefix=X86

				declare i4 @llvm.smul.fix.sat.i4 (i4, i4, i32)
				declare i32 @llvm.smul.fix.sat.i32 (i32, i32, i32)
				declare i64 @llvm.smul.fix.sat.i64 (i64, i64, i32)
				declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32>, <4 x i32>, i32)

				define i32 @func(i32 %x, i32 %y) nounwind {
				; X64-LABEL: func:
				; X64: # %bb.0:
				; X64-NEXT: movslq %esi, %rax
				; X64-NEXT: movslq %edi, %rcx
				; X64-NEXT: imulq %rax, %rcx
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: shrq $32, %rax
				; X64-NEXT: shrdl $2, %eax, %ecx
				; X64-NEXT: cmpl $1, %eax
				; X64-NEXT: movl $2147483647, %edx # imm = 0x7FFFFFFF
				; X64-NEXT: cmovlel %ecx, %edx
				; X64-NEXT: cmpl $-2, %eax
				; X64-NEXT: movl $-2147483648, %eax # imm = 0x80000000
				; X64-NEXT: cmovgel %edx, %eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func:
				; X86: # %bb.0:
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: imull {{[0-9]+}}(%esp)
				; X86-NEXT: shrdl $2, %edx, %eax
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $2147483647, %ecx # imm = 0x7FFFFFFF
				; X86-NEXT: cmovgl %ecx, %eax
				; X86-NEXT: cmpl $-2, %edx
				; X86-NEXT: movl $-2147483648, %ecx # imm = 0x80000000
				; X86-NEXT: cmovll %ecx, %eax
				; X86-NEXT: retl
				%tmp = call i32 @llvm.smul.fix.sat.i32(i32 %x, i32 %y, i32 2);
				ret i32 %tmp;
				}

				define i64 @func2(i64 %x, i64 %y) nounwind {
				; X64-LABEL: func2:
				; X64: # %bb.0:
				; X64-NEXT: movq %rdi, %rax
				; X64-NEXT: imulq %rsi
				; X64-NEXT: shrdq $2, %rdx, %rax
				; X64-NEXT: cmpq $1, %rdx
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: cmovgq %rcx, %rax
				; X64-NEXT: cmpq $-2, %rdx
				; X64-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
				; X64-NEXT: cmovlq %rcx, %rax
				; X64-NEXT: retq
				;
				; X86-LABEL: func2:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: subl $8, %esp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: mull %esi
				; X86-NEXT: movl %edx, %edi
				; X86-NEXT: movl %eax, %ebx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X86-NEXT: movl %edx, %ebp
				; X86-NEXT: addl %ebx, %ebp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-NEXT: adcl $0, %edi
				; X86-NEXT: movl %ebx, %eax
				; X86-NEXT: imull %esi
				; X86-NEXT: movl %edx, %ecx
				; X86-NEXT: movl %eax, %esi
				; X86-NEXT: movl %ebx, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: addl %ebp, %eax
				; X86-NEXT: adcl %edi, %edx
				; X86-NEXT: adcl $0, %ecx
				; X86-NEXT: addl %esi, %edx
				; X86-NEXT: adcl $0, %ecx
				; X86-NEXT: movl %edx, %esi
				; X86-NEXT: subl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %ecx, %edi
				; X86-NEXT: sbbl $0, %edi
				; X86-NEXT: testl %ebx, %ebx
				; X86-NEXT: cmovnsl %ecx, %edi
				; X86-NEXT: cmovnsl %edx, %esi
				; X86-NEXT: movl %esi, %ecx
				; X86-NEXT: subl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl %edi, %ebp
				; X86-NEXT: sbbl $0, %ebp
				; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)
				; X86-NEXT: cmovnsl %edi, %ebp
				; X86-NEXT: cmovnsl %esi, %ecx
				; X86-NEXT: testl %ebp, %ebp
				; X86-NEXT: setg %bh
				; X86-NEXT: sete {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Spill
				; X86-NEXT: cmpl $1, %ecx
				; X86-NEXT: seta %bl
				; X86-NEXT: movl %ecx, %edx
				; X86-NEXT: shldl $30, %eax, %edx
				; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %esi # 4-byte Reload
				; X86-NEXT: shldl $30, %esi, %eax
				; X86-NEXT: andb {{[-0-9]+}}(%e{{[sb]}}p), %bl # 1-byte Folded Reload
				; X86-NEXT: orb %bh, %bl
				; X86-NEXT: testb %bl, %bl
				; X86-NEXT: movl $2147483647, %esi # imm = 0x7FFFFFFF
				; X86-NEXT: cmovnel %esi, %edx
				; X86-NEXT: movl $-1, %esi
				; X86-NEXT: cmovnel %esi, %eax
				; X86-NEXT: cmpl $-1, %ebp
				; X86-NEXT: setl %bl
				; X86-NEXT: sete %bh
				; X86-NEXT: cmpl $-2, %ecx
				; X86-NEXT: setb %cl
				; X86-NEXT: andb %bh, %cl
				; X86-NEXT: xorl %esi, %esi
				; X86-NEXT: orb %bl, %cl
				; X86-NEXT: cmovnel %esi, %eax
				; X86-NEXT: movl $-2147483648, %ecx # imm = 0x80000000
				; X86-NEXT: cmovnel %ecx, %edx
				; X86-NEXT: addl $8, %esp
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 %x, i64 %y, i32 2);
				ret i64 %tmp;
				}

				define i4 @func3(i4 %x, i4 %y) nounwind {
				; X64-LABEL: func3:
				; X64: # %bb.0:
				; X64-NEXT: shlb $4, %sil
				; X64-NEXT: sarb $4, %sil
				; X64-NEXT: shlb $4, %dil
				; X64-NEXT: movsbl %dil, %eax
				; X64-NEXT: movsbl %sil, %ecx
				; X64-NEXT: imull %eax, %ecx
				; X64-NEXT: movl %ecx, %eax
				; X64-NEXT: shrb $2, %al
				; X64-NEXT: shrl $8, %ecx
				; X64-NEXT: movl %ecx, %edx
				; X64-NEXT: shlb $6, %dl
				; X64-NEXT: orb %al, %dl
				; X64-NEXT: movzbl %dl, %eax
				; X64-NEXT: cmpb $1, %cl
				; X64-NEXT: movl $127, %edx
				; X64-NEXT: cmovlel %eax, %edx
				; X64-NEXT: cmpb $-2, %cl
				; X64-NEXT: movl $128, %eax
				; X64-NEXT: cmovgel %edx, %eax
				; X64-NEXT: sarb $4, %al
				; X64-NEXT: # kill: def $al killed $al killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func3:
				; X86: # %bb.0:
				; X86-NEXT: movb {{[0-9]+}}(%esp), %al
				; X86-NEXT: shlb $4, %al
				; X86-NEXT: sarb $4, %al
				; X86-NEXT: movb {{[0-9]+}}(%esp), %cl
				; X86-NEXT: shlb $4, %cl
				; X86-NEXT: movsbl %cl, %ecx
				; X86-NEXT: movsbl %al, %eax
				; X86-NEXT: imull %ecx, %eax
				; X86-NEXT: movb %ah, %cl
				; X86-NEXT: shlb $6, %cl
				; X86-NEXT: shrb $2, %al
				; X86-NEXT: orb %cl, %al
				; X86-NEXT: movzbl %al, %ecx
				; X86-NEXT: cmpb $1, %ah
				; X86-NEXT: movl $127, %edx
				; X86-NEXT: cmovlel %ecx, %edx
				; X86-NEXT: cmpb $-2, %ah
				; X86-NEXT: movl $128, %eax
				; X86-NEXT: cmovgel %edx, %eax
				; X86-NEXT: sarb $4, %al
				; X86-NEXT: # kill: def $al killed $al killed $eax
				; X86-NEXT: retl
				%tmp = call i4 @llvm.smul.fix.sat.i4(i4 %x, i4 %y, i32 2);
				ret i4 %tmp;
				}

				define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {
				; X64-LABEL: vec:
				; X64: # %bb.0:
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm1[3,1,2,3]
				; X64-NEXT: movd %xmm2, %eax
				; X64-NEXT: cltq
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm0[3,1,2,3]
				; X64-NEXT: movd %xmm2, %ecx
				; X64-NEXT: movslq %ecx, %rdx
				; X64-NEXT: imulq %rax, %rdx
				; X64-NEXT: movq %rdx, %rcx
				; X64-NEXT: shrq $32, %rcx
				; X64-NEXT: shrdl $2, %ecx, %edx
				; X64-NEXT: cmpl $1, %ecx
				; X64-NEXT: movl $2147483647, %eax # imm = 0x7FFFFFFF
				; X64-NEXT: cmovgl %eax, %edx
				; X64-NEXT: cmpl $-2, %ecx
				; X64-NEXT: movl $-2147483648, %ecx # imm = 0x80000000
				; X64-NEXT: cmovll %ecx, %edx
				; X64-NEXT: movd %edx, %xmm2
				; X64-NEXT: pshufd {{.*#+}} xmm3 = xmm1[2,3,0,1]
				; X64-NEXT: movd %xmm3, %edx
				; X64-NEXT: movslq %edx, %rdx
				; X64-NEXT: pshufd {{.*#+}} xmm3 = xmm0[2,3,0,1]
				; X64-NEXT: movd %xmm3, %esi
				; X64-NEXT: movslq %esi, %rsi
				; X64-NEXT: imulq %rdx, %rsi
				; X64-NEXT: movq %rsi, %rdx
				; X64-NEXT: shrq $32, %rdx
				; X64-NEXT: shrdl $2, %edx, %esi
				; X64-NEXT: cmpl $1, %edx
				; X64-NEXT: cmovgl %eax, %esi
				; X64-NEXT: cmpl $-2, %edx
				; X64-NEXT: cmovll %ecx, %esi
				; X64-NEXT: movd %esi, %xmm3
				; X64-NEXT: punpckldq {{.*#+}} xmm3 = xmm3[0],xmm2[0],xmm3[1],xmm2[1]
				; X64-NEXT: movd %xmm1, %edx
				; X64-NEXT: movslq %edx, %rdx
				; X64-NEXT: movd %xmm0, %esi
				; X64-NEXT: movslq %esi, %rsi
				; X64-NEXT: imulq %rdx, %rsi
				; X64-NEXT: movq %rsi, %rdx
				; X64-NEXT: shrq $32, %rdx
				; X64-NEXT: shrdl $2, %edx, %esi
				; X64-NEXT: cmpl $1, %edx
				; X64-NEXT: cmovgl %eax, %esi
				; X64-NEXT: cmpl $-2, %edx
				; X64-NEXT: cmovll %ecx, %esi
				; X64-NEXT: movd %esi, %xmm2
				; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,2,3]
				; X64-NEXT: movd %xmm1, %edx
				; X64-NEXT: movslq %edx, %rdx
				; X64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,2,3]
				; X64-NEXT: movd %xmm0, %esi
				; X64-NEXT: movslq %esi, %rsi
				; X64-NEXT: imulq %rdx, %rsi
				; X64-NEXT: movq %rsi, %rdx
				; X64-NEXT: shrq $32, %rdx
				; X64-NEXT: shrdl $2, %edx, %esi
				; X64-NEXT: cmpl $1, %edx
				; X64-NEXT: cmovgl %eax, %esi
				; X64-NEXT: cmpl $-2, %edx
				; X64-NEXT: cmovll %ecx, %esi
				; X64-NEXT: movd %esi, %xmm0
				; X64-NEXT: punpckldq {{.*#+}} xmm2 = xmm2[0],xmm0[0],xmm2[1],xmm0[1]
				; X64-NEXT: punpcklqdq {{.*#+}} xmm2 = xmm2[0],xmm3[0]
				; X64-NEXT: movdqa %xmm2, %xmm0
				; X64-NEXT: retq
				;
				; X86-LABEL: vec:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: imull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %eax, %ecx
				; X86-NEXT: shrdl $2, %edx, %ecx
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: movl $2147483647, %ebp # imm = 0x7FFFFFFF
				; X86-NEXT: cmovgl %ebp, %ecx
				; X86-NEXT: cmpl $-2, %edx
				; X86-NEXT: movl $-2147483648, %esi # imm = 0x80000000
				; X86-NEXT: cmovll %esi, %ecx
				; X86-NEXT: movl %edi, %eax
				; X86-NEXT: imull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %eax, %edi
				; X86-NEXT: shrdl $2, %edx, %edi
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: cmovgl %ebp, %edi
				; X86-NEXT: cmpl $-2, %edx
				; X86-NEXT: cmovll %esi, %edi
				; X86-NEXT: movl %ebx, %eax
				; X86-NEXT: imull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %eax, %ebx
				; X86-NEXT: shrdl $2, %edx, %ebx
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: cmovgl %ebp, %ebx
				; X86-NEXT: cmpl $-2, %edx
				; X86-NEXT: cmovll %esi, %ebx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: imull {{[0-9]+}}(%esp)
				; X86-NEXT: shrdl $2, %edx, %eax
				; X86-NEXT: cmpl $1, %edx
				; X86-NEXT: cmovgl %ebp, %eax
				; X86-NEXT: cmpl $-2, %edx
				; X86-NEXT: cmovll %esi, %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: movl %eax, 12(%edx)
				; X86-NEXT: movl %ebx, 8(%edx)
				; X86-NEXT: movl %edi, 4(%edx)
				; X86-NEXT: movl %ecx, (%edx)
				; X86-NEXT: movl %edx, %eax
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl $4
				%tmp = call <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %x, <4 x i32> %y, i32 2);
				ret <4 x i32> %tmp;
				}

				; These result in regular integer multiplication
				define i32 @func4(i32 %x, i32 %y) nounwind {
				; X64-LABEL: func4:
				; X64: # %bb.0:
				; X64-NEXT: movl %edi, %ecx
				; X64-NEXT: imull %esi, %ecx
				; X64-NEXT: xorl %eax, %eax
				; X64-NEXT: testl %ecx, %ecx
				; X64-NEXT: setns %al
				; X64-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
				; X64-NEXT: imull %esi, %edi
				; X64-NEXT: cmovnol %edi, %eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func4:
				; X86: # %bb.0:
				; X86-NEXT: pushl %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: movl %eax, %esi
				; X86-NEXT: imull %edx, %esi
				; X86-NEXT: xorl %ecx, %ecx
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: setns %cl
				; X86-NEXT: addl $2147483647, %ecx # imm = 0x7FFFFFFF
				; X86-NEXT: imull %edx, %eax
				; X86-NEXT: cmovol %ecx, %eax
				; X86-NEXT: popl %esi
				; X86-NEXT: retl
				%tmp = call i32 @llvm.smul.fix.sat.i32(i32 %x, i32 %y, i32 0);
				ret i32 %tmp;
				}

				define i64 @func5(i64 %x, i64 %y) {
				; X64-LABEL: func5:
				; X64: # %bb.0:
				; X64-NEXT: movq %rdi, %rax
				; X64-NEXT: imulq %rsi, %rax
				; X64-NEXT: xorl %ecx, %ecx
				; X64-NEXT: testq %rax, %rax
				; X64-NEXT: setns %cl
				; X64-NEXT: movabsq $9223372036854775807, %rax # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: addq %rcx, %rax
				; X64-NEXT: imulq %rsi, %rdi
				; X64-NEXT: cmovnoq %rdi, %rax
				; X64-NEXT: retq
				;
				; X86-LABEL: func5:
				; X86: # %bb.0:
				; X86-NEXT: pushl %edi
				; X86-NEXT: .cfi_def_cfa_offset 8
				; X86-NEXT: pushl %esi
				; X86-NEXT: .cfi_def_cfa_offset 12
				; X86-NEXT: pushl %eax
				; X86-NEXT: .cfi_def_cfa_offset 16
				; X86-NEXT: .cfi_offset %esi, -12
				; X86-NEXT: .cfi_offset %edi, -8
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl $0, (%esp)
				; X86-NEXT: movl %esp, %edi
				; X86-NEXT: pushl %edi
				; X86-NEXT: .cfi_adjust_cfa_offset 4
				; X86-NEXT: pushl %esi
				; X86-NEXT: .cfi_adjust_cfa_offset 4
				; X86-NEXT: pushl %edx
				; X86-NEXT: .cfi_adjust_cfa_offset 4
				; X86-NEXT: pushl %ecx
				; X86-NEXT: .cfi_adjust_cfa_offset 4
				; X86-NEXT: pushl %eax
				; X86-NEXT: .cfi_adjust_cfa_offset 4
				; X86-NEXT: calll __mulodi4
				; X86-NEXT: addl $20, %esp
				; X86-NEXT: .cfi_adjust_cfa_offset -20
				; X86-NEXT: xorl %ecx, %ecx
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: setns %cl
				; X86-NEXT: addl $2147483647, %ecx # imm = 0x7FFFFFFF
				; X86-NEXT: movl %edx, %esi
				; X86-NEXT: sarl $31, %esi
				; X86-NEXT: cmpl $0, (%esp)
				; X86-NEXT: cmovnel %esi, %eax
				; X86-NEXT: cmovnel %ecx, %edx
				; X86-NEXT: addl $4, %esp
				; X86-NEXT: .cfi_def_cfa_offset 12
				; X86-NEXT: popl %esi
				; X86-NEXT: .cfi_def_cfa_offset 8
				; X86-NEXT: popl %edi
				; X86-NEXT: .cfi_def_cfa_offset 4
				; X86-NEXT: retl
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 %x, i64 %y, i32 0);
				ret i64 %tmp;
				}

				define i4 @func6(i4 %x, i4 %y) nounwind {
				; X64-LABEL: func6:
				; X64: # %bb.0:
				; X64-NEXT: movl %edi, %eax
				; X64-NEXT: shlb $4, %sil
				; X64-NEXT: sarb $4, %sil
				; X64-NEXT: shlb $4, %al
				; X64-NEXT: # kill: def $al killed $al killed $eax
				; X64-NEXT: imulb %sil
				; X64-NEXT: seto %cl
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: testb %al, %al
				; X64-NEXT: setns %dl
				; X64-NEXT: addl $127, %edx
				; X64-NEXT: movzbl %al, %eax
				; X64-NEXT: testb %cl, %cl
				; X64-NEXT: cmovnel %edx, %eax
				; X64-NEXT: sarb $4, %al
				; X64-NEXT: # kill: def $al killed $al killed $eax
				; X64-NEXT: retq
				;
				; X86-LABEL: func6:
				; X86: # %bb.0:
				; X86-NEXT: movb {{[0-9]+}}(%esp), %cl
				; X86-NEXT: shlb $4, %cl
				; X86-NEXT: sarb $4, %cl
				; X86-NEXT: movb {{[0-9]+}}(%esp), %al
				; X86-NEXT: shlb $4, %al
				; X86-NEXT: imulb %cl
				; X86-NEXT: seto %dl
				; X86-NEXT: xorl %ecx, %ecx
				; X86-NEXT: testb %al, %al
				; X86-NEXT: setns %cl
				; X86-NEXT: addl $127, %ecx
				; X86-NEXT: movzbl %al, %eax
				; X86-NEXT: testb %dl, %dl
				; X86-NEXT: cmovnel %ecx, %eax
				; X86-NEXT: sarb $4, %al
				; X86-NEXT: # kill: def $al killed $al killed $eax
				; X86-NEXT: retl
				%tmp = call i4 @llvm.smul.fix.sat.i4(i4 %x, i4 %y, i32 0);
				ret i4 %tmp;
				}

				define <4 x i32> @vec2(<4 x i32> %x, <4 x i32> %y) nounwind {
				; X64-LABEL: vec2:
				; X64: # %bb.0:
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm1[1,1,2,3]
				; X64-NEXT: movd %xmm2, %ecx
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm0[1,1,2,3]
				; X64-NEXT: movd %xmm2, %r8d
				; X64-NEXT: movl %r8d, %edx
				; X64-NEXT: imull %ecx, %edx
				; X64-NEXT: xorl %esi, %esi
				; X64-NEXT: testl %edx, %edx
				; X64-NEXT: setns %sil
				; X64-NEXT: addl $2147483647, %esi # imm = 0x7FFFFFFF
				; X64-NEXT: imull %ecx, %r8d
				; X64-NEXT: cmovol %esi, %r8d
				; X64-NEXT: movd %xmm1, %edx
				; X64-NEXT: movd %xmm0, %ecx
				; X64-NEXT: movl %ecx, %esi
				; X64-NEXT: imull %edx, %esi
				; X64-NEXT: xorl %edi, %edi
				; X64-NEXT: testl %esi, %esi
				; X64-NEXT: setns %dil
				; X64-NEXT: addl $2147483647, %edi # imm = 0x7FFFFFFF
				; X64-NEXT: imull %edx, %ecx
				; X64-NEXT: cmovol %edi, %ecx
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm1[2,3,0,1]
				; X64-NEXT: movd %xmm2, %edx
				; X64-NEXT: pshufd {{.*#+}} xmm2 = xmm0[2,3,0,1]
				; X64-NEXT: movd %xmm2, %esi
				; X64-NEXT: movl %esi, %edi
				; X64-NEXT: imull %edx, %edi
				; X64-NEXT: xorl %eax, %eax
				; X64-NEXT: testl %edi, %edi
				; X64-NEXT: setns %al
				; X64-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
				; X64-NEXT: imull %edx, %esi
				; X64-NEXT: cmovol %eax, %esi
				; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[3,1,2,3]
				; X64-NEXT: movd %xmm1, %r9d
				; X64-NEXT: pshufd {{.*#+}} xmm0 = xmm0[3,1,2,3]
				; X64-NEXT: movd %xmm0, %edx
				; X64-NEXT: movl %edx, %edi
				; X64-NEXT: imull %r9d, %edi
				; X64-NEXT: xorl %eax, %eax
				; X64-NEXT: testl %edi, %edi
				; X64-NEXT: setns %al
				; X64-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
				; X64-NEXT: imull %r9d, %edx
				; X64-NEXT: cmovol %eax, %edx
				; X64-NEXT: movd %edx, %xmm0
				; X64-NEXT: movd %esi, %xmm1
				; X64-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
				; X64-NEXT: movd %ecx, %xmm0
				; X64-NEXT: movd %r8d, %xmm2
				; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1]
				; X64-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
				; X64-NEXT: retq
				;
				; X86-LABEL: vec2:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: movl %ecx, %esi
				; X86-NEXT: imull %edx, %esi
				; X86-NEXT: xorl %eax, %eax
				; X86-NEXT: testl %esi, %esi
				; X86-NEXT: setns %al
				; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
				; X86-NEXT: imull %edx, %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: cmovol %eax, %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %edx, %edi
				; X86-NEXT: imull %esi, %edi
				; X86-NEXT: xorl %eax, %eax
				; X86-NEXT: testl %edi, %edi
				; X86-NEXT: setns %al
				; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
				; X86-NEXT: imull %esi, %edx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: cmovol %eax, %edx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: movl %esi, %ebx
				; X86-NEXT: imull %edi, %ebx
				; X86-NEXT: xorl %eax, %eax
				; X86-NEXT: testl %ebx, %ebx
				; X86-NEXT: setns %al
				; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
				; X86-NEXT: imull %edi, %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: cmovol %eax, %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: movl %edi, %ebp
				; X86-NEXT: imull %eax, %ebp
				; X86-NEXT: xorl %ebx, %ebx
				; X86-NEXT: testl %ebp, %ebp
				; X86-NEXT: setns %bl
				; X86-NEXT: addl $2147483647, %ebx # imm = 0x7FFFFFFF
				; X86-NEXT: imull %eax, %edi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X86-NEXT: cmovol %ebx, %edi
				; X86-NEXT: movl %ecx, 12(%eax)
				; X86-NEXT: movl %edx, 8(%eax)
				; X86-NEXT: movl %esi, 4(%eax)
				; X86-NEXT: movl %edi, (%eax)
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl $4
				%tmp = call <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %x, <4 x i32> %y, i32 0);
				ret <4 x i32> %tmp;
				}

				define i64 @func7(i64 %x, i64 %y) nounwind {
				; X64-LABEL: func7:
				; X64: # %bb.0:
				; X64-NEXT: movq %rdi, %rax
				; X64-NEXT: imulq %rsi
				; X64-NEXT: shrdq $32, %rdx, %rax
				; X64-NEXT: cmpq $2147483647, %rdx # imm = 0x7FFFFFFF
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: cmovgq %rcx, %rax
				; X64-NEXT: cmpq $-2147483648, %rdx # imm = 0x80000000
				; X64-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
				; X64-NEXT: cmovlq %rcx, %rax
				; X64-NEXT: retq
				;
				; X86-LABEL: func7:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %edx, %edi
				; X86-NEXT: movl %eax, %ebx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: addl %edx, %ebx
				; X86-NEXT: adcl $0, %edi
				; X86-NEXT: movl %esi, %eax
				; X86-NEXT: imull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %edx, %ebp
				; X86-NEXT: movl %eax, %ecx
				; X86-NEXT: movl %esi, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: addl %ebx, %eax
				; X86-NEXT: adcl %edi, %edx
				; X86-NEXT: adcl $0, %ebp
				; X86-NEXT: addl %ecx, %edx
				; X86-NEXT: adcl $0, %ebp
				; X86-NEXT: movl %edx, %ecx
				; X86-NEXT: subl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl %ebp, %esi
				; X86-NEXT: sbbl $0, %esi
				; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)
				; X86-NEXT: cmovnsl %ebp, %esi
				; X86-NEXT: cmovnsl %edx, %ecx
				; X86-NEXT: movl %ecx, %edx
				; X86-NEXT: subl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: movl %esi, %edi
				; X86-NEXT: sbbl $0, %edi
				; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)
				; X86-NEXT: cmovnsl %esi, %edi
				; X86-NEXT: cmovnsl %ecx, %edx
				; X86-NEXT: testl %edx, %edx
				; X86-NEXT: setg %cl
				; X86-NEXT: sets %ch
				; X86-NEXT: testl %edi, %edi
				; X86-NEXT: setg %bl
				; X86-NEXT: sete %bh
				; X86-NEXT: andb %ch, %bh
				; X86-NEXT: orb %bl, %bh
				; X86-NEXT: movl $2147483647, %esi # imm = 0x7FFFFFFF
				; X86-NEXT: cmovnel %esi, %edx
				; X86-NEXT: movl $-1, %esi
				; X86-NEXT: cmovnel %esi, %eax
				; X86-NEXT: cmpl $-1, %edi
				; X86-NEXT: setl %ch
				; X86-NEXT: sete %bl
				; X86-NEXT: andb %cl, %bl
				; X86-NEXT: xorl %esi, %esi
				; X86-NEXT: orb %ch, %bl
				; X86-NEXT: cmovnel %esi, %eax
				; X86-NEXT: movl $-2147483648, %ecx # imm = 0x80000000
				; X86-NEXT: cmovnel %ecx, %edx
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 %x, i64 %y, i32 32);
				ret i64 %tmp;
				}

				define i64 @func8(i64 %x, i64 %y) nounwind {
				; X64-LABEL: func8:
				; X64: # %bb.0:
				; X64-NEXT: movq %rdi, %rax
				; X64-NEXT: imulq %rsi
				; X64-NEXT: shrdq $63, %rdx, %rax
				; X64-NEXT: movabsq $4611686018427387903, %rcx # imm = 0x3FFFFFFFFFFFFFFF
				; X64-NEXT: cmpq %rcx, %rdx
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: cmovgq %rcx, %rax
				; X64-NEXT: movabsq $-4611686018427387904, %rcx # imm = 0xC000000000000000
				; X64-NEXT: cmpq %rcx, %rdx
				; X64-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
				; X64-NEXT: cmovlq %rcx, %rax
				; X64-NEXT: retq
				;
				; X86-LABEL: func8:
				; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: pushl %ebx
				; X86-NEXT: pushl %edi
				; X86-NEXT: pushl %esi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %edx, %edi
				; X86-NEXT: movl %eax, %ebx
				; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %edx, %ebp
				; X86-NEXT: addl %ebx, %ebp
				; X86-NEXT: adcl $0, %edi
				; X86-NEXT: movl %esi, %eax
				; X86-NEXT: imull {{[0-9]+}}(%esp)
				; X86-NEXT: movl %edx, %ebx
				; X86-NEXT: movl %eax, %ecx
				; X86-NEXT: movl %esi, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
				; X86-NEXT: addl %ebp, %eax
				; X86-NEXT: adcl %edi, %edx
				; X86-NEXT: adcl $0, %ebx
				; X86-NEXT: addl %ecx, %edx
				; X86-NEXT: adcl $0, %ebx
				; X86-NEXT: movl %edx, %ecx
				; X86-NEXT: subl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl %ebx, %esi
				; X86-NEXT: sbbl $0, %esi
				; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)
				; X86-NEXT: cmovnsl %ebx, %esi
				; X86-NEXT: cmovnsl %edx, %ecx
				; X86-NEXT: movl %ecx, %edi
				; X86-NEXT: subl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: movl %esi, %ebx
				; X86-NEXT: sbbl $0, %ebx
				; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)
				; X86-NEXT: cmovnsl %esi, %ebx
				; X86-NEXT: cmovnsl %ecx, %edi
				; X86-NEXT: movl %ebx, %edx
				; X86-NEXT: shldl $1, %edi, %edx
				; X86-NEXT: shrdl $31, %edi, %eax
				; X86-NEXT: cmpl $1073741823, %ebx # imm = 0x3FFFFFFF
				; X86-NEXT: movl $2147483647, %ecx # imm = 0x7FFFFFFF
				; X86-NEXT: cmovgl %ecx, %edx
				; X86-NEXT: movl $-1, %ecx
				; X86-NEXT: cmovgl %ecx, %eax
				; X86-NEXT: xorl %ecx, %ecx
				; X86-NEXT: cmpl $-1073741824, %ebx # imm = 0xC0000000
				; X86-NEXT: cmovll %ecx, %eax
				; X86-NEXT: movl $-2147483648, %ecx # imm = 0x80000000
				; X86-NEXT: cmovll %ecx, %edx
				; X86-NEXT: popl %esi
				; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebx
				; X86-NEXT: popl %ebp
				; X86-NEXT: retl
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 %x, i64 %y, i32 63);
				ret i64 %tmp;
				}

llvm/trunk/test/CodeGen/X86/smul_fix_sat_constants.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s --check-prefix=X64

				; Verify expansion by using constant values. We just want to cover all the paths layed out by ExpandIntRes_MULFIX.

				declare i4 @llvm.smul.fix.sat.i4 (i4, i4, i32)
				declare i32 @llvm.smul.fix.sat.i32 (i32, i32, i32)
				declare i64 @llvm.smul.fix.sat.i64 (i64, i64, i32)
				declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32>, <4 x i32>, i32)
				declare { i64, i1 } @llvm.smul.with.overflow.i64(i64, i64)

				define i64 @func() nounwind {
				; X64-LABEL: func:
				; X64: # %bb.0:
				; X64-NEXT: movl $2, %ecx
				; X64-NEXT: movl $3, %eax
				; X64-NEXT: imulq %rcx
				; X64-NEXT: shrdq $2, %rdx, %rax
				; X64-NEXT: cmpq $1, %rdx
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: cmovgq %rcx, %rax
				; X64-NEXT: cmpq $-2, %rdx
				; X64-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
				; X64-NEXT: cmovlq %rcx, %rax
				; X64-NEXT: retq
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 3, i64 2, i32 2);
				ret i64 %tmp;
				}

				define i64 @func2() nounwind {
				; X64-LABEL: func2:
				; X64: # %bb.0:
				; X64-NEXT: movl $3, %eax
				; X64-NEXT: imulq $2, %rax, %rcx
				; X64-NEXT: xorl %edx, %edx
				; X64-NEXT: testq %rcx, %rcx
				; X64-NEXT: setns %dl
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: addq %rdx, %rcx
				; X64-NEXT: imulq $2, %rax, %rax
				; X64-NEXT: cmovoq %rcx, %rax
				; X64-NEXT: retq
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 3, i64 2, i32 0);
				ret i64 %tmp;
				}

				define i64 @func3() nounwind {
				; X64-LABEL: func3:
				; X64: # %bb.0:
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: movl $2, %edx
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: imulq %rdx
				; X64-NEXT: shrdq $2, %rdx, %rax
				; X64-NEXT: cmpq $1, %rdx
				; X64-NEXT: cmovgq %rcx, %rax
				; X64-NEXT: cmpq $-2, %rdx
				; X64-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
				; X64-NEXT: cmovlq %rcx, %rax
				; X64-NEXT: retq
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 9223372036854775807, i64 2, i32 2);
				ret i64 %tmp;
				}

				define i64 @func4() nounwind {
				; X64-LABEL: func4:
				; X64: # %bb.0:
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: movl $2, %edx
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: imulq %rdx
				; X64-NEXT: shrdq $32, %rdx, %rax
				; X64-NEXT: cmpq $2147483647, %rdx # imm = 0x7FFFFFFF
				; X64-NEXT: cmovgq %rcx, %rax
				; X64-NEXT: cmpq $-2147483648, %rdx # imm = 0x80000000
				; X64-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
				; X64-NEXT: cmovlq %rcx, %rax
				; X64-NEXT: retq
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 9223372036854775807, i64 2, i32 32);
				ret i64 %tmp;
				}

				define i64 @func5() nounwind {
				; X64-LABEL: func5:
				; X64: # %bb.0:
				; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
				; X64-NEXT: movl $2, %edx
				; X64-NEXT: movq %rcx, %rax
				; X64-NEXT: imulq %rdx
				; X64-NEXT: shrdq $63, %rdx, %rax
				; X64-NEXT: movabsq $4611686018427387903, %rsi # imm = 0x3FFFFFFFFFFFFFFF
				; X64-NEXT: cmpq %rsi, %rdx
				; X64-NEXT: cmovgq %rcx, %rax
				; X64-NEXT: movabsq $-4611686018427387904, %rcx # imm = 0xC000000000000000
				; X64-NEXT: cmpq %rcx, %rdx
				; X64-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000
				; X64-NEXT: cmovlq %rcx, %rax
				; X64-NEXT: retq
				%tmp = call i64 @llvm.smul.fix.sat.i64(i64 9223372036854775807, i64 2, i32 63);
				ret i64 %tmp;
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Intrinsic] Signed Fixed Point Saturation Multiplication IntrinsicClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 200565

llvm/trunk/docs/LangRef.rst

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

llvm/trunk/include/llvm/IR/Intrinsics.td

llvm/trunk/include/llvm/Target/TargetSelectionDAG.td

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

llvm/trunk/lib/IR/Verifier.cpp

llvm/trunk/test/CodeGen/X86/smul_fix_sat.ll

llvm/trunk/test/CodeGen/X86/smul_fix_sat_constants.ll

[Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
ClosedPublic