This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
15/15
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
3/3
ISDOpcodes.h
3/3
TargetLowering.h
-
IR/
2/2
Intrinsics.td
-
Target/
2/2
TargetSelectionDAG.td
-
lib/
-
CodeGen/
-
SelectionDAG/
1/1
LegalizeDAG.cpp
19/19
LegalizeIntegerTypes.cpp
-
LegalizeTypes.h
1/1
LegalizeVectorOps.cpp
-
LegalizeVectorTypes.cpp
-
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
7/7
TargetLowering.cpp
-
TargetLoweringBase.cpp
-
IR/
2/2
Verifier.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
3/3
smul_fix.ll

Differential D54719

[Intrinsic] Signed Fixed Point Multiplication Intrinsic
ClosedPublic

Authored by leonardchan on Nov 19 2018, 12:28 PM.

Download Raw Diff

Details

Reviewers

ebevhan
bjope
craig.topper
RKSimon
eli.friedman

Commits

rG118e53fd6370: [Intrinsic] Signed Fixed Point Multiplication Intrinsic
rL348912: [Intrinsic] Signed Fixed Point Multiplication Intrinsic

Summary

Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them.

This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics.

Diff Detail

Repository: rL LLVM

Event Timeline

leonardchan created this revision.Nov 19 2018, 12:28 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptNov 19 2018, 12:28 PM

Need to update LangRef.rst which I think we also missed for the saturating intrinsics.

llvm/include/llvm/CodeGen/ISDOpcodes.h
278	This is says it saturates but I didn't see that implemented in the expansion function.
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
625	Get rid of Op1 and Op2 and just do GetPromotedInteger(N->getOperand(0))
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5102	Shift amount constants should get their type from getShiftAmountTy.
5105	This is really an OR isn't it? DAG combiner will turn it into that so might as well just use OR.
5106	No need for an else after a return.
llvm/lib/IR/Verifier.cpp
4543	argumenr->argument
4544	I think you need to check that Op3 is a ConstantInt as well. And that it fits in 32 bits.

In D54719#1303553, @craig.topper wrote:

Need to update LangRef.rst which I think we also missed for the saturating intrinsics.

Funny, I actually just uploaded a patch for the LangRef docs on the other intrinsics (https://reviews.llvm.org/D54729). Also added the docs for this intrinsic in this patch.

llvm/include/llvm/CodeGen/ISDOpcodes.h
278	My bad, this intrinsic doesn't perform saturation.

bjope added inline comments.Nov 20 2018, 4:29 AM

llvm/docs/LangRef.rst
12604	I'm not sure if/how the auto-vectorizers are aware that only the first two operands should be vectorized and that the third operand should stay scalar. So I guess it is unlikely that we ever get any vector operands at the moment (except for handwritten lit tests). So I guess this should work for now (after all it is simpler than having a vector of scales). But perhaps we need to allow the scale to be given as a vector as well in the future to support vectorization (`shl` is for example similar, but afaik it will get the shift counts as a vector when being vectorized).
llvm/include/llvm/CodeGen/TargetLowering.h
801	Shouldn't this say "scale" instead of "saturation bit width"?
819	Same as above, isn't the check about "scales" and not "saturation widths"?
llvm/include/llvm/IR/Intrinsics.td
742	nit: In the langref you are using "Saturation Arithmetic Intrinsics" and "Fixed Point Arithemetic Instrinsics" as two separate chapters. And here we put all of them into one category "Fixed Point Intrinsics". I'm not sure if the saturating arithmetics should be in a fixed point category. Anyway, when adding a int_smul_fix_sat later I guess it will be in the "Fixed Point Arithemetic Instrinsics" in the langref (even if it also is saturating). So maybe it is better to not having this split into two categories after all.

RKSimon added inline comments.Nov 20 2018, 5:33 AM

llvm/test/CodeGen/X86/smul_fix.ll
3	You should be able to drop -mcpu, also please can you use -check-prefix=X64 and -check-prefix=X86 ?
9	Add nounwind to all the tests to reduce stack codegen?

ebevhan added inline comments.Nov 20 2018, 6:31 AM

llvm/docs/LangRef.rst
12628	typo "up"
12629	This doc doesn't say what happens on overflow. Truncation/wraparound? Or should we consider it undefined? If it says that the rounding is dependent on the target, will the rounding mode be target-configurable and exposed through TLI/TTI? We essentially can't touch these intrinsics (constant folding, optimization, even legalization) otherwise. Looking at the legalization sequence, it will definitely lower these into a round-down form. If a target has some legal operations which round up and some non-legal operations, then the legal ones will round up and the non-legal ones will round down. That's sort of messy. It might be safer to say that the rounding is indeterminate, but that's even worse for optimization.
llvm/include/llvm/CodeGen/ISDOpcodes.h
278	The intrinsic docs in the LangRef mentions that the last value must be constant, but this comment doesn't.
llvm/include/llvm/CodeGen/TargetLowering.h
3785	This comment clashes with the assertions in the function. It doesn't take vectors.
llvm/include/llvm/Target/TargetSelectionDAG.td
383	It's marked SDTIntBinOp, but is supposed to have three input operands. I think these nodes might need a new SDT. Also, multiplication is obviously commutative, but I don't know if SDNPCommutative works on nodes that have anything except two operands. It might not have an effect at all. Someone who knows more about DAG might have more info on that.
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2543	Maybe I'm missing something here, but isn't this just a normal, expanded multiplication?
2544	If we hit this return, doesn't this mean that legalization failed? Do we want to catch this here?
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5101	Could use a couple comments explaining what we're doing with the values/SRL/SHL. Does this work if MULHS in VT is of dubious legality?

craig.topper added inline comments.Nov 20 2018, 10:16 AM

llvm/include/llvm/Target/TargetSelectionDAG.td
383	Tablegen should use the first two operands as the commutative ones. It used to be an error or an assertion. But I changed it sometime in the last year or so to make it work for FMA in X86.

craig.topper added inline comments.Nov 20 2018, 10:18 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5101	Ideally we'd use MUL_LOHI if the target supports it. That should allow X86 to use a single multiply instruction in the test cases.

leonardchan updated this revision to Diff 174858.Nov 20 2018, 5:19 PM

leonardchan marked 15 inline comments as done.

leonardchan added inline comments.

llvm/docs/LangRef.rst
12629	Overflow I was going to leave as undefined behavior. Added this into the docs. By target dependent, I meant that whatever legal operation a target could replace this intrinsic with has the choice of rounding up or down. I'm not sure how bad this would be for optimization, but I imagine rounding isn't something that needs to be touched immediately since with this intrinsic, I'm just attempting to match what the standard says. I think one of the conclusions we also came up to in the long thread on llvm-dev was that, for now, we don't need to do more than what's necessary to implement the spec. I changed the docs though to say rounding is indeterminable, as the spec says.
llvm/include/llvm/IR/Intrinsics.td
742	Split into 2 categories. Was also planning on putting the saturating fixed point ones under "Fixed Point Intrinsics".
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2543	This should actually be an expansion of SMUL_LOHI since we need the high bits when scaling down the result after multiplication. Updated the code, but one thing I found was that `expandMUL_LOHI` depends on `ADDC` but does not have a check for it. When rerunning the tests, it seems that X86 does not support 32 bit `ADDE` (and `ADDC`). LLVM ERROR: Cannot select: t81: i32,glue = adde t121, t44:1, t80:1 so I removed these from the lit tests, but if we want to support expansion of this, I'm not sure if the solution is to add a check into `expandMUL_LOHI` to check `isOperationLegalOrCustom(ISD::ADDC/E, VT)` and if it is not legal, write out the expanded version. I found a comment in `ExpandIntRes_ADDSUB()` saying that there currently does not appear to be a way of generating a value of `MTV::Glue` if we were to expand the result.
2544	Added a fatal error report here.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5101	Added checks to see if we can use `MULHS` or `SMUL_LOHI`.

ebevhan added inline comments.Nov 21 2018, 1:37 AM

llvm/docs/LangRef.rst
12629	Okay. If we claim that it's indeterminate I guess we can still fold however we like, but the results would seem a bit inconsistent.
12635	I think "operation" might be clearer than "conversion". Or don't mention it at all and just say "It is undefined behavior if the source value does not fit within the range of the fixed point type." The rounding section is also a bit wordy. "If the result value cannot be precisely represented in the given scale, the value is rounded up or down to the closest representable value. The rounding direction is unspecified." Also, my choice of the word "indeterminate" was a bit unfortunate, I think the rest of the LangRef uses "unspecified".
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2543	It's a bit odd that x86 doesn't support ADDC/ADDE. Those nodes are deprecated in favor of ADDCARRY, so expandMUL_LOHI should probably be making expansions with ADDCARRY instead. @craig.topper probably knows more about this.
llvm/test/CodeGen/X86/smul_fix.ll
29	Interesting that the 32-bit target produces better code in most cases.

leonardchan updated this revision to Diff 174948.Nov 21 2018, 9:59 AM

leonardchan marked 6 inline comments as done.

I think @eli.friedman might know more about the ADDE/ADDC vs ADDCARRY than I do.

Do we have examples of real hardware that implements this sort of instruction?

Right now this intrinsic looks like its just reimplementing what you would get if you just emitted this IR

%a = sext iX %x to i2X
%b = sext iX %y to i2X
%c = mul i2X %a, %b
%d = lshr i2X %c, PRECISION
%e = trunc i2X %d to iX

So I'm not sure if there's a need for an intrinsic unless that pattern is difficult to match to an instruction.

In D54719#1305542, @craig.topper wrote:
Do we have examples of real hardware that implements this sort of instruction?

Right now this intrinsic looks like its just reimplementing what you would get if you just emitted this IR
%a = sext iX %x to i2X
%b = sext iX %y to i2X
%c = mul i2X %a, %b
%d = lshr i2X %c, PRECISION
%e = trunc i2X %d to iX
So I'm not sure if there's a need for an intrinsic unless that pattern is difficult to match to an instruction.

I think @ebevhan and @bjope may have hardware with fixed point instructions

In D54719#1305550, @leonardchan wrote:
In D54719#1305542, @craig.topper wrote:
Do we have examples of real hardware that implements this sort of instruction?

Right now this intrinsic looks like its just reimplementing what you would get if you just emitted this IR
%a = sext iX %x to i2X
%b = sext iX %y to i2X
%c = mul i2X %a, %b
%d = lshr i2X %c, PRECISION
%e = trunc i2X %d to iX
So I'm not sure if there's a need for an intrinsic unless that pattern is difficult to match to an instruction.
I think @ebevhan and @bjope may have hardware with fixed point instructions

Yes, our HW support various kinds of fixed point multiplication (such as llvm.smul.fix.i32(i32 %x, i32 %y, i32 31) and llvm.smul.fix.i24(i24 %x, i24 %y, i32 15)). We already have an intrinsic like llvm.smul.fix downstream.

For our downstream target we keep the intrinsic all the way to ISel. When compiling for other targets we have so far used an IR pass that transform the intrinsic to regular IR early in opt. I'm afraid that if we do that kind of expansion also for our target, we would introduce mul/lshr using non-legal types like i64 or i48. If we can't successfully match the IR back to a native instruction we will get really bad performance by doing the multiplication using the non-legal types. I'm also afraid that the pattern matching easily can be broken, for example by an IR pass that sinks the trunc to a later BB.

efriedma added a subscriber: efriedma.Nov 21 2018, 2:05 PM

efriedma added inline comments.

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2543	Yes, ADDC/ADDE are deprecated, and only work on targets which specifically implement lowering support, so there are a lot of targets where they simply can't be used (either because the target uses ADDCARRY instead, or because the operation simply doesn't exist on the target). expandMUL_LOHI is only used in very few places, though, so I guess we haven't hit this issue before. There are a few different ways to expand an ADD; see DAGTypeLegalizer::ExpandIntRes_ADDSUB. Ideally, we should just generate ADDCARRY and let legalization take care of the rest, but I don't think ADDCARRY legalization is currently implemented. Probably not hard to implement, though. (Maybe keep the existing ADDC/ADDE code for targets which use them.)

leonardchan updated this revision to Diff 175795.Nov 28 2018, 5:45 PM

leonardchan marked an inline comment as done.Nov 28 2018, 5:47 PM

leonardchan added inline comments.

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2543	I changed `expandMUL_LOHI` to use `ADDCARRY` if `ADDC` and `ADDE` are not available, and the corresponding expansion in the `DAGLegalizer`.

*ping* Does anyone have any more comments on this patch?

LGTM.

@craig.topper @RKSimon @efriedma Any more comments on this patch?

You might need PromoteIntOp_SMULFIX for certain targets, like 64-bit RISCV.

llvm/docs/LangRef.rst
12630	"The rounding direction is unspecified" seems scary to me... is there really no standard for rounding these operations?
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
624	Do you need to sign-extend here?
3257	Maybe worth adding a comment that the operand that needs to be expanded must be the third operand, since the other two operands have the same type as the result. And actually, I'm not sure how you would hit this in practice; it seems unusual to split a boolean value.

ebevhan added inline comments.Dec 4 2018, 3:39 AM

llvm/docs/LangRef.rst
12630	The Embedded-C standard says that the rounding direction is implementation-defined, and is allowed to be indeterminable. I also think it's a bit unfortunate that we can't express the rounding direction, since it inhibits optimization, but locking it down to either rounding up or down will make things annoying for any target with operations that round the other direction. Saying that it's unspecified lets us properly legalize this to something sensible while allowing targets to implement it with legal operations if they have them. For what it's worth, the legalization will make it round down (toward negative infinity).

bjope added inline comments.Dec 4 2018, 6:57 AM

llvm/docs/LangRef.rst
12630	Maybe it should say "target specific" instead? And then the legalization (and any constant folding etc) could use some target transform hook to decide in which direction to do the rounding (I guess it could be set to up/down/any). Does "unspecified" mean that ConstantFolding use one strategy for rounding and legalization another strategy? Or should we only allow constant folding when rounding isn't a problem (such as multiply by zero)? Things could still be annoying for a target that does care about rounding. Or maybe it isn't allowed for a target to decide what the implementation-defined behavior should be? Or should it be the same for all targets? A first implementation could of course go for only supporting "unspecified" (if it is clear what kind of constant folding and other optimizations that are allowed). If a target needs a "target specific" behavior, then I assume the code could be sprinkled with hooks at a later stage.

In D54719#1317563, @efriedma wrote:

You might need PromoteIntOp_SMULFIX for certain targets, like 64-bit RISCV.

Added, although RISCV doesn't seem to support [US]MUL_LOHI, so I couldn't add a test for that.

llvm/docs/LangRef.rst
12630	Maybe it should say "target specific" instead? And then the legalization (and any constant folding etc) could use some target transform hook to decide in which direction to do the rounding (I guess it could be set to up/down/any). I would be ok with adding this later. For now I was focusing on laying down the basic code necessary for just using this intrinsic. Does "unspecified" mean that ConstantFolding use one strategy for rounding and legalization another strategy? Or should we only allow constant folding when rounding isn't a problem (such as multiply by zero)? I initially had it as "target specific" but changed it to "unspecified" since the rest of the LangRef doc seems to use this term for describing either implementation-defined, or undefined, behavior. Things could still be annoying for a target that does care about rounding. Or maybe it isn't allowed for a target to decide what the implementation-defined behavior should be? Or should it be the same for all targets? A first implementation could of course go for only supporting "unspecified" (if it is clear what kind of constant folding and other optimizations that are allowed). If a target needs a "target specific" behavior, then I assume the code could be sprinkled with hooks at a later stage. The hooks are a good idea. I figure for optimizations, depending on what's supported (up/down/any), InstCombine and others can use the hook to determine rounding direction.
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
624	Yup
3257	Oh. You're right, this shouldn't have happened in the first place. When expanding to use `ADDCARRY` in `expandMUL_LOHI`, I accidentally created the constant with a VT that was the same as the operands instead of an `MTV::i1`. Fixed this, and this method is no longer necessary.

efriedma added inline comments.Dec 4 2018, 3:45 PM

llvm/docs/LangRef.rst
12630	Would it make sense to make the rounding mode a parameter to the intrinsic? That way, frontends can choose the semantics they want, and the semantics are clear throughout the optimization pipeline. clang could choose the rounding mode in a target-specific manner if necessary.

leonardchan marked an inline comment as done.Dec 4 2018, 3:56 PM

leonardchan added inline comments.

llvm/docs/LangRef.rst
12630	The only thing that I think would be of concern is just having too many parameters, or having an extra intrinsic to dictate rounding direction. Other than this, I have no other strong opinions regarding this. I imagine though that having an extra parameter/intrinsic to specify rounding would imply that a particular target has the option of specifying rounding in both directions, when really a target would likely just have native rounding in one direction (although this is just an assumption).

*ping* @craig.topper @RKSimon Any more comments on this patch?

craig.topper added inline comments.Dec 5 2018, 1:03 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2530	Just get N->getConstantOperandVal(2)?
2561	This result is only used by one of the 2 ifs below. Should it be pulled into that if. Can you add a comment to describe what's happening here?

leonardchan updated this revision to Diff 176874.Dec 5 2018, 1:31 PM

leonardchan marked 2 inline comments as done.

leonardchan marked an inline comment as done.Dec 5 2018, 5:21 PM

leonardchan added inline comments.

llvm/docs/LangRef.rst
12630	@ebevhan @bjope Do you think an extra parameter/intrinsic is worth adding also in this patch?

craig.topper added inline comments.Dec 6 2018, 11:14 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2554	if Scale is grater than NVTSize then this number is negative.
2569	Why is ResultLL being shifted right twice? I don't think I understand that.
2586	If Scale == NVTSize then the result is exactly in ResultHL and ResultLH, but nothing can prevent at least 1 bit of ResultHH from being used here. The shift amount for the ResultHH shift must be between 0 and NVTSize-1 to not be undefined.

leonardchan updated this revision to Diff 177032.Dec 6 2018, 1:06 PM

leonardchan marked 5 inline comments as done.

leonardchan added inline comments.

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2569	My bad. It shouldn't have shifted twice.
2586	Done. Added test case for this also.

ebevhan added inline comments.Dec 7 2018, 1:15 AM

llvm/docs/LangRef.rst
12630	It's definitely handy to be able to express the rounding of the operations in the intrinsic, but I think some of the downsides are: The number of parameters grows, making the intrinsic more unwieldy. There are other roundings than just up and down; rounding towards or away from zero (the former of which I think the fixed-point division ought to try and always follow) are also possibilities. We need to support legalization of every rounding mode we add to the intrinsic. Then again, if we add a hook we need to do the same anyway. The legalization checks need to check both rounding and scale. And as you say, it's more likely that the rounding of a specific target should be the same across the board rather than be configurable per operation, if it has support for fixed-point.

ebevhan added inline comments.Dec 7 2018, 1:33 AM

llvm/docs/LangRef.rst
12630	Ignore what I said about the division rounding; I was totally off mark there. I think we can just keep it unspecified for now and if the need arises, add rounding mode with a hook.

Use getConstantOperandVal where possible

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
1132	unsigned Scale = Node->getConstantOperandVal(2);
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
415	unsigned Scale = Node->getConstantOperandVal(2);
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5094	unsigned Scale = Node->getConstantOperandVal(2);

leonardchan updated this revision to Diff 177339.Dec 7 2018, 3:42 PM

leonardchan marked 9 inline comments as done.

*ping* Any more comments on this patch?

LGTM

This revision is now accepted and ready to land.Dec 11 2018, 4:34 PM

Closed by commit rL348912: [Intrinsic] Signed Fixed Point Multiplication Intrinsic (authored by leonardchan). · Explain WhyDec 11 2018, 10:32 PM

This revision was automatically updated to reflect the committed changes.

lebedev.ri added a subscriber: lebedev.ri.Apr 14 2019, 8:18 AM

lebedev.ri added inline comments.

llvm/trunk/docs/LangRef.rst
12820–12829 ↗	(On Diff #177822)	To someone unfamiliar with what fixed-point math is, this is somewhat vague. Would it please be possible to reword this with some more details? Overview section in https://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic is a nice example of it would look best. In particular, it isn't all that obvious how scale interacts with lhs/rhs. My current guess: %r = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %c) => %a2 = sext i4 %a to i8 %b2 = sext i4 %b to i8 %mul = mul nsw nuw i8 %a, %b %c2 = trunc i32 %c to i8 %scale = ashr i8 %mul, i8 %c ; does not convey the randomness of rounding though %r = trunc i8 %scale to i4

Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2019, 8:18 AM

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

66 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

6 lines

TargetLowering.h

37 lines

IR/

Intrinsics.td

3 lines

Target/

TargetSelectionDAG.td

1 line

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

10 lines

LegalizeIntegerTypes.cpp

39 lines

LegalizeTypes.h

6 lines

LegalizeVectorOps.cpp

6 lines

LegalizeVectorTypes.cpp

28 lines

SelectionDAGBuilder.cpp

8 lines

SelectionDAGDumper.cpp

1 line

TargetLowering.cpp

37 lines

TargetLoweringBase.cpp

1 line

IR/

Verifier.cpp

16 lines

test/

CodeGen/

X86/

smul_fix.ll

329 lines

Diff 174703

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 12,570 Lines • ▼ Show 20 Lines

	.. code-block:: llvm			.. code-block:: llvm

	%res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)			%res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
	%sum = extractvalue {i32, i1} %res, 0			%sum = extractvalue {i32, i1} %res, 0
	%obit = extractvalue {i32, i1} %res, 1			%obit = extractvalue {i32, i1} %res, 1
	br i1 %obit, label %overflow, label %normal			br i1 %obit, label %overflow, label %normal

				Fixed Point Arithmetic Intrinsics
				---------------------------------

				A fixed point number represents a real data type for a number that has a fixed
				number of digits after a radix point (equivalent to the decimal point '.').
				The number of digits after the radix point is referred as the ``scale``. These
				are useful for representing fractional values to a specific precision. The
				following intrinsics perform fixed point arithmetic operations on 2 operands
				of the same scale, specified as the third argument.


				'``llvm.smul.fix.*``' Intrinsics
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax
				"""""""

				This is an overloaded intrinsic. You can use ``llvm.smul.fix``
				on any integer bit width or vectors of integers.

				::

				declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
				declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
				declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
				declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
				bjopeUnsubmitted Done Reply Inline Actions I'm not sure if/how the auto-vectorizers are aware that only the first two operands should be vectorized and that the third operand should stay scalar. So I guess it is unlikely that we ever get any vector operands at the moment (except for handwritten lit tests). So I guess this should work for now (after all it is simpler than having a vector of scales). But perhaps we need to allow the scale to be given as a vector as well in the future to support vectorization (`shl` is for example similar, but afaik it will get the shift counts as a vector when being vectorized). bjope: I'm not sure if/how the auto-vectorizers are aware that only the first two operands should be…

				Overview
				"""""""""

				The '``llvm.smul.fix``' family of intrinsic functions perform signed
				fixed point multiplication on 2 arguments of the same scale.

				Arguments
				""""""""""

				The arguments (%a and %b) and the result may be of integer types of any bit
				width, but they must have the same bit width. ``%a`` and ``%b`` are the two
				values that will undergo signed fixed point multiplication. The argument
				``%scale`` represents the scale of both operands, and must be a constant
				integer.

				Semantics:
				""""""""""

				This operation performs fixed point multiplication on the 2 arguments of a
				specified scale. The result will also be returned in the same scale specified
				in the third argument. In the event a value cannot be precisely represented in
				this scale, the value is either rounded down to the closest fixed point value
				less than the source value, or rounded ip to the closest fixed point value
				ebevhanUnsubmitted Done Reply Inline Actions typo "up" ebevhan: typo "up"
				greater than the source value. This rounding is dependent on the target.
				ebevhanUnsubmitted Done Reply Inline Actions This doc doesn't say what happens on overflow. Truncation/wraparound? Or should we consider it undefined? If it says that the rounding is dependent on the target, will the rounding mode be target-configurable and exposed through TLI/TTI? We essentially can't touch these intrinsics (constant folding, optimization, even legalization) otherwise. Looking at the legalization sequence, it will definitely lower these into a round-down form. If a target has some legal operations which round up and some non-legal operations, then the legal ones will round up and the non-legal ones will round down. That's sort of messy. It might be safer to say that the rounding is indeterminate, but that's even worse for optimization. ebevhan: This doc doesn't say what happens on overflow. Truncation/wraparound? Or should we consider it…
				leonardchanAuthorUnsubmitted Done Reply Inline Actions Overflow I was going to leave as undefined behavior. Added this into the docs. By target dependent, I meant that whatever legal operation a target could replace this intrinsic with has the choice of rounding up or down. I'm not sure how bad this would be for optimization, but I imagine rounding isn't something that needs to be touched immediately since with this intrinsic, I'm just attempting to match what the standard says. I think one of the conclusions we also came up to in the long thread on llvm-dev was that, for now, we don't need to do more than what's necessary to implement the spec. I changed the docs though to say rounding is indeterminable, as the spec says. leonardchan: Overflow I was going to leave as undefined behavior. Added this into the docs. By target…
				ebevhanUnsubmitted Done Reply Inline Actions Okay. If we claim that it's indeterminate I guess we can still fold however we like, but the results would seem a bit inconsistent. ebevhan: Okay. If we claim that it's indeterminate I guess we can still fold however we like, but the…

				efriedmaUnsubmitted Done Reply Inline Actions "The rounding direction is unspecified" seems scary to me... is there really no standard for rounding these operations? efriedma: "The rounding direction is unspecified" seems scary to me... is there really no standard for…
				ebevhanUnsubmitted Done Reply Inline Actions The Embedded-C standard says that the rounding direction is implementation-defined, and is allowed to be indeterminable. I also think it's a bit unfortunate that we can't express the rounding direction, since it inhibits optimization, but locking it down to either rounding up or down will make things annoying for any target with operations that round the other direction. Saying that it's unspecified lets us properly legalize this to something sensible while allowing targets to implement it with legal operations if they have them. For what it's worth, the legalization will make it round down (toward negative infinity). ebevhan: The Embedded-C standard says that the rounding direction is implementation-defined, and is…
				bjopeUnsubmitted Done Reply Inline Actions Maybe it should say "target specific" instead? And then the legalization (and any constant folding etc) could use some target transform hook to decide in which direction to do the rounding (I guess it could be set to up/down/any). Does "unspecified" mean that ConstantFolding use one strategy for rounding and legalization another strategy? Or should we only allow constant folding when rounding isn't a problem (such as multiply by zero)? Things could still be annoying for a target that does care about rounding. Or maybe it isn't allowed for a target to decide what the implementation-defined behavior should be? Or should it be the same for all targets? A first implementation could of course go for only supporting "unspecified" (if it is clear what kind of constant folding and other optimizations that are allowed). If a target needs a "target specific" behavior, then I assume the code could be sprinkled with hooks at a later stage. bjope: Maybe it should say "target specific" instead? And then the legalization (and any constant…
				leonardchanAuthorUnsubmitted Done Reply Inline Actions Maybe it should say "target specific" instead? And then the legalization (and any constant folding etc) could use some target transform hook to decide in which direction to do the rounding (I guess it could be set to up/down/any). I would be ok with adding this later. For now I was focusing on laying down the basic code necessary for just using this intrinsic. Does "unspecified" mean that ConstantFolding use one strategy for rounding and legalization another strategy? Or should we only allow constant folding when rounding isn't a problem (such as multiply by zero)? I initially had it as "target specific" but changed it to "unspecified" since the rest of the LangRef doc seems to use this term for describing either implementation-defined, or undefined, behavior. Things could still be annoying for a target that does care about rounding. Or maybe it isn't allowed for a target to decide what the implementation-defined behavior should be? Or should it be the same for all targets? A first implementation could of course go for only supporting "unspecified" (if it is clear what kind of constant folding and other optimizations that are allowed). If a target needs a "target specific" behavior, then I assume the code could be sprinkled with hooks at a later stage. The hooks are a good idea. I figure for optimizations, depending on what's supported (up/down/any), InstCombine and others can use the hook to determine rounding direction. leonardchan: > Maybe it should say "target specific" instead? And then the legalization (and any constant…
				efriedmaUnsubmitted Done Reply Inline Actions Would it make sense to make the rounding mode a parameter to the intrinsic? That way, frontends can choose the semantics they want, and the semantics are clear throughout the optimization pipeline. clang could choose the rounding mode in a target-specific manner if necessary. efriedma: Would it make sense to make the rounding mode a parameter to the intrinsic? That way…
				leonardchanAuthorUnsubmitted Done Reply Inline Actions The only thing that I think would be of concern is just having too many parameters, or having an extra intrinsic to dictate rounding direction. Other than this, I have no other strong opinions regarding this. I imagine though that having an extra parameter/intrinsic to specify rounding would imply that a particular target has the option of specifying rounding in both directions, when really a target would likely just have native rounding in one direction (although this is just an assumption). leonardchan: The only thing that I think would be of concern is just having too many parameters, or having…
				leonardchanAuthorUnsubmitted Done Reply Inline Actions @ebevhan @bjope Do you think an extra parameter/intrinsic is worth adding also in this patch? leonardchan: @ebevhan @bjope Do you think an extra parameter/intrinsic is worth adding also in this patch?
				ebevhanUnsubmitted Done Reply Inline Actions It's definitely handy to be able to express the rounding of the operations in the intrinsic, but I think some of the downsides are: The number of parameters grows, making the intrinsic more unwieldy. There are other roundings than just up and down; rounding towards or away from zero (the former of which I think the fixed-point division ought to try and always follow) are also possibilities. We need to support legalization of every rounding mode we add to the intrinsic. Then again, if we add a hook we need to do the same anyway. The legalization checks need to check both rounding and scale. And as you say, it's more likely that the rounding of a specific target should be the same across the board rather than be configurable per operation, if it has support for fixed-point. ebevhan: It's definitely handy to be able to express the rounding of the operations in the intrinsic…
				ebevhanUnsubmitted Done Reply Inline Actions Ignore what I said about the division rounding; I was totally off mark there. I think we can just keep it unspecified for now and if the need arises, add rounding mode with a hook. ebevhan: Ignore what I said about the division rounding; I was totally off mark there. I think we can…

				Examples
				"""""""""

				.. code-block:: llvm
				ebevhanUnsubmitted Done Reply Inline Actions I think "operation" might be clearer than "conversion". Or don't mention it at all and just say "It is undefined behavior if the source value does not fit within the range of the fixed point type." The rounding section is also a bit wordy. "If the result value cannot be precisely represented in the given scale, the value is rounded up or down to the closest representable value. The rounding direction is unspecified." Also, my choice of the word "indeterminate" was a bit unfortunate, I think the rest of the LangRef uses "unspecified". ebevhan: I think "operation" might be clearer than "conversion". Or don't mention it at all and just say…

				%res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
				%res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
				%res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)

				; The result in the following could be rounded up to -2 or down to -2.5 depending on the system
				%res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)


	Specialised Arithmetic Intrinsics			Specialised Arithmetic Intrinsics
	---------------------------------			---------------------------------

	'``llvm.canonicalize.*``' Intrinsic			'``llvm.canonicalize.*``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""
	▲ Show 20 Lines • Show All 3,159 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 266 Lines • ▼ Show 20 Lines	enum NodeType {
/// RESULT = [US]SUBSAT(LHS, RHS) - Perform saturation subtraction on 2		/// RESULT = [US]SUBSAT(LHS, RHS) - Perform saturation subtraction on 2
/// integers with the same bit width (W). If the true value of LHS - RHS		/// integers with the same bit width (W). If the true value of LHS - RHS
/// exceeds the largest value that can be represented by W bits, the		/// exceeds the largest value that can be represented by W bits, the
/// resulting value is this maximum value. Otherwise, if this value is less		/// resulting value is this maximum value. Otherwise, if this value is less
/// than the smallest value that can be represented by W bits, the		/// than the smallest value that can be represented by W bits, the
/// resulting value is this minimum value.		/// resulting value is this minimum value.
SSUBSAT, USUBSAT,		SSUBSAT, USUBSAT,

		/// RESULT = SMULFIX(LHS, RHS, SCALE) - Perform fixed point multiplication on
		/// 2 integers with the same width and scale. SCALE represents the scale of
		/// both operands as fixed point numbers. A scale of zero is effectively
		/// performing multiplication on 2 integers.
		craig.topperUnsubmitted Done Reply Inline Actions This is says it saturates but I didn't see that implemented in the expansion function. craig.topper: This is says it saturates but I didn't see that implemented in the expansion function.
		leonardchanAuthorUnsubmitted Done Reply Inline Actions My bad, this intrinsic doesn't perform saturation. leonardchan: My bad, this intrinsic doesn't perform saturation.
		ebevhanUnsubmitted Done Reply Inline Actions The intrinsic docs in the LangRef mentions that the last value must be constant, but this comment doesn't. ebevhan: The intrinsic docs in the LangRef mentions that the last value must be constant, but this…
		SMULFIX,

/// Simple binary floating point operators.		/// Simple binary floating point operators.
FADD, FSUB, FMUL, FDIV, FREM,		FADD, FSUB, FMUL, FDIV, FREM,

/// Constrained versions of the binary floating point operators.		/// Constrained versions of the binary floating point operators.
/// These will be lowered to the simple operators before final selection.		/// These will be lowered to the simple operators before final selection.
/// They are used to limit optimizations while the DAG is being		/// They are used to limit optimizations while the DAG is being
/// optimized.		/// optimized.
STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,		STRICT_FADD, STRICT_FSUB, STRICT_FMUL, STRICT_FDIV, STRICT_FREM,
▲ Show 20 Lines • Show All 745 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 791 Lines • ▼ Show 20 Lines	public:
LegalizeAction getOperationAction(unsigned Op, EVT VT) const {		LegalizeAction getOperationAction(unsigned Op, EVT VT) const {
if (VT.isExtended()) return Expand;		if (VT.isExtended()) return Expand;
// If a target-specific SDNode requires legalization, require the target		// If a target-specific SDNode requires legalization, require the target
// to provide custom legalization for it.		// to provide custom legalization for it.
if (Op >= array_lengthof(OpActions[0])) return Custom;		if (Op >= array_lengthof(OpActions[0])) return Custom;
return OpActions[(unsigned)VT.getSimpleVT().SimpleTy][Op];		return OpActions[(unsigned)VT.getSimpleVT().SimpleTy][Op];
}		}

		/// Custom method defined by each target to indicate if an operation which
		/// may require a saturation bit width is supported natively by the target.
		bjopeUnsubmitted Done Reply Inline Actions Shouldn't this say "scale" instead of "saturation bit width"? bjope: Shouldn't this say "scale" instead of "saturation bit width"?
		/// If not, the operation is illegal.
		virtual bool isSupportedFixedPointOperation(unsigned Op, EVT VT,
		unsigned Scale) const {
		return false;
		}

		/// Some fixed point operations may be natively supported by the target but
		/// only for specific scales. This method allows for checking
		/// if the width is supported by the target for a given operation that may
		/// depend on scale.
		LegalizeAction getFixedPointOperationAction(unsigned Op, EVT VT,
		unsigned Scale) const {
		auto Action = getOperationAction(Op, VT);
		if (Action != Legal)
		return Action;

		// This operation is supported in this type but may only work on specific
		// saturation widths.
		bjopeUnsubmitted Done Reply Inline Actions Same as above, isn't the check about "scales" and not "saturation widths"? bjope: Same as above, isn't the check about "scales" and not "saturation widths"?
		bool Supported;
		switch (Op) {
		default:
		llvm_unreachable("Unexpected fixed point operation.");
		case ISD::SMULFIX:
		Supported = isSupportedFixedPointOperation(Op, VT, Scale);
		break;
		}

		return Supported ? Action : Expand;
		}

LegalizeAction getStrictFPOperationAction(unsigned Op, EVT VT) const {		LegalizeAction getStrictFPOperationAction(unsigned Op, EVT VT) const {
unsigned EqOpc;		unsigned EqOpc;
switch (Op) {		switch (Op) {
default: llvm_unreachable("Unexpected FP pseudo-opcode");		default: llvm_unreachable("Unexpected FP pseudo-opcode");
case ISD::STRICT_FADD: EqOpc = ISD::FADD; break;		case ISD::STRICT_FADD: EqOpc = ISD::FADD; break;
case ISD::STRICT_FSUB: EqOpc = ISD::FSUB; break;		case ISD::STRICT_FSUB: EqOpc = ISD::FSUB; break;
case ISD::STRICT_FMUL: EqOpc = ISD::FMUL; break;		case ISD::STRICT_FMUL: EqOpc = ISD::FMUL; break;
case ISD::STRICT_FDIV: EqOpc = ISD::FDIV; break;		case ISD::STRICT_FDIV: EqOpc = ISD::FDIV; break;
▲ Show 20 Lines • Show All 2,936 Lines • ▼ Show 20 Lines
SDValue getVectorElementPointer(SelectionDAG &DAG, SDValue VecPtr, EVT VecVT,		SDValue getVectorElementPointer(SelectionDAG &DAG, SDValue VecPtr, EVT VecVT,
SDValue Index) const;		SDValue Index) const;

/// Method for building the DAG expansion of ISD::[US][ADD\|SUB]SAT. This		/// Method for building the DAG expansion of ISD::[US][ADD\|SUB]SAT. This
/// method accepts integers or vectors of integers as its arguments.		/// method accepts integers or vectors of integers as its arguments.
SDValue getExpandedSaturationAdditionSubtraction(SDNode *Node,		SDValue getExpandedSaturationAdditionSubtraction(SDNode *Node,
SelectionDAG &DAG) const;		SelectionDAG &DAG) const;

		/// Method for building the DAG expansion of ISD::SMULFIX. This method accepts
		/// integers or vectors of integers as its arguments.
		ebevhanUnsubmitted Done Reply Inline Actions This comment clashes with the assertions in the function. It doesn't take vectors. ebevhan: This comment clashes with the assertions in the function. It doesn't take vectors.
		SDValue getExpandedFixedPointMultiplication(SDNode *Node,
		SelectionDAG &DAG) const;

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Instruction Emitting Hooks		// Instruction Emitting Hooks
//		//

/// This method should be implemented by targets that mark instructions with		/// This method should be implemented by targets that mark instructions with
/// the 'usesCustomInserter' flag. These instructions are special in various		/// the 'usesCustomInserter' flag. These instructions are special in various
/// ways, which require special support to insert. The specified MachineInstr		/// ways, which require special support to insert. The specified MachineInstr
/// is created but not inserted into any basic blocks, and this method is		/// is created but not inserted into any basic blocks, and this method is
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 733 Lines • ▼ Show 20 Lines

	def int_smul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],			def int_smul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>],			[LLVMMatchType<0>, LLVMMatchType<0>],
	[IntrNoMem, IntrSpeculatable]>;			[IntrNoMem, IntrSpeculatable]>;
	def int_umul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],			def int_umul_with_overflow : Intrinsic<[llvm_anyint_ty, llvm_i1_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>],			[LLVMMatchType<0>, LLVMMatchType<0>],
	[IntrNoMem, IntrSpeculatable]>;			[IntrNoMem, IntrSpeculatable]>;

	//===------------------------- Fixed Point Intrinsics ---------------------===//			//===------------------------- Fixed Point Intrinsics ---------------------===//
				bjopeUnsubmitted Done Reply Inline Actions nit: In the langref you are using "Saturation Arithmetic Intrinsics" and "Fixed Point Arithemetic Instrinsics" as two separate chapters. And here we put all of them into one category "Fixed Point Intrinsics". I'm not sure if the saturating arithmetics should be in a fixed point category. Anyway, when adding a int_smul_fix_sat later I guess it will be in the "Fixed Point Arithemetic Instrinsics" in the langref (even if it also is saturating). So maybe it is better to not having this split into two categories after all. bjope: nit: In the langref you are using "Saturation Arithmetic Intrinsics" and "Fixed Point…
				leonardchanAuthorUnsubmitted Done Reply Inline Actions Split into 2 categories. Was also planning on putting the saturating fixed point ones under "Fixed Point Intrinsics". leonardchan: Split into 2 categories. Was also planning on putting the saturating fixed point ones under…
	//			//
	def int_sadd_sat : Intrinsic<[llvm_anyint_ty],			def int_sadd_sat : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>],			[LLVMMatchType<0>, LLVMMatchType<0>],
	[IntrNoMem, IntrSpeculatable, Commutative]>;			[IntrNoMem, IntrSpeculatable, Commutative]>;
	def int_uadd_sat : Intrinsic<[llvm_anyint_ty],			def int_uadd_sat : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>],			[LLVMMatchType<0>, LLVMMatchType<0>],
	[IntrNoMem, IntrSpeculatable, Commutative]>;			[IntrNoMem, IntrSpeculatable, Commutative]>;
	def int_ssub_sat : Intrinsic<[llvm_anyint_ty],			def int_ssub_sat : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>],			[LLVMMatchType<0>, LLVMMatchType<0>],
	[IntrNoMem, IntrSpeculatable]>;			[IntrNoMem, IntrSpeculatable]>;
	def int_usub_sat : Intrinsic<[llvm_anyint_ty],			def int_usub_sat : Intrinsic<[llvm_anyint_ty],
	[LLVMMatchType<0>, LLVMMatchType<0>],			[LLVMMatchType<0>, LLVMMatchType<0>],
	[IntrNoMem, IntrSpeculatable]>;			[IntrNoMem, IntrSpeculatable]>;
				def int_smul_fix : Intrinsic<[llvm_anyint_ty],
				[LLVMMatchType<0>, LLVMMatchType<0>, llvm_i32_ty],
				[IntrNoMem, IntrSpeculatable, Commutative]>;

	//===------------------------- Memory Use Markers -------------------------===//			//===------------------------- Memory Use Markers -------------------------===//
	//			//
	def int_lifetime_start : Intrinsic<[],			def int_lifetime_start : Intrinsic<[],
	[llvm_i64_ty, llvm_anyptr_ty],			[llvm_i64_ty, llvm_anyptr_ty],
	[IntrArgMemOnly, NoCapture<1>]>;			[IntrArgMemOnly, NoCapture<1>]>;
	def int_lifetime_end : Intrinsic<[],			def int_lifetime_end : Intrinsic<[],
	[llvm_i64_ty, llvm_anyptr_ty],			[llvm_i64_ty, llvm_anyptr_ty],
	▲ Show 20 Lines • Show All 306 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetSelectionDAG.td

Show First 20 Lines • Show All 374 Lines • ▼ Show 20 Lines	def umin : SDNode<"ISD::UMIN" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;
def umax : SDNode<"ISD::UMAX" , SDTIntBinOp,		def umax : SDNode<"ISD::UMAX" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;

def saddsat : SDNode<"ISD::SADDSAT" , SDTIntBinOp, [SDNPCommutative]>;		def saddsat : SDNode<"ISD::SADDSAT" , SDTIntBinOp, [SDNPCommutative]>;
def uaddsat : SDNode<"ISD::UADDSAT" , SDTIntBinOp, [SDNPCommutative]>;		def uaddsat : SDNode<"ISD::UADDSAT" , SDTIntBinOp, [SDNPCommutative]>;
def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;		def ssubsat : SDNode<"ISD::SSUBSAT" , SDTIntBinOp>;
def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;		def usubsat : SDNode<"ISD::USUBSAT" , SDTIntBinOp>;
		def smulfix : SDNode<"ISD::SMULFIX" , SDTIntBinOp, [SDNPCommutative]>;
		ebevhanUnsubmitted Done Reply Inline Actions It's marked SDTIntBinOp, but is supposed to have three input operands. I think these nodes might need a new SDT. Also, multiplication is obviously commutative, but I don't know if SDNPCommutative works on nodes that have anything except two operands. It might not have an effect at all. Someone who knows more about DAG might have more info on that. ebevhan: It's marked SDTIntBinOp, but is supposed to have three input operands. I think these nodes…
		craig.topperUnsubmitted Done Reply Inline Actions Tablegen should use the first two operands as the commutative ones. It used to be an error or an assertion. But I changed it sometime in the last year or so to make it work for FMA in X86. craig.topper: Tablegen should use the first two operands as the commutative ones. It used to be an error or…

def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;		def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;
def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;		def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;
def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;		def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;

def abs : SDNode<"ISD::ABS" , SDTIntUnaryOp>;		def abs : SDNode<"ISD::ABS" , SDTIntUnaryOp>;
def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;		def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;
def bswap : SDNode<"ISD::BSWAP" , SDTIntUnaryOp>;		def bswap : SDNode<"ISD::BSWAP" , SDTIntUnaryOp>;
▲ Show 20 Lines • Show All 954 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 1,122 Lines • ▼ Show 20 Lines	case ISD::STRICT_FTRUNC:
break;		break;
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: {		case ISD::USUBSAT: {
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
}		}
		case ISD::SMULFIX: {
		unsigned Scale = cast<ConstantSDNode>(Node->getOperand(2))->getZExtValue();
		RKSimonUnsubmitted Done Reply Inline Actions unsigned Scale = Node->getConstantOperandVal(2); RKSimon: unsigned Scale = Node->getConstantOperandVal(2);
		Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
		Node->getValueType(0), Scale);
		break;
		}
case ISD::MSCATTER:		case ISD::MSCATTER:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<MaskedScatterSDNode>(Node)->getValue().getValueType());		cast<MaskedScatterSDNode>(Node)->getValue().getValueType());
break;		break;
case ISD::MSTORE:		case ISD::MSTORE:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<MaskedStoreSDNode>(Node)->getValue().getValueType());		cast<MaskedStoreSDNode>(Node)->getValue().getValueType());
break;		break;
▲ Show 20 Lines • Show All 2,125 Lines • ▼ Show 20 Lines	bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
}		}
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: {		case ISD::USUBSAT: {
Results.push_back(TLI.getExpandedSaturationAdditionSubtraction(Node, DAG));		Results.push_back(TLI.getExpandedSaturationAdditionSubtraction(Node, DAG));
break;		break;
}		}
		case ISD::SMULFIX: {
		Results.push_back(TLI.getExpandedFixedPointMultiplication(Node, DAG));
		break;
		}
case ISD::SADDO:		case ISD::SADDO:
case ISD::SSUBO: {		case ISD::SSUBO: {
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
SDValue Sum = DAG.getNode(Node->getOpcode() == ISD::SADDO ?		SDValue Sum = DAG.getNode(Node->getOpcode() == ISD::SADDO ?
ISD::ADD : ISD::SUB, dl, LHS.getValueType(),		ISD::ADD : ISD::SUB, dl, LHS.getValueType(),
LHS, RHS);		LHS, RHS);
Results.push_back(Sum);		Results.push_back(Sum);
▲ Show 20 Lines • Show All 1,346 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	#endif

case ISD::ADDCARRY:		case ISD::ADDCARRY:
case ISD::SUBCARRY: Res = PromoteIntRes_ADDSUBCARRY(N, ResNo); break;		case ISD::SUBCARRY: Res = PromoteIntRes_ADDSUBCARRY(N, ResNo); break;

case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: Res = PromoteIntRes_ADDSUBSAT(N); break;		case ISD::USUBSAT: Res = PromoteIntRes_ADDSUBSAT(N); break;
		case ISD::SMULFIX: Res = PromoteIntRes_SMULFIX(N); break;

case ISD::ATOMIC_LOAD:		case ISD::ATOMIC_LOAD:
Res = PromoteIntRes_Atomic0(cast<AtomicSDNode>(N)); break;		Res = PromoteIntRes_Atomic0(cast<AtomicSDNode>(N)); break;

case ISD::ATOMIC_LOAD_ADD:		case ISD::ATOMIC_LOAD_ADD:
case ISD::ATOMIC_LOAD_SUB:		case ISD::ATOMIC_LOAD_SUB:
case ISD::ATOMIC_LOAD_AND:		case ISD::ATOMIC_LOAD_AND:
case ISD::ATOMIC_LOAD_CLR:		case ISD::ATOMIC_LOAD_CLR:
▲ Show 20 Lines • Show All 455 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_ADDSUBSAT(SDNode *N) {
Op2Promoted =		Op2Promoted =
DAG.getNode(ISD::SHL, dl, PromotedType, Op2Promoted, ShiftAmount);		DAG.getNode(ISD::SHL, dl, PromotedType, Op2Promoted, ShiftAmount);

SDValue Result =		SDValue Result =
DAG.getNode(Opcode, dl, PromotedType, Op1Promoted, Op2Promoted);		DAG.getNode(Opcode, dl, PromotedType, Op1Promoted, Op2Promoted);
return DAG.getNode(ShiftOp, dl, PromotedType, Result, ShiftAmount);		return DAG.getNode(ShiftOp, dl, PromotedType, Result, ShiftAmount);
}		}

		SDValue DAGTypeLegalizer::PromoteIntRes_SMULFIX(SDNode *N) {
		// Can just promote the operands then continue with operation.
		SDLoc dl(N);
		SDValue Op1Promoted = GetPromotedInteger(N->getOperand(0));
		SDValue Op2Promoted = GetPromotedInteger(N->getOperand(1));
		efriedmaUnsubmitted Done Reply Inline Actions Do you need to sign-extend here? efriedma: Do you need to sign-extend here?
		leonardchanAuthorUnsubmitted Done Reply Inline Actions Yup leonardchan: Yup
		EVT PromotedType = Op1Promoted.getValueType();
		craig.topperUnsubmitted Done Reply Inline Actions Get rid of Op1 and Op2 and just do GetPromotedInteger(N->getOperand(0)) craig.topper: Get rid of Op1 and Op2 and just do GetPromotedInteger(N->getOperand(0))
		return DAG.getNode(N->getOpcode(), dl, PromotedType, Op1Promoted, Op2Promoted,
		N->getOperand(2));
		}

SDValue DAGTypeLegalizer::PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo) {		SDValue DAGTypeLegalizer::PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo) {
if (ResNo == 1)		if (ResNo == 1)
return PromoteIntRes_Overflow(N);		return PromoteIntRes_Overflow(N);

// The operation overflowed iff the result in the larger type is not the		// The operation overflowed iff the result in the larger type is not the
// sign extension of its truncation to the original type.		// sign extension of its truncation to the original type.
SDValue LHS = SExtPromotedInteger(N->getOperand(0));		SDValue LHS = SExtPromotedInteger(N->getOperand(0));
SDValue RHS = SExtPromotedInteger(N->getOperand(1));		SDValue RHS = SExtPromotedInteger(N->getOperand(1));
▲ Show 20 Lines • Show All 909 Lines • ▼ Show 20 Lines	#endif
case ISD::USUBO: ExpandIntRes_UADDSUBO(N, Lo, Hi); break;		case ISD::USUBO: ExpandIntRes_UADDSUBO(N, Lo, Hi); break;
case ISD::UMULO:		case ISD::UMULO:
case ISD::SMULO: ExpandIntRes_XMULO(N, Lo, Hi); break;		case ISD::SMULO: ExpandIntRes_XMULO(N, Lo, Hi); break;

case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;		case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;
		case ISD::SMULFIX: ExpandIntRes_SMULFIX(N, Lo, Hi); break;
}		}

// If Lo/Hi is null, the sub-method took care of registering results etc.		// If Lo/Hi is null, the sub-method took care of registering results etc.
if (Lo.getNode())		if (Lo.getNode())
SetExpandedInteger(SDValue(N, ResNo), Lo, Hi);		SetExpandedInteger(SDValue(N, ResNo), Lo, Hi);
}		}

/// Lower an atomic node to the appropriate builtin call.		/// Lower an atomic node to the appropriate builtin call.
▲ Show 20 Lines • Show All 952 Lines • ▼ Show 20 Lines
}		}

void DAGTypeLegalizer::ExpandIntRes_ADDSUBSAT(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::ExpandIntRes_ADDSUBSAT(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
SDValue Result = TLI.getExpandedSaturationAdditionSubtraction(N, DAG);		SDValue Result = TLI.getExpandedSaturationAdditionSubtraction(N, DAG);
SplitInteger(Result, Lo, Hi);		SplitInteger(Result, Lo, Hi);
}		}

		void DAGTypeLegalizer::ExpandIntRes_SMULFIX(SDNode *N, SDValue &Lo,
		SDValue &Hi) {
		SDLoc dl(N);
		EVT VT = N->getValueType(0);
		SDValue LHS = N->getOperand(0);
		SDValue RHS = N->getOperand(1);
		unsigned Scale = cast<ConstantSDNode>(N->getOperand(2))->getZExtValue();
		craig.topperUnsubmitted Done Reply Inline Actions Just get N->getConstantOperandVal(2)? craig.topper: Just get N->getConstantOperandVal(2)?
		if (!Scale) {
		SDValue Result = DAG.getNode(ISD::MUL, dl, VT, LHS, RHS);
		SplitInteger(Result, Lo, Hi);
		return;
		}

		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		SDValue LL, LH, RL, RH;
		GetExpandedInteger(LHS, LL, LH);
		GetExpandedInteger(RHS, RL, RH);
		if (!TLI.expandMUL(
		DAG.getNode(ISD::MUL, dl, VT, LHS, RHS).getNode(), Lo, Hi, NVT, DAG,
		TargetLowering::MulExpansionKind::OnlyLegalOrCustom, LL, LH, RL, RH))
		ebevhanUnsubmitted Done Reply Inline Actions Maybe I'm missing something here, but isn't this just a normal, expanded multiplication? ebevhan: Maybe I'm missing something here, but isn't this just a normal, expanded multiplication?
		leonardchanAuthorUnsubmitted Done Reply Inline Actions This should actually be an expansion of SMUL_LOHI since we need the high bits when scaling down the result after multiplication. Updated the code, but one thing I found was that `expandMUL_LOHI` depends on `ADDC` but does not have a check for it. When rerunning the tests, it seems that X86 does not support 32 bit `ADDE` (and `ADDC`). LLVM ERROR: Cannot select: t81: i32,glue = adde t121, t44:1, t80:1 so I removed these from the lit tests, but if we want to support expansion of this, I'm not sure if the solution is to add a check into `expandMUL_LOHI` to check `isOperationLegalOrCustom(ISD::ADDC/E, VT)` and if it is not legal, write out the expanded version. I found a comment in `ExpandIntRes_ADDSUB()` saying that there currently does not appear to be a way of generating a value of `MTV::Glue` if we were to expand the result. leonardchan: This should actually be an expansion of SMUL_LOHI since we need the high bits when scaling down…
		ebevhanUnsubmitted Done Reply Inline Actions It's a bit odd that x86 doesn't support ADDC/ADDE. Those nodes are deprecated in favor of ADDCARRY, so expandMUL_LOHI should probably be making expansions with ADDCARRY instead. @craig.topper probably knows more about this. ebevhan: It's a bit odd that x86 doesn't support ADDC/ADDE. Those nodes are deprecated in favor of…
		efriedmaUnsubmitted Done Reply Inline Actions Yes, ADDC/ADDE are deprecated, and only work on targets which specifically implement lowering support, so there are a lot of targets where they simply can't be used (either because the target uses ADDCARRY instead, or because the operation simply doesn't exist on the target). expandMUL_LOHI is only used in very few places, though, so I guess we haven't hit this issue before. There are a few different ways to expand an ADD; see DAGTypeLegalizer::ExpandIntRes_ADDSUB. Ideally, we should just generate ADDCARRY and let legalization take care of the rest, but I don't think ADDCARRY legalization is currently implemented. Probably not hard to implement, though. (Maybe keep the existing ADDC/ADDE code for targets which use them.) efriedma: Yes, ADDC/ADDE are deprecated, and only work on targets which specifically implement lowering…
		leonardchanAuthorUnsubmitted Done Reply Inline Actions I changed `expandMUL_LOHI` to use `ADDCARRY` if `ADDC` and `ADDE` are not available, and the corresponding expansion in the `DAGLegalizer`. leonardchan: I changed `expandMUL_LOHI` to use `ADDCARRY` if `ADDC` and `ADDE` are not available, and the…
		return;
		ebevhanUnsubmitted Done Reply Inline Actions If we hit this return, doesn't this mean that legalization failed? Do we want to catch this here? ebevhan: If we hit this return, doesn't this mean that legalization failed? Do we want to catch this…
		leonardchanAuthorUnsubmitted Done Reply Inline Actions Added a fatal error report here. leonardchan: Added a fatal error report here.

		Lo = DAG.getNode(ISD::SRL, dl, NVT, Lo, DAG.getConstant(Scale, dl, NVT));
		Hi = DAG.getNode(ISD::SHL, dl, NVT, Hi,
		DAG.getConstant(NVT.getScalarSizeInBits() - Scale, dl, NVT));
		}

void DAGTypeLegalizer::ExpandIntRes_SADDSUBO(SDNode *Node,		void DAGTypeLegalizer::ExpandIntRes_SADDSUBO(SDNode *Node,
SDValue &Lo, SDValue &Hi) {		SDValue &Lo, SDValue &Hi) {
SDValue LHS = Node->getOperand(0);		SDValue LHS = Node->getOperand(0);
SDValue RHS = Node->getOperand(1);		SDValue RHS = Node->getOperand(1);
		craig.topperUnsubmitted Done Reply Inline Actions if Scale is grater than NVTSize then this number is negative. craig.topper: if Scale is grater than NVTSize then this number is negative.
SDLoc dl(Node);		SDLoc dl(Node);

// Expand the result by simply replacing it with the equivalent		// Expand the result by simply replacing it with the equivalent
// non-overflow-checking operation.		// non-overflow-checking operation.
SDValue Sum = DAG.getNode(Node->getOpcode() == ISD::SADDO ?		SDValue Sum = DAG.getNode(Node->getOpcode() == ISD::SADDO ?
ISD::ADD : ISD::SUB, dl, LHS.getValueType(),		ISD::ADD : ISD::SUB, dl, LHS.getValueType(),
LHS, RHS);		LHS, RHS);
		craig.topperUnsubmitted Done Reply Inline Actions This result is only used by one of the 2 ifs below. Should it be pulled into that if. Can you add a comment to describe what's happening here? craig.topper: This result is only used by one of the 2 ifs below. Should it be pulled into that if. Can you…
SplitInteger(Sum, Lo, Hi);		SplitInteger(Sum, Lo, Hi);

// Compute the overflow.		// Compute the overflow.
//		//
// LHSSign -> LHS >= 0		// LHSSign -> LHS >= 0
// RHSSign -> RHS >= 0		// RHSSign -> RHS >= 0
// SumSign -> Sum >= 0		// SumSign -> Sum >= 0
//		//
		craig.topperUnsubmitted Done Reply Inline Actions Why is ResultLL being shifted right twice? I don't think I understand that. craig.topper: Why is ResultLL being shifted right twice? I don't think I understand that.
		leonardchanAuthorUnsubmitted Done Reply Inline Actions My bad. It shouldn't have shifted twice. leonardchan: My bad. It shouldn't have shifted twice.
// Add:		// Add:
// Overflow -> (LHSSign == RHSSign) && (LHSSign != SumSign)		// Overflow -> (LHSSign == RHSSign) && (LHSSign != SumSign)
// Sub:		// Sub:
// Overflow -> (LHSSign != RHSSign) && (LHSSign != SumSign)		// Overflow -> (LHSSign != RHSSign) && (LHSSign != SumSign)
//		//
EVT OType = Node->getValueType(1);		EVT OType = Node->getValueType(1);
SDValue Zero = DAG.getConstant(0, dl, LHS.getValueType());		SDValue Zero = DAG.getConstant(0, dl, LHS.getValueType());

SDValue LHSSign = DAG.getSetCC(dl, OType, LHS, Zero, ISD::SETGE);		SDValue LHSSign = DAG.getSetCC(dl, OType, LHS, Zero, ISD::SETGE);
SDValue RHSSign = DAG.getSetCC(dl, OType, RHS, Zero, ISD::SETGE);		SDValue RHSSign = DAG.getSetCC(dl, OType, RHS, Zero, ISD::SETGE);
SDValue SignsMatch = DAG.getSetCC(dl, OType, LHSSign, RHSSign,		SDValue SignsMatch = DAG.getSetCC(dl, OType, LHSSign, RHSSign,
Node->getOpcode() == ISD::SADDO ?		Node->getOpcode() == ISD::SADDO ?
ISD::SETEQ : ISD::SETNE);		ISD::SETEQ : ISD::SETNE);

SDValue SumSign = DAG.getSetCC(dl, OType, Sum, Zero, ISD::SETGE);		SDValue SumSign = DAG.getSetCC(dl, OType, Sum, Zero, ISD::SETGE);
SDValue SumSignNE = DAG.getSetCC(dl, OType, LHSSign, SumSign, ISD::SETNE);		SDValue SumSignNE = DAG.getSetCC(dl, OType, LHSSign, SumSign, ISD::SETNE);

		craig.topperUnsubmitted Done Reply Inline Actions If Scale == NVTSize then the result is exactly in ResultHL and ResultLH, but nothing can prevent at least 1 bit of ResultHH from being used here. The shift amount for the ResultHH shift must be between 0 and NVTSize-1 to not be undefined. craig.topper: If Scale == NVTSize then the result is exactly in ResultHL and ResultLH, but nothing can…
		leonardchanAuthorUnsubmitted Done Reply Inline Actions Done. Added test case for this also. leonardchan: Done. Added test case for this also.
SDValue Cmp = DAG.getNode(ISD::AND, dl, OType, SignsMatch, SumSignNE);		SDValue Cmp = DAG.getNode(ISD::AND, dl, OType, SignsMatch, SumSignNE);

// Use the calculated overflow everywhere.		// Use the calculated overflow everywhere.
ReplaceValueWith(SDValue(Node, 1), Cmp);		ReplaceValueWith(SDValue(Node, 1), Cmp);
}		}

void DAGTypeLegalizer::ExpandIntRes_SDIV(SDNode *N,		void DAGTypeLegalizer::ExpandIntRes_SDIV(SDNode *N,
SDValue &Lo, SDValue &Hi) {		SDValue &Lo, SDValue &Hi) {
▲ Show 20 Lines • Show All 654 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::ExpandIntOp_BR_CC(SDNode *N) {

// Update N to have the operands specified.		// Update N to have the operands specified.
return SDValue(DAG.UpdateNodeOperands(N, N->getOperand(0),		return SDValue(DAG.UpdateNodeOperands(N, N->getOperand(0),
DAG.getCondCode(CCCode), NewLHS, NewRHS,		DAG.getCondCode(CCCode), NewLHS, NewRHS,
N->getOperand(4)), 0);		N->getOperand(4)), 0);
}		}

SDValue DAGTypeLegalizer::ExpandIntOp_SELECT_CC(SDNode *N) {		SDValue DAGTypeLegalizer::ExpandIntOp_SELECT_CC(SDNode *N) {
SDValue NewLHS = N->getOperand(0), NewRHS = N->getOperand(1);		SDValue NewLHS = N->getOperand(0), NewRHS = N->getOperand(1);
		efriedmaUnsubmitted Done Reply Inline Actions Maybe worth adding a comment that the operand that needs to be expanded must be the third operand, since the other two operands have the same type as the result. And actually, I'm not sure how you would hit this in practice; it seems unusual to split a boolean value. efriedma: Maybe worth adding a comment that the operand that needs to be expanded must be the third…
		leonardchanAuthorUnsubmitted Done Reply Inline Actions Oh. You're right, this shouldn't have happened in the first place. When expanding to use `ADDCARRY` in `expandMUL_LOHI`, I accidentally created the constant with a VT that was the same as the operands instead of an `MTV::i1`. Fixed this, and this method is no longer necessary. leonardchan: Oh. You're right, this shouldn't have happened in the first place. When expanding to use…
ISD::CondCode CCCode = cast<CondCodeSDNode>(N->getOperand(4))->get();		ISD::CondCode CCCode = cast<CondCodeSDNode>(N->getOperand(4))->get();
IntegerExpandSetCCOperands(NewLHS, NewRHS, CCCode, SDLoc(N));		IntegerExpandSetCCOperands(NewLHS, NewRHS, CCCode, SDLoc(N));

// If ExpandSetCCOperands returned a scalar, we need to compare the result		// If ExpandSetCCOperands returned a scalar, we need to compare the result
// against zero to select between true and false values.		// against zero to select between true and false values.
if (!NewRHS.getNode()) {		if (!NewRHS.getNode()) {
NewRHS = DAG.getConstant(0, SDLoc(N), NewLHS.getValueType());		NewRHS = DAG.getConstant(0, SDLoc(N), NewLHS.getValueType());
CCCode = ISD::SETNE;		CCCode = ISD::SETNE;
▲ Show 20 Lines • Show All 475 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	private:
SDValue PromoteIntRes_SRL(SDNode *N);		SDValue PromoteIntRes_SRL(SDNode *N);
SDValue PromoteIntRes_TRUNCATE(SDNode *N);		SDValue PromoteIntRes_TRUNCATE(SDNode *N);
SDValue PromoteIntRes_UADDSUBO(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_UADDSUBO(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_ADDSUBCARRY(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_ADDSUBCARRY(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_UNDEF(SDNode *N);		SDValue PromoteIntRes_UNDEF(SDNode *N);
SDValue PromoteIntRes_VAARG(SDNode *N);		SDValue PromoteIntRes_VAARG(SDNode *N);
SDValue PromoteIntRes_XMULO(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_XMULO(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_ADDSUBSAT(SDNode *N);		SDValue PromoteIntRes_ADDSUBSAT(SDNode *N);
		SDValue PromoteIntRes_SMULFIX(SDNode *N);

// Integer Operand Promotion.		// Integer Operand Promotion.
bool PromoteIntegerOperand(SDNode *N, unsigned OpNo);		bool PromoteIntegerOperand(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_ANY_EXTEND(SDNode *N);		SDValue PromoteIntOp_ANY_EXTEND(SDNode *N);
SDValue PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N);		SDValue PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N);
SDValue PromoteIntOp_BITCAST(SDNode *N);		SDValue PromoteIntOp_BITCAST(SDNode *N);
SDValue PromoteIntOp_BUILD_PAIR(SDNode *N);		SDValue PromoteIntOp_BUILD_PAIR(SDNode *N);
SDValue PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo);
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	private:
void ExpandIntRes_Shift (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_Shift (SDNode *N, SDValue &Lo, SDValue &Hi);

void ExpandIntRes_MINMAX (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_MINMAX (SDNode *N, SDValue &Lo, SDValue &Hi);

void ExpandIntRes_SADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_SADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_UADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_UADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_XMULO (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_XMULO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ADDSUBSAT (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ADDSUBSAT (SDNode *N, SDValue &Lo, SDValue &Hi);
		void ExpandIntRes_SMULFIX (SDNode *N, SDValue &Lo, SDValue &Hi);

void ExpandIntRes_ATOMIC_LOAD (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ATOMIC_LOAD (SDNode *N, SDValue &Lo, SDValue &Hi);

void ExpandShiftByConstant(SDNode *N, const APInt &Amt,		void ExpandShiftByConstant(SDNode *N, const APInt &Amt,
SDValue &Lo, SDValue &Hi);		SDValue &Lo, SDValue &Hi);
bool ExpandShiftWithKnownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);		bool ExpandShiftWithKnownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);
bool ExpandShiftWithUnknownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);		bool ExpandShiftWithUnknownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);

▲ Show 20 Lines • Show All 239 Lines • ▼ Show 20 Lines	private:
SDValue ScalarizeVecRes_SCALAR_TO_VECTOR(SDNode *N);		SDValue ScalarizeVecRes_SCALAR_TO_VECTOR(SDNode *N);
SDValue ScalarizeVecRes_VSELECT(SDNode *N);		SDValue ScalarizeVecRes_VSELECT(SDNode *N);
SDValue ScalarizeVecRes_SELECT(SDNode *N);		SDValue ScalarizeVecRes_SELECT(SDNode *N);
SDValue ScalarizeVecRes_SELECT_CC(SDNode *N);		SDValue ScalarizeVecRes_SELECT_CC(SDNode *N);
SDValue ScalarizeVecRes_SETCC(SDNode *N);		SDValue ScalarizeVecRes_SETCC(SDNode *N);
SDValue ScalarizeVecRes_UNDEF(SDNode *N);		SDValue ScalarizeVecRes_UNDEF(SDNode *N);
SDValue ScalarizeVecRes_VECTOR_SHUFFLE(SDNode *N);		SDValue ScalarizeVecRes_VECTOR_SHUFFLE(SDNode *N);

		SDValue ScalarizeVecRes_SMULFIX(SDNode *N);

// Vector Operand Scalarization: <1 x ty> -> ty.		// Vector Operand Scalarization: <1 x ty> -> ty.
bool ScalarizeVectorOperand(SDNode *N, unsigned OpNo);		bool ScalarizeVectorOperand(SDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_BITCAST(SDNode *N);		SDValue ScalarizeVecOp_BITCAST(SDNode *N);
SDValue ScalarizeVecOp_UnaryOp(SDNode *N);		SDValue ScalarizeVecOp_UnaryOp(SDNode *N);
SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);		SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);
SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue ScalarizeVecOp_VSELECT(SDNode *N);		SDValue ScalarizeVecOp_VSELECT(SDNode *N);
SDValue ScalarizeVecOp_VSETCC(SDNode *N);		SDValue ScalarizeVecOp_VSETCC(SDNode *N);
Show All 19 Lines	private:
void SplitVecRes_BinOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_BinOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_TernaryOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_TernaryOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_UnaryOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_UnaryOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_ExtendOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_ExtendOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_InregOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_InregOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_ExtVecInRegOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_ExtVecInRegOp(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_StrictFPOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_StrictFPOp(SDNode *N, SDValue &Lo, SDValue &Hi);

		void SplitVecRes_SMULFIX(SDNode *N, SDValue &Lo, SDValue &Hi);

void SplitVecRes_BITCAST(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_BITCAST(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_BUILD_VECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_BUILD_VECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_CONCAT_VECTORS(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_CONCAT_VECTORS(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_EXTRACT_SUBVECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_EXTRACT_SUBVECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_INSERT_SUBVECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_INSERT_SUBVECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_FPOWI(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_FPOWI(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_FCOPYSIGN(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_FCOPYSIGN(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_INSERT_VECTOR_ELT(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_INSERT_VECTOR_ELT(SDNode *N, SDValue &Lo, SDValue &Hi);
▲ Show 20 Lines • Show All 207 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 405 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::UMUL_LOHI:		case ISD::UMUL_LOHI:
case ISD::FCANONICALIZE:		case ISD::FCANONICALIZE:
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT:		case ISD::USUBSAT:
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
		case ISD::SMULFIX: {
		unsigned Scale = cast<ConstantSDNode>(Node->getOperand(2))->getZExtValue();
		RKSimonUnsubmitted Done Reply Inline Actions unsigned Scale = Node->getConstantOperandVal(2); RKSimon: unsigned Scale = Node->getConstantOperandVal(2);
		Action = TLI.getFixedPointOperationAction(Node->getOpcode(),
		Node->getValueType(0), Scale);
		break;
		}
case ISD::FP_ROUND_INREG:		case ISD::FP_ROUND_INREG:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<VTSDNode>(Node->getOperand(1))->getVT());		cast<VTSDNode>(Node->getOperand(1))->getVT());
break;		break;
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
Node->getOperand(0).getValueType());		Node->getOperand(0).getValueType());
▲ Show 20 Lines • Show All 817 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
R = ScalarizeVecRes_StrictFPOp(N);		R = ScalarizeVecRes_StrictFPOp(N);
break;		break;
		case ISD::SMULFIX:
		R = ScalarizeVecRes_SMULFIX(N);
		break;
}		}

// If R is null, the sub-method took care of registering the result.		// If R is null, the sub-method took care of registering the result.
if (R.getNode())		if (R.getNode())
SetScalarizedVector(SDValue(N, ResNo), R);		SetScalarizedVector(SDValue(N, ResNo), R);
}		}

SDValue DAGTypeLegalizer::ScalarizeVecRes_BinOp(SDNode *N) {		SDValue DAGTypeLegalizer::ScalarizeVecRes_BinOp(SDNode *N) {
SDValue LHS = GetScalarizedVector(N->getOperand(0));		SDValue LHS = GetScalarizedVector(N->getOperand(0));
SDValue RHS = GetScalarizedVector(N->getOperand(1));		SDValue RHS = GetScalarizedVector(N->getOperand(1));
return DAG.getNode(N->getOpcode(), SDLoc(N),		return DAG.getNode(N->getOpcode(), SDLoc(N),
LHS.getValueType(), LHS, RHS, N->getFlags());		LHS.getValueType(), LHS, RHS, N->getFlags());
}		}

SDValue DAGTypeLegalizer::ScalarizeVecRes_TernaryOp(SDNode *N) {		SDValue DAGTypeLegalizer::ScalarizeVecRes_TernaryOp(SDNode *N) {
SDValue Op0 = GetScalarizedVector(N->getOperand(0));		SDValue Op0 = GetScalarizedVector(N->getOperand(0));
SDValue Op1 = GetScalarizedVector(N->getOperand(1));		SDValue Op1 = GetScalarizedVector(N->getOperand(1));
SDValue Op2 = GetScalarizedVector(N->getOperand(2));		SDValue Op2 = GetScalarizedVector(N->getOperand(2));
return DAG.getNode(N->getOpcode(), SDLoc(N),		return DAG.getNode(N->getOpcode(), SDLoc(N),
Op0.getValueType(), Op0, Op1, Op2);		Op0.getValueType(), Op0, Op1, Op2);
}		}

		SDValue DAGTypeLegalizer::ScalarizeVecRes_SMULFIX(SDNode *N) {
		SDValue Op0 = GetScalarizedVector(N->getOperand(0));
		SDValue Op1 = GetScalarizedVector(N->getOperand(1));
		SDValue Op2 = N->getOperand(2);
		return DAG.getNode(N->getOpcode(), SDLoc(N), Op0.getValueType(), Op0, Op1,
		Op2);
		}

SDValue DAGTypeLegalizer::ScalarizeVecRes_StrictFPOp(SDNode *N) {		SDValue DAGTypeLegalizer::ScalarizeVecRes_StrictFPOp(SDNode *N) {
EVT VT = N->getValueType(0).getVectorElementType();		EVT VT = N->getValueType(0).getVectorElementType();
unsigned NumOpers = N->getNumOperands();		unsigned NumOpers = N->getNumOperands();
SDValue Chain = N->getOperand(0);		SDValue Chain = N->getOperand(0);
EVT ValueVTs[] = {VT, MVT::Other};		EVT ValueVTs[] = {VT, MVT::Other};
SDLoc dl(N);		SDLoc dl(N);

SmallVector<SDValue, 4> Opers;		SmallVector<SDValue, 4> Opers;
▲ Show 20 Lines • Show All 638 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
SplitVecRes_StrictFPOp(N, Lo, Hi);		SplitVecRes_StrictFPOp(N, Lo, Hi);
break;		break;
		case ISD::SMULFIX:
		SplitVecRes_SMULFIX(N, Lo, Hi);
		break;
}		}

// If Lo/Hi is null, the sub-method took care of registering results etc.		// If Lo/Hi is null, the sub-method took care of registering results etc.
if (Lo.getNode())		if (Lo.getNode())
SetSplitVector(SDValue(N, ResNo), Lo, Hi);		SetSplitVector(SDValue(N, ResNo), Lo, Hi);
}		}

void DAGTypeLegalizer::SplitVecRes_BinOp(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::SplitVecRes_BinOp(SDNode *N, SDValue &Lo,
Show All 21 Lines	void DAGTypeLegalizer::SplitVecRes_TernaryOp(SDNode *N, SDValue &Lo,
SDLoc dl(N);		SDLoc dl(N);

Lo = DAG.getNode(N->getOpcode(), dl, Op0Lo.getValueType(),		Lo = DAG.getNode(N->getOpcode(), dl, Op0Lo.getValueType(),
Op0Lo, Op1Lo, Op2Lo);		Op0Lo, Op1Lo, Op2Lo);
Hi = DAG.getNode(N->getOpcode(), dl, Op0Hi.getValueType(),		Hi = DAG.getNode(N->getOpcode(), dl, Op0Hi.getValueType(),
Op0Hi, Op1Hi, Op2Hi);		Op0Hi, Op1Hi, Op2Hi);
}		}

		void DAGTypeLegalizer::SplitVecRes_SMULFIX(SDNode *N, SDValue &Lo,
		SDValue &Hi) {
		SDValue LHSLo, LHSHi;
		GetSplitVector(N->getOperand(0), LHSLo, LHSHi);
		SDValue RHSLo, RHSHi;
		GetSplitVector(N->getOperand(1), RHSLo, RHSHi);
		SDLoc dl(N);
		SDValue Op2 = N->getOperand(2);

		unsigned Opcode = N->getOpcode();
		Lo = DAG.getNode(Opcode, dl, LHSLo.getValueType(), LHSLo, RHSLo, Op2);
		Hi = DAG.getNode(Opcode, dl, LHSHi.getValueType(), LHSHi, RHSHi, Op2);
		}

void DAGTypeLegalizer::SplitVecRes_BITCAST(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::SplitVecRes_BITCAST(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
// We know the result is a vector. The input may be either a vector or a		// We know the result is a vector. The input may be either a vector or a
// scalar value.		// scalar value.
EVT LoVT, HiVT;		EVT LoVT, HiVT;
std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));		std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
SDLoc dl(N);		SDLoc dl(N);

▲ Show 20 Lines • Show All 3,588 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,808 Lines • ▼ Show 20 Lines	case Intrinsic::ssub_sat: {
return nullptr;		return nullptr;
}		}
case Intrinsic::usub_sat: {		case Intrinsic::usub_sat: {
SDValue Op1 = getValue(I.getArgOperand(0));		SDValue Op1 = getValue(I.getArgOperand(0));
SDValue Op2 = getValue(I.getArgOperand(1));		SDValue Op2 = getValue(I.getArgOperand(1));
setValue(&I, DAG.getNode(ISD::USUBSAT, sdl, Op1.getValueType(), Op1, Op2));		setValue(&I, DAG.getNode(ISD::USUBSAT, sdl, Op1.getValueType(), Op1, Op2));
return nullptr;		return nullptr;
}		}
		case Intrinsic::smul_fix: {
		SDValue Op1 = getValue(I.getArgOperand(0));
		SDValue Op2 = getValue(I.getArgOperand(1));
		SDValue Op3 = getValue(I.getArgOperand(2));
		setValue(&I,
		DAG.getNode(ISD::SMULFIX, sdl, Op1.getValueType(), Op1, Op2, Op3));
		return nullptr;
		}
case Intrinsic::stacksave: {		case Intrinsic::stacksave: {
SDValue Op = getRoot();		SDValue Op = getRoot();
Res = DAG.getNode(		Res = DAG.getNode(
ISD::STACKSAVE, sdl,		ISD::STACKSAVE, sdl,
DAG.getVTList(TLI.getPointerTy(DAG.getDataLayout()), MVT::Other), Op);		DAG.getVTList(TLI.getPointerTy(DAG.getDataLayout()), MVT::Other), Op);
setValue(&I, Res);		setValue(&I, Res);
DAG.setRoot(Res.getValue(1));		DAG.setRoot(Res.getValue(1));
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 4,607 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 289 Lines • ▼ Show 20 Lines	#endif
case ISD::SHL_PARTS: return "shl_parts";		case ISD::SHL_PARTS: return "shl_parts";
case ISD::SRA_PARTS: return "sra_parts";		case ISD::SRA_PARTS: return "sra_parts";
case ISD::SRL_PARTS: return "srl_parts";		case ISD::SRL_PARTS: return "srl_parts";

case ISD::SADDSAT: return "saddsat";		case ISD::SADDSAT: return "saddsat";
case ISD::UADDSAT: return "uaddsat";		case ISD::UADDSAT: return "uaddsat";
case ISD::SSUBSAT: return "ssubsat";		case ISD::SSUBSAT: return "ssubsat";
case ISD::USUBSAT: return "usubsat";		case ISD::USUBSAT: return "usubsat";
		case ISD::SMULFIX: return "smulfix";

// Conversion operators.		// Conversion operators.
case ISD::SIGN_EXTEND: return "sign_extend";		case ISD::SIGN_EXTEND: return "sign_extend";
case ISD::ZERO_EXTEND: return "zero_extend";		case ISD::ZERO_EXTEND: return "zero_extend";
case ISD::ANY_EXTEND: return "any_extend";		case ISD::ANY_EXTEND: return "any_extend";
case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";		case ISD::SIGN_EXTEND_INREG: return "sign_extend_inreg";
case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";		case ISD::ANY_EXTEND_VECTOR_INREG: return "any_extend_vector_inreg";
case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";		case ISD::SIGN_EXTEND_VECTOR_INREG: return "sign_extend_vector_inreg";
▲ Show 20 Lines • Show All 572 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 5,065 Lines • ▼ Show 20 Lines	if (Opcode == ISD::UADDSAT) {
APInt MaxVal = APInt::getSignedMaxValue(BitWidth);		APInt MaxVal = APInt::getSignedMaxValue(BitWidth);
SDValue SatMin = DAG.getConstant(MinVal, dl, ResultType);		SDValue SatMin = DAG.getConstant(MinVal, dl, ResultType);
SDValue SatMax = DAG.getConstant(MaxVal, dl, ResultType);		SDValue SatMax = DAG.getConstant(MaxVal, dl, ResultType);
SDValue SumNeg = DAG.getSetCC(dl, BoolVT, SumDiff, Zero, ISD::SETLT);		SDValue SumNeg = DAG.getSetCC(dl, BoolVT, SumDiff, Zero, ISD::SETLT);
Result = DAG.getSelect(dl, ResultType, SumNeg, SatMax, SatMin);		Result = DAG.getSelect(dl, ResultType, SumNeg, SatMax, SatMin);
return DAG.getSelect(dl, ResultType, Overflow, Result, SumDiff);		return DAG.getSelect(dl, ResultType, Overflow, Result, SumDiff);
}		}
}		}

		SDValue
		TargetLowering::getExpandedFixedPointMultiplication(SDNode *Node,
		SelectionDAG &DAG) const {
		assert(Node->getOpcode() == ISD::SMULFIX && "Expected opcode to be SMULFIX.");
		assert(Node->getNumOperands() == 3 &&
		"Expected signed fixed point multiplication to have 3 operands.");

		SDLoc dl(Node);
		SDValue LHS = Node->getOperand(0);
		SDValue RHS = Node->getOperand(1);
		assert(LHS.getValueType().isScalarInteger() &&
		"Expected operands to be integers. Vector of int arguments should "
		"already be unrolled.");
		assert(RHS.getValueType().isScalarInteger() &&
		"Expected operands to be integers. Vector of int arguments should "
		"already be unrolled.");
		assert(LHS.getValueType() == RHS.getValueType() &&
		"Expected both operands to be the same type");

		unsigned Scale = cast<ConstantSDNode>(Node->getOperand(2))->getZExtValue();
		RKSimonUnsubmitted Done Reply Inline Actions unsigned Scale = Node->getConstantOperandVal(2); RKSimon: unsigned Scale = Node->getConstantOperandVal(2);
		EVT VT = LHS.getValueType();
		assert(Scale < VT.getScalarSizeInBits() &&
		"Expected scale to be less than the number of bits.");

		SDValue Lo = DAG.getNode(ISD::MUL, dl, VT, LHS, RHS);
		if (Scale) {
		SDValue Hi = DAG.getNode(ISD::MULHS, dl, VT, LHS, RHS);
		ebevhanUnsubmitted Done Reply Inline Actions Could use a couple comments explaining what we're doing with the values/SRL/SHL. Does this work if MULHS in VT is of dubious legality? ebevhan: Could use a couple comments explaining what we're doing with the values/SRL/SHL. Does this…
		craig.topperUnsubmitted Done Reply Inline Actions Ideally we'd use MUL_LOHI if the target supports it. That should allow X86 to use a single multiply instruction in the test cases. craig.topper: Ideally we'd use MUL_LOHI if the target supports it. That should allow X86 to use a single…
		leonardchanAuthorUnsubmitted Done Reply Inline Actions Added checks to see if we can use `MULHS` or `SMUL_LOHI`. leonardchan: Added checks to see if we can use `MULHS` or `SMUL_LOHI`.
		EVT ShiftTy = getShiftAmountTy(VT, DAG.getDataLayout());
		craig.topperUnsubmitted Done Reply Inline Actions Shift amount constants should get their type from getShiftAmountTy. craig.topper: Shift amount constants should get their type from getShiftAmountTy.
		Lo = DAG.getNode(ISD::SRL, dl, VT, Lo, DAG.getConstant(Scale, dl, ShiftTy));
		Hi = DAG.getNode(
		ISD::SHL, dl, VT, Hi,
		craig.topperUnsubmitted Done Reply Inline Actions This is really an OR isn't it? DAG combiner will turn it into that so might as well just use OR. craig.topper: This is really an OR isn't it? DAG combiner will turn it into that so might as well just use OR.
		DAG.getConstant(VT.getScalarSizeInBits() - Scale, dl, ShiftTy));
		craig.topperUnsubmitted Done Reply Inline Actions No need for an else after a return. craig.topper: No need for an else after a return.
		return DAG.getNode(ISD::OR, dl, VT, Lo, Hi);
		}
		return Lo;
		}

llvm/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 608 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::all_valuetypes()) {
setOperationAction(ISD::SMAX, VT, Expand);		setOperationAction(ISD::SMAX, VT, Expand);
setOperationAction(ISD::UMIN, VT, Expand);		setOperationAction(ISD::UMIN, VT, Expand);
setOperationAction(ISD::UMAX, VT, Expand);		setOperationAction(ISD::UMAX, VT, Expand);
setOperationAction(ISD::ABS, VT, Expand);		setOperationAction(ISD::ABS, VT, Expand);
setOperationAction(ISD::SADDSAT, VT, Expand);		setOperationAction(ISD::SADDSAT, VT, Expand);
setOperationAction(ISD::UADDSAT, VT, Expand);		setOperationAction(ISD::UADDSAT, VT, Expand);
setOperationAction(ISD::SSUBSAT, VT, Expand);		setOperationAction(ISD::SSUBSAT, VT, Expand);
setOperationAction(ISD::USUBSAT, VT, Expand);		setOperationAction(ISD::USUBSAT, VT, Expand);
		setOperationAction(ISD::SMULFIX, VT, Expand);

// Overflow operations default to expand		// Overflow operations default to expand
setOperationAction(ISD::SADDO, VT, Expand);		setOperationAction(ISD::SADDO, VT, Expand);
setOperationAction(ISD::SSUBO, VT, Expand);		setOperationAction(ISD::SSUBO, VT, Expand);
setOperationAction(ISD::UADDO, VT, Expand);		setOperationAction(ISD::UADDO, VT, Expand);
setOperationAction(ISD::USUBO, VT, Expand);		setOperationAction(ISD::USUBO, VT, Expand);
setOperationAction(ISD::SMULO, VT, Expand);		setOperationAction(ISD::SMULO, VT, Expand);
setOperationAction(ISD::UMULO, VT, Expand);		setOperationAction(ISD::UMULO, VT, Expand);
▲ Show 20 Lines • Show All 1,242 Lines • Show Last 20 Lines

llvm/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,523 Lines • ▼ Show 20 Lines	case Intrinsic::usub_sat: {
Assert(Op1->getType()->isIntOrIntVectorTy(),		Assert(Op1->getType()->isIntOrIntVectorTy(),
"first operand of [us][add\|sub]_sat must be an int type or vector "		"first operand of [us][add\|sub]_sat must be an int type or vector "
"of ints");		"of ints");
Assert(Op2->getType()->isIntOrIntVectorTy(),		Assert(Op2->getType()->isIntOrIntVectorTy(),
"second operand of [us][add\|sub]_sat must be an int type or vector "		"second operand of [us][add\|sub]_sat must be an int type or vector "
"of ints");		"of ints");
break;		break;
}		}
		case Intrinsic::smul_fix: {
		Value *Op1 = CS.getArgOperand(0);
		Value *Op2 = CS.getArgOperand(1);
		Assert(Op1->getType()->isIntOrIntVectorTy(),
		"first operand of smul_fix must be an int type or vector "
		"of ints");
		Assert(Op2->getType()->isIntOrIntVectorTy(),
		"second operand of smul_fix must be an int type or vector "
		"of ints");

		auto *Op3 = dyn_cast<ConstantInt>(CS.getArgOperand(2));
		Assert(Op3, "third argument of smul_fix must be a constant integer");
		craig.topperUnsubmitted Done Reply Inline Actions argumenr->argument craig.topper: argumenr->argument
		Assert(Op3->getType()->getBitWidth() <= 32,
		craig.topperUnsubmitted Done Reply Inline Actions I think you need to check that Op3 is a ConstantInt as well. And that it fits in 32 bits. craig.topper: I think you need to check that Op3 is a ConstantInt as well. And that it fits in 32 bits.
		"third argument of smul_fix must fit within 32 bits");
		break;
		}
};		};
}		}

/// Carefully grab the subprogram from a local scope.		/// Carefully grab the subprogram from a local scope.
///		///
/// This carefully grabs the subprogram from a local scope, avoiding the		/// This carefully grabs the subprogram from a local scope, avoiding the
/// built-in assertions that would typically fire.		/// built-in assertions that would typically fire.
static DISubprogram getSubprogram(Metadata LocalScope) {		static DISubprogram getSubprogram(Metadata LocalScope) {
▲ Show 20 Lines • Show All 668 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/smul_fix.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mcpu=generic -mtriple=x86_64-linux \| FileCheck %s
				; RUN: llc < %s -mcpu=generic -mtriple=i686 -mattr=cmov \| FileCheck %s --check-prefix=CHECK32
				RKSimonUnsubmitted Done Reply Inline Actions You should be able to drop -mcpu, also please can you use -check-prefix=X64 and -check-prefix=X86 ? RKSimon: You should be able to drop -mcpu, also please can you use -check-prefix=X64 and -check…

				declare i4 @llvm.smul.fix.i4 (i4, i4, i32)
				declare i32 @llvm.smul.fix.i32 (i32, i32, i32)
				declare i64 @llvm.smul.fix.i64 (i64, i64, i32)
				declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32>, <4 x i32>, i32)

				RKSimonUnsubmitted Done Reply Inline Actions Add nounwind to all the tests to reduce stack codegen? RKSimon: Add nounwind to all the tests to reduce stack codegen?
				define i32 @func(i32 %x, i32 %y) {
				; CHECK-LABEL: func:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movslq %edi, %rcx
				; CHECK-NEXT: imull %esi, %edi
				; CHECK-NEXT: movslq %esi, %rax
				; CHECK-NEXT: imulq %rcx, %rax
				; CHECK-NEXT: shrq $32, %rax
				; CHECK-NEXT: shldl $30, %edi, %eax
				; CHECK-NEXT: # kill: def $eax killed $eax killed $rax
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: func:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %edx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: movl %eax, %ecx
				; CHECK32-NEXT: imull %edx, %ecx
				; CHECK32-NEXT: imull %edx
				; CHECK32-NEXT: shrdl $2, %edx, %ecx
				ebevhanUnsubmitted Done Reply Inline Actions Interesting that the 32-bit target produces better code in most cases. ebevhan: Interesting that the 32-bit target produces better code in most cases.
				; CHECK32-NEXT: movl %ecx, %eax
				; CHECK32-NEXT: retl
				%tmp = call i32 @llvm.smul.fix.i32(i32 %x, i32 %y, i32 2);
				ret i32 %tmp;
				}

				define i64 @func2(i64 %x, i64 %y) {
				; CHECK-LABEL: func2:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movq %rdi, %rax
				; CHECK-NEXT: movq %rdi, %rcx
				; CHECK-NEXT: imulq %rsi, %rcx
				; CHECK-NEXT: imulq %rsi
				; CHECK-NEXT: movq %rdx, %rax
				; CHECK-NEXT: shldq $62, %rcx, %rax
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: func2:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: pushl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 8
				; CHECK32-NEXT: .cfi_offset %esi, -8
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %esi
				; CHECK32-NEXT: movl %ecx, %eax
				; CHECK32-NEXT: mull %esi
				; CHECK32-NEXT: shrl $2, %eax
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %ecx
				; CHECK32-NEXT: addl %ecx, %edx
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %esi
				; CHECK32-NEXT: addl %esi, %edx
				; CHECK32-NEXT: shll $30, %edx
				; CHECK32-NEXT: popl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 4
				; CHECK32-NEXT: retl
				%tmp = call i64 @llvm.smul.fix.i64(i64 %x, i64 %y, i32 2);
				ret i64 %tmp;
				}

				define i4 @func3(i4 %x, i4 %y) {
				; CHECK-LABEL: func3:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movl %edi, %eax
				; CHECK-NEXT: mulb %sil
				; CHECK-NEXT: shrb $2, %al
				; CHECK-NEXT: movsbl %sil, %ecx
				; CHECK-NEXT: movsbl %dil, %edx
				; CHECK-NEXT: imull %ecx, %edx
				; CHECK-NEXT: shrl $8, %edx
				; CHECK-NEXT: shlb $6, %dl
				; CHECK-NEXT: orb %dl, %al
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: func3:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: movsbl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: movsbl {{[0-9]+}}(%esp), %edx
				; CHECK32-NEXT: movl %eax, %ecx
				; CHECK32-NEXT: imull %edx, %ecx
				; CHECK32-NEXT: shlb $6, %ch
				; CHECK32-NEXT: # kill: def $al killed $al killed $eax
				; CHECK32-NEXT: mulb %dl
				; CHECK32-NEXT: shrb $2, %al
				; CHECK32-NEXT: orb %ch, %al
				; CHECK32-NEXT: retl
				%tmp = call i4 @llvm.smul.fix.i4(i4 %x, i4 %y, i32 2);
				ret i4 %tmp;
				}

				define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) {
				; CHECK-LABEL: vec:
				; CHECK: # %bb.0:
				; CHECK-NEXT: pshufd {{.*#+}} xmm2 = xmm1[3,1,2,3]
				; CHECK-NEXT: movd %xmm2, %eax
				; CHECK-NEXT: pshufd {{.*#+}} xmm2 = xmm0[3,1,2,3]
				; CHECK-NEXT: movd %xmm2, %ecx
				; CHECK-NEXT: movslq %ecx, %rdx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: cltq
				; CHECK-NEXT: imulq %rdx, %rax
				; CHECK-NEXT: shrq $32, %rax
				; CHECK-NEXT: shldl $30, %ecx, %eax
				; CHECK-NEXT: movd %eax, %xmm2
				; CHECK-NEXT: pshufd {{.*#+}} xmm3 = xmm1[2,3,0,1]
				; CHECK-NEXT: movd %xmm3, %eax
				; CHECK-NEXT: pshufd {{.*#+}} xmm3 = xmm0[2,3,0,1]
				; CHECK-NEXT: movd %xmm3, %ecx
				; CHECK-NEXT: movslq %ecx, %rdx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: cltq
				; CHECK-NEXT: imulq %rdx, %rax
				; CHECK-NEXT: shrq $32, %rax
				; CHECK-NEXT: shldl $30, %ecx, %eax
				; CHECK-NEXT: movd %eax, %xmm3
				; CHECK-NEXT: punpckldq {{.*#+}} xmm3 = xmm3[0],xmm2[0],xmm3[1],xmm2[1]
				; CHECK-NEXT: movd %xmm1, %eax
				; CHECK-NEXT: movd %xmm0, %ecx
				; CHECK-NEXT: movslq %ecx, %rdx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: cltq
				; CHECK-NEXT: imulq %rdx, %rax
				; CHECK-NEXT: shrq $32, %rax
				; CHECK-NEXT: shldl $30, %ecx, %eax
				; CHECK-NEXT: movd %eax, %xmm2
				; CHECK-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,2,3]
				; CHECK-NEXT: movd %xmm1, %eax
				; CHECK-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,2,3]
				; CHECK-NEXT: movd %xmm0, %ecx
				; CHECK-NEXT: movslq %ecx, %rdx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: cltq
				; CHECK-NEXT: imulq %rdx, %rax
				; CHECK-NEXT: shrq $32, %rax
				; CHECK-NEXT: shldl $30, %ecx, %eax
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq {{.*#+}} xmm2 = xmm2[0],xmm0[0],xmm2[1],xmm0[1]
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm2 = xmm2[0],xmm3[0]
				; CHECK-NEXT: movdqa %xmm2, %xmm0
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: vec:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: pushl %ebp
				; CHECK32-NEXT: .cfi_def_cfa_offset 8
				; CHECK32-NEXT: pushl %ebx
				; CHECK32-NEXT: .cfi_def_cfa_offset 12
				; CHECK32-NEXT: pushl %edi
				; CHECK32-NEXT: .cfi_def_cfa_offset 16
				; CHECK32-NEXT: pushl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 20
				; CHECK32-NEXT: .cfi_offset %esi, -20
				; CHECK32-NEXT: .cfi_offset %edi, -16
				; CHECK32-NEXT: .cfi_offset %ebx, -12
				; CHECK32-NEXT: .cfi_offset %ebp, -8
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %edi
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %esi
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: movl %eax, %ebp
				; CHECK32-NEXT: imull %ecx, %ebp
				; CHECK32-NEXT: imull %ecx
				; CHECK32-NEXT: movl %edx, %ecx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: shldl $30, %ebp, %ecx
				; CHECK32-NEXT: movl %eax, %ebp
				; CHECK32-NEXT: imull %esi, %ebp
				; CHECK32-NEXT: imull %esi
				; CHECK32-NEXT: movl %edx, %esi
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: shldl $30, %ebp, %esi
				; CHECK32-NEXT: movl %eax, %ebp
				; CHECK32-NEXT: imull %edi, %ebp
				; CHECK32-NEXT: imull %edi
				; CHECK32-NEXT: movl %edx, %edi
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: shldl $30, %ebp, %edi
				; CHECK32-NEXT: movl %eax, %ebp
				; CHECK32-NEXT: imull %ebx, %ebp
				; CHECK32-NEXT: imull %ebx
				; CHECK32-NEXT: shldl $30, %ebp, %edx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: movl %edx, 12(%eax)
				; CHECK32-NEXT: movl %edi, 8(%eax)
				; CHECK32-NEXT: movl %esi, 4(%eax)
				; CHECK32-NEXT: movl %ecx, (%eax)
				; CHECK32-NEXT: popl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 16
				; CHECK32-NEXT: popl %edi
				; CHECK32-NEXT: .cfi_def_cfa_offset 12
				; CHECK32-NEXT: popl %ebx
				; CHECK32-NEXT: .cfi_def_cfa_offset 8
				; CHECK32-NEXT: popl %ebp
				; CHECK32-NEXT: .cfi_def_cfa_offset 4
				; CHECK32-NEXT: retl $4
				%tmp = call <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 2);
				ret <4 x i32> %tmp;
				}

				; These result in regular integer multiplication
				define i32 @func4(i32 %x, i32 %y) {
				; CHECK-LABEL: func4:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movl %edi, %eax
				; CHECK-NEXT: imull %esi, %eax
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: func4:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: retl
				%tmp = call i32 @llvm.smul.fix.i32(i32 %x, i32 %y, i32 0);
				ret i32 %tmp;
				}

				define i64 @func5(i64 %x, i64 %y) {
				; CHECK-LABEL: func5:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movq %rdi, %rax
				; CHECK-NEXT: imulq %rsi, %rax
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: func5:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: pushl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 8
				; CHECK32-NEXT: .cfi_offset %esi, -8
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %esi
				; CHECK32-NEXT: movl %ecx, %eax
				; CHECK32-NEXT: mull %esi
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %ecx
				; CHECK32-NEXT: addl %ecx, %edx
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %esi
				; CHECK32-NEXT: addl %esi, %edx
				; CHECK32-NEXT: popl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 4
				; CHECK32-NEXT: retl
				%tmp = call i64 @llvm.smul.fix.i64(i64 %x, i64 %y, i32 0);
				ret i64 %tmp;
				}

				define i4 @func6(i4 %x, i4 %y) {
				; CHECK-LABEL: func6:
				; CHECK: # %bb.0:
				; CHECK-NEXT: movl %edi, %eax
				; CHECK-NEXT: # kill: def $al killed $al killed $eax
				; CHECK-NEXT: mulb %sil
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: func6:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: movb {{[0-9]+}}(%esp), %al
				; CHECK32-NEXT: mulb {{[0-9]+}}(%esp)
				; CHECK32-NEXT: retl
				%tmp = call i4 @llvm.smul.fix.i4(i4 %x, i4 %y, i32 0);
				ret i4 %tmp;
				}

				define <4 x i32> @vec2(<4 x i32> %x, <4 x i32> %y) {
				; CHECK-LABEL: vec2:
				; CHECK: # %bb.0:
				; CHECK-NEXT: pshufd {{.*#+}} xmm2 = xmm1[3,1,2,3]
				; CHECK-NEXT: movd %xmm2, %eax
				; CHECK-NEXT: pshufd {{.*#+}} xmm2 = xmm0[3,1,2,3]
				; CHECK-NEXT: movd %xmm2, %ecx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: movd %ecx, %xmm2
				; CHECK-NEXT: pshufd {{.*#+}} xmm3 = xmm1[2,3,0,1]
				; CHECK-NEXT: movd %xmm3, %eax
				; CHECK-NEXT: pshufd {{.*#+}} xmm3 = xmm0[2,3,0,1]
				; CHECK-NEXT: movd %xmm3, %ecx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: movd %ecx, %xmm3
				; CHECK-NEXT: punpckldq {{.*#+}} xmm3 = xmm3[0],xmm2[0],xmm3[1],xmm2[1]
				; CHECK-NEXT: movd %xmm1, %eax
				; CHECK-NEXT: movd %xmm0, %ecx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: movd %ecx, %xmm2
				; CHECK-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,2,3]
				; CHECK-NEXT: movd %xmm1, %eax
				; CHECK-NEXT: pshufd {{.*#+}} xmm0 = xmm0[1,1,2,3]
				; CHECK-NEXT: movd %xmm0, %ecx
				; CHECK-NEXT: imull %eax, %ecx
				; CHECK-NEXT: movd %ecx, %xmm0
				; CHECK-NEXT: punpckldq {{.*#+}} xmm2 = xmm2[0],xmm0[0],xmm2[1],xmm0[1]
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm2 = xmm2[0],xmm3[0]
				; CHECK-NEXT: movdqa %xmm2, %xmm0
				; CHECK-NEXT: retq
				;
				; CHECK32-LABEL: vec2:
				; CHECK32: # %bb.0:
				; CHECK32-NEXT: pushl %edi
				; CHECK32-NEXT: .cfi_def_cfa_offset 8
				; CHECK32-NEXT: pushl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 12
				; CHECK32-NEXT: .cfi_offset %esi, -12
				; CHECK32-NEXT: .cfi_offset %edi, -8
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %edx
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %esi
				; CHECK32-NEXT: movl {{[0-9]+}}(%esp), %edi
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %edi
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %esi
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %edx
				; CHECK32-NEXT: imull {{[0-9]+}}(%esp), %ecx
				; CHECK32-NEXT: movl %ecx, 12(%eax)
				; CHECK32-NEXT: movl %edx, 8(%eax)
				; CHECK32-NEXT: movl %esi, 4(%eax)
				; CHECK32-NEXT: movl %edi, (%eax)
				; CHECK32-NEXT: popl %esi
				; CHECK32-NEXT: .cfi_def_cfa_offset 8
				; CHECK32-NEXT: popl %edi
				; CHECK32-NEXT: .cfi_def_cfa_offset 4
				; CHECK32-NEXT: retl $4
				%tmp = call <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 0);
				ret <4 x i32> %tmp;
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Intrinsic] Signed Fixed Point Multiplication IntrinsicClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 174703

llvm/docs/LangRef.rst

llvm/include/llvm/CodeGen/ISDOpcodes.h

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/Target/TargetSelectionDAG.td

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/CodeGen/TargetLoweringBase.cpp

llvm/lib/IR/Verifier.cpp

llvm/test/CodeGen/X86/smul_fix.ll

[Intrinsic] Signed Fixed Point Multiplication Intrinsic
ClosedPublic