This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
2/2
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
TargetLowering.h
-
IR/
-
Intrinsics.td
-
Target/
-
TargetSelectionDAG.td
-
lib/CodeGen/
-
CodeGen/
-
SelectionDAG/
1/1
LegalizeDAG.cpp
1/1
LegalizeFloatTypes.cpp
1/1
LegalizeIntegerTypes.cpp
-
LegalizeTypes.h
-
LegalizeVectorOps.cpp
-
LegalizeVectorTypes.cpp
-
SelectionDAG.cpp
1
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
-
TargetLowering.cpp
-
TargetLoweringBase.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
fptoi-sat-scalar.ll

Differential D54749

Saturating float to int casts.
ClosedPublic

Authored by bjope on Nov 20 2018, 3:49 AM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
t.p.northover
hfinkel
MatzeB
andrew.w.kaylor
scanon
jdoerfert
nikic
bogner
craig.topper
ebevhan

Commits

rGa89d751fb401: Add intrinsics for saturating float to int casts

Summary

This is the first part split off from D54696.

This patch adds support for the fptoui.sat and fptosi.sat intrinsics, which provide basically the same functionality as the existing fptoui and fptosi instructions, but will saturate (or return 0 for NaN) on values unrepresentable in the target type, instead of returning poison. Related mailing list discussion can be found at: https://groups.google.com/d/msg/llvm-dev/cgDFaBmCnDQ/CZAIMj4IBAAJ.

The intrinsics have overloaded source and result type and support vector operands:

i32 @llvm.fptoui.sat.i32.f32(float %f)
i100 @llvm.fptoui.sat.i100.f64(double %f)
<4 x i32> @llvm.fptoui.sat.v4i32.v4f16(half %f)
// etc

On the SelectionDAG layer two new ISD opcodes are added, FP_TO_UINT_SAT and FP_TO_SINT_SAT. These opcodes have two operands and one result. The second operand is an integer constant specifying the scalar saturation width. The idea here is that initially the second operand and the scalar width of the result type are the same, but they may change during type legalization. For example:

i19 @llvm.fptsi.sat.i19.f32(float %f)
// builds
i19 fp_to_sint_sat f, 19
// type legalizes (through integer result promotion)
i32 fp_to_sint_sat f, 19

I went for this approach, because saturated conversion does not compose well. There is no good way of "adjusting" a saturating conversion to i32 into one to i19 short of saturating twice. Specifying the saturation width separately allows directly saturating to the correct width.

There are two baseline expansions for the fp_to_xint_sat opcodes. If the integer bounds can be exactly represented in the float type and fminnum/fmaxnum are legal, we can expand to something like:

f = fmaxnum f, FP(MIN)
f = fminnum f, FP(MAX)
i = fptoxi f
i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN

If the bounds cannot be exactly represented, we expand to something like this instead:

i = fptoxi f
i = select f ult FP(MIN), MIN, i
i = select f ogt FP(MAX), MAX, i
i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN

It should be noted that this expansion assumes a non-trapping fptoxi.

Initial tests are for AArch64, x86_64 and ARM. This exercises all of the scalar and vector legalization. ARM is included to test float softening.

Original patch by @nikic.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

One thing I'm uncertain about here are the operands of the FP_TO_XINT_SAT opcode. There are basically three ways I could see this being represented, with a post-legalization saturation to v4i19:

v4i32 = fp_to_xint_sat f, VT:v4i19
v4i32 = fp_to_xint_sat f, VT:i19
v4i32 = fp_to_xint_sat f, 19

That is, either the second operand is the type to which we saturate (possibly a vector), the scalarization thereof, or the scalar bitwidth.

The current implementation is the first and is modeled after what sign_extend_inreg does. The disadvantage is that the argument requires special handling, in particular during vector legalizations. It might be possible to reuse a bit more code in that area if it is stored as the scalarized type. On the other hand that feels somewhat asymmetric.

The last variant is probably not great because it does not make it obvious that the saturation width must be static.

t.p.northover added a subscriber: t.p.northover.Nov 20 2018, 6:00 AM

t.p.northover added inline comments.

docs/LangRef.rst
13105	What's going on here? All the other intrinsics I know specify the return type first in the prototype. I didn't even think it was something you could override.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2524	Maybe put the normal assertion in here? assert(NewOutTy.isInteger() && "Ran out of possibilities!");
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
5686	I was expecting this tag to be the scalar type in the vector situation. Does something get neater this way round?

The current implementation is the first and is modeled after what sign_extend_inreg does.

That's pretty persuasive. Though possibly something we could fix in sign-extend if enough people think the scalar is better.

Put result type first in the type suffix of the intrinsics. Add an assertion.

nikic marked 2 inline comments as done.Nov 20 2018, 7:09 AM

nikic added inline comments.

docs/LangRef.rst
13105	You're absolutely right, I've flipped the order now, here and in tests. It looks like LLVM is very forgiving of incorrect intrinsic name suffixes and just normalizes them to be correct.

Rebase and fix some formatting.

rkruppe added a subscriber: rkruppe.Nov 28 2018, 2:16 PM

Ping. Would appreciate some feedback, especially regarding the structure of the SelectionDAG nodes.

Add X86 tests.

andreadb added a subscriber: andreadb.Dec 12 2018, 5:28 AM

I think the consistency argument is good, so I'm in favour of the structure you chose now. My main issue at the moment is the tests: hard-coding register usage leads to very fragile tests.

It might be better to split the test files into fptosi-sat-*.ll and fptoui-sat-*.ll

test/CodeGen/X86/fptoi-sat-scalar.ll

2 ↗

(On Diff #177839)

Please add i686 coverage:

; RUN: llc < %s -mtriple=i686-linux | FileCheck %s --check-prefix=X86,X86-X87
; RUN: llc < %s -mtriple=i686-linux -mattr=+sse2 | FileCheck %s --check-prefix=X86,X86-SSE
; RUN: llc < %s -mtriple=x86_64-linux | FileCheck %s --check-prefix=X64

Add i686 tests, split test files for fptoui and fptosi.

I had to comment out a few more tests on the X86 side, because they would need libcall legalizations on i686, which are not implemented in this patch.

Rebase

In D54749#1329486, @t.p.northover wrote:

I think the consistency argument is good, so I'm in favour of the structure you chose now. My main issue at the moment is the tests: hard-coding register usage leads to very fragile tests.

Is there any way to avoid hardcoded register usage in an automated manner? With vector coverage, this is probably going to need about 15k lines of tests just for two platforms, and they're going to change quite a bit until codegen is finalized. I don't think it's productive to write and rewrite those by hand.

In D54749#1339117, @nikic wrote:

In D54749#1329486, @t.p.northover wrote:

I think the consistency argument is good, so I'm in favour of the structure you chose now. My main issue at the moment is the tests: hard-coding register usage leads to very fragile tests.

Is there any way to avoid hardcoded register usage in an automated manner?

From previous discussions, the hardcoded/spelled-out register names are intentional.

With vector coverage, this is probably going to need about 15k lines of tests just for two platforms, and they're going to change quite a bit until codegen is finalized. I don't think it's productive to write and rewrite those by hand.

+1, update_llc_test_checks.py is the standard tool for the task, and is the only sane way to have meaningful, reasonable test coverage without going insane from having to write CHECK lines by hand.

Ping

sunfish added a subscriber: sunfish.Jan 17 2019, 9:23 AM

Ping

arsenm added a subscriber: arsenm.Feb 8 2019, 11:15 AM

arsenm added inline comments.

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
961–963	I'm pretty sure this will warn. There's no point in doing this anyway since you'll get the appropriate error if you just leave this out
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
2308	Ditto

Remove unimplemented legalization stubs.

Add nounwind to avoid CFI.

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 12 2019, 2:03 PM

Ping

spatel mentioned this in D60021: InstSimplify: Fold round intrinsics from sitofp.Apr 1 2019, 8:04 AM

aykevl added a subscriber: aykevl.Apr 12 2019, 6:00 AM

lzutao added a subscriber: lzutao.Jun 26 2019, 8:39 PM

Adding more potential reviewers

Rebase

Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2019, 12:00 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

fhahn added a subscriber: fhahn.Jun 28 2019, 11:37 AM

Reviewers: what do we need to get this across the finish line?

llvm/docs/LangRef.rst
14206 ↗	(On Diff #206902)	typo: "the same"
14257 ↗	(On Diff #206902)	Ditto.
llvm/include/llvm/CodeGen/ISDOpcodes.h
552 ↗	(On Diff #206902)	Can we rewrite this to be a bit more clear? I find something like the following easier to understand: If the FP value is NaN, the result is 0. Otherwise it is clamped to the range representable by the type of operand 1, and the fractional part is discarded. This is partially a matter of taste, of course.

This revision now requires changes to proceed.Jul 22 2019, 1:27 PM

• jeffvandyke added a subscriber: • jeffvandyke.Aug 6 2019, 10:46 AM

Rebase.

For what it's worth, RISC-V implements semantics close to but not exactly like ARM, with NaN being treated as if it was +Infinity.
Source: https://content.riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf#page=63

(Also, I need these intrinsics as well for a Go compiler).

ping

@RKSimon @scanon @nikic

These intrinsics are necessary to implement Nontrapping Float To Int Conversions, which have now been merged with the mainline WebAssembly spec. Without these intrinsics, a significant amount of code is required to get them to function (using Rust's implementation as an example, amount of code varies by language). Since the WebAssembly standard now requires these operations, I would like to know the status of this PR and how close it is to being merged.

ebevhan added a subscriber: ebevhan.Aug 7 2020, 7:41 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptAug 7 2020, 7:41 AM

These intrinsics would likely also be useful for implementing Embedded-C's floating point to fixed-point conversion in Clang.

I've tried to get this in for a long time, but I don't think there's enough interest in this functionality. Rust ended up doing the saturation itself, even though that produces code that is much worse than a custom lowered intrinsic. At some point you have to cut your losses :)

That's unfortunate. I was looking forward to this to use in TinyGo. For TinyGo it doesn't matter what the exact behavior is, just that it is something sensible (and not UB). Saturating float operations would be great.

Also, wouldn't Rust start using this once it becomes available in LLVM? I remember from the GitHub issue that performance was a concern.

It seems like there are quite a few people who would like to use this intrinsic, though. The only lack of interest seems to be in reviewing.

I was planning on submitting something similar for the fixed-point support, but now that I know this patch exists, I feel like helping this move forward is the better option.

In D54749#2206518, @ebevhan wrote:

It seems like there are quite a few people who would like to use this intrinsic, though. The only lack of interest seems to be in reviewing.
I was planning on submitting something similar for the fixed-point support, but now that I know this patch exists, I feel like helping this move forward is the better option.

That seems like the right way to proceed (commandeer/reopen) - there was definitely interest from the early reviewer comments, but maybe they're tied up with other stuff, so just add more people to the list.
Personally, I didn't look at this closely because I had higher priority patches and I thought other reviewers would continue. I don't think there are any fundamental blockers. Sometimes you just have to keep pinging. :)

ebevhan mentioned this in D81904: [clang] Do not crash for unsupported fixed point to floating point conversion.Aug 10 2020, 9:57 AM

In D54749#2206737, @spatel wrote:

In D54749#2206518, @ebevhan wrote:

It seems like there are quite a few people who would like to use this intrinsic, though. The only lack of interest seems to be in reviewing.
I was planning on submitting something similar for the fixed-point support, but now that I know this patch exists, I feel like helping this move forward is the better option.

That seems like the right way to proceed (commandeer/reopen) - there was definitely interest from the early reviewer comments, but maybe they're tied up with other stuff, so just add more people to the list.
Personally, I didn't look at this closely because I had higher priority patches and I thought other reviewers would continue. I don't think there are any fundamental blockers. Sometimes you just have to keep pinging. :)

Well, I suspect that the GIsel people will want a GIsel lowering of the intrinsics to match the ISelDAG ones, but other than that the patch looks good from what I can see.

I will probably commandeer this soon and try for another push at getting it in if @nikic doesn't mind.

Ka-Ka added a subscriber: Ka-Ka.Aug 12 2020, 3:25 AM

ebevhan commandeered this revision.Aug 14 2020, 8:50 AM

ebevhan added a reviewer: nikic.

Thank you for picking this up. Some pointers from my side:

I would recommend against including GlobalISel support in the initial implementation. Unless trivial, SDAG and GlobalISel changes should never be made in the same patch, as the reviewers for these parts of the codebase are essentially disjoint.
You may want to replace the VT specifying the saturation width with a simple constant integer operand. This is the approach that the fixed-point ISD opcodes went with, and it should make legalization a bit simpler.
I originally restricted this patch to a minimum viable implementation, in the hope that it would make review easier and get this landed quickly. Given how things turned out, it might make sense to put the soft float and vector legalization (which are part of D54696) back into this patch, so that legalization support is complete and this can stand on its own.
For soft float legalization, I would switch from libcall legalization to expanding and recursively legalizing. Adding so many new compiler-rt functions is an unnecessary burden.
Finally, something worth mentioning is that the legalization implemented here is not compatible with trapping fptoi (at least one of the expansions isn't). I don't believe trapping fptoi's are actually legal per langref, but I've also seen people adjust x86 fptoi lowering to work with trapping fptoi at some point, so I'm a bit confused on what the state here is.

Finally, something worth mentioning is that the legalization implemented here is not compatible with trapping fptoi (at least one of the expansions isn't). I don't believe trapping fptoi's are actually legal per langref, but I've also seen people adjust x86 fptoi lowering to work with trapping fptoi at some point, so I'm a bit confused on what the state here is.

The fptosi instruction doesn't have side-effects. llvm.experimental.constrained.fptosi can raise floating-point exceptions. I'm not sure anyone has looked at actually trapping.

In D54749#2218724, @efriedma wrote:

Finally, something worth mentioning is that the legalization implemented here is not compatible with trapping fptoi (at least one of the expansions isn't). I don't believe trapping fptoi's are actually legal per langref, but I've also seen people adjust x86 fptoi lowering to work with trapping fptoi at some point, so I'm a bit confused on what the state here is.

The fptosi instruction doesn't have side-effects. llvm.experimental.constrained.fptosi can raise floating-point exceptions. I'm not sure anyone has looked at actually trapping.

Thanks for the clarification. The changes I vaguely remembered here were apparently D53794 and D67105, which were indeed related to FPEs. I presume it is not a problem if saturating fptoi causes FPEs as part of normal operation?

In D54749#2218829, @nikic wrote:

In D54749#2218724, @efriedma wrote:

Finally, something worth mentioning is that the legalization implemented here is not compatible with trapping fptoi (at least one of the expansions isn't). I don't believe trapping fptoi's are actually legal per langref, but I've also seen people adjust x86 fptoi lowering to work with trapping fptoi at some point, so I'm a bit confused on what the state here is.

The fptosi instruction doesn't have side-effects. llvm.experimental.constrained.fptosi can raise floating-point exceptions. I'm not sure anyone has looked at actually trapping.

Thanks for the clarification. The changes I vaguely remembered here were apparently D53794 and D67105, which were indeed related to FPEs. I presume it is not a problem if saturating fptoi causes FPEs as part of normal operation?

Right, you can ignore the possibility of floating-point exceptions. (We might at some point want to implement llvm.experimental.constrained.fptosi.sat, but that would be a separate intrinsic.)

In D54749#2218465, @nikic wrote:

Thank you for picking this up. Some pointers from my side:

I would recommend against including GlobalISel support in the initial implementation. Unless trivial, SDAG and GlobalISel changes should never be made in the same patch, as the reviewers for these parts of the codebase are essentially disjoint.

All right.

You may want to replace the VT specifying the saturation width with a simple constant integer operand. This is the approach that the fixed-point ISD opcodes went with, and it should make legalization a bit simpler.

Well, the value on the fixed point nodes isn't for saturation, and I think there are actually a few instances in which the fixed point nodes (such as division) could have used a saturation width for simplifying legalization. I didn't add it, though. It's very convenient for these nodes.

I originally restricted this patch to a minimum viable implementation, in the hope that it would make review easier and get this landed quickly. Given how things turned out, it might make sense to put the soft float and vector legalization (which are part of D54696) back into this patch, so that legalization support is complete and this can stand on its own.

For soft float legalization, I would switch from libcall legalization to expanding and recursively legalizing. Adding so many new compiler-rt functions is an unnecessary burden.

Hm, okay. I actually did originally fold both of those into this patch but then took it out since it wasn't originally part of it.

I agree on the libcalls. I was also planning on doing the expansion directly instead of going for libcalls. In most cases it shouldn't be that much worse than if we had a direct libcall. The worst case is 4 libcalls, but I think that on targets that do not support floating point, it should be expected to be slow anyway.

Rebased and addressed comments.

Included all vector, result expansion and float softening in the patch. The latter two are done by simply performing the standard expansion. In the worst case, this can result in up to 4 libcalls, but I think this is preferable to adding a massive number of new ones.
Replaced the VT operand with an integer constant. For vectors, this is the scalar saturation width.

ebevhan added inline comments.Aug 17 2020, 5:39 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7987 ↗	(On Diff #285971)	I also added this to solve issues in expanding f16 FP_TO_XINT nodes to libcalls. Libcalls with a source of f16 don't seem to exist, so we can't emit such nodes if we will end up making libcalls of them. Maybe this should check operation and type legality, though.

Harbormaster completed remote builds in B68590: Diff 285971.Aug 17 2020, 5:58 AM

Removed stray merge mistake.

Harbormaster completed remote builds in B68594: Diff 285990.Aug 17 2020, 6:45 AM

nikic added inline comments.Aug 17 2020, 12:55 PM

llvm/include/llvm/IR/Intrinsics.td
1251 ↗	(On Diff #285990)	These should also be IntrWillReturn, which has been added in the meantime.

blackhole12 removed a subscriber: blackhole12.Aug 17 2020, 1:27 PM

Added IntrWillReturn.

Harbormaster completed remote builds in B68735: Diff 286254.Aug 18 2020, 5:42 AM

Updated summary on the second operand of the node.

Harbormaster completed remote builds in B68751: Diff 286279.Aug 18 2020, 7:51 AM

Fixed minor formatting nits.

Adding some more reviewers to get some more eyes on this.

As @spatel mentioned this is probably pretty close to being finished, so it would be great to get this final stretch over and done with.

Harbormaster completed remote builds in B68756: Diff 286286.Aug 18 2020, 9:19 AM

RKSimon added inline comments.Aug 18 2020, 9:30 AM

llvm/docs/LangRef.rst
16142 ↗	(On Diff #286286)	Add (this includes negative infinity) ?

Addressed comment.

Harbormaster completed remote builds in B68864: Diff 286505.Aug 19 2020, 2:21 AM

Ping.

Herald added a subscriber: bjope. · View Herald TranscriptAug 26 2020, 3:39 AM

Missing globalisel support

In D54749#2238740, @arsenm wrote:

Missing globalisel support

I did mention GIsel before I commandeered the patch, but @nikic thought it was better to do GIsel in a separate step rather than integrate it into this patch. Is that acceptable?

I have yet to start on the GIsel implementation, however, so I don't really have anything to show just yet.

ebevhan added a child revision: D86632: [Fixed Point] Add codegen for conversion between fixed-point and floating point..Aug 26 2020, 8:37 AM

I can't competently review LLVM code-gen patches, but I can review intrinsic design. If the intended use case here is fixed point, should there be a scale parameter rather than expecting the caller to scale the input? Or is scaling the input basically the expected implementation on all hardware? Scaling the input can be lossy if we overflow to infinity, so that would be making a tacit assumption here that if that happens, the original value must've been outside the representation range of the integer. For float (and bfloat), the exponent goes up to 127, so you would need to be converting to a pretty large fixed-point format to have problems with lossy overflow. For half, the exponent only goes up to 16, so converting to a 32-bit fixed-point format with a scale could overflow, and you probably won't be able to use these intrinsics.

Please take care of the clang-format lints.

There are some missing optimizations here. It's probably worth teaching LegalizeVectorOps to expand the conversion into vector ops. And for conversions that correspond to a native instruction, there's probably some more efficient code sequence. But those can be dealt with in followups.

Making the saturation type an argument seemed a little strange at first glance, but it seems to work out reasonably well. Maybe could change the argument to be a type, like ISD::SIGN_EXTEND_INREG.

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2683 ↗	(On Diff #286505)	This can't be the only place that uses this loop; can it be refactored?
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
163 ↗	(On Diff #286505)	Do we need a corresponding ScalarizeVecOp_FP_TO_XINT_SAT?
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7987 ↗	(On Diff #285971)	In theory, legalization should ensure we generate appropriate libcalls. Maybe that doesn't actually happen at the moment, though.

ebevhan mentioned this in D86632: [Fixed Point] Add codegen for conversion between fixed-point and floating point..Aug 27 2020, 1:28 AM

In D54749#2239801, @rjmccall wrote:

I can't competently review LLVM code-gen patches, but I can review intrinsic design. If the intended use case here is fixed point, should there be a scale parameter rather than expecting the caller to scale the input? Or is scaling the input basically the expected implementation on all hardware? Scaling the input can be lossy if we overflow to infinity, so that would be making a tacit assumption here that if that happens, the original value must've been outside the representation range of the integer. For float (and bfloat), the exponent goes up to 127, so you would need to be converting to a pretty large fixed-point format to have problems with lossy overflow. For half, the exponent only goes up to 16, so converting to a 32-bit fixed-point format with a scale could overflow, and you probably won't be able to use these intrinsics.

This is a good point that I hadn't considered. I left a comment in D86632.

I was originally planning on adding such an intrinsic (with a scaling factor) but dropped the idea when I found this patch. You are right in that the small exponent of half precision is problematic, though. Not sure what to do about that.

In D54749#2239911, @efriedma wrote:

Please take care of the clang-format lints.

I presume it's fine to only fix the ones that don't conform to some existing unconventional code layout?

There are some missing optimizations here. It's probably worth teaching LegalizeVectorOps to expand the conversion into vector ops. And for conversions that correspond to a native instruction, there's probably some more efficient code sequence. But those can be dealt with in followups.

For improved native support, I've pulled out D86079 and D86078 from the original patch.

By expand into vector ops, you mean to avoid breaking up the vector operation into scalar operations and then legalizing those, and instead expanding the vector operation directly to the corresponding operations? What would be the criteria for doing this rather than splitting it up first?

Making the saturation type an argument seemed a little strange at first glance, but it seems to work out reasonably well. Maybe could change the argument to be a type, like ISD::SIGN_EXTEND_INREG.

It was originally a type operand similar to SIGN_EXTEND_INREG, but the suggestion was to replace it with a constant operand instead to make things simpler. The only thing we care about is the saturation bit width.

It might have been fine to make it a scalar type operand even for vector types, though.

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2683 ↗	(On Diff #286505)	There are similar loops in some of the other PromoteLegal* functions. The loops in those functions could probably be refactored - they look compatible - but this loop isn't quite the same since the signed nodes can't be used to implement the unsigned ones.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7987 ↗	(On Diff #285971)	There's some conditions in the libcall emission which do cause promotion of the source before emission, but it doesn't happen in every case (even though no libcalls from half->int seem to exist in the first place).

uabelho added a subscriber: uabelho.Aug 27 2020, 4:48 AM

I was originally planning on adding such an intrinsic (with a scaling factor) but dropped the idea when I found this patch. You are right in that the small exponent of half precision is problematic, though. Not sure what to do about that.

It'd be a problem with float and bfloat, too, if we ever have to support any 128-bit fixed-point formats. I think planning around this is probably the better approach.

At least for the IEEE formats, you should be able to just destructure the bit-pattern of the float, right? Normalize and do some extends and shifts.

At least for the IEEE formats, you should be able to just destructure the bit-pattern of the float, right? Normalize and do some extends and shifts.

If we're going down that path, maybe simpler to just mess with the float scaling factor. If a float is close to the largest finite float, in IEEE formats, the fractional part must be zero. So we can do the float-to-int conversion on the unscaled float, and scale the resulting integer. Probably faster than trying to explicitly destructure a float.

In D54749#2242802, @efriedma wrote:

At least for the IEEE formats, you should be able to just destructure the bit-pattern of the float, right? Normalize and do some extends and shifts.

If we're going down that path, maybe simpler to just mess with the float scaling factor. If a float is close to the largest finite float, in IEEE formats, the fractional part must be zero. So we can do the float-to-int conversion on the unscaled float, and scale the resulting integer. Probably faster than trying to explicitly destructure a float.

Ah, good point.

I presume it's fine to only fix the ones that don't conform to some existing unconventional code layout?

Yes.

By expand into vector ops, you mean to avoid breaking up the vector operation into scalar operations and then legalizing those, and instead expanding the vector operation directly to the corresponding operations? What would be the criteria for doing this rather than splitting it up first?

Yes. Probably sufficient to guard this with a check that the vector operations in question are legal.

It was originally a type operand similar to SIGN_EXTEND_INREG, but the suggestion was to replace it with a constant operand instead to make things simpler. The only thing we care about is the saturation bit width.

Not a big deal either way.

It might have been fine to make it a scalar type operand even for vector types, though.

This is how SIGN_EXTEND_INREG works, I think?

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2683 ↗	(On Diff #286505)	Okay.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7987 ↗	(On Diff #285971)	Maybe we have the relevant code in type legalization, but not LegalizeDAG, or something like that?

Addressed comments.

Harbormaster completed remote builds in B70104: Diff 288960.Aug 31 2020, 9:52 AM

In D54749#2242802, @efriedma wrote:

At least for the IEEE formats, you should be able to just destructure the bit-pattern of the float, right? Normalize and do some extends and shifts.

If we're going down that path, maybe simpler to just mess with the float scaling factor. If a float is close to the largest finite float, in IEEE formats, the fractional part must be zero. So we can do the float-to-int conversion on the unscaled float, and scale the resulting integer. Probably faster than trying to explicitly destructure a float.

I'm not sure how this would work. The float does not have to be close to its finite limit in order to contain a valid and representable fixed-point value.

I'm trying to determine if the only criteria for not being able to use a fmul+fptoint-pattern for fixed-point conversion is whether or not the fixed-point scaling factor fits in the exponent. It should be possible to decompose the operation into a bitcast+mask+some shifting in the case where we can't use fmul+fptoint.

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
1361 ↗	(On Diff #288960)	I'm unsure if I've done the right thing here. We don't even get to this function for any of the test cases.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
7987 ↗	(On Diff #285971)	I think the spot that I had issues with was during integer result expansion, actually. ExpandIntRes_FP_TO_SINT. If the integer result type must be expanded, but the source type is f16 (and legal), it will try to emit a libcall which does not exist. It was probably f16 -> i128 in my case. There are cases for promotion and softening in the function, but those only take effect if the float operand must be promoted or softened, which it doesn't have to be if the type is legal.

Rebased.

Harbormaster completed remote builds in B70827: Diff 290262.Sep 7 2020, 6:52 AM

I don't know why the premerge tests are failing. I don't see those failures locally, so it's rather hard to debug it, especially since the premerge build lacks proper debug info.

It's probably related to the vector expansion I added, but I don't see why it would be failing on Phab and not on my machine.

ebevhan added a child revision: D86078: [AArch64] Improved lowering for saturating float to int..Sep 8 2020, 9:19 AM

Fixed failing test cases.

Harbormaster completed remote builds in B71056: Diff 290676.Sep 9 2020, 2:57 AM

Removed vector expansion.

Doing this makes the improved expansion for AArch64 in D86078 quite bad.

Harbormaster completed remote builds in B71077: Diff 290714.Sep 9 2020, 6:41 AM

efriedma mentioned this in D87232: [SVE][CodeGen] Lower floating point -> integer conversions.Sep 10 2020, 12:17 PM

Fix clang-format warnings.

Harbormaster completed remote builds in B71564: Diff 291573.Sep 14 2020, 8:07 AM

Any more feedback on this?

Another gentle ping.

Rebased.

Harbormaster completed remote builds in B73302: Diff 294915.Sep 29 2020, 3:07 AM

Design seems reasonable to me, but I can't review LLVM CodeGen patches.

It doesn't seem like anyone has any more immediate feedback. @efriedma, have all of your review comments been adequately addressed?

Are there any outstanding issues left or is this perhaps good enough to be submitted?

Herald added a subscriber: pengfei. · View Herald TranscriptOct 26 2020, 11:40 PM

RKSimon added inline comments.Oct 27 2020, 3:07 AM

llvm/docs/LangRef.rst
16197 ↗	(On Diff #294915)	representable unsigned integer
16247 ↗	(On Diff #294915)	representable signed integer
16250 ↗	(On Diff #294915)	representable signed integer
llvm/include/llvm/CodeGen/ISDOpcodes.h
737 ↗	(On Diff #294915)	Maybe explicitly say that the op1 width may be smaller or equal (but not larger) to the result width

Rebase

Address review comments from RKSimon.

Big thanks to @ebevhan for doing lots of work moving forward with the fixed-point number (embedded-c) support.

Although, Bevin currently got some other assignments, so I'll try to help out by commandeering this patch.

bjope marked 5 inline comments as done.Oct 30 2020, 5:17 AM

Harbormaster completed remote builds in B77029: Diff 301858.Oct 30 2020, 5:46 AM

Harbormaster completed remote builds in B77030: Diff 301861.Oct 30 2020, 5:52 AM

Add back test cases (they were accidentally removed when commandeering this patch and rebasing earlier).

Btw, I think all review comments have been addressed. So it would be nice to know what do to next, e.g. to be able to move forward with the rest of the patches in the stack depending on this one.

Harbormaster completed remote builds in B78519: Diff 304630.Nov 11 2020, 3:14 PM

Another little ping.

rkruppe removed a subscriber: rkruppe.Nov 26 2020, 5:31 AM

Gentle ping again.

I've basically just inherited this from @ebevhan , who continued the initial work done by @nikic since there were some interest in these intrinsics on llvm-dev.

Afaik this patch alone is just addition of new intrinsics and simple lowering of those. So apart from adding a bunch of code and tests it shouldn't impact the existing codegen anyway. This patch is however currently a blocker for continuing the work with Embedded-C support when it comes to some fixed<->floating point conversions. So it would be nice to understand what I need to do to move forward with this patch. Any more concerns? Is it good to go? Do we need to find other ways to move forward with the Embedded-C support (such as doing lots of expansion in clang instead of using intrinsics, although that seems wrong if there were other potential use cases for these intrinsics)?

I'm not entirely sure what you're still waiting for, really. It's a big patch with a lot of diffuse responsibilities, but you've gotten sign-off from individual people across at least most of it. Do you feel like there's something significant that hasn't been reviewed?

In D54749#2458355, @rjmccall wrote:

I'm not entirely sure what you're still waiting for, really. It's a big patch with a lot of diffuse responsibilities, but you've gotten sign-off from individual people across at least most of it. Do you feel like there's something significant that hasn't been reviewed?

Well, I've just assumed that we should wait for an LGTM after the last fixups of earlier review comments.

But as you say, there haven't been any objections to adding the intrinsics, and there are no outstanding complaints. So maybe it is safe to assume that no one will object if I simply land this.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 18 2020, 2:10 AM

This revision was landed with ongoing or failed builds.

Closed by commit rGa89d751fb401: Add intrinsics for saturating float to int casts (authored by bjope). · Explain Why

This revision was automatically updated to reflect the committed changes.

bjope added a commit: rGa89d751fb401: Add intrinsics for saturating float to int casts.

@bjope There are two more revisions based on this one, D86079 implements improved X86 lowerings and is already accepted, and D86078 implements improved AArch64 lowerings. Do you plan to land these as well?

In D54749#2489043, @nikic wrote:

@bjope There are two more revisions based on this one, D86079 implements improved X86 lowerings and is already accepted, and D86078 implements improved AArch64 lowerings. Do you plan to land these as well?

My plan has been to move forward with D86632 (which is ready to land and makes use of the new intrisics for conversions between fixed point and floating point).
And then also to land the patch that improves lowering for X86 (D86079) as it has been accepted.

I haven't looked closely at the patch that improves lowering for AArch64 (D86078). I need to figure out if there are any review comments that haven't been addressed yet. If it ends up being within my comfort zone I'll try to move forward with that patch as well :-) I'll at least make sure there is some kind of status update to that patch as I don't think we can't rely on Bevin picking it up again soon.

spatel mentioned this in D114964: [DAG] Create fptoui.sat from clamped fptoui.Dec 3 2021, 8:02 AM

Revision Contents

Path

Size

docs/

LangRef.rst

113 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

11 lines

TargetLowering.h

5 lines

IR/

Intrinsics.td

7 lines

Target/

TargetSelectionDAG.td

5 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

39 lines

LegalizeFloatTypes.cpp

16 lines

LegalizeIntegerTypes.cpp

19 lines

LegalizeTypes.h

10 lines

LegalizeVectorOps.cpp

2 lines

LegalizeVectorTypes.cpp

56 lines

SelectionDAG.cpp

4 lines

SelectionDAGBuilder.cpp

14 lines

SelectionDAGDumper.cpp

2 lines

TargetLowering.cpp

97 lines

TargetLoweringBase.cpp

2 lines

test/

CodeGen/

AArch64/

fptoi-sat-scalar.ll

972 lines

Diff 174769

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 13,075 Lines • ▼ Show 20 Lines
	Examples:			Examples:
	"""""""""			"""""""""

	.. code-block:: llvm			.. code-block:: llvm

	%a = load i16, i16* @x, align 2			%a = load i16, i16* @x, align 2
	%res = call float @llvm.convert.from.fp16(i16 %a)			%res = call float @llvm.convert.from.fp16(i16 %a)

				Saturating floating-point to integer conversions
				------------------------------------------------

				The ``fptoui`` and ``fptosi`` instructions return a
				:ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
				representable by the result type. These intrinsics provide an alternative
				conversion, which will saturate towards the smallest and largest representable
				integer values instead.

				'``llvm.fptoui.sat.*``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
				floating-point argument type and any integer result type, or vectors thereof.
				Not all targets may support all types, however.

				::

				declare i32 @llvm.fptoui.sat.i32.f32(float %f)
				t.p.northoverUnsubmitted Done Reply Inline Actions What's going on here? All the other intrinsics I know specify the return type first in the prototype. I didn't even think it was something you could override. t.p.northover: What's going on here? All the other intrinsics I know specify the return type first in the…
				nikicUnsubmitted Done Reply Inline Actions You're absolutely right, I've flipped the order now, here and in tests. It looks like LLVM is very forgiving of incorrect intrinsic name suffixes and just normalizes them to be correct. nikic: You're absolutely right, I've flipped the order now, here and in tests. It looks like LLVM is…
				declare i19 @llvm.fptoui.sat.i19.f64(double %f)
				declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)

				Overview:
				"""""""""

				This intrinsic converts the argument into an unsigned integer using saturating
				semantics.

				Arguments:
				""""""""""

				The argument may be any floating-point or vector of floating-point type. The
				return value may be any integer or vector of integer type. The number of vector
				elements in argument and return must be same.

				Semantics:
				""""""""""

				The conversion to integer is performed subject to the following rules:

				- If the argument is any NaN, zero is returned.
				- If the argument is smaller than zero, zero is returned.
				- If the argument is larger than the largest representable integer of the
				result type (this includes positive infinity), the largest representable
				integer is returned.
				- Otherwise, the result of rounding the argument towards zero is returned.

				Example:
				""""""""

				.. code-block:: text

				%a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9) ; yields i8: 123
				%b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7) ; yields i8: 0
				%c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0) ; yields i8: 255
				%d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0

				'``llvm.fptosi.sat.*``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
				floating-point argument type and any integer result type, or vectors thereof.
				Not all targets may support all types, however.

				::

				declare i32 @llvm.fptosi.sat.i32.f32(float %f)
				declare i19 @llvm.fptosi.sat.i19.f64(double %f)
				declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)

				Overview:
				"""""""""

				This intrinsic converts the argument into a signed integer using saturating
				semantics.

				Arguments:
				""""""""""

				The argument may be any floating-point or vector of floating-point type. The
				return value may be any integer or vector of integer type. The number of vector
				elements in argument and return must be same.

				Semantics:
				""""""""""

				The conversion to integer is performed subject to the following rules:

				- If the argument is any NaN, zero is returned.
				- If the argument is smaller than the smallest representable integer of the
				result type (this includes negative infinity), the smallest representable
				integer is returned.
				- If the argument is larger than the largest representable integer of the
				result type (this includes positive infinity), the largest representable
				integer is returned.
				- Otherwise, the result of rounding the argument towards zero is returned.

				Example:
				""""""""

				.. code-block:: text

				%a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9) ; yields i8: 23
				%b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8) ; yields i8: -128
				%c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0) ; yields i8: 127
				%d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0

	.. _dbg_intrinsics:			.. _dbg_intrinsics:

	Debugger Intrinsics			Debugger Intrinsics
	-------------------			-------------------

	The LLVM debugger intrinsics (which all start with ``llvm.dbg.``			The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
	prefix), are described in the `LLVM Source Level			prefix), are described in the `LLVM Source Level
	Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_			Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
	▲ Show 20 Lines • Show All 2,608 Lines • Show Last 20 Lines

include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 507 Lines • ▼ Show 20 Lines	enum NodeType {
ZERO_EXTEND_VECTOR_INREG,		ZERO_EXTEND_VECTOR_INREG,

/// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned		/// FP_TO_[US]INT - Convert a floating point value to a signed or unsigned
/// integer. These have the same semantics as fptosi and fptoui in IR. If		/// integer. These have the same semantics as fptosi and fptoui in IR. If
/// the FP value cannot fit in the integer type, the results are undefined.		/// the FP value cannot fit in the integer type, the results are undefined.
FP_TO_SINT,		FP_TO_SINT,
FP_TO_UINT,		FP_TO_UINT,

		/// FP_TO_[US]INT_SAT - Convert floating point value in operand 0 to a
		/// signed or unsigned integer type given in operand 1. If the FP value cannot
		/// fit in the integer type, then if the FP value is NaN return 0, otherwise
		/// return the largest/smallest integer value, if the FP value is
		/// larger/smaller (or +INF/-INF) than the largest/smallest integer value.
		///
		/// The type in operand 1 may be smaller than the result type as a result of
		/// integer type legalization.
		FP_TO_SINT_SAT,
		FP_TO_UINT_SAT,

/// X = FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating point type		/// X = FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating point type
/// down to the precision of the destination VT. TRUNC is a flag, which is		/// down to the precision of the destination VT. TRUNC is a flag, which is
/// always an integer that is zero or one. If TRUNC is 0, this is a		/// always an integer that is zero or one. If TRUNC is 0, this is a
/// normal rounding, if it is 1, this FP_ROUND is known to not change the		/// normal rounding, if it is 1, this FP_ROUND is known to not change the
/// value of Y.		/// value of Y.
///		///
/// The TRUNC = 1 case is used in cases where we know that the value will		/// The TRUNC = 1 case is used in cases where we know that the value will
/// not be modified by the node, because Y is not using any of the extra		/// not be modified by the node, because Y is not using any of the extra
▲ Show 20 Lines • Show All 504 Lines • Show Last 20 Lines

include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 3,680 Lines • ▼ Show 20 Lines	public:
/// \param N Node to expand		/// \param N Node to expand
/// \param Result output after conversion		/// \param Result output after conversion
/// \returns True, if the expansion was successful, false otherwise		/// \returns True, if the expansion was successful, false otherwise
bool expandUINT_TO_FP(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;		bool expandUINT_TO_FP(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;

/// Expand fminnum/fmaxnum into fminnum_ieee/fmaxnum_ieee with quieted inputs.		/// Expand fminnum/fmaxnum into fminnum_ieee/fmaxnum_ieee with quieted inputs.
SDValue expandFMINNUM_FMAXNUM(SDNode *N, SelectionDAG &DAG) const;		SDValue expandFMINNUM_FMAXNUM(SDNode *N, SelectionDAG &DAG) const;

		/// Expand FP_TO_[US]INT_SAT into FP_TO_[US]INT and selects or min/max.
		/// \param N Node to expand
		/// \returns The expansion result
		SDValue expandFP_TO_INT_SAT(SDNode *N, SelectionDAG &DAG) const;

/// Expand CTPOP nodes. Expands vector/scalar CTPOP nodes,		/// Expand CTPOP nodes. Expands vector/scalar CTPOP nodes,
/// vector nodes can only succeed if all operations are legal/custom.		/// vector nodes can only succeed if all operations are legal/custom.
/// \param N Node to expand		/// \param N Node to expand
/// \param Result output after conversion		/// \param Result output after conversion
/// \returns True, if the expansion was successful, false otherwise		/// \returns True, if the expansion was successful, false otherwise
bool expandCTPOP(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;		bool expandCTPOP(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;

/// Expand CTLZ/CTLZ_ZERO_UNDEF nodes. Expands vector/scalar CTLZ nodes,		/// Expand CTLZ/CTLZ_ZERO_UNDEF nodes. Expands vector/scalar CTLZ nodes,
▲ Show 20 Lines • Show All 126 Lines • Show Last 20 Lines

include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 893 Lines • ▼ Show 20 Lines
	def int_sideeffect : Intrinsic<[], [], [IntrInaccessibleMemOnly]>;			def int_sideeffect : Intrinsic<[], [], [IntrInaccessibleMemOnly]>;

	// Intrisics to support half precision floating point format			// Intrisics to support half precision floating point format
	let IntrProperties = [IntrNoMem] in {			let IntrProperties = [IntrNoMem] in {
	def int_convert_to_fp16 : Intrinsic<[llvm_i16_ty], [llvm_anyfloat_ty]>;			def int_convert_to_fp16 : Intrinsic<[llvm_i16_ty], [llvm_anyfloat_ty]>;
	def int_convert_from_fp16 : Intrinsic<[llvm_anyfloat_ty], [llvm_i16_ty]>;			def int_convert_from_fp16 : Intrinsic<[llvm_anyfloat_ty], [llvm_i16_ty]>;
	}			}

				// Saturating floating point to integer intrinsics
				def int_fptoui_sat : Intrinsic<[llvm_anyint_ty], [llvm_anyfloat_ty],
				[IntrNoMem, IntrSpeculatable]>;

				def int_fptosi_sat : Intrinsic<[llvm_anyint_ty], [llvm_anyfloat_ty],
				[IntrNoMem, IntrSpeculatable]>;

	// Clear cache intrinsic, default to ignore (ie. emit nothing)			// Clear cache intrinsic, default to ignore (ie. emit nothing)
	// maps to void __clear_cache() on supporting platforms			// maps to void __clear_cache() on supporting platforms
	def int_clear_cache : Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty],			def int_clear_cache : Intrinsic<[], [llvm_ptr_ty, llvm_ptr_ty],
	[], "llvm.clear_cache">;			[], "llvm.clear_cache">;

	// Intrinsic to detect whether its argument is a constant.			// Intrinsic to detect whether its argument is a constant.
	def int_is_constant : Intrinsic<[llvm_i1_ty], [llvm_any_ty], [IntrNoMem], "llvm.is.constant">;			def int_is_constant : Intrinsic<[llvm_i1_ty], [llvm_any_ty], [IntrNoMem], "llvm.is.constant">;

	▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

include/llvm/Target/TargetSelectionDAG.td

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	def SDTFPExtendOp : SDTypeProfile<1, 1, [ // fextend
SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<1, 0>, SDTCisSameNumEltsAs<0, 1>		SDTCisFP<0>, SDTCisFP<1>, SDTCisOpSmallerThanOp<1, 0>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
def SDTIntToFPOp : SDTypeProfile<1, 1, [ // [su]int_to_fp		def SDTIntToFPOp : SDTypeProfile<1, 1, [ // [su]int_to_fp
SDTCisFP<0>, SDTCisInt<1>, SDTCisSameNumEltsAs<0, 1>		SDTCisFP<0>, SDTCisInt<1>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
def SDTFPToIntOp : SDTypeProfile<1, 1, [ // fp_to_[su]int		def SDTFPToIntOp : SDTypeProfile<1, 1, [ // fp_to_[su]int
SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>		SDTCisInt<0>, SDTCisFP<1>, SDTCisSameNumEltsAs<0, 1>
]>;		]>;
		def SDTFPToIntSatOp : SDTypeProfile<1, 2, [ // fp_to_[su]int_sat
		SDTCisInt<0>, SDTCisFP<1>, SDTCisVT<2, OtherVT>, SDTCisSameNumEltsAs<0, 1>
		]>;
def SDTExtInreg : SDTypeProfile<1, 2, [ // sext_inreg		def SDTExtInreg : SDTypeProfile<1, 2, [ // sext_inreg
SDTCisSameAs<0, 1>, SDTCisInt<0>, SDTCisVT<2, OtherVT>,		SDTCisSameAs<0, 1>, SDTCisInt<0>, SDTCisVT<2, OtherVT>,
SDTCisVTSmallerThanOp<2, 1>		SDTCisVTSmallerThanOp<2, 1>
]>;		]>;
def SDTExtInvec : SDTypeProfile<1, 1, [ // sext_invec		def SDTExtInvec : SDTypeProfile<1, 1, [ // sext_invec
SDTCisInt<0>, SDTCisVec<0>, SDTCisInt<1>, SDTCisVec<1>,		SDTCisInt<0>, SDTCisVec<0>, SDTCisInt<1>, SDTCisVec<1>,
SDTCisOpSmallerThanOp<1, 0>		SDTCisOpSmallerThanOp<1, 0>
]>;		]>;
▲ Show 20 Lines • Show All 271 Lines • ▼ Show 20 Lines
def fpround : SDNode<"ISD::FP_ROUND" , SDTFPRoundOp>;		def fpround : SDNode<"ISD::FP_ROUND" , SDTFPRoundOp>;
def fpextend : SDNode<"ISD::FP_EXTEND" , SDTFPExtendOp>;		def fpextend : SDNode<"ISD::FP_EXTEND" , SDTFPExtendOp>;
def fcopysign : SDNode<"ISD::FCOPYSIGN" , SDTFPSignOp>;		def fcopysign : SDNode<"ISD::FCOPYSIGN" , SDTFPSignOp>;

def sint_to_fp : SDNode<"ISD::SINT_TO_FP" , SDTIntToFPOp>;		def sint_to_fp : SDNode<"ISD::SINT_TO_FP" , SDTIntToFPOp>;
def uint_to_fp : SDNode<"ISD::UINT_TO_FP" , SDTIntToFPOp>;		def uint_to_fp : SDNode<"ISD::UINT_TO_FP" , SDTIntToFPOp>;
def fp_to_sint : SDNode<"ISD::FP_TO_SINT" , SDTFPToIntOp>;		def fp_to_sint : SDNode<"ISD::FP_TO_SINT" , SDTFPToIntOp>;
def fp_to_uint : SDNode<"ISD::FP_TO_UINT" , SDTFPToIntOp>;		def fp_to_uint : SDNode<"ISD::FP_TO_UINT" , SDTFPToIntOp>;
		def fp_to_sint_sat : SDNode<"ISD::FP_TO_SINT_SAT" , SDTFPToIntSatOp>;
		def fp_to_uint_sat : SDNode<"ISD::FP_TO_UINT_SAT" , SDTFPToIntSatOp>;
def f16_to_fp : SDNode<"ISD::FP16_TO_FP" , SDTIntToFPOp>;		def f16_to_fp : SDNode<"ISD::FP16_TO_FP" , SDTIntToFPOp>;
def fp_to_f16 : SDNode<"ISD::FP_TO_FP16" , SDTFPToIntOp>;		def fp_to_f16 : SDNode<"ISD::FP_TO_FP16" , SDTFPToIntOp>;

def setcc : SDNode<"ISD::SETCC" , SDTSetCC>;		def setcc : SDNode<"ISD::SETCC" , SDTSetCC>;
def select : SDNode<"ISD::SELECT" , SDTSelect>;		def select : SDNode<"ISD::SELECT" , SDTSelect>;
def vselect : SDNode<"ISD::VSELECT" , SDTVSelect>;		def vselect : SDNode<"ISD::VSELECT" , SDTVSelect>;
def selectcc : SDNode<"ISD::SELECT_CC" , SDTSelectCC>;		def selectcc : SDNode<"ISD::SELECT_CC" , SDTSelectCC>;

▲ Show 20 Lines • Show All 888 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	private:
SDValue ExpandFCOPYSIGN(SDNode *Node) const;		SDValue ExpandFCOPYSIGN(SDNode *Node) const;
SDValue ExpandFABS(SDNode *Node) const;		SDValue ExpandFABS(SDNode *Node) const;
SDValue ExpandLegalINT_TO_FP(bool isSigned, SDValue Op0, EVT DestVT,		SDValue ExpandLegalINT_TO_FP(bool isSigned, SDValue Op0, EVT DestVT,
const SDLoc &dl);		const SDLoc &dl);
SDValue PromoteLegalINT_TO_FP(SDValue LegalOp, EVT DestVT, bool isSigned,		SDValue PromoteLegalINT_TO_FP(SDValue LegalOp, EVT DestVT, bool isSigned,
const SDLoc &dl);		const SDLoc &dl);
SDValue PromoteLegalFP_TO_INT(SDValue LegalOp, EVT DestVT, bool isSigned,		SDValue PromoteLegalFP_TO_INT(SDValue LegalOp, EVT DestVT, bool isSigned,
const SDLoc &dl);		const SDLoc &dl);
		SDValue PromoteLegalFP_TO_INT_SAT(SDNode *Node, const SDLoc &dl);

SDValue ExpandBITREVERSE(SDValue Op, const SDLoc &dl);		SDValue ExpandBITREVERSE(SDValue Op, const SDLoc &dl);
SDValue ExpandBSWAP(SDValue Op, const SDLoc &dl);		SDValue ExpandBSWAP(SDValue Op, const SDLoc &dl);

SDValue ExpandExtractFromVectorThroughStack(SDValue Op);		SDValue ExpandExtractFromVectorThroughStack(SDValue Op);
SDValue ExpandInsertToVectorThroughStack(SDValue Op);		SDValue ExpandInsertToVectorThroughStack(SDValue Op);
SDValue ExpandVectorBuildThroughStack(SDNode* Node);		SDValue ExpandVectorBuildThroughStack(SDNode* Node);

▲ Show 20 Lines • Show All 935 Lines • ▼ Show 20 Lines	case ISD::STRICT_FTRUNC:
// is also legal, but if ISD::FSQRT requires expansion then so does		// is also legal, but if ISD::FSQRT requires expansion then so does
// ISD::STRICT_FSQRT.		// ISD::STRICT_FSQRT.
Action = TLI.getStrictFPOperationAction(Node->getOpcode(),		Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
break;		break;
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: {		case ISD::USUBSAT:
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
}
case ISD::MSCATTER:		case ISD::MSCATTER:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<MaskedScatterSDNode>(Node)->getValue().getValueType());		cast<MaskedScatterSDNode>(Node)->getValue().getValueType());
break;		break;
case ISD::MSTORE:		case ISD::MSTORE:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<MaskedStoreSDNode>(Node)->getValue().getValueType());		cast<MaskedStoreSDNode>(Node)->getValue().getValueType());
break;		break;
▲ Show 20 Lines • Show All 1,365 Lines • ▼ Show 20 Lines	SDValue SelectionDAGLegalize::PromoteLegalFP_TO_INT(SDValue LegalOp, EVT DestVT,
// Okay, we found the operation and type to use.		// Okay, we found the operation and type to use.
SDValue Operation = DAG.getNode(OpToUse, dl, NewOutTy, LegalOp);		SDValue Operation = DAG.getNode(OpToUse, dl, NewOutTy, LegalOp);

// Truncate the result of the extended FP_TO_*INT operation to the desired		// Truncate the result of the extended FP_TO_*INT operation to the desired
// size.		// size.
return DAG.getNode(ISD::TRUNCATE, dl, DestVT, Operation);		return DAG.getNode(ISD::TRUNCATE, dl, DestVT, Operation);
}		}

		/// Promote FP_TO_*INT_SAT operation to a larger result type. At this point
		/// the result and operand types are legal and there must be a legal
		/// FP_TO_*INT_SAT operation for a larger result type.
		SDValue SelectionDAGLegalize::PromoteLegalFP_TO_INT_SAT(SDNode *Node,
		const SDLoc &dl) {
		unsigned Opcode = Node->getOpcode();

		// Scan for the appropriate larger type to use.
		EVT NewOutTy = Node->getValueType(0);
		while (true) {
		NewOutTy = (MVT::SimpleValueType)(NewOutTy.getSimpleVT().SimpleTy+1);
		t.p.northoverUnsubmitted Done Reply Inline Actions Maybe put the normal assertion in here? assert(NewOutTy.isInteger() && "Ran out of possibilities!"); t.p.northover: Maybe put the normal assertion in here? assert(NewOutTy.isInteger() && "Ran out of…
		assert(NewOutTy.isInteger() && "Ran out of possibilities!");

		if (TLI.isOperationLegalOrCustom(Opcode, NewOutTy)) {
		break;
		}
		}

		// Saturation width is determined by second operand, so we don't have to
		// perform any fixup and can directly truncate the result.
		SDValue Result = DAG.getNode(Opcode, dl, NewOutTy,
		Node->getOperand(0), Node->getOperand(1));
		return DAG.getNode(ISD::TRUNCATE, dl, Node->getValueType(0), Result);
		}

/// Legalize a BITREVERSE scalar/vector operation as a series of mask + shifts.		/// Legalize a BITREVERSE scalar/vector operation as a series of mask + shifts.
SDValue SelectionDAGLegalize::ExpandBITREVERSE(SDValue Op, const SDLoc &dl) {		SDValue SelectionDAGLegalize::ExpandBITREVERSE(SDValue Op, const SDLoc &dl) {
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
EVT SHVT = TLI.getShiftAmountTy(VT, DAG.getDataLayout());		EVT SHVT = TLI.getShiftAmountTy(VT, DAG.getDataLayout());
unsigned Sz = VT.getScalarSizeInBits();		unsigned Sz = VT.getScalarSizeInBits();

SDValue Tmp, Tmp2, Tmp3;		SDValue Tmp, Tmp2, Tmp3;

▲ Show 20 Lines • Show All 338 Lines • ▼ Show 20 Lines	bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))		if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
if (TLI.expandFP_TO_UINT(Node, Tmp1, DAG))		if (TLI.expandFP_TO_UINT(Node, Tmp1, DAG))
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
		Results.push_back(TLI.expandFP_TO_INT_SAT(Node, DAG));
		break;
case ISD::VAARG:		case ISD::VAARG:
Results.push_back(DAG.expandVAArg(Node));		Results.push_back(DAG.expandVAArg(Node));
Results.push_back(Results[0].getValue(1));		Results.push_back(Results[0].getValue(1));
break;		break;
case ISD::VACOPY:		case ISD::VACOPY:
Results.push_back(DAG.expandVACopy(Node));		Results.push_back(DAG.expandVACopy(Node));
break;		break;
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
▲ Show 20 Lines • Show All 1,301 Lines • ▼ Show 20 Lines	case ISD::BSWAP: {
break;		break;
}		}
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
Tmp1 = PromoteLegalFP_TO_INT(Node->getOperand(0), Node->getValueType(0),		Tmp1 = PromoteLegalFP_TO_INT(Node->getOperand(0), Node->getValueType(0),
Node->getOpcode() == ISD::FP_TO_SINT, dl);		Node->getOpcode() == ISD::FP_TO_SINT, dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
		case ISD::FP_TO_UINT_SAT:
		case ISD::FP_TO_SINT_SAT:
		Results.push_back(PromoteLegalFP_TO_INT_SAT(Node, dl));
		break;
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
Tmp1 = PromoteLegalINT_TO_FP(Node->getOperand(0), Node->getValueType(0),		Tmp1 = PromoteLegalINT_TO_FP(Node->getOperand(0), Node->getValueType(0),
Node->getOpcode() == ISD::SINT_TO_FP, dl);		Node->getOpcode() == ISD::SINT_TO_FP, dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::VAARG: {		case ISD::VAARG: {
SDValue Chain = Node->getOperand(0); // Get the chain.		SDValue Chain = Node->getOperand(0); // Get the chain.
▲ Show 20 Lines • Show All 435 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

Show First 20 Lines • Show All 757 Lines • ▼ Show 20 Lines	#endif
case ISD::FABS: Res = SoftenFloatOp_FABS(N); break;		case ISD::FABS: Res = SoftenFloatOp_FABS(N); break;
case ISD::FCOPYSIGN: Res = SoftenFloatOp_FCOPYSIGN(N); break;		case ISD::FCOPYSIGN: Res = SoftenFloatOp_FCOPYSIGN(N); break;
case ISD::FNEG: Res = SoftenFloatOp_FNEG(N); break;		case ISD::FNEG: Res = SoftenFloatOp_FNEG(N); break;
case ISD::FP_EXTEND: Res = SoftenFloatOp_FP_EXTEND(N); break;		case ISD::FP_EXTEND: Res = SoftenFloatOp_FP_EXTEND(N); break;
case ISD::FP_TO_FP16: // Same as FP_ROUND for softening purposes		case ISD::FP_TO_FP16: // Same as FP_ROUND for softening purposes
case ISD::FP_ROUND: Res = SoftenFloatOp_FP_ROUND(N); break;		case ISD::FP_ROUND: Res = SoftenFloatOp_FP_ROUND(N); break;
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: Res = SoftenFloatOp_FP_TO_XINT(N); break;		case ISD::FP_TO_UINT: Res = SoftenFloatOp_FP_TO_XINT(N); break;
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT: Res = SoftenFloatOp_FP_TO_XINT_SAT(N); break;
case ISD::SELECT: Res = SoftenFloatOp_SELECT(N); break;		case ISD::SELECT: Res = SoftenFloatOp_SELECT(N); break;
case ISD::SELECT_CC: Res = SoftenFloatOp_SELECT_CC(N); break;		case ISD::SELECT_CC: Res = SoftenFloatOp_SELECT_CC(N); break;
case ISD::SETCC: Res = SoftenFloatOp_SETCC(N); break;		case ISD::SETCC: Res = SoftenFloatOp_SETCC(N); break;
case ISD::STORE:		case ISD::STORE:
Res = SoftenFloatOp_STORE(N, OpNo);		Res = SoftenFloatOp_STORE(N, OpNo);
// Do not try to analyze or soften this node again if the value is		// Do not try to analyze or soften this node again if the value is
// or can be held in a register. In that case, Res.getNode() should		// or can be held in a register. In that case, Res.getNode() should
// be equal to N.		// be equal to N.
▲ Show 20 Lines • Show All 176 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::SoftenFloatOp_FP_TO_XINT(SDNode *N) {

SDValue Op = GetSoftenedFloat(N->getOperand(0));		SDValue Op = GetSoftenedFloat(N->getOperand(0));
SDValue Res = TLI.makeLibCall(DAG, LC, NVT, Op, false, dl).first;		SDValue Res = TLI.makeLibCall(DAG, LC, NVT, Op, false, dl).first;

// Truncate the result if the libcall returns a larger type.		// Truncate the result if the libcall returns a larger type.
return DAG.getNode(ISD::TRUNCATE, dl, RVT, Res);		return DAG.getNode(ISD::TRUNCATE, dl, RVT, Res);
}		}

		SDValue DAGTypeLegalizer::SoftenFloatOp_FP_TO_XINT_SAT(SDNode *N) {
		llvm_unreachable("fp_to_xint soften float op not implemented yet");
		return SDValue();
		}
		arsenmUnsubmitted Done Reply Inline Actions I'm pretty sure this will warn. There's no point in doing this anyway since you'll get the appropriate error if you just leave this out arsenm: I'm pretty sure this will warn. There's no point in doing this anyway since you'll get the…

SDValue DAGTypeLegalizer::SoftenFloatOp_SELECT(SDNode *N) {		SDValue DAGTypeLegalizer::SoftenFloatOp_SELECT(SDNode *N) {
SDValue Op1 = GetSoftenedFloat(N->getOperand(1));		SDValue Op1 = GetSoftenedFloat(N->getOperand(1));
SDValue Op2 = GetSoftenedFloat(N->getOperand(2));		SDValue Op2 = GetSoftenedFloat(N->getOperand(2));

if (Op1 == N->getOperand(1) && Op2 == N->getOperand(2))		if (Op1 == N->getOperand(1) && Op2 == N->getOperand(2))
return SDValue();		return SDValue();

return SDValue(DAG.UpdateNodeOperands(N, N->getOperand(0), Op1, Op2),		return SDValue(DAG.UpdateNodeOperands(N, N->getOperand(0), Op1, Op2),
▲ Show 20 Lines • Show All 797 Lines • ▼ Show 20 Lines	bool DAGTypeLegalizer::PromoteFloatOperand(SDNode *N, unsigned OpNo) {
switch (N->getOpcode()) {		switch (N->getOpcode()) {
default:		default:
llvm_unreachable("Do not know how to promote this operator's operand!");		llvm_unreachable("Do not know how to promote this operator's operand!");

case ISD::BITCAST: R = PromoteFloatOp_BITCAST(N, OpNo); break;		case ISD::BITCAST: R = PromoteFloatOp_BITCAST(N, OpNo); break;
case ISD::FCOPYSIGN: R = PromoteFloatOp_FCOPYSIGN(N, OpNo); break;		case ISD::FCOPYSIGN: R = PromoteFloatOp_FCOPYSIGN(N, OpNo); break;
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: R = PromoteFloatOp_FP_TO_XINT(N, OpNo); break;		case ISD::FP_TO_UINT: R = PromoteFloatOp_FP_TO_XINT(N, OpNo); break;
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
		R = PromoteFloatOp_FP_TO_XINT_SAT(N, OpNo); break;
case ISD::FP_EXTEND: R = PromoteFloatOp_FP_EXTEND(N, OpNo); break;		case ISD::FP_EXTEND: R = PromoteFloatOp_FP_EXTEND(N, OpNo); break;
case ISD::SELECT_CC: R = PromoteFloatOp_SELECT_CC(N, OpNo); break;		case ISD::SELECT_CC: R = PromoteFloatOp_SELECT_CC(N, OpNo); break;
case ISD::SETCC: R = PromoteFloatOp_SETCC(N, OpNo); break;		case ISD::SETCC: R = PromoteFloatOp_SETCC(N, OpNo); break;
case ISD::STORE: R = PromoteFloatOp_STORE(N, OpNo); break;		case ISD::STORE: R = PromoteFloatOp_STORE(N, OpNo); break;
}		}

if (R.getNode())		if (R.getNode())
ReplaceValueWith(SDValue(N, 0), R);		ReplaceValueWith(SDValue(N, 0), R);
Show All 27 Lines
}		}

// Convert the promoted float value to the desired integer type		// Convert the promoted float value to the desired integer type
SDValue DAGTypeLegalizer::PromoteFloatOp_FP_TO_XINT(SDNode *N, unsigned OpNo) {		SDValue DAGTypeLegalizer::PromoteFloatOp_FP_TO_XINT(SDNode *N, unsigned OpNo) {
SDValue Op = GetPromotedFloat(N->getOperand(0));		SDValue Op = GetPromotedFloat(N->getOperand(0));
return DAG.getNode(N->getOpcode(), SDLoc(N), N->getValueType(0), Op);		return DAG.getNode(N->getOpcode(), SDLoc(N), N->getValueType(0), Op);
}		}

		SDValue DAGTypeLegalizer::PromoteFloatOp_FP_TO_XINT_SAT(SDNode *N, unsigned OpNo) {
		SDValue Op = GetPromotedFloat(N->getOperand(0));
		return DAG.getNode(
		N->getOpcode(), SDLoc(N), N->getValueType(0), Op, N->getOperand(1));
		}

SDValue DAGTypeLegalizer::PromoteFloatOp_FP_EXTEND(SDNode *N, unsigned OpNo) {		SDValue DAGTypeLegalizer::PromoteFloatOp_FP_EXTEND(SDNode *N, unsigned OpNo) {
SDValue Op = GetPromotedFloat(N->getOperand(0));		SDValue Op = GetPromotedFloat(N->getOperand(0));
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);

// Desired VT is same as promoted type. Use promoted float directly.		// Desired VT is same as promoted type. Use promoted float directly.
if (VT == Op->getValueType(0))		if (VT == Op->getValueType(0))
return Op;		return Op;

▲ Show 20 Lines • Show All 347 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	#endif

case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;		case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;

case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;		case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;

		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
		Res = PromoteIntRes_FP_TO_XINT_SAT(N); break;

case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;		case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;

case ISD::AND:		case ISD::AND:
case ISD::OR:		case ISD::OR:
case ISD::XOR:		case ISD::XOR:
case ISD::ADD:		case ISD::ADD:
case ISD::SUB:		case ISD::SUB:
case ISD::MUL: Res = PromoteIntRes_SimpleIntBinOp(N); break;		case ISD::MUL: Res = PromoteIntRes_SimpleIntBinOp(N); break;
▲ Show 20 Lines • Show All 336 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_XINT(SDNode *N) {
// NOTE: fp-to-uint to fp-to-sint promotion guarantees zero extend. For example:		// NOTE: fp-to-uint to fp-to-sint promotion guarantees zero extend. For example:
// before legalization: fp-to-uint16, 65534. -> 0xfffe		// before legalization: fp-to-uint16, 65534. -> 0xfffe
// after legalization: fp-to-sint32, 65534. -> 0x0000fffe		// after legalization: fp-to-sint32, 65534. -> 0x0000fffe
return DAG.getNode(N->getOpcode() == ISD::FP_TO_UINT ?		return DAG.getNode(N->getOpcode() == ISD::FP_TO_UINT ?
ISD::AssertZext : ISD::AssertSext, dl, NVT, Res,		ISD::AssertZext : ISD::AssertSext, dl, NVT, Res,
DAG.getValueType(N->getValueType(0).getScalarType()));		DAG.getValueType(N->getValueType(0).getScalarType()));
}		}

		SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_XINT_SAT(SDNode *N) {
		// Promote the result type, while keeping the original type in Op1.
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
		SDLoc dl(N);
		return DAG.getNode(
		N->getOpcode(), dl, NVT, N->getOperand(0), N->getOperand(1));
		}

SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {
EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
SDLoc dl(N);		SDLoc dl(N);

return DAG.getNode(N->getOpcode(), dl, NVT, N->getOperand(0));		return DAG.getNode(N->getOpcode(), dl, NVT, N->getOperand(0));
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_INT_EXTEND(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_INT_EXTEND(SDNode *N) {
▲ Show 20 Lines • Show All 970 Lines • ▼ Show 20 Lines	#endif
case ISD::CTLZ_ZERO_UNDEF:		case ISD::CTLZ_ZERO_UNDEF:
case ISD::CTLZ: ExpandIntRes_CTLZ(N, Lo, Hi); break;		case ISD::CTLZ: ExpandIntRes_CTLZ(N, Lo, Hi); break;
case ISD::CTPOP: ExpandIntRes_CTPOP(N, Lo, Hi); break;		case ISD::CTPOP: ExpandIntRes_CTPOP(N, Lo, Hi); break;
case ISD::CTTZ_ZERO_UNDEF:		case ISD::CTTZ_ZERO_UNDEF:
case ISD::CTTZ: ExpandIntRes_CTTZ(N, Lo, Hi); break;		case ISD::CTTZ: ExpandIntRes_CTTZ(N, Lo, Hi); break;
case ISD::FLT_ROUNDS_: ExpandIntRes_FLT_ROUNDS(N, Lo, Hi); break;		case ISD::FLT_ROUNDS_: ExpandIntRes_FLT_ROUNDS(N, Lo, Hi); break;
case ISD::FP_TO_SINT: ExpandIntRes_FP_TO_SINT(N, Lo, Hi); break;		case ISD::FP_TO_SINT: ExpandIntRes_FP_TO_SINT(N, Lo, Hi); break;
case ISD::FP_TO_UINT: ExpandIntRes_FP_TO_UINT(N, Lo, Hi); break;		case ISD::FP_TO_UINT: ExpandIntRes_FP_TO_UINT(N, Lo, Hi); break;
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT: ExpandIntRes_FP_TO_XINT_SAT(N, Lo, Hi); break;
case ISD::LOAD: ExpandIntRes_LOAD(cast<LoadSDNode>(N), Lo, Hi); break;		case ISD::LOAD: ExpandIntRes_LOAD(cast<LoadSDNode>(N), Lo, Hi); break;
case ISD::MUL: ExpandIntRes_MUL(N, Lo, Hi); break;		case ISD::MUL: ExpandIntRes_MUL(N, Lo, Hi); break;
case ISD::READCYCLECOUNTER: ExpandIntRes_READCYCLECOUNTER(N, Lo, Hi); break;		case ISD::READCYCLECOUNTER: ExpandIntRes_READCYCLECOUNTER(N, Lo, Hi); break;
case ISD::SDIV: ExpandIntRes_SDIV(N, Lo, Hi); break;		case ISD::SDIV: ExpandIntRes_SDIV(N, Lo, Hi); break;
case ISD::SIGN_EXTEND: ExpandIntRes_SIGN_EXTEND(N, Lo, Hi); break;		case ISD::SIGN_EXTEND: ExpandIntRes_SIGN_EXTEND(N, Lo, Hi); break;
case ISD::SIGN_EXTEND_INREG: ExpandIntRes_SIGN_EXTEND_INREG(N, Lo, Hi); break;		case ISD::SIGN_EXTEND_INREG: ExpandIntRes_SIGN_EXTEND_INREG(N, Lo, Hi); break;
case ISD::SREM: ExpandIntRes_SREM(N, Lo, Hi); break;		case ISD::SREM: ExpandIntRes_SREM(N, Lo, Hi); break;
case ISD::TRUNCATE: ExpandIntRes_TRUNCATE(N, Lo, Hi); break;		case ISD::TRUNCATE: ExpandIntRes_TRUNCATE(N, Lo, Hi); break;
▲ Show 20 Lines • Show All 819 Lines • ▼ Show 20 Lines	if (getTypeAction(Op.getValueType()) == TargetLowering::TypePromoteFloat)
Op = GetPromotedFloat(Op);		Op = GetPromotedFloat(Op);

RTLIB::Libcall LC = RTLIB::getFPTOUINT(Op.getValueType(), VT);		RTLIB::Libcall LC = RTLIB::getFPTOUINT(Op.getValueType(), VT);
assert(LC != RTLIB::UNKNOWN_LIBCALL && "Unexpected fp-to-uint conversion!");		assert(LC != RTLIB::UNKNOWN_LIBCALL && "Unexpected fp-to-uint conversion!");
SplitInteger(TLI.makeLibCall(DAG, LC, VT, Op, false/irrelevant/, dl).first,		SplitInteger(TLI.makeLibCall(DAG, LC, VT, Op, false/irrelevant/, dl).first,
Lo, Hi);		Lo, Hi);
}		}

		void DAGTypeLegalizer::ExpandIntRes_FP_TO_XINT_SAT(SDNode *N, SDValue &Lo,
		SDValue &Hi) {
		llvm_unreachable("fp_to_xint_sat expand int res not implemented yet");
		arsenmUnsubmitted Done Reply Inline Actions Ditto arsenm: Ditto
		}

void DAGTypeLegalizer::ExpandIntRes_LOAD(LoadSDNode *N,		void DAGTypeLegalizer::ExpandIntRes_LOAD(LoadSDNode *N,
SDValue &Lo, SDValue &Hi) {		SDValue &Lo, SDValue &Hi) {
if (ISD::isNormalLoad(N)) {		if (ISD::isNormalLoad(N)) {
ExpandRes_NormalLoad(N, Lo, Hi);		ExpandRes_NormalLoad(N, Lo, Hi);
return;		return;
}		}

assert(ISD::isUNINDEXEDLoad(N) && "Indexed load during type legalization!");		assert(ISD::isUNINDEXEDLoad(N) && "Indexed load during type legalization!");
▲ Show 20 Lines • Show All 1,402 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	private:
SDValue PromoteIntRes_BITREVERSE(SDNode *N);		SDValue PromoteIntRes_BITREVERSE(SDNode *N);
SDValue PromoteIntRes_BUILD_PAIR(SDNode *N);		SDValue PromoteIntRes_BUILD_PAIR(SDNode *N);
SDValue PromoteIntRes_Constant(SDNode *N);		SDValue PromoteIntRes_Constant(SDNode *N);
SDValue PromoteIntRes_CTLZ(SDNode *N);		SDValue PromoteIntRes_CTLZ(SDNode *N);
SDValue PromoteIntRes_CTPOP(SDNode *N);		SDValue PromoteIntRes_CTPOP(SDNode *N);
SDValue PromoteIntRes_CTTZ(SDNode *N);		SDValue PromoteIntRes_CTTZ(SDNode *N);
SDValue PromoteIntRes_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue PromoteIntRes_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue PromoteIntRes_FP_TO_XINT(SDNode *N);		SDValue PromoteIntRes_FP_TO_XINT(SDNode *N);
		SDValue PromoteIntRes_FP_TO_XINT_SAT(SDNode *N);
SDValue PromoteIntRes_FP_TO_FP16(SDNode *N);		SDValue PromoteIntRes_FP_TO_FP16(SDNode *N);
SDValue PromoteIntRes_INT_EXTEND(SDNode *N);		SDValue PromoteIntRes_INT_EXTEND(SDNode *N);
SDValue PromoteIntRes_LOAD(LoadSDNode *N);		SDValue PromoteIntRes_LOAD(LoadSDNode *N);
SDValue PromoteIntRes_MLOAD(MaskedLoadSDNode *N);		SDValue PromoteIntRes_MLOAD(MaskedLoadSDNode *N);
SDValue PromoteIntRes_MGATHER(MaskedGatherSDNode *N);		SDValue PromoteIntRes_MGATHER(MaskedGatherSDNode *N);
SDValue PromoteIntRes_Overflow(SDNode *N);		SDValue PromoteIntRes_Overflow(SDNode *N);
SDValue PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_SELECT(SDNode *N);		SDValue PromoteIntRes_SELECT(SDNode *N);
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	private:
void ExpandIntRes_READCYCLECOUNTER (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_READCYCLECOUNTER (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_SIGN_EXTEND (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_SIGN_EXTEND (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_SIGN_EXTEND_INREG (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_SIGN_EXTEND_INREG (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_TRUNCATE (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_TRUNCATE (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ZERO_EXTEND (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ZERO_EXTEND (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_FLT_ROUNDS (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_FLT_ROUNDS (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_FP_TO_SINT (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_FP_TO_SINT (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_FP_TO_UINT (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_FP_TO_UINT (SDNode *N, SDValue &Lo, SDValue &Hi);
		void ExpandIntRes_FP_TO_XINT_SAT (SDNode *N, SDValue &Lo, SDValue &Hi);

void ExpandIntRes_Logical (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_Logical (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ADDSUB (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ADDSUB (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ADDSUBC (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ADDSUBC (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ADDSUBE (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ADDSUBE (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ADDSUBCARRY (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ADDSUBCARRY (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_BITREVERSE (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_BITREVERSE (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_BSWAP (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_BSWAP (SDNode *N, SDValue &Lo, SDValue &Hi);
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	private:
SDValue SoftenFloatOp_COPY_TO_REG(SDNode *N);		SDValue SoftenFloatOp_COPY_TO_REG(SDNode *N);
SDValue SoftenFloatOp_BR_CC(SDNode *N);		SDValue SoftenFloatOp_BR_CC(SDNode *N);
SDValue SoftenFloatOp_FABS(SDNode *N);		SDValue SoftenFloatOp_FABS(SDNode *N);
SDValue SoftenFloatOp_FCOPYSIGN(SDNode *N);		SDValue SoftenFloatOp_FCOPYSIGN(SDNode *N);
SDValue SoftenFloatOp_FNEG(SDNode *N);		SDValue SoftenFloatOp_FNEG(SDNode *N);
SDValue SoftenFloatOp_FP_EXTEND(SDNode *N);		SDValue SoftenFloatOp_FP_EXTEND(SDNode *N);
SDValue SoftenFloatOp_FP_ROUND(SDNode *N);		SDValue SoftenFloatOp_FP_ROUND(SDNode *N);
SDValue SoftenFloatOp_FP_TO_XINT(SDNode *N);		SDValue SoftenFloatOp_FP_TO_XINT(SDNode *N);
		SDValue SoftenFloatOp_FP_TO_XINT_SAT(SDNode *N);
SDValue SoftenFloatOp_SELECT(SDNode *N);		SDValue SoftenFloatOp_SELECT(SDNode *N);
SDValue SoftenFloatOp_SELECT_CC(SDNode *N);		SDValue SoftenFloatOp_SELECT_CC(SDNode *N);
SDValue SoftenFloatOp_SETCC(SDNode *N);		SDValue SoftenFloatOp_SETCC(SDNode *N);
SDValue SoftenFloatOp_STORE(SDNode *N, unsigned OpNo);		SDValue SoftenFloatOp_STORE(SDNode *N, unsigned OpNo);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Float Expansion Support: LegalizeFloatTypes.cpp		// Float Expansion Support: LegalizeFloatTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	private:
SDValue PromoteFloatRes_UNDEF(SDNode *N);		SDValue PromoteFloatRes_UNDEF(SDNode *N);
SDValue PromoteFloatRes_XINT_TO_FP(SDNode *N);		SDValue PromoteFloatRes_XINT_TO_FP(SDNode *N);

bool PromoteFloatOperand(SDNode *N, unsigned OpNo);		bool PromoteFloatOperand(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_BITCAST(SDNode *N, unsigned OpNo);		SDValue PromoteFloatOp_BITCAST(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_FCOPYSIGN(SDNode *N, unsigned OpNo);		SDValue PromoteFloatOp_FCOPYSIGN(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_FP_EXTEND(SDNode *N, unsigned OpNo);		SDValue PromoteFloatOp_FP_EXTEND(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_FP_TO_XINT(SDNode *N, unsigned OpNo);		SDValue PromoteFloatOp_FP_TO_XINT(SDNode *N, unsigned OpNo);
		SDValue PromoteFloatOp_FP_TO_XINT_SAT(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_STORE(SDNode *N, unsigned OpNo);		SDValue PromoteFloatOp_STORE(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_SELECT_CC(SDNode *N, unsigned OpNo);		SDValue PromoteFloatOp_SELECT_CC(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_SETCC(SDNode *N, unsigned OpNo);		SDValue PromoteFloatOp_SETCC(SDNode *N, unsigned OpNo);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Scalarization Support: LegalizeVectorTypes.cpp		// Scalarization Support: LegalizeVectorTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

Show All 27 Lines	private:
SDValue ScalarizeVecRes_LOAD(LoadSDNode *N);		SDValue ScalarizeVecRes_LOAD(LoadSDNode *N);
SDValue ScalarizeVecRes_SCALAR_TO_VECTOR(SDNode *N);		SDValue ScalarizeVecRes_SCALAR_TO_VECTOR(SDNode *N);
SDValue ScalarizeVecRes_VSELECT(SDNode *N);		SDValue ScalarizeVecRes_VSELECT(SDNode *N);
SDValue ScalarizeVecRes_SELECT(SDNode *N);		SDValue ScalarizeVecRes_SELECT(SDNode *N);
SDValue ScalarizeVecRes_SELECT_CC(SDNode *N);		SDValue ScalarizeVecRes_SELECT_CC(SDNode *N);
SDValue ScalarizeVecRes_SETCC(SDNode *N);		SDValue ScalarizeVecRes_SETCC(SDNode *N);
SDValue ScalarizeVecRes_UNDEF(SDNode *N);		SDValue ScalarizeVecRes_UNDEF(SDNode *N);
SDValue ScalarizeVecRes_VECTOR_SHUFFLE(SDNode *N);		SDValue ScalarizeVecRes_VECTOR_SHUFFLE(SDNode *N);
		SDValue ScalarizeVecRes_FP_TO_XINT_SAT(SDNode *N);

// Vector Operand Scalarization: <1 x ty> -> ty.		// Vector Operand Scalarization: <1 x ty> -> ty.
bool ScalarizeVectorOperand(SDNode *N, unsigned OpNo);		bool ScalarizeVectorOperand(SDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_BITCAST(SDNode *N);		SDValue ScalarizeVecOp_BITCAST(SDNode *N);
SDValue ScalarizeVecOp_UnaryOp(SDNode *N);		SDValue ScalarizeVecOp_UnaryOp(SDNode *N);
SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);		SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);
SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue ScalarizeVecOp_VSELECT(SDNode *N);		SDValue ScalarizeVecOp_VSELECT(SDNode *N);
SDValue ScalarizeVecOp_VSETCC(SDNode *N);		SDValue ScalarizeVecOp_VSETCC(SDNode *N);
SDValue ScalarizeVecOp_STORE(StoreSDNode *N, unsigned OpNo);		SDValue ScalarizeVecOp_STORE(StoreSDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo);		SDValue ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo);
		SDValue ScalarizeVecOp_FP_TO_XINT_SAT(SDNode *N);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Vector Splitting Support: LegalizeVectorTypes.cpp		// Vector Splitting Support: LegalizeVectorTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Given a processed vector Op which was split into vectors of half the size,		/// Given a processed vector Op which was split into vectors of half the size,
/// this method returns the halves. The first elements of Op coincide with the		/// this method returns the halves. The first elements of Op coincide with the
/// elements of Lo; the remaining elements of Op coincide with the elements of		/// elements of Lo; the remaining elements of Op coincide with the elements of
Show All 24 Lines	private:
void SplitVecRes_INSERT_VECTOR_ELT(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_INSERT_VECTOR_ELT(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_LOAD(LoadSDNode *LD, SDValue &Lo, SDValue &Hi);		void SplitVecRes_LOAD(LoadSDNode *LD, SDValue &Lo, SDValue &Hi);
void SplitVecRes_MLOAD(MaskedLoadSDNode *MLD, SDValue &Lo, SDValue &Hi);		void SplitVecRes_MLOAD(MaskedLoadSDNode *MLD, SDValue &Lo, SDValue &Hi);
void SplitVecRes_MGATHER(MaskedGatherSDNode *MGT, SDValue &Lo, SDValue &Hi);		void SplitVecRes_MGATHER(MaskedGatherSDNode *MGT, SDValue &Lo, SDValue &Hi);
void SplitVecRes_SCALAR_TO_VECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_SCALAR_TO_VECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_SETCC(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_SETCC(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_VECTOR_SHUFFLE(ShuffleVectorSDNode *N, SDValue &Lo,		void SplitVecRes_VECTOR_SHUFFLE(ShuffleVectorSDNode *N, SDValue &Lo,
SDValue &Hi);		SDValue &Hi);
		void SplitVecRes_FP_TO_XINT_SAT(SDNode *N, SDValue &Lo, SDValue &Hi);

// Vector Operand Splitting: <128 x ty> -> 2 x <64 x ty>.		// Vector Operand Splitting: <128 x ty> -> 2 x <64 x ty>.
bool SplitVectorOperand(SDNode *N, unsigned OpNo);		bool SplitVectorOperand(SDNode *N, unsigned OpNo);
SDValue SplitVecOp_VSELECT(SDNode *N, unsigned OpNo);		SDValue SplitVecOp_VSELECT(SDNode *N, unsigned OpNo);
SDValue SplitVecOp_VECREDUCE(SDNode *N, unsigned OpNo);		SDValue SplitVecOp_VECREDUCE(SDNode *N, unsigned OpNo);
SDValue SplitVecOp_UnaryOp(SDNode *N);		SDValue SplitVecOp_UnaryOp(SDNode *N);
SDValue SplitVecOp_TruncateHelper(SDNode *N);		SDValue SplitVecOp_TruncateHelper(SDNode *N);

SDValue SplitVecOp_BITCAST(SDNode *N);		SDValue SplitVecOp_BITCAST(SDNode *N);
SDValue SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N);		SDValue SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N);
SDValue SplitVecOp_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue SplitVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue SplitVecOp_ExtVecInRegOp(SDNode *N);		SDValue SplitVecOp_ExtVecInRegOp(SDNode *N);
SDValue SplitVecOp_STORE(StoreSDNode *N, unsigned OpNo);		SDValue SplitVecOp_STORE(StoreSDNode *N, unsigned OpNo);
SDValue SplitVecOp_MSTORE(MaskedStoreSDNode *N, unsigned OpNo);		SDValue SplitVecOp_MSTORE(MaskedStoreSDNode *N, unsigned OpNo);
SDValue SplitVecOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);		SDValue SplitVecOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);
SDValue SplitVecOp_MGATHER(MaskedGatherSDNode *MGT, unsigned OpNo);		SDValue SplitVecOp_MGATHER(MaskedGatherSDNode *MGT, unsigned OpNo);
SDValue SplitVecOp_CONCAT_VECTORS(SDNode *N);		SDValue SplitVecOp_CONCAT_VECTORS(SDNode *N);
SDValue SplitVecOp_VSETCC(SDNode *N);		SDValue SplitVecOp_VSETCC(SDNode *N);
SDValue SplitVecOp_FP_ROUND(SDNode *N);		SDValue SplitVecOp_FP_ROUND(SDNode *N);
SDValue SplitVecOp_FCOPYSIGN(SDNode *N);		SDValue SplitVecOp_FCOPYSIGN(SDNode *N);
		SDValue SplitVecOp_FP_TO_XINT_SAT(SDNode *N);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Vector Widening Support: LegalizeVectorTypes.cpp		// Vector Widening Support: LegalizeVectorTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Given a processed vector Op which was widened into a larger vector, this		/// Given a processed vector Op which was widened into a larger vector, this
/// method returns the larger vector. The elements of the returned vector		/// method returns the larger vector. The elements of the returned vector
/// consist of the elements of Op followed by elements containing rubbish.		/// consist of the elements of Op followed by elements containing rubbish.
Show All 28 Lines	private:
SDValue WidenVecRes_UNDEF(SDNode *N);		SDValue WidenVecRes_UNDEF(SDNode *N);
SDValue WidenVecRes_VECTOR_SHUFFLE(ShuffleVectorSDNode *N);		SDValue WidenVecRes_VECTOR_SHUFFLE(ShuffleVectorSDNode *N);

SDValue WidenVecRes_Ternary(SDNode *N);		SDValue WidenVecRes_Ternary(SDNode *N);
SDValue WidenVecRes_Binary(SDNode *N);		SDValue WidenVecRes_Binary(SDNode *N);
SDValue WidenVecRes_BinaryCanTrap(SDNode *N);		SDValue WidenVecRes_BinaryCanTrap(SDNode *N);
SDValue WidenVecRes_StrictFP(SDNode *N);		SDValue WidenVecRes_StrictFP(SDNode *N);
SDValue WidenVecRes_Convert(SDNode *N);		SDValue WidenVecRes_Convert(SDNode *N);
		SDValue WidenVecRes_FP_TO_XINT_SAT(SDNode *N);
SDValue WidenVecRes_FCOPYSIGN(SDNode *N);		SDValue WidenVecRes_FCOPYSIGN(SDNode *N);
SDValue WidenVecRes_POWI(SDNode *N);		SDValue WidenVecRes_POWI(SDNode *N);
SDValue WidenVecRes_Shift(SDNode *N);		SDValue WidenVecRes_Shift(SDNode *N);
SDValue WidenVecRes_Unary(SDNode *N);		SDValue WidenVecRes_Unary(SDNode *N);
SDValue WidenVecRes_InregOp(SDNode *N);		SDValue WidenVecRes_InregOp(SDNode *N);

// Widen Vector Operand.		// Widen Vector Operand.
bool WidenVectorOperand(SDNode *N, unsigned OpNo);		bool WidenVectorOperand(SDNode *N, unsigned OpNo);
SDValue WidenVecOp_BITCAST(SDNode *N);		SDValue WidenVecOp_BITCAST(SDNode *N);
SDValue WidenVecOp_CONCAT_VECTORS(SDNode *N);		SDValue WidenVecOp_CONCAT_VECTORS(SDNode *N);
SDValue WidenVecOp_EXTEND(SDNode *N);		SDValue WidenVecOp_EXTEND(SDNode *N);
SDValue WidenVecOp_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue WidenVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue WidenVecOp_EXTRACT_SUBVECTOR(SDNode *N);		SDValue WidenVecOp_EXTRACT_SUBVECTOR(SDNode *N);
SDValue WidenVecOp_STORE(SDNode* N);		SDValue WidenVecOp_STORE(SDNode* N);
SDValue WidenVecOp_MSTORE(SDNode* N, unsigned OpNo);		SDValue WidenVecOp_MSTORE(SDNode* N, unsigned OpNo);
SDValue WidenVecOp_MGATHER(SDNode* N, unsigned OpNo);		SDValue WidenVecOp_MGATHER(SDNode* N, unsigned OpNo);
SDValue WidenVecOp_MSCATTER(SDNode* N, unsigned OpNo);		SDValue WidenVecOp_MSCATTER(SDNode* N, unsigned OpNo);
SDValue WidenVecOp_SETCC(SDNode* N);		SDValue WidenVecOp_SETCC(SDNode* N);

SDValue WidenVecOp_Convert(SDNode *N);		SDValue WidenVecOp_Convert(SDNode *N);
		SDValue WidenVecOp_FP_TO_XINT_SAT(SDNode *N);
SDValue WidenVecOp_FCOPYSIGN(SDNode *N);		SDValue WidenVecOp_FCOPYSIGN(SDNode *N);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Vector Widening Utilities Support: LegalizeVectorTypes.cpp		// Vector Widening Utilities Support: LegalizeVectorTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Helper function to generate a set of loads to load a vector with a		/// Helper function to generate a set of loads to load a vector with a
/// resulting wider type. It takes:		/// resulting wider type. It takes:
▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 395 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::UMAX:		case ISD::UMAX:
case ISD::SMUL_LOHI:		case ISD::SMUL_LOHI:
case ISD::UMUL_LOHI:		case ISD::UMUL_LOHI:
case ISD::FCANONICALIZE:		case ISD::FCANONICALIZE:
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT:		case ISD::USUBSAT:
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
case ISD::FP_ROUND_INREG:		case ISD::FP_ROUND_INREG:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<VTSDNode>(Node->getOperand(1))->getVT());		cast<VTSDNode>(Node->getOperand(1))->getVT());
break;		break;
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
▲ Show 20 Lines • Show All 819 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
R = ScalarizeVecRes_StrictFPOp(N);		R = ScalarizeVecRes_StrictFPOp(N);
break;		break;

		case ISD::FP_TO_UINT_SAT:
		case ISD::FP_TO_SINT_SAT:
		R = ScalarizeVecRes_FP_TO_XINT_SAT(N);
		break;
}		}

// If R is null, the sub-method took care of registering the result.		// If R is null, the sub-method took care of registering the result.
if (R.getNode())		if (R.getNode())
SetScalarizedVector(SDValue(N, ResNo), R);		SetScalarizedVector(SDValue(N, ResNo), R);
}		}

SDValue DAGTypeLegalizer::ScalarizeVecRes_BinOp(SDNode *N) {		SDValue DAGTypeLegalizer::ScalarizeVecRes_BinOp(SDNode *N) {
▲ Show 20 Lines • Show All 311 Lines • ▼ Show 20 Lines	SDValue Res = DAG.getNode(ISD::SETCC, DL, MVT::i1, LHS, RHS,
N->getOperand(2));		N->getOperand(2));
// Vectors may have a different boolean contents to scalars. Promote the		// Vectors may have a different boolean contents to scalars. Promote the
// value appropriately.		// value appropriately.
ISD::NodeType ExtendCode =		ISD::NodeType ExtendCode =
TargetLowering::getExtendForContent(TLI.getBooleanContents(OpVT));		TargetLowering::getExtendForContent(TLI.getBooleanContents(OpVT));
return DAG.getNode(ExtendCode, DL, NVT, Res);		return DAG.getNode(ExtendCode, DL, NVT, Res);
}		}

		SDValue DAGTypeLegalizer::ScalarizeVecRes_FP_TO_XINT_SAT(SDNode *N) {
		llvm_unreachable("fp_to_xint_sat scalarize vec res not implemented yet");
		return SDValue();
		}


//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Operand Vector Scalarization <1 x ty> -> ty.		// Operand Vector Scalarization <1 x ty> -> ty.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

bool DAGTypeLegalizer::ScalarizeVectorOperand(SDNode *N, unsigned OpNo) {		bool DAGTypeLegalizer::ScalarizeVectorOperand(SDNode *N, unsigned OpNo) {
LLVM_DEBUG(dbgs() << "Scalarize node operand " << OpNo << ": "; N->dump(&DAG);		LLVM_DEBUG(dbgs() << "Scalarize node operand " << OpNo << ": "; N->dump(&DAG);
dbgs() << "\n");		dbgs() << "\n");
Show All 17 Lines	#endif
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
Res = ScalarizeVecOp_UnaryOp(N);		Res = ScalarizeVecOp_UnaryOp(N);
break;		break;
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
		Res = ScalarizeVecOp_FP_TO_XINT_SAT(N);
		break;
case ISD::CONCAT_VECTORS:		case ISD::CONCAT_VECTORS:
Res = ScalarizeVecOp_CONCAT_VECTORS(N);		Res = ScalarizeVecOp_CONCAT_VECTORS(N);
break;		break;
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
Res = ScalarizeVecOp_EXTRACT_VECTOR_ELT(N);		Res = ScalarizeVecOp_EXTRACT_VECTOR_ELT(N);
break;		break;
case ISD::VSELECT:		case ISD::VSELECT:
Res = ScalarizeVecOp_VSELECT(N);		Res = ScalarizeVecOp_VSELECT(N);
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines
SDValue DAGTypeLegalizer::ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo) {		SDValue DAGTypeLegalizer::ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo) {
SDValue Elt = GetScalarizedVector(N->getOperand(0));		SDValue Elt = GetScalarizedVector(N->getOperand(0));
SDValue Res = DAG.getNode(ISD::FP_ROUND, SDLoc(N),		SDValue Res = DAG.getNode(ISD::FP_ROUND, SDLoc(N),
N->getValueType(0).getVectorElementType(), Elt,		N->getValueType(0).getVectorElementType(), Elt,
N->getOperand(1));		N->getOperand(1));
return DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(N), N->getValueType(0), Res);		return DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(N), N->getValueType(0), Res);
}		}

		SDValue DAGTypeLegalizer::ScalarizeVecOp_FP_TO_XINT_SAT(SDNode *N) {
		llvm_unreachable("fp_to_xint_sat scalarize vec op not implemented yet");
		return SDValue();
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Result Vector Splitting		// Result Vector Splitting
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This method is called when the specified result of the specified node is		/// This method is called when the specified result of the specified node is
/// found to need vector splitting. At this point, the node may also have		/// found to need vector splitting. At this point, the node may also have
/// invalid operands or may have other results that need legalization, we just		/// invalid operands or may have other results that need legalization, we just
/// know that (at least) one result needs vector splitting.		/// know that (at least) one result needs vector splitting.
▲ Show 20 Lines • Show All 152 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
SplitVecRes_StrictFPOp(N, Lo, Hi);		SplitVecRes_StrictFPOp(N, Lo, Hi);
break;		break;
		case ISD::FP_TO_UINT_SAT:
		case ISD::FP_TO_SINT_SAT:
		SplitVecRes_FP_TO_XINT_SAT(N, Lo, Hi);
		break;
}		}

// If Lo/Hi is null, the sub-method took care of registering results etc.		// If Lo/Hi is null, the sub-method took care of registering results etc.
if (Lo.getNode())		if (Lo.getNode())
SetSplitVector(SDValue(N, ResNo), Lo, Hi);		SetSplitVector(SDValue(N, ResNo), Lo, Hi);
}		}

void DAGTypeLegalizer::SplitVecRes_BinOp(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::SplitVecRes_BinOp(SDNode *N, SDValue &Lo,
▲ Show 20 Lines • Show All 777 Lines • ▼ Show 20 Lines	if (useBuildVector) {
// At least one input vector was used. Create a new shuffle vector.		// At least one input vector was used. Create a new shuffle vector.
Output = DAG.getVectorShuffle(NewVT, dl, Op0, Op1, Ops);		Output = DAG.getVectorShuffle(NewVT, dl, Op0, Op1, Ops);
}		}

Ops.clear();		Ops.clear();
}		}
}		}

		void DAGTypeLegalizer::SplitVecRes_FP_TO_XINT_SAT(SDNode *N, SDValue &Lo,
		SDValue &Hi) {
		llvm_unreachable("fp_to_xint_sat scalarize vec res not implemented yet");
		}


//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Operand Vector Splitting		// Operand Vector Splitting
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This method is called when the specified operand of the specified node is		/// This method is called when the specified operand of the specified node is
/// found to need vector splitting. At this point, all of the result types of		/// found to need vector splitting. At this point, all of the result types of
/// the node are known to be legal, but other operands of the node may need		/// the node are known to be legal, but other operands of the node may need
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	case ISD::FP_TO_UINT:
break;		break;
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
if (N->getValueType(0).bitsLT(N->getOperand(0).getValueType()))		if (N->getValueType(0).bitsLT(N->getOperand(0).getValueType()))
Res = SplitVecOp_TruncateHelper(N);		Res = SplitVecOp_TruncateHelper(N);
else		else
Res = SplitVecOp_UnaryOp(N);		Res = SplitVecOp_UnaryOp(N);
break;		break;
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
		Res = SplitVecOp_FP_TO_XINT_SAT(N);
		break;
case ISD::CTTZ:		case ISD::CTTZ:
case ISD::CTLZ:		case ISD::CTLZ:
case ISD::CTPOP:		case ISD::CTPOP:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
case ISD::FTRUNC:		case ISD::FTRUNC:
▲ Show 20 Lines • Show All 599 Lines • ▼ Show 20 Lines
}		}

SDValue DAGTypeLegalizer::SplitVecOp_FCOPYSIGN(SDNode *N) {		SDValue DAGTypeLegalizer::SplitVecOp_FCOPYSIGN(SDNode *N) {
// The result (and the first input) has a legal vector type, but the second		// The result (and the first input) has a legal vector type, but the second
// input needs splitting.		// input needs splitting.
return DAG.UnrollVectorOp(N, N->getValueType(0).getVectorNumElements());		return DAG.UnrollVectorOp(N, N->getValueType(0).getVectorNumElements());
}		}

		SDValue DAGTypeLegalizer::SplitVecOp_FP_TO_XINT_SAT(SDNode *N) {
		llvm_unreachable("fp_to_xint_sat split vec op not implemented yet");
		return SDValue();
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Result Vector Widening		// Result Vector Widening
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {		void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
LLVM_DEBUG(dbgs() << "Widen node result " << ResNo << ": "; N->dump(&DAG);		LLVM_DEBUG(dbgs() << "Widen node result " << ResNo << ": "; N->dump(&DAG);
dbgs() << "\n");		dbgs() << "\n");
▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	#endif
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
Res = WidenVecRes_Convert(N);		Res = WidenVecRes_Convert(N);
break;		break;

		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
		Res = WidenVecRes_FP_TO_XINT_SAT(N);
		break;

case ISD::FABS:		case ISD::FABS:
case ISD::FCEIL:		case ISD::FCEIL:
case ISD::FCOS:		case ISD::FCOS:
case ISD::FEXP:		case ISD::FEXP:
case ISD::FEXP2:		case ISD::FEXP2:
case ISD::FFLOOR:		case ISD::FFLOOR:
case ISD::FLOG:		case ISD::FLOG:
case ISD::FLOG10:		case ISD::FLOG10:
▲ Show 20 Lines • Show All 408 Lines • ▼ Show 20 Lines	if (N->getNumOperands() == 1)
Ops[i] = DAG.getNode(Opcode, DL, EltVT, Val);		Ops[i] = DAG.getNode(Opcode, DL, EltVT, Val);
else		else
Ops[i] = DAG.getNode(Opcode, DL, EltVT, Val, N->getOperand(1), Flags);		Ops[i] = DAG.getNode(Opcode, DL, EltVT, Val, N->getOperand(1), Flags);
}		}

return DAG.getBuildVector(WidenVT, DL, Ops);		return DAG.getBuildVector(WidenVT, DL, Ops);
}		}

		SDValue DAGTypeLegalizer::WidenVecRes_FP_TO_XINT_SAT(SDNode *N) {
		llvm_unreachable("fp_to_xint_sat widen vec res not implemented yet");
		return SDValue();
		}

SDValue DAGTypeLegalizer::WidenVecRes_EXTEND_VECTOR_INREG(SDNode *N) {		SDValue DAGTypeLegalizer::WidenVecRes_EXTEND_VECTOR_INREG(SDNode *N) {
unsigned Opcode = N->getOpcode();		unsigned Opcode = N->getOpcode();
SDValue InOp = N->getOperand(0);		SDValue InOp = N->getOperand(0);
SDLoc DL(N);		SDLoc DL(N);

EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT WidenVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
EVT WidenSVT = WidenVT.getVectorElementType();		EVT WidenSVT = WidenVT.getVectorElementType();
unsigned WidenNumElts = WidenVT.getVectorNumElements();		unsigned WidenNumElts = WidenVT.getVectorNumElements();
▲ Show 20 Lines • Show All 769 Lines • ▼ Show 20 Lines	#endif
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
Res = WidenVecOp_Convert(N);		Res = WidenVecOp_Convert(N);
break;		break;

		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT:
		Res = WidenVecOp_FP_TO_XINT_SAT(N);
		break;
}		}

// If Res is null, the sub-method took care of registering the result.		// If Res is null, the sub-method took care of registering the result.
if (!Res.getNode()) return false;		if (!Res.getNode()) return false;

// If the result is N, the sub-method updated N in place. Tell the legalizer		// If the result is N, the sub-method updated N in place. Tell the legalizer
// core about this.		// core about this.
if (Res.getNode() == N)		if (Res.getNode() == N)
▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	Ops[i] = DAG.getNode(
Opcode, dl, EltVT,		Opcode, dl, EltVT,
DAG.getNode(		DAG.getNode(
ISD::EXTRACT_VECTOR_ELT, dl, InEltVT, InOp,		ISD::EXTRACT_VECTOR_ELT, dl, InEltVT, InOp,
DAG.getConstant(i, dl, TLI.getVectorIdxTy(DAG.getDataLayout()))));		DAG.getConstant(i, dl, TLI.getVectorIdxTy(DAG.getDataLayout()))));

return DAG.getBuildVector(VT, dl, Ops);		return DAG.getBuildVector(VT, dl, Ops);
}		}

		SDValue DAGTypeLegalizer::WidenVecOp_FP_TO_XINT_SAT(SDNode *N) {
		llvm_unreachable("fp_to_xint_sat widen vec op not implemented yet");
		return SDValue();
		}

SDValue DAGTypeLegalizer::WidenVecOp_BITCAST(SDNode *N) {		SDValue DAGTypeLegalizer::WidenVecOp_BITCAST(SDNode *N) {
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
SDValue InOp = GetWidenedVector(N->getOperand(0));		SDValue InOp = GetWidenedVector(N->getOperand(0));
EVT InWidenVT = InOp.getValueType();		EVT InWidenVT = InOp.getValueType();
SDLoc dl(N);		SDLoc dl(N);

// Check if we can convert between two legal vector types and extract.		// Check if we can convert between two legal vector types and extract.
unsigned InWidenSize = InWidenVT.getSizeInBits();		unsigned InWidenSize = InWidenVT.getSizeInBits();
▲ Show 20 Lines • Show All 677 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,627 Lines • ▼ Show 20 Lines	for (i= 0; i != NE; ++i) {
case ISD::SRL:		case ISD::SRL:
case ISD::ROTL:		case ISD::ROTL:
case ISD::ROTR:		case ISD::ROTR:
Scalars.push_back(getNode(N->getOpcode(), dl, EltVT, Operands[0],		Scalars.push_back(getNode(N->getOpcode(), dl, EltVT, Operands[0],
getShiftAmountOperand(Operands[0].getValueType(),		getShiftAmountOperand(Operands[0].getValueType(),
Operands[1])));		Operands[1])));
break;		break;
case ISD::SIGN_EXTEND_INREG:		case ISD::SIGN_EXTEND_INREG:
case ISD::FP_ROUND_INREG: {		case ISD::FP_ROUND_INREG:
		case ISD::FP_TO_SINT_SAT:
		case ISD::FP_TO_UINT_SAT: {
EVT ExtVT = cast<VTSDNode>(Operands[1])->getVT().getVectorElementType();		EVT ExtVT = cast<VTSDNode>(Operands[1])->getVT().getVectorElementType();
Scalars.push_back(getNode(N->getOpcode(), dl, EltVT,		Scalars.push_back(getNode(N->getOpcode(), dl, EltVT,
Operands[0],		Operands[0],
getValueType(ExtVT)));		getValueType(ExtVT)));
}		}
}		}
}		}

▲ Show 20 Lines • Show All 364 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,673 Lines • ▼ Show 20 Lines	setValue(&I, DAG.getNode(ISD::BITCAST, sdl, MVT::i16,
MVT::i32))));		MVT::i32))));
return nullptr;		return nullptr;
case Intrinsic::convert_from_fp16:		case Intrinsic::convert_from_fp16:
setValue(&I, DAG.getNode(ISD::FP_EXTEND, sdl,		setValue(&I, DAG.getNode(ISD::FP_EXTEND, sdl,
TLI.getValueType(DAG.getDataLayout(), I.getType()),		TLI.getValueType(DAG.getDataLayout(), I.getType()),
DAG.getNode(ISD::BITCAST, sdl, MVT::f16,		DAG.getNode(ISD::BITCAST, sdl, MVT::f16,
getValue(I.getArgOperand(0)))));		getValue(I.getArgOperand(0)))));
return nullptr;		return nullptr;
		case Intrinsic::fptosi_sat: {
		EVT Type = TLI.getValueType(DAG.getDataLayout(), I.getType());
		setValue(&I, DAG.getNode(ISD::FP_TO_SINT_SAT, sdl, Type,
		getValue(I.getArgOperand(0)),
		DAG.getValueType(Type)));
		t.p.northoverUnsubmitted Not Done Reply Inline Actions I was expecting this tag to be the scalar type in the vector situation. Does something get neater this way round? t.p.northover: I was expecting this tag to be the scalar type in the vector situation. Does something get…
		return nullptr;
		}
		case Intrinsic::fptoui_sat: {
		EVT Type = TLI.getValueType(DAG.getDataLayout(), I.getType());
		setValue(&I, DAG.getNode(ISD::FP_TO_UINT_SAT, sdl, Type,
		getValue(I.getArgOperand(0)),
		DAG.getValueType(Type)));
		return nullptr;
		}
case Intrinsic::pcmarker: {		case Intrinsic::pcmarker: {
SDValue Tmp = getValue(I.getArgOperand(0));		SDValue Tmp = getValue(I.getArgOperand(0));
DAG.setRoot(DAG.getNode(ISD::PCMARKER, sdl, MVT::Other, getRoot(), Tmp));		DAG.setRoot(DAG.getNode(ISD::PCMARKER, sdl, MVT::Other, getRoot(), Tmp));
return nullptr;		return nullptr;
}		}
case Intrinsic::readcyclecounter: {		case Intrinsic::readcyclecounter: {
SDValue Op = getRoot();		SDValue Op = getRoot();
Res = DAG.getNode(ISD::READCYCLECOUNTER, sdl,		Res = DAG.getNode(ISD::READCYCLECOUNTER, sdl,
▲ Show 20 Lines • Show All 4,736 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	#endif
case ISD::FLT_ROUNDS_: return "flt_rounds";		case ISD::FLT_ROUNDS_: return "flt_rounds";
case ISD::FP_ROUND_INREG: return "fp_round_inreg";		case ISD::FP_ROUND_INREG: return "fp_round_inreg";
case ISD::FP_EXTEND: return "fp_extend";		case ISD::FP_EXTEND: return "fp_extend";

case ISD::SINT_TO_FP: return "sint_to_fp";		case ISD::SINT_TO_FP: return "sint_to_fp";
case ISD::UINT_TO_FP: return "uint_to_fp";		case ISD::UINT_TO_FP: return "uint_to_fp";
case ISD::FP_TO_SINT: return "fp_to_sint";		case ISD::FP_TO_SINT: return "fp_to_sint";
case ISD::FP_TO_UINT: return "fp_to_uint";		case ISD::FP_TO_UINT: return "fp_to_uint";
		case ISD::FP_TO_SINT_SAT: return "fp_to_sint_sat";
		case ISD::FP_TO_UINT_SAT: return "fp_to_uint_sat";
case ISD::BITCAST: return "bitcast";		case ISD::BITCAST: return "bitcast";
case ISD::ADDRSPACECAST: return "addrspacecast";		case ISD::ADDRSPACECAST: return "addrspacecast";
case ISD::FP16_TO_FP: return "fp16_to_fp";		case ISD::FP16_TO_FP: return "fp16_to_fp";
case ISD::FP_TO_FP16: return "fp_to_fp16";		case ISD::FP_TO_FP16: return "fp_to_fp16";

// Control flow instructions		// Control flow instructions
case ISD::BR: return "br";		case ISD::BR: return "br";
case ISD::BRIND: return "brind";		case ISD::BRIND: return "brind";
▲ Show 20 Lines • Show All 553 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 5,057 Lines • ▼ Show 20 Lines	if (Opcode == ISD::UADDSAT) {
APInt MaxVal = APInt::getSignedMaxValue(BitWidth);		APInt MaxVal = APInt::getSignedMaxValue(BitWidth);
SDValue SatMin = DAG.getConstant(MinVal, dl, ResultType);		SDValue SatMin = DAG.getConstant(MinVal, dl, ResultType);
SDValue SatMax = DAG.getConstant(MaxVal, dl, ResultType);		SDValue SatMax = DAG.getConstant(MaxVal, dl, ResultType);
SDValue SumNeg = DAG.getSetCC(dl, BoolVT, SumDiff, Zero, ISD::SETLT);		SDValue SumNeg = DAG.getSetCC(dl, BoolVT, SumDiff, Zero, ISD::SETLT);
Result = DAG.getSelect(dl, ResultType, SumNeg, SatMax, SatMin);		Result = DAG.getSelect(dl, ResultType, SumNeg, SatMax, SatMin);
return DAG.getSelect(dl, ResultType, Overflow, Result, SumDiff);		return DAG.getSelect(dl, ResultType, Overflow, Result, SumDiff);
}		}
}		}

		SDValue TargetLowering::expandFP_TO_INT_SAT(
		SDNode *Node, SelectionDAG &DAG) const {
		bool IsSigned = Node->getOpcode() == ISD::FP_TO_SINT_SAT;
		SDLoc dl(SDValue(Node, 0));
		SDValue Src = Node->getOperand(0);

		// DstVT is the result type, while SatVT is the size to which we saturate
		EVT SrcVT = Src.getValueType();
		EVT SatVT = cast<VTSDNode>(Node->getOperand(1))->getVT();
		EVT DstVT = Node->getValueType(0);

		unsigned SatWidth = SatVT.getScalarSizeInBits();
		unsigned DstWidth = DstVT.getScalarSizeInBits();
		assert(SatWidth <= DstWidth &&
		"Expected saturation width smaller than result width");

		// Determine minimum and maximum integer values and their corresponding
		// floating-point values.
		APInt MinInt, MaxInt;
		if (IsSigned) {
		MinInt = APInt::getSignedMinValue(SatWidth).sextOrSelf(DstWidth);
		MaxInt = APInt::getSignedMaxValue(SatWidth).sextOrSelf(DstWidth);
		} else {
		MinInt = APInt::getMinValue(SatWidth).zextOrSelf(DstWidth);
		MaxInt = APInt::getMaxValue(SatWidth).zextOrSelf(DstWidth);
		}

		APFloat MinFloat(DAG.EVTToAPFloatSemantics(SrcVT));
		APFloat MaxFloat(DAG.EVTToAPFloatSemantics(SrcVT));

		APFloat::opStatus MinStatus = MinFloat.convertFromAPInt(
		MinInt, IsSigned, APFloat::rmTowardZero);
		APFloat::opStatus MaxStatus = MaxFloat.convertFromAPInt(
		MaxInt, IsSigned, APFloat::rmTowardZero);
		bool AreExactFloatBounds = !(MinStatus & APFloat::opStatus::opInexact)
		&& !(MaxStatus & APFloat::opStatus::opInexact);

		SDValue MinFloatNode = DAG.getConstantFP(MinFloat, dl, SrcVT);
		SDValue MaxFloatNode = DAG.getConstantFP(MaxFloat, dl, SrcVT);

		// If the integer bounds are exactly representable as floats and min/max are
		// legal, emit a min+max+fptoi sequence. Otherwise we have to use a sequence
		// of comparisons and selects.
		bool MinMaxLegal = isOperationLegal(ISD::FMINNUM, SrcVT)
		&& isOperationLegal(ISD::FMAXNUM, SrcVT);
		if (AreExactFloatBounds && MinMaxLegal) {
		SDValue Clamped = Src;

		// Clamp Src by MinFloat from below. If Src is NaN the result is MinFloat.
		Clamped = DAG.getNode(ISD::FMAXNUM, dl, SrcVT, Clamped, MinFloatNode);
		// Clamp by MaxFloat from above. NaN cannot occur.
		Clamped = DAG.getNode(ISD::FMINNUM, dl, SrcVT, Clamped, MaxFloatNode);
		// Convert clamped value to integer.
		SDValue FpToInt = DAG.getNode(
		IsSigned ? ISD::FP_TO_SINT : ISD::FP_TO_UINT, dl, DstVT, Clamped);

		// In the unsigned case we're done, because we mapped NaN to MinFloat,
		// which will cast to zero.
		if (!IsSigned)
		return FpToInt;

		// Otherwise, select 0 if Src is NaN.
		SDValue ZeroInt = DAG.getConstant(0, dl, DstVT);
		return DAG.getSelectCC(
		dl, Src, Src, ZeroInt, FpToInt, ISD::CondCode::SETUO);
		}

		SDValue MinIntNode = DAG.getConstant(MinInt, dl, DstVT);
		SDValue MaxIntNode = DAG.getConstant(MaxInt, dl, DstVT);

		// Result of direct conversion. The assumption here is that the operation is
		// non-trapping and it's fine to apply it to an out-of-range value if we
		// select it away later.
		SDValue FpToInt = DAG.getNode(
		IsSigned ? ISD::FP_TO_SINT : ISD::FP_TO_UINT, dl, DstVT, Src);

		SDValue Select = FpToInt;

		// If Src ULT MinFloat, select MinInt. In particular, this also selects
		// MinInt if Src is NaN.
		Select = DAG.getSelectCC(
		dl, Src, MinFloatNode, MinIntNode, Select, ISD::CondCode::SETULT);
		// If Src OGT MaxFloat, select MaxInt.
		Select = DAG.getSelectCC(
		dl, Src, MaxFloatNode, MaxIntNode, Select, ISD::CondCode::SETOGT);

		// In the unsigned case we are done, because we mapped NaN to MinInt, which
		// is already zero.
		if (!IsSigned)
		return Select;

		// Otherwise, select 0 if Src is NaN.
		SDValue ZeroInt = DAG.getConstant(0, dl, DstVT);
		return DAG.getSelectCC(
		dl, Src, Src, ZeroInt, Select, ISD::CondCode::SETUO);
		}

lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 608 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::all_valuetypes()) {
setOperationAction(ISD::SMAX, VT, Expand);		setOperationAction(ISD::SMAX, VT, Expand);
setOperationAction(ISD::UMIN, VT, Expand);		setOperationAction(ISD::UMIN, VT, Expand);
setOperationAction(ISD::UMAX, VT, Expand);		setOperationAction(ISD::UMAX, VT, Expand);
setOperationAction(ISD::ABS, VT, Expand);		setOperationAction(ISD::ABS, VT, Expand);
setOperationAction(ISD::SADDSAT, VT, Expand);		setOperationAction(ISD::SADDSAT, VT, Expand);
setOperationAction(ISD::UADDSAT, VT, Expand);		setOperationAction(ISD::UADDSAT, VT, Expand);
setOperationAction(ISD::SSUBSAT, VT, Expand);		setOperationAction(ISD::SSUBSAT, VT, Expand);
setOperationAction(ISD::USUBSAT, VT, Expand);		setOperationAction(ISD::USUBSAT, VT, Expand);
		setOperationAction(ISD::FP_TO_SINT_SAT, VT, Expand);
		setOperationAction(ISD::FP_TO_UINT_SAT, VT, Expand);

// Overflow operations default to expand		// Overflow operations default to expand
setOperationAction(ISD::SADDO, VT, Expand);		setOperationAction(ISD::SADDO, VT, Expand);
setOperationAction(ISD::SSUBO, VT, Expand);		setOperationAction(ISD::SSUBO, VT, Expand);
setOperationAction(ISD::UADDO, VT, Expand);		setOperationAction(ISD::UADDO, VT, Expand);
setOperationAction(ISD::USUBO, VT, Expand);		setOperationAction(ISD::USUBO, VT, Expand);
setOperationAction(ISD::SMULO, VT, Expand);		setOperationAction(ISD::SMULO, VT, Expand);
setOperationAction(ISD::UMULO, VT, Expand);		setOperationAction(ISD::UMULO, VT, Expand);
▲ Show 20 Lines • Show All 1,242 Lines • Show Last 20 Lines

test/CodeGen/AArch64/fptoi-sat-scalar.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=aarch64 < %s \| FileCheck %s

				;
				; 32-bit float to signed integer
				;

				declare i1 @llvm.fptosi.sat.i1.f32 (float)
				declare i8 @llvm.fptosi.sat.i8.f32 (float)
				declare i13 @llvm.fptosi.sat.i13.f32 (float)
				declare i16 @llvm.fptosi.sat.i16.f32 (float)
				declare i19 @llvm.fptosi.sat.i19.f32 (float)
				declare i32 @llvm.fptosi.sat.i32.f32 (float)
				declare i50 @llvm.fptosi.sat.i50.f32 (float)
				declare i64 @llvm.fptosi.sat.i64.f32 (float)
				declare i100 @llvm.fptosi.sat.i100.f32(float)
				declare i128 @llvm.fptosi.sat.i128.f32(float)

				define i1 @test_signed_i1_f32(float %f) {
				; CHECK-LABEL: test_signed_i1_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fmov s1, #-1.00000000
				; CHECK-NEXT: fmov s2, wzr
				; CHECK-NEXT: fmaxnm s1, s0, s1
				; CHECK-NEXT: fminnm s1, s1, s2
				; CHECK-NEXT: fcvtzs w8, s1
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w8, wzr, w8, vs
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%x = call i1 @llvm.fptosi.sat.i1.f32(float %f)
				ret i1 %x
				}

				define i8 @test_signed_i8_f32(float %f) {
				; CHECK-LABEL: test_signed_i8_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI1_0
				; CHECK-NEXT: adrp x9, .LCPI1_1
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI1_0]
				; CHECK-NEXT: ldr s2, [x9, :lo12:.LCPI1_1]
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: fmaxnm s1, s0, s1
				; CHECK-NEXT: fminnm s1, s1, s2
				; CHECK-NEXT: fcvtzs w8, s1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i8 @llvm.fptosi.sat.i8.f32(float %f)
				ret i8 %x
				}

				define i13 @test_signed_i13_f32(float %f) {
				; CHECK-LABEL: test_signed_i13_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI2_0
				; CHECK-NEXT: adrp x9, .LCPI2_1
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI2_0]
				; CHECK-NEXT: ldr s2, [x9, :lo12:.LCPI2_1]
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: fmaxnm s1, s0, s1
				; CHECK-NEXT: fminnm s1, s1, s2
				; CHECK-NEXT: fcvtzs w8, s1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i13 @llvm.fptosi.sat.i13.f32(float %f)
				ret i13 %x
				}

				define i16 @test_signed_i16_f32(float %f) {
				; CHECK-LABEL: test_signed_i16_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI3_0
				; CHECK-NEXT: adrp x9, .LCPI3_1
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI3_0]
				; CHECK-NEXT: ldr s2, [x9, :lo12:.LCPI3_1]
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: fmaxnm s1, s0, s1
				; CHECK-NEXT: fminnm s1, s1, s2
				; CHECK-NEXT: fcvtzs w8, s1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i16 @llvm.fptosi.sat.i16.f32(float %f)
				ret i16 %x
				}

				define i19 @test_signed_i19_f32(float %f) {
				; CHECK-LABEL: test_signed_i19_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI4_0
				; CHECK-NEXT: adrp x9, .LCPI4_1
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI4_0]
				; CHECK-NEXT: ldr s2, [x9, :lo12:.LCPI4_1]
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: fmaxnm s1, s0, s1
				; CHECK-NEXT: fminnm s1, s1, s2
				; CHECK-NEXT: fcvtzs w8, s1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i19 @llvm.fptosi.sat.i19.f32(float %f)
				ret i19 %x
				}

				define i32 @test_signed_i32_f32(float %f) {
				; CHECK-LABEL: test_signed_i32_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x9, .LCPI5_0
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI5_0]
				; CHECK-NEXT: adrp x9, .LCPI5_1
				; CHECK-NEXT: ldr s2, [x9, :lo12:.LCPI5_1]
				; CHECK-NEXT: fcvtzs w8, s0
				; CHECK-NEXT: orr w10, wzr, #0x80000000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr w9, wzr, #0x7fffffff
				; CHECK-NEXT: csel w8, w10, w8, lt
				; CHECK-NEXT: fcmp s0, s2
				; CHECK-NEXT: csel w8, w9, w8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i32 @llvm.fptosi.sat.i32.f32(float %f)
				ret i32 %x
				}

				define i50 @test_signed_i50_f32(float %f) {
				; CHECK-LABEL: test_signed_i50_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x9, .LCPI6_0
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI6_0]
				; CHECK-NEXT: adrp x9, .LCPI6_1
				; CHECK-NEXT: ldr s2, [x9, :lo12:.LCPI6_1]
				; CHECK-NEXT: fcvtzs x8, s0
				; CHECK-NEXT: orr x10, xzr, #0xfffe000000000000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr x9, xzr, #0x1ffffffffffff
				; CHECK-NEXT: csel x8, x10, x8, lt
				; CHECK-NEXT: fcmp s0, s2
				; CHECK-NEXT: csel x8, x9, x8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel x0, xzr, x8, vs
				; CHECK-NEXT: ret
				%x = call i50 @llvm.fptosi.sat.i50.f32(float %f)
				ret i50 %x
				}

				define i64 @test_signed_i64_f32(float %f) {
				; CHECK-LABEL: test_signed_i64_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x9, .LCPI7_0
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI7_0]
				; CHECK-NEXT: adrp x9, .LCPI7_1
				; CHECK-NEXT: ldr s2, [x9, :lo12:.LCPI7_1]
				; CHECK-NEXT: fcvtzs x8, s0
				; CHECK-NEXT: orr x10, xzr, #0x8000000000000000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr x9, xzr, #0x7fffffffffffffff
				; CHECK-NEXT: csel x8, x10, x8, lt
				; CHECK-NEXT: fcmp s0, s2
				; CHECK-NEXT: csel x8, x9, x8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel x0, xzr, x8, vs
				; CHECK-NEXT: ret
				%x = call i64 @llvm.fptosi.sat.i64.f32(float %f)
				ret i64 %x
				}

				;define i100 @test_signed_i100_f32(float %f) {
				; %x = call i100 @llvm.fptosi.sat.i100.f32(float %f)
				; ret i100 %x
				;}
				;
				;define i128 @test_signed_i128_f32(float %f) {
				; %x = call i128 @llvm.fptosi.sat.i128.f32(float %f)
				; ret i128 %x
				;}

				;
				; 32-bit float to unsigned integer
				;

				declare i1 @llvm.fptoui.sat.i1.f32 (float)
				declare i8 @llvm.fptoui.sat.i8.f32 (float)
				declare i13 @llvm.fptoui.sat.i13.f32 (float)
				declare i16 @llvm.fptoui.sat.i16.f32 (float)
				declare i19 @llvm.fptoui.sat.i19.f32 (float)
				declare i32 @llvm.fptoui.sat.i32.f32 (float)
				declare i50 @llvm.fptoui.sat.i50.f32 (float)
				declare i64 @llvm.fptoui.sat.i64.f32 (float)
				declare i100 @llvm.fptoui.sat.i100.f32(float)
				declare i128 @llvm.fptoui.sat.i128.f32(float)

				define i1 @test_unsigned_i1_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i1_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fmov s1, wzr
				; CHECK-NEXT: fmaxnm s0, s0, s1
				; CHECK-NEXT: fmov s1, #1.00000000
				; CHECK-NEXT: fminnm s0, s0, s1
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%x = call i1 @llvm.fptoui.sat.i1.f32(float %f)
				ret i1 %x
				}

				define i8 @test_unsigned_i8_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i8_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI9_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI9_0]
				; CHECK-NEXT: fmov s2, wzr
				; CHECK-NEXT: fmaxnm s0, s0, s2
				; CHECK-NEXT: fminnm s0, s0, s1
				; CHECK-NEXT: fcvtzu w0, s0
				; CHECK-NEXT: ret
				%x = call i8 @llvm.fptoui.sat.i8.f32(float %f)
				ret i8 %x
				}

				define i13 @test_unsigned_i13_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i13_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI10_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI10_0]
				; CHECK-NEXT: fmov s2, wzr
				; CHECK-NEXT: fmaxnm s0, s0, s2
				; CHECK-NEXT: fminnm s0, s0, s1
				; CHECK-NEXT: fcvtzu w0, s0
				; CHECK-NEXT: ret
				%x = call i13 @llvm.fptoui.sat.i13.f32(float %f)
				ret i13 %x
				}

				define i16 @test_unsigned_i16_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i16_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI11_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI11_0]
				; CHECK-NEXT: fmov s2, wzr
				; CHECK-NEXT: fmaxnm s0, s0, s2
				; CHECK-NEXT: fminnm s0, s0, s1
				; CHECK-NEXT: fcvtzu w0, s0
				; CHECK-NEXT: ret
				%x = call i16 @llvm.fptoui.sat.i16.f32(float %f)
				ret i16 %x
				}

				define i19 @test_unsigned_i19_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i19_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI12_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI12_0]
				; CHECK-NEXT: fmov s2, wzr
				; CHECK-NEXT: fmaxnm s0, s0, s2
				; CHECK-NEXT: fminnm s0, s0, s1
				; CHECK-NEXT: fcvtzu w0, s0
				; CHECK-NEXT: ret
				%x = call i19 @llvm.fptoui.sat.i19.f32(float %f)
				ret i19 %x
				}

				define i32 @test_unsigned_i32_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i32_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI13_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI13_0]
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csinv w0, w8, wzr, le
				; CHECK-NEXT: ret
				%x = call i32 @llvm.fptoui.sat.i32.f32(float %f)
				ret i32 %x
				}

				define i50 @test_unsigned_i50_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i50_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI14_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI14_0]
				; CHECK-NEXT: fcvtzu x8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel x8, xzr, x8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr x9, xzr, #0x3ffffffffffff
				; CHECK-NEXT: csel x0, x9, x8, gt
				; CHECK-NEXT: ret
				%x = call i50 @llvm.fptoui.sat.i50.f32(float %f)
				ret i50 %x
				}

				define i64 @test_unsigned_i64_f32(float %f) {
				; CHECK-LABEL: test_unsigned_i64_f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI15_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI15_0]
				; CHECK-NEXT: fcvtzu x8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel x8, xzr, x8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csinv x0, x8, xzr, le
				; CHECK-NEXT: ret
				%x = call i64 @llvm.fptoui.sat.i64.f32(float %f)
				ret i64 %x
				}

				;define i100 @test_unsigned_i100_f32(float %f) {
				; %x = call i100 @llvm.fptoui.sat.i100.f32(float %f)
				; ret i100 %x
				;}
				;
				;define i128 @test_unsigned_i128_f32(float %f) {
				; %x = call i128 @llvm.fptoui.sat.i128.f32(float %f)
				; ret i128 %x
				;}

				;
				; 64-bit float to signed integer
				;

				declare i1 @llvm.fptosi.sat.i1.f64 (double)
				declare i8 @llvm.fptosi.sat.i8.f64 (double)
				declare i13 @llvm.fptosi.sat.i13.f64 (double)
				declare i16 @llvm.fptosi.sat.i16.f64 (double)
				declare i19 @llvm.fptosi.sat.i19.f64 (double)
				declare i32 @llvm.fptosi.sat.i32.f64 (double)
				declare i50 @llvm.fptosi.sat.i50.f64 (double)
				declare i64 @llvm.fptosi.sat.i64.f64 (double)
				declare i100 @llvm.fptosi.sat.i100.f64(double)
				declare i128 @llvm.fptosi.sat.i128.f64(double)

				define i1 @test_signed_i1_f64(double %f) {
				; CHECK-LABEL: test_signed_i1_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fmov d1, #-1.00000000
				; CHECK-NEXT: fmov d2, xzr
				; CHECK-NEXT: fmaxnm d1, d0, d1
				; CHECK-NEXT: fminnm d1, d1, d2
				; CHECK-NEXT: fcvtzs w8, d1
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: csel w8, wzr, w8, vs
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%x = call i1 @llvm.fptosi.sat.i1.f64(double %f)
				ret i1 %x
				}

				define i8 @test_signed_i8_f64(double %f) {
				; CHECK-LABEL: test_signed_i8_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI17_0
				; CHECK-NEXT: adrp x9, .LCPI17_1
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI17_0]
				; CHECK-NEXT: ldr d2, [x9, :lo12:.LCPI17_1]
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: fmaxnm d1, d0, d1
				; CHECK-NEXT: fminnm d1, d1, d2
				; CHECK-NEXT: fcvtzs w8, d1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i8 @llvm.fptosi.sat.i8.f64(double %f)
				ret i8 %x
				}

				define i13 @test_signed_i13_f64(double %f) {
				; CHECK-LABEL: test_signed_i13_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI18_0
				; CHECK-NEXT: adrp x9, .LCPI18_1
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI18_0]
				; CHECK-NEXT: ldr d2, [x9, :lo12:.LCPI18_1]
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: fmaxnm d1, d0, d1
				; CHECK-NEXT: fminnm d1, d1, d2
				; CHECK-NEXT: fcvtzs w8, d1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i13 @llvm.fptosi.sat.i13.f64(double %f)
				ret i13 %x
				}

				define i16 @test_signed_i16_f64(double %f) {
				; CHECK-LABEL: test_signed_i16_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI19_0
				; CHECK-NEXT: adrp x9, .LCPI19_1
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI19_0]
				; CHECK-NEXT: ldr d2, [x9, :lo12:.LCPI19_1]
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: fmaxnm d1, d0, d1
				; CHECK-NEXT: fminnm d1, d1, d2
				; CHECK-NEXT: fcvtzs w8, d1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i16 @llvm.fptosi.sat.i16.f64(double %f)
				ret i16 %x
				}

				define i19 @test_signed_i19_f64(double %f) {
				; CHECK-LABEL: test_signed_i19_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI20_0
				; CHECK-NEXT: adrp x9, .LCPI20_1
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI20_0]
				; CHECK-NEXT: ldr d2, [x9, :lo12:.LCPI20_1]
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: fmaxnm d1, d0, d1
				; CHECK-NEXT: fminnm d1, d1, d2
				; CHECK-NEXT: fcvtzs w8, d1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i19 @llvm.fptosi.sat.i19.f64(double %f)
				ret i19 %x
				}

				define i32 @test_signed_i32_f64(double %f) {
				; CHECK-LABEL: test_signed_i32_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI21_0
				; CHECK-NEXT: adrp x9, .LCPI21_1
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI21_0]
				; CHECK-NEXT: ldr d2, [x9, :lo12:.LCPI21_1]
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: fmaxnm d1, d0, d1
				; CHECK-NEXT: fminnm d1, d1, d2
				; CHECK-NEXT: fcvtzs w8, d1
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i32 @llvm.fptosi.sat.i32.f64(double %f)
				ret i32 %x
				}

				define i50 @test_signed_i50_f64(double %f) {
				; CHECK-LABEL: test_signed_i50_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI22_0
				; CHECK-NEXT: adrp x9, .LCPI22_1
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI22_0]
				; CHECK-NEXT: ldr d2, [x9, :lo12:.LCPI22_1]
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: fmaxnm d1, d0, d1
				; CHECK-NEXT: fminnm d1, d1, d2
				; CHECK-NEXT: fcvtzs x8, d1
				; CHECK-NEXT: csel x0, xzr, x8, vs
				; CHECK-NEXT: ret
				%x = call i50 @llvm.fptosi.sat.i50.f64(double %f)
				ret i50 %x
				}

				define i64 @test_signed_i64_f64(double %f) {
				; CHECK-LABEL: test_signed_i64_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x9, .LCPI23_0
				; CHECK-NEXT: ldr d1, [x9, :lo12:.LCPI23_0]
				; CHECK-NEXT: adrp x9, .LCPI23_1
				; CHECK-NEXT: ldr d2, [x9, :lo12:.LCPI23_1]
				; CHECK-NEXT: fcvtzs x8, d0
				; CHECK-NEXT: orr x10, xzr, #0x8000000000000000
				; CHECK-NEXT: fcmp d0, d1
				; CHECK-NEXT: orr x9, xzr, #0x7fffffffffffffff
				; CHECK-NEXT: csel x8, x10, x8, lt
				; CHECK-NEXT: fcmp d0, d2
				; CHECK-NEXT: csel x8, x9, x8, gt
				; CHECK-NEXT: fcmp d0, d0
				; CHECK-NEXT: csel x0, xzr, x8, vs
				; CHECK-NEXT: ret
				%x = call i64 @llvm.fptosi.sat.i64.f64(double %f)
				ret i64 %x
				}

				;define i100 @test_signed_i100_f64(double %f) {
				; %x = call i100 @llvm.fptosi.sat.i100.f64(double %f)
				; ret i100 %x
				;}
				;
				;define i128 @test_signed_i128_f64(double %f) {
				; %x = call i128 @llvm.fptosi.sat.i128.f64(double %f)
				; ret i128 %x
				;}

				;
				; 64-bit float to unsigned integer
				;

				declare i1 @llvm.fptoui.sat.i1.f64 (double)
				declare i8 @llvm.fptoui.sat.i8.f64 (double)
				declare i13 @llvm.fptoui.sat.i13.f64 (double)
				declare i16 @llvm.fptoui.sat.i16.f64 (double)
				declare i19 @llvm.fptoui.sat.i19.f64 (double)
				declare i32 @llvm.fptoui.sat.i32.f64 (double)
				declare i50 @llvm.fptoui.sat.i50.f64 (double)
				declare i64 @llvm.fptoui.sat.i64.f64 (double)
				declare i100 @llvm.fptoui.sat.i100.f64(double)
				declare i128 @llvm.fptoui.sat.i128.f64(double)

				define i1 @test_unsigned_i1_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i1_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fmov d1, xzr
				; CHECK-NEXT: fmaxnm d0, d0, d1
				; CHECK-NEXT: fmov d1, #1.00000000
				; CHECK-NEXT: fminnm d0, d0, d1
				; CHECK-NEXT: fcvtzu w8, d0
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%x = call i1 @llvm.fptoui.sat.i1.f64(double %f)
				ret i1 %x
				}

				define i8 @test_unsigned_i8_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i8_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI25_0
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI25_0]
				; CHECK-NEXT: fmov d2, xzr
				; CHECK-NEXT: fmaxnm d0, d0, d2
				; CHECK-NEXT: fminnm d0, d0, d1
				; CHECK-NEXT: fcvtzu w0, d0
				; CHECK-NEXT: ret
				%x = call i8 @llvm.fptoui.sat.i8.f64(double %f)
				ret i8 %x
				}

				define i13 @test_unsigned_i13_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i13_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI26_0
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI26_0]
				; CHECK-NEXT: fmov d2, xzr
				; CHECK-NEXT: fmaxnm d0, d0, d2
				; CHECK-NEXT: fminnm d0, d0, d1
				; CHECK-NEXT: fcvtzu w0, d0
				; CHECK-NEXT: ret
				%x = call i13 @llvm.fptoui.sat.i13.f64(double %f)
				ret i13 %x
				}

				define i16 @test_unsigned_i16_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i16_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI27_0
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI27_0]
				; CHECK-NEXT: fmov d2, xzr
				; CHECK-NEXT: fmaxnm d0, d0, d2
				; CHECK-NEXT: fminnm d0, d0, d1
				; CHECK-NEXT: fcvtzu w0, d0
				; CHECK-NEXT: ret
				%x = call i16 @llvm.fptoui.sat.i16.f64(double %f)
				ret i16 %x
				}

				define i19 @test_unsigned_i19_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i19_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI28_0
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI28_0]
				; CHECK-NEXT: fmov d2, xzr
				; CHECK-NEXT: fmaxnm d0, d0, d2
				; CHECK-NEXT: fminnm d0, d0, d1
				; CHECK-NEXT: fcvtzu w0, d0
				; CHECK-NEXT: ret
				%x = call i19 @llvm.fptoui.sat.i19.f64(double %f)
				ret i19 %x
				}

				define i32 @test_unsigned_i32_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i32_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI29_0
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI29_0]
				; CHECK-NEXT: fmov d2, xzr
				; CHECK-NEXT: fmaxnm d0, d0, d2
				; CHECK-NEXT: fminnm d0, d0, d1
				; CHECK-NEXT: fcvtzu w0, d0
				; CHECK-NEXT: ret
				%x = call i32 @llvm.fptoui.sat.i32.f64(double %f)
				ret i32 %x
				}

				define i50 @test_unsigned_i50_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i50_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI30_0
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI30_0]
				; CHECK-NEXT: fmov d2, xzr
				; CHECK-NEXT: fmaxnm d0, d0, d2
				; CHECK-NEXT: fminnm d0, d0, d1
				; CHECK-NEXT: fcvtzu x0, d0
				; CHECK-NEXT: ret
				%x = call i50 @llvm.fptoui.sat.i50.f64(double %f)
				ret i50 %x
				}

				define i64 @test_unsigned_i64_f64(double %f) {
				; CHECK-LABEL: test_unsigned_i64_f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI31_0
				; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI31_0]
				; CHECK-NEXT: fcvtzu x8, d0
				; CHECK-NEXT: fcmp d0, #0.0
				; CHECK-NEXT: csel x8, xzr, x8, lt
				; CHECK-NEXT: fcmp d0, d1
				; CHECK-NEXT: csinv x0, x8, xzr, le
				; CHECK-NEXT: ret
				%x = call i64 @llvm.fptoui.sat.i64.f64(double %f)
				ret i64 %x
				}

				;define i100 @test_unsigned_i100_f64(double %f) {
				; %x = call i100 @llvm.fptoui.sat.i100.f64(double %f)
				; ret i100 %x
				;}
				;
				;define i128 @test_unsigned_i128_f64(double %f) {
				; %x = call i128 @llvm.fptoui.sat.i128.f64(double %f)
				; ret i128 %x
				;}

				;
				; 16-bit float to signed integer
				;

				declare i1 @llvm.fptosi.sat.i1.f16 (half)
				declare i8 @llvm.fptosi.sat.i8.f16 (half)
				declare i13 @llvm.fptosi.sat.i13.f16 (half)
				declare i16 @llvm.fptosi.sat.i16.f16 (half)
				declare i19 @llvm.fptosi.sat.i19.f16 (half)
				declare i32 @llvm.fptosi.sat.i32.f16 (half)
				declare i50 @llvm.fptosi.sat.i50.f16 (half)
				declare i64 @llvm.fptosi.sat.i64.f16 (half)
				declare i100 @llvm.fptosi.sat.i100.f16(half)
				declare i128 @llvm.fptosi.sat.i128.f16(half)

				define i1 @test_signed_i1_f16(half %f) {
				; CHECK-LABEL: test_signed_i1_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fmov s1, #-1.00000000
				; CHECK-NEXT: fcvtzs w8, s0
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csinv w8, w8, wzr, ge
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w8, wzr, w8, vs
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%x = call i1 @llvm.fptosi.sat.i1.f16(half %f)
				ret i1 %x
				}

				define i8 @test_signed_i8_f16(half %f) {
				; CHECK-LABEL: test_signed_i8_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI33_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI33_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: adrp x9, .LCPI33_1
				; CHECK-NEXT: orr w8, wzr, #0xffffff80
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI33_1]
				; CHECK-NEXT: fcvtzs w9, s0
				; CHECK-NEXT: csel w8, w8, w9, lt
				; CHECK-NEXT: orr w9, wzr, #0x7f
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csel w8, w9, w8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i8 @llvm.fptosi.sat.i8.f16(half %f)
				ret i8 %x
				}

				define i13 @test_signed_i13_f16(half %f) {
				; CHECK-LABEL: test_signed_i13_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI34_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI34_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: adrp x9, .LCPI34_1
				; CHECK-NEXT: orr w8, wzr, #0xfffff000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI34_1]
				; CHECK-NEXT: fcvtzs w9, s0
				; CHECK-NEXT: csel w8, w8, w9, lt
				; CHECK-NEXT: orr w9, wzr, #0xfff
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csel w8, w9, w8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i13 @llvm.fptosi.sat.i13.f16(half %f)
				ret i13 %x
				}

				define i16 @test_signed_i16_f16(half %f) {
				; CHECK-LABEL: test_signed_i16_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI35_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI35_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: adrp x9, .LCPI35_1
				; CHECK-NEXT: orr w8, wzr, #0xffff8000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI35_1]
				; CHECK-NEXT: fcvtzs w9, s0
				; CHECK-NEXT: csel w8, w8, w9, lt
				; CHECK-NEXT: orr w9, wzr, #0x7fff
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csel w8, w9, w8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i16 @llvm.fptosi.sat.i16.f16(half %f)
				ret i16 %x
				}

				define i19 @test_signed_i19_f16(half %f) {
				; CHECK-LABEL: test_signed_i19_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI36_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI36_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: adrp x9, .LCPI36_1
				; CHECK-NEXT: orr w8, wzr, #0xfffc0000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI36_1]
				; CHECK-NEXT: fcvtzs w9, s0
				; CHECK-NEXT: csel w8, w8, w9, lt
				; CHECK-NEXT: orr w9, wzr, #0x3ffff
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csel w8, w9, w8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i19 @llvm.fptosi.sat.i19.f16(half %f)
				ret i19 %x
				}

				define i32 @test_signed_i32_f16(half %f) {
				; CHECK-LABEL: test_signed_i32_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI37_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI37_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: adrp x9, .LCPI37_1
				; CHECK-NEXT: orr w8, wzr, #0x80000000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI37_1]
				; CHECK-NEXT: fcvtzs w9, s0
				; CHECK-NEXT: csel w8, w8, w9, lt
				; CHECK-NEXT: orr w9, wzr, #0x7fffffff
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csel w8, w9, w8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel w0, wzr, w8, vs
				; CHECK-NEXT: ret
				%x = call i32 @llvm.fptosi.sat.i32.f16(half %f)
				ret i32 %x
				}

				define i50 @test_signed_i50_f16(half %f) {
				; CHECK-LABEL: test_signed_i50_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI38_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI38_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: adrp x9, .LCPI38_1
				; CHECK-NEXT: orr x8, xzr, #0xfffe000000000000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI38_1]
				; CHECK-NEXT: fcvtzs x9, s0
				; CHECK-NEXT: csel x8, x8, x9, lt
				; CHECK-NEXT: orr x9, xzr, #0x1ffffffffffff
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csel x8, x9, x8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel x0, xzr, x8, vs
				; CHECK-NEXT: ret
				%x = call i50 @llvm.fptosi.sat.i50.f16(half %f)
				ret i50 %x
				}

				define i64 @test_signed_i64_f16(half %f) {
				; CHECK-LABEL: test_signed_i64_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI39_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI39_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: adrp x9, .LCPI39_1
				; CHECK-NEXT: orr x8, xzr, #0x8000000000000000
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: ldr s1, [x9, :lo12:.LCPI39_1]
				; CHECK-NEXT: fcvtzs x9, s0
				; CHECK-NEXT: csel x8, x8, x9, lt
				; CHECK-NEXT: orr x9, xzr, #0x7fffffffffffffff
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csel x8, x9, x8, gt
				; CHECK-NEXT: fcmp s0, s0
				; CHECK-NEXT: csel x0, xzr, x8, vs
				; CHECK-NEXT: ret
				%x = call i64 @llvm.fptosi.sat.i64.f16(half %f)
				ret i64 %x
				}

				;define i100 @test_signed_i100_f16(half %f) {
				; %x = call i100 @llvm.fptosi.sat.i100.f16(half %f)
				; ret i100 %x
				;}
				;
				;define i128 @test_signed_i128_f16(half %f) {
				; %x = call i128 @llvm.fptosi.sat.i128.f16(half %f)
				; ret i128 %x
				;}

				;
				; 16-bit float to unsigned integer
				;

				declare i1 @llvm.fptoui.sat.i1.f16 (half)
				declare i8 @llvm.fptoui.sat.i8.f16 (half)
				declare i13 @llvm.fptoui.sat.i13.f16 (half)
				declare i16 @llvm.fptoui.sat.i16.f16 (half)
				declare i19 @llvm.fptoui.sat.i19.f16 (half)
				declare i32 @llvm.fptoui.sat.i32.f16 (half)
				declare i50 @llvm.fptoui.sat.i50.f16 (half)
				declare i64 @llvm.fptoui.sat.i64.f16 (half)
				declare i100 @llvm.fptoui.sat.i100.f16(half)
				declare i128 @llvm.fptoui.sat.i128.f16(half)

				define i1 @test_unsigned_i1_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i1_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fmov s1, #1.00000000
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csinc w8, w8, wzr, le
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%x = call i1 @llvm.fptoui.sat.i1.f16(half %f)
				ret i1 %x
				}

				define i8 @test_unsigned_i8_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i8_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI41_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI41_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr w9, wzr, #0xff
				; CHECK-NEXT: csel w0, w9, w8, gt
				; CHECK-NEXT: ret
				%x = call i8 @llvm.fptoui.sat.i8.f16(half %f)
				ret i8 %x
				}

				define i13 @test_unsigned_i13_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i13_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI42_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI42_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr w9, wzr, #0x1fff
				; CHECK-NEXT: csel w0, w9, w8, gt
				; CHECK-NEXT: ret
				%x = call i13 @llvm.fptoui.sat.i13.f16(half %f)
				ret i13 %x
				}

				define i16 @test_unsigned_i16_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i16_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI43_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI43_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr w9, wzr, #0xffff
				; CHECK-NEXT: csel w0, w9, w8, gt
				; CHECK-NEXT: ret
				%x = call i16 @llvm.fptoui.sat.i16.f16(half %f)
				ret i16 %x
				}

				define i19 @test_unsigned_i19_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i19_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI44_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI44_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr w9, wzr, #0x7ffff
				; CHECK-NEXT: csel w0, w9, w8, gt
				; CHECK-NEXT: ret
				%x = call i19 @llvm.fptoui.sat.i19.f16(half %f)
				ret i19 %x
				}

				define i32 @test_unsigned_i32_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i32_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI45_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI45_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fcvtzu w8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel w8, wzr, w8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csinv w0, w8, wzr, le
				; CHECK-NEXT: ret
				%x = call i32 @llvm.fptoui.sat.i32.f16(half %f)
				ret i32 %x
				}

				define i50 @test_unsigned_i50_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i50_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI46_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI46_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fcvtzu x8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel x8, xzr, x8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: orr x9, xzr, #0x3ffffffffffff
				; CHECK-NEXT: csel x0, x9, x8, gt
				; CHECK-NEXT: ret
				%x = call i50 @llvm.fptoui.sat.i50.f16(half %f)
				ret i50 %x
				}

				define i64 @test_unsigned_i64_f16(half %f) {
				; CHECK-LABEL: test_unsigned_i64_f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adrp x8, .LCPI47_0
				; CHECK-NEXT: ldr s1, [x8, :lo12:.LCPI47_0]
				; CHECK-NEXT: fcvt s0, h0
				; CHECK-NEXT: fcvtzu x8, s0
				; CHECK-NEXT: fcmp s0, #0.0
				; CHECK-NEXT: csel x8, xzr, x8, lt
				; CHECK-NEXT: fcmp s0, s1
				; CHECK-NEXT: csinv x0, x8, xzr, le
				; CHECK-NEXT: ret
				%x = call i64 @llvm.fptoui.sat.i64.f16(half %f)
				ret i64 %x
				}

				;define i100 @test_unsigned_i100_f16(half %f) {
				; %x = call i100 @llvm.fptoui.sat.i100.f16(half %f)
				; ret i100 %x
				;}
				;
				;define i128 @test_unsigned_i128_f16(half %f) {
				; %x = call i128 @llvm.fptoui.sat.i128.f16(half %f)
				; ret i128 %x
				;}

This is an archive of the discontinued LLVM Phabricator instance.

Saturating float to int casts.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 174769

docs/LangRef.rst

include/llvm/CodeGen/ISDOpcodes.h

include/llvm/CodeGen/TargetLowering.h

include/llvm/IR/Intrinsics.td

include/llvm/Target/TargetSelectionDAG.td

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

lib/CodeGen/SelectionDAG/LegalizeTypes.h

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

lib/CodeGen/SelectionDAG/TargetLowering.cpp

lib/CodeGen/TargetLoweringBase.cpp

test/CodeGen/AArch64/fptoi-sat-scalar.ll

Saturating float to int casts.
ClosedPublic