This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
docs/
-
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
SelectionDAGNodes.h
-
TargetLowering.h
-
IR/
-
IntrinsicInst.h
-
Intrinsics.td
-
lib/
-
CodeGen/
-
SelectionDAG/
-
LegalizeDAG.cpp
-
LegalizeIntegerTypes.cpp
-
LegalizeTypes.h
-
LegalizeVectorOps.cpp
-
LegalizeVectorTypes.cpp
-
SelectionDAG.cpp
-
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
-
TargetLowering.cpp
-
TargetLoweringBase.cpp
-
IR/
-
IntrinsicInst.cpp
-
Verifier.cpp
-
test/
-
CodeGen/
-
PowerPC/
-
fp-intrinsics-fptosi-legal.ll
-
X86/
-
fp-intrinsics.ll
-
vector-constrained-fp-intrinsics.ll
-
Feature/
-
fp-intrinsics.ll

Differential D63782

[FPEnv] Add fptosi and fptoui constrained intrinsics
ClosedPublic

Authored by kpn on Jun 25 2019, 11:56 AM.

Download Raw Diff

Details

Reviewers

andrew.w.kaylor
craig.topper
hfinkel
mehdi_amini
aemerson
javed.absar
kbarton

Commits

rGddf13c00edf1: [FPEnv] Add fptosi and fptoui constrained intrinsics.
rL370228: [FPEnv] Add fptosi and fptoui constrained intrinsics.

Summary

Constrained floating point intrinsics for FP to signed and unsigned integers are still needed.

Quoting from D32319: "The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed."

This diff replaces D43515. That ticket has too much history to be easily read.

Diff Detail

Repository: rL LLVM

Event Timeline

kpn created this revision.Jun 25 2019, 11:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 25 2019, 11:56 AM

Herald added subscribers: llvm-commits, jsji, nemanjai. · View Herald Transcript

kpn mentioned this in D43515: More math intrinsics for conservative math handling.Jun 25 2019, 11:57 AM

What happens if the input float is out of range? fptosi/fptoui instructions produce poison; not sure if you want that here.

In D63782#1558430, @efriedma wrote:

What happens if the input float is out of range? fptosi/fptoui instructions produce poison; not sure if you want that here.

I'm not well versed in poison. Are you worried about constant folding away the fptoXi? If a run-time conversion could raise a trap I expect constant folding to be skipped. Would that avoid poison?

The problem with poison is that it eventually leads to UB, and then your program has no defined meaning. Practically, it might mean some codepath that involves a call to llvm.experimental.constrained.fptosi.i32.f64 could get folded away because it's provably UB, or something like that.

In any case, we should explicitly state what happens, since it isn't specified by IEEE754.

In D63782#1558430, @efriedma wrote:

What happens if the input float is out of range? fptosi/fptoui instructions produce poison; not sure if you want that here.

Is this a general comment or referring to a change in Kevin's patch?

Unless I'm misunderstanding, we should be leaving the constrained converts alone until a hardware instruction is produced. In the ConvertToInteger cases, the hw instructions should flag Invalid and then the rounding mode will determine the result.

If you're asking what would happen if we found a poison value as the fptoXi's operand, that's a tougher question...

Is this a general comment or referring to a change in Kevin's patch?

It's a question about the semantics of the proposed llvm.experimental.constrained.fptosi/llvm.experimental.constrained.fptoui. Specifically, what does it do when the result is larger than the largest integer representable in the destination format? LangRef for fptoui/fptosi says "If the value cannot fit in ty2, the result is a poison value."

Unless I'm misunderstanding, we should be leaving the constrained converts alone until a hardware instruction is produced.

We have to define the semantics; I mean, I guess we could say "If the value cannot fit in the destination type, the result is computed in a target-specific way", but we'd have to state it explicitly. And it's sort of awkward.

In D63782#1559923, @efriedma wrote:

Unless I'm misunderstanding, we should be leaving the constrained converts alone until a hardware instruction is produced.

We have to define the semantics; I mean, I guess we could say "If the value cannot fit in the destination type, the result is computed in a target-specific way", but we'd have to state it explicitly. And it's sort of awkward.

I think IEEE-754 does define this:

When a numeric operand would convert to an integer outside the range of the destination format, the invalid operation exception shall be signaled if this situation cannot otherwise be indicated.

There's some language that follows about IEEE specific converts and rounding mode too.

Let me be clear that this is in one of my blind spots wrt IEEE-754, so take my opinions lightly. And it probably goes without saying that I haven't thought through the edge cases...

I think IEEE-754 does define this

Your citation doesn't actually specify what value is returned, only that an exception is raised.

In D63782#1559923, @efriedma wrote:

Is this a general comment or referring to a change in Kevin's patch?

It's a question about the semantics of the proposed llvm.experimental.constrained.fptosi/llvm.experimental.constrained.fptoui. Specifically, what does it do when the result is larger than the largest integer representable in the destination format? LangRef for fptoui/fptosi says "If the value cannot fit in ty2, the result is a poison value."

Unless I'm misunderstanding, we should be leaving the constrained converts alone until a hardware instruction is produced.

We have to define the semantics; I mean, I guess we could say "If the value cannot fit in the destination type, the result is computed in a target-specific way", but we'd have to state it explicitly. And it's sort of awkward.

And we're generally avoided giving target-independent LLVM operations target-defined semantics, and I prefer that we continue to avoid doing that.

In D63782#1559964, @efriedma wrote:

I think IEEE-754 does define this

Your citation doesn't actually specify what value is returned, only that an exception is raised.

Ok, I agree with that:

The invalid operation exception is signaled if and only if there is no usefully definable result.

So pragmatically, an invalid exception is an alarm that the code is off track. As long as the exception is handled appropriately (default or an alternative), the result of the invalid operation shouldn't matter. Whatever LLVM wants to do with the value gets no arguments from me, since we've already self-destructed (unless the program handles the exception gracefully, but that wouldn't require a defined result from the invalid operation anyway).

Thinking about LangRef, aren't all the constrained intrinsics special considering the side-effects? Would it be acceptable for LangRef to read something like:

Semantics:

@llvm.experimental.constrained.fadd maps to IEEE-754's addition(x, y)

[Also, I'm noticing that the LangRef semantics for the existing constrained intrinsics are pretty sloppy. Those will probably need to be reworked.]

In D63782#1560123, @cameron.mcinally wrote:
In D63782#1559964, @efriedma wrote:

I think IEEE-754 does define this

Your citation doesn't actually specify what value is returned, only that an exception is raised.

Ok, I agree with that:
The invalid operation exception is signaled if and only if there is no usefully definable result.
So pragmatically, an invalid exception is an alarm that the code is off track. As long as the exception is handled appropriately (default or an alternative), the result of the invalid operation shouldn't matter. Whatever LLVM wants to do with the value gets no arguments from me, since we've already self-destructed (unless the program handles the exception gracefully, but that wouldn't require a defined result from the invalid operation anyway).

But the exception could be masked couldn't it?

Thinking about LangRef, aren't all the constrained intrinsics special considering the side-effects? Would it be acceptable for LangRef to read something like:
Semantics:

@llvm.experimental.constrained.fadd maps to IEEE-754's addition(x, y)
[Also, I'm noticing that the LangRef semantics for the existing constrained intrinsics are pretty sloppy. Those will probably need to be reworked.]

In D63782#1560136, @craig.topper wrote:

In D63782#1560123, @cameron.mcinally wrote:

So pragmatically, an invalid exception is an alarm that the code is off track. As long as the exception is handled appropriately (default or an alternative), the result of the invalid operation shouldn't matter. Whatever LLVM wants to do with the value gets no arguments from me, since we've already self-destructed (unless the program handles the exception gracefully, but that wouldn't require a defined result from the invalid operation anyway).

But the exception could be masked couldn't it?

Yeah, but I'm not sure if it matters. The program has already failed, so there's no guarantee that the results are useful.

Our typical user enables traps (inv, divz, and ovf) during development to convince themselves that the code is safe. Production runs are then done with traps disabled. But, of course, traps may be reenabled if a runtime problem is later found. They're basically a sanity check.

In D63782#1560156, @cameron.mcinally wrote:

In D63782#1560136, @craig.topper wrote:

In D63782#1560123, @cameron.mcinally wrote:

So pragmatically, an invalid exception is an alarm that the code is off track. As long as the exception is handled appropriately (default or an alternative), the result of the invalid operation shouldn't matter. Whatever LLVM wants to do with the value gets no arguments from me, since we've already self-destructed (unless the program handles the exception gracefully, but that wouldn't require a defined result from the invalid operation anyway).

But the exception could be masked couldn't it?

Yeah, but I'm not sure if it matters. The program has already failed, so there's no guarantee that the results are useful.

Our typical user enables traps (inv, divz, and ovf) during development to convince themselves that the code is safe. Production runs are then done with traps disabled. But, of course, traps may be reenabled if a runtime problem is later found. They're basically a sanity check.

We run with traps _on_, except we don't care about Inexact, and we do this in the products that ship to customers. Traps get handled in some way that is reasonable to a higher level of the software, for example by at least sometimes having a handler insert a value to replace the value that didn't get produced by the instruction that trapped. So I want a constrained fptoXi that is given valid inputs to never generate poison or get folded in a way that hides a trap.

For the constrained intrinsics, at least in the case where the exception semantics argument is "fpexcept.strict", we need to make sure the invalid exception is raised. If the exception semantics argument is "fpexcept.ignore" or "fpexcept.maytrap" we could allow the conversion to be optimized away.

If we say the result of the constrained intrinsic is poison, will the fact that the intrinsic is defined as having side effects keep it from being eliminated?

If we say the result of the constrained intrinsic is poison, will the fact that the intrinsic is defined as having side effects keep it from being eliminated?

If a program uses a poison value in certain ways (defined in LangRef), the behavior of the program as a whole is undefined. So the optimizer can generate code that does anything in that case; most likely, it will erase the entire basic block containing the call. Side-effects like modifying the FP exception bits don't matter.

That said, if llvm.experimental.constrained.fptosi.i32.f64 is allowed to raise an unmasked exception, that would keep it from being eliminated: the return value would never be used.

I might be opening a can of worms here and I'm not a language expert, but it isn't clear to me from reading the C99 standard that defining fptosi/fptoui as returning poison values in the unrepresentable case allows correct implementation of the C standard. That is, it doesn't seem to me that the standard actually says this is undefined behavior. It just says the resulting value is unspecified, and the exception behavior is explicitly defined. On the other hand, C++ does say clearly that it is undefined behavior, right?

I understand that in the unconstrained case LLVM doesn't care about FP exceptions and that we would like the LLVM IR definition to be more precise than the C standard. I'm just trying to get my head wrapped around why we're doing what we are in that case and what we need to do to correctly implement strict FP semantics.

If we say that the constrained version returns undef in the unrepresentable case and clearly emphasize how this differs from the standard fptosi/fptoui instructions, would that have the effect of keeping the operation around so that it can raise the exception while still giving us well-defined IR semantics?

In D63782#1561585, @andrew.w.kaylor wrote:

I might be opening a can of worms here and I'm not a language expert, but it isn't clear to me from reading the C99 standard that defining fptosi/fptoui as returning poison values in the unrepresentable case allows correct implementation of the C standard. That is, it doesn't seem to me that the standard actually says this is undefined behavior. It just says the resulting value is unspecified, and the exception behavior is explicitly defined. On the other hand, C++ does say clearly that it is undefined behavior, right?

I understand that in the unconstrained case LLVM doesn't care about FP exceptions and that we would like the LLVM IR definition to be more precise than the C standard. I'm just trying to get my head wrapped around why we're doing what we are in that case and what we need to do to correctly implement strict FP semantics.

If we say that the constrained version returns undef in the unrepresentable case and clearly emphasize how this differs from the standard fptosi/fptoui instructions, would that have the effect of keeping the operation around so that it can raise the exception while still giving us well-defined IR semantics?

Ping? Can anyone address this above comment?

Herald added subscribers: • wuzish, MaskRay. · View Herald TranscriptJul 8 2019, 9:24 AM

In D63782#1573780, @kpn wrote:

In D63782#1561585, @andrew.w.kaylor wrote:

I might be opening a can of worms here and I'm not a language expert, but it isn't clear to me from reading the C99 standard that defining fptosi/fptoui as returning poison values in the unrepresentable case allows correct implementation of the C standard. That is, it doesn't seem to me that the standard actually says this is undefined behavior. It just says the resulting value is unspecified, and the exception behavior is explicitly defined. On the other hand, C++ does say clearly that it is undefined behavior, right?

It looks like it is undefined behavior in a recent(-ish) draft of the C Standard:

6.3.1.4 Real floating and integer

1 When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

And it also appears to be UB in a draft of the C++ Standard:

7.10 Floating-integral conversions

1 A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

I understand that in the unconstrained case LLVM doesn't care about FP exceptions and that we would like the LLVM IR definition to be more precise than the C standard. I'm just trying to get my head wrapped around why we're doing what we are in that case and what we need to do to correctly implement strict FP semantics.

If we say that the constrained version returns undef in the unrepresentable case and clearly emphasize how this differs from the standard fptosi/fptoui instructions, would that have the effect of keeping the operation around so that it can raise the exception while still giving us well-defined IR semantics?

I'd be okay with a constrained fptoXi returning poison, as long as the operation isn't replaced by a poison value. In other words, what happens after the trap (and processing) is immaterial, so it's fine for the compiler to treat it as undefined behavior.

How aggressive is LLVM's UB handling? Would it remove an entire block/function if UB is found in it? @eli.friedman

How aggressive is LLVM's UB handling? Would it remove an entire block/function if UB is found in it?

If LLVM can prove a basic block unconditionally executes UB, it will be erased. But "unconditionally" is an important qualifier. For example, consider the following function: void f(void g()) { g(); *(int*)0 = 0; }. The call to g isn't erased because we can't prove g will return.

In D63782#1579595, @efriedma wrote:

How aggressive is LLVM's UB handling? Would it remove an entire block/function if UB is found in it?

If LLVM can prove a basic block unconditionally executes UB, it will be erased. But "unconditionally" is an important qualifier. For example, consider the following function: void f(void g()) { g(); *(int*)0 = 0; }. The call to g isn't erased because we can't prove g will return.

The constrained FP intrinsics should be opaque enough (besides 'ignore'), so that sounds fine to me.

Seems like we should let invalid fptoXi's return poison, like the unconstrained versions do. Anyone see a problem with this?

In D63782#1582778, @cameron.mcinally wrote:

In D63782#1579595, @efriedma wrote:

How aggressive is LLVM's UB handling? Would it remove an entire block/function if UB is found in it?

If LLVM can prove a basic block unconditionally executes UB, it will be erased. But "unconditionally" is an important qualifier. For example, consider the following function: void f(void g()) { g(); *(int*)0 = 0; }. The call to g isn't erased because we can't prove g will return.

The constrained FP intrinsics should be opaque enough (besides 'ignore'), so that sounds fine to me.

Seems like we should let invalid fptoXi's return poison, like the unconstrained versions do. Anyone see a problem with this?

Is there any chance I can get this patch into 9? What do I need to do to make that happen?

Is there any chance I can get this patch into 9? What do I need to do to make that happen?

Constant folding the constrained intrinsics is a long way off. IMO, it shouldn't hold up these two (they're experimental right now anyway).

We'll have to address all the operations that can signal invalid in the future too...

That's just my $0.02 though. I'm open to other opinions.

craig.topper added inline comments.Jul 17 2019, 2:15 PM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	Is this what we want from a strict implementation. "icc -fp-model=strict" goes out of its way to generate an invalid exception when the input type doesn't fit and the hardware can't generate an exception on its own. https://godbolt.org/z/ABU80i
lib/CodeGen/SelectionDAG/TargetLowering.cpp
4828 ↗	(On Diff #206495)	Shouldn't this be the caller's responsibility? It is for the data result.
4829 ↗	(On Diff #206495)	Put else on the above line.
4838 ↗	(On Diff #206495)	I think the use of "Strict" here is being overloaded. The "Strict" in shouldUseStrictFP_TO_INT isn't the same "strict" as the intrinsic. We may want the same behavior but we should clarify the terminology maybe.
4864 ↗	(On Diff #206495)	Same question as above
4865 ↗	(On Diff #206495)	Put else on this line
test/CodeGen/X86/fp-intrinsics.ll
293 ↗	(On Diff #206495)	I hope there are more instructions here. This is a signed conversion instruction. Or are we constant folding the setcc and select in the fp_to_uint expansion?
344 ↗	(On Diff #206495)	Where's the fptosi test case?

kpn marked 6 inline comments as done.Jul 18 2019, 10:54 AM

kpn added inline comments.

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	No, I don't think this is what we want. I'll remove that sentence.
lib/CodeGen/SelectionDAG/TargetLowering.cpp
4828 ↗	(On Diff #206495)	That's a fair question. Splitting the responsibilities could be seen as confusing. But if you grep for "Legalize the chain result" you'll find 30+ places where we're already handling the chain the way we are here. So I think that we should be consistent here and if we want to change it everywhere then maybe another ticket would be better for that change.
4838 ↗	(On Diff #206495)	Ok. How about UseOnlyOneFP_TO_SINT?
test/CodeGen/X86/fp-intrinsics.ll
293 ↗	(On Diff #206495)	It looks like it's constant folding away the rest of the instructions since 42 fits within the lowest bits of a value. I can change it to take a variable and then check for the other interesting instructions.
344 ↗	(On Diff #206495)	I had to move the tests over to the PowerPC's e500 chip with the SPE feature. That's the in-tree target that still has scalar FP_TO_SINT marked Legal. Most everywhere else it is not marked Legal and currently fails with the existing mutation support.

craig.topper added a subscriber: RKSimon.Jul 18 2019, 11:10 AM

craig.topper added inline comments.

lib/CodeGen/SelectionDAG/TargetLowering.cpp
4828 ↗	(On Diff #206495)	Most of those are in the type legalizer though aren't they? I guess my concern here is that this is a helper function that can be called from multiple places. Doing part of the replacement inside make the behavior of this function confusing.
4838 ↗	(On Diff #206495)	@RKSimon do you have an opinion here?
test/CodeGen/X86/fp-intrinsics.ll
344 ↗	(On Diff #206495)	Ok. Should we file bugs to implement support in the other targets?

kpn marked an inline comment as done and an inline comment as not done.Jul 18 2019, 11:37 AM

kpn added inline comments.

lib/CodeGen/SelectionDAG/TargetLowering.cpp
4828 ↗	(On Diff #206495)	OK, that's fair. I'll change it.
test/CodeGen/X86/fp-intrinsics.ll
344 ↗	(On Diff #206495)	The problem is in TargetLoweringBase::getStrictFPOperationAction(). The target is queried to see how it handles the non-strict version of a strict node, and then any result that isn't Legal gets bashed into Expand. So Promote and Custom can't be handled. That's the problem that needs to be solved. Short term I'm not sure what to do. Maybe yank out the lines that bash Promote and Custom? That's not been tried and I don't know exactly what will happen. Long term the mutation compatibility code needs to die. Also, it was pointed out to me recently that the strict and non-strict nodes need to have the same action registered. Which is true. Do we want to put in checks to verify that this is the case at some point?

I've been thinking about the getStrictFPOperationAction issue. I believe this function makes no sense at all and really should go away. Instead, the strict opcodes should use their own operation actions just like everything else. In current code, those should now be set up in a reasonable manner: they all default to Expand, unless the target overrides them (to whatever makes sense).

If the operation action is anything but Expand, I believe it should simply be respected as-is (i.e. Legal stays until ISel, Custom calls the target hook, Promote -if ever used- should be implemented in common code in a way that preserves strictness, and LibCall should emit a library call - possibly a strict version if necessary).

Now, if the operation action of a strict FP operation is Expand, common code should try to implement an expansion rule if possible in a way that preserves strictness. If that is not possible, then and only then should we think about falling back to a non-strict implementation. That fallback (and only that fallback) now should look at the operation action of the non-strict equivalent. If that is anything but Legal, then the expansion logic should replace the strict node with a non-strict node and push that non-strict node back to the legalizer to handle it in whatever way is indicated by its operation action. If the operation action is Legal, then we can leave the strict node in and mutate it to the non-strict node only shortly before ISel as is done today.

As to when an expansion is possible in a way that preserves strictness, and when to fall back to the non-strict equivalent: For a vector op, I believe it always makes sense to expand a strict vector op by scalarizing it to strict component ops. (Except, possibly, if that scalar op would itself fall back to its non-strict eqiuvalent -- then we might as well do the fallback on the vector type.) For scalar ops, for some it could be possible to expand them while respecting strictness, e.g. for some of the fp-to-sint cases above. That can be decided on a case-by-case basis. If at the end of ExpandNode we haven't found a way to expand a strict node, we can to the fallback then.

Does this make sense?

In D63782#1593727, @uweigand wrote:

I've been thinking about the getStrictFPOperationAction issue. I believe this function makes no sense at all and really should go away. Instead, the strict opcodes should use their own operation actions just like everything else. In current code, those should now be set up in a reasonable manner: they all default to Expand, unless the target overrides them (to whatever makes sense).

If the operation action is anything but Expand, I believe it should simply be respected as-is (i.e. Legal stays until ISel, Custom calls the target hook, Promote -if ever used- should be implemented in common code in a way that preserves strictness, and LibCall should emit a library call - possibly a strict version if necessary).

Now, if the operation action of a strict FP operation is Expand, common code should try to implement an expansion rule if possible in a way that preserves strictness. If that is not possible, then and only then should we think about falling back to a non-strict implementation. That fallback (and only that fallback) now should look at the operation action of the non-strict equivalent. If that is anything but Legal, then the expansion logic should replace the strict node with a non-strict node and push that non-strict node back to the legalizer to handle it in whatever way is indicated by its operation action. If the operation action is Legal, then we can leave the strict node in and mutate it to the non-strict node only shortly before ISel as is done today.

As to when an expansion is possible in a way that preserves strictness, and when to fall back to the non-strict equivalent: For a vector op, I believe it always makes sense to expand a strict vector op by scalarizing it to strict component ops. (Except, possibly, if that scalar op would itself fall back to its non-strict eqiuvalent -- then we might as well do the fallback on the vector type.) For scalar ops, for some it could be possible to expand them while respecting strictness, e.g. for some of the fp-to-sint cases above. That can be decided on a case-by-case basis. If at the end of ExpandNode we haven't found a way to expand a strict node, we can to the fallback then.

Does this make sense?

Makes sense to me. As I recall, the mutation was always intended as a temporary solution.

Makes sense to me. As I recall, the mutation was always intended as a temporary solution.

I've now posted a patch implementing something along those lines here: https://reviews.llvm.org/D65226

The only difference is that if the strict operation is to fall back to the non-strict operation, and that non-strict operation is marked Custom, I'm now not attempting to call that Custom handler. This is mostly because I haven't found any case where this would actually be invoked today (and therefore couldn't test any code added to handle this case), and also because it doesn't really make much sense. Going forward, when this will show up (e.g. because of fptosi/fptoui), the target simply should mark the strict operation as Custom as well. (This will actually work correctly with D65226.)

Address review comments.

When I changed the fp-intrinsics.ll test (which is scalar) to operate on a variable instead of a constant I found an issue that required adding a fourth ReplaceNode*() method to LegalizeDAG.cpp.

pengfei added inline comments.Aug 1 2019, 1:55 AM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	I don't understand the behavior of icc here. SDM says this instruction can raise floating-point invalid exception when the result out of range. Why icc add the code to raise another divz exception?

craig.topper added inline comments.Aug 1 2019, 8:27 AM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	The instructions raises an exception based on a 32-bit result size. But the C code has a 16-bit result size. We need to generate an exception if it doesn't fit in 16-bits to match the C code. But the available instructions can't do that.

cameron.mcinally added inline comments.Aug 1 2019, 9:06 AM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	That seems like a hardware problem. Does LLVM have a policy on fixing up hardware deficiencies? It's not obvious that we should care about the hardware doing the wrong thing. @eli.friedman

Herald added a subscriber: jdoerfert. · View Herald TranscriptAug 1 2019, 9:06 AM

efriedma added inline comments.Aug 1 2019, 11:37 AM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	In general, IR instructions have set semantics, and we generate whatever native code is necessary to make the generated code match the specified semantics. (This influences the way we specify instructions; for example, one of the reasons shl produces poison for large shift amounts is to allow it to be directly mapped to a hardware instruction on common targets.) In this case, I think it's pretty clear; if we're going to have a "strict" fp-to-int conversion that promises to raise an overflow exception if and only if the conversion overflows, that should work as documented regardless of what the underlying hardware supports. Every architecture has some fp-to-int conversions that need to be emulated, and the exact set can vary depending on the specific CPU target. We don't want to produce unpredictable exceptions for emulated conversions. If the performance penalty for this is too large, we could allow frontends to request a non-strict conversion that's allowed to generate arbitrary exceptions... but I'm not sure that would be useful in code that's cares about exceptions.

cameron.mcinally added inline comments.Aug 1 2019, 11:59 AM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	Ok. Thanks for the clarification, Eli. That seems reasonable. I agree with @pengfei that it would be nice if we could generate an artificial overflow instead of a divz, but that's picking nits.

cameron.mcinally added inline comments.Aug 1 2019, 12:06 PM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	Oh, just caught that it's div(0,0), so produces an Invalid exception. Comment withdrawn...

pengfei added inline comments.Aug 1 2019, 6:34 PM

include/llvm/CodeGen/ISDOpcodes.h
307 ↗	(On Diff #206495)	Thanks @cameron.mcinally, I just learned SIMD div(0,0) raise invalid instead of divz exception.

uweigand mentioned this in D65226: [Strict FP] Allow custom operation actions.Aug 5 2019, 10:08 AM

craig.topper added inline comments.Aug 5 2019, 11:40 AM

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
2931 ↗	(On Diff #211562)	I think the comment has a typo or mistake in it.
lib/CodeGen/SelectionDAG/TargetLowering.cpp
4829 ↗	(On Diff #211562)	Remove the commented out code.
4867 ↗	(On Diff #211562)	Remvoe the commented out code

Address review comments. Rebase.

Ping

craig.topper added inline comments.Aug 27 2019, 8:29 AM

lib/CodeGen/SelectionDAG/TargetLowering.cpp
5378 ↗	(On Diff #213641)	This isn't used below is it? The else block here returns true but this block continues. But the continuation code assigns Result without ever reading it.

kpn marked an inline comment as done.Aug 27 2019, 9:27 AM

kpn added inline comments.

lib/CodeGen/SelectionDAG/TargetLowering.cpp
5378 ↗	(On Diff #213641)	The else is a call to getNode() for regular ISD::FP_TO_SINT. Then both legs fall through to the return true.

LGTM

lib/CodeGen/SelectionDAG/TargetLowering.cpp
5378 ↗	(On Diff #213641)	Oops. I think the lack of curly braces on the else and the comments in this area made me miss the indentation change.

This revision is now accepted and ready to land.Aug 27 2019, 9:53 AM

Closed by commit rL370228: [FPEnv] Add fptosi and fptoui constrained intrinsics. (authored by kpn). · Explain WhyAug 28 2019, 9:35 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

docs/

LangRef.rst

66 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

7 lines

SelectionDAGNodes.h

2 lines

TargetLowering.h

4 lines

IR/

IntrinsicInst.h

2 lines

Intrinsics.td

11 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

29 lines

LegalizeIntegerTypes.cpp

20 lines

LegalizeTypes.h

1 line

LegalizeVectorOps.cpp

12 lines

LegalizeVectorTypes.cpp

31 lines

SelectionDAG.cpp

2 lines

SelectionDAGBuilder.cpp

8 lines

SelectionDAGDumper.cpp

2 lines

TargetLowering.cpp

48 lines

TargetLoweringBase.cpp

2 lines

IR/

IntrinsicInst.cpp

2 lines

Verifier.cpp

29 lines

test/

CodeGen/

PowerPC/

fp-intrinsics-fptosi-legal.ll

19 lines

X86/

fp-intrinsics.ll

36 lines

vector-constrained-fp-intrinsics.ll

882 lines

Feature/

fp-intrinsics.ll

25 lines

Diff 217675

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 15,274 Lines • ▼ Show 20 Lines

	Semantics:			Semantics:
	""""""""""			""""""""""

	The result produced is the product of the first two operands added to the third			The result produced is the product of the first two operands added to the third
	operand computed with infinite precision, and then rounded to the target			operand computed with infinite precision, and then rounded to the target
	precision.			precision.

				'``llvm.experimental.constrained.fptoui``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <ty2>
				@llvm.experimental.constrained.fptoui(<type> <value>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
				floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fptoui``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is an unsigned integer converted from the floating
				point operand. The value is truncated, so it is rounded towards zero.

				'``llvm.experimental.constrained.fptosi``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <ty2>
				@llvm.experimental.constrained.fptosi(<type> <value>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.fptosi``' intrinsic converts
				:ref:`floating-point <t_floating>` ``value`` to type ``ty2``.

				Arguments:
				""""""""""

				The first argument to the '``llvm.experimental.constrained.fptosi``'
				intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
				<t_vector>` of floating point values.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				The result produced is a signed integer converted from the floating
				point operand. The value is truncated, so it is rounded towards zero.

	'``llvm.experimental.constrained.fptrunc``' Intrinsic			'``llvm.experimental.constrained.fptrunc``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	▲ Show 20 Lines • Show All 2,256 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 296 Lines • ▼ Show 20 Lines	enum NodeType {
/// These will be lowered to the equivalent non-constrained pseudo-op		/// These will be lowered to the equivalent non-constrained pseudo-op
/// (or expanded to the equivalent library call) before final selection.		/// (or expanded to the equivalent library call) before final selection.
/// They are used to limit optimizations while the DAG is being optimized.		/// They are used to limit optimizations while the DAG is being optimized.
STRICT_FSQRT, STRICT_FPOW, STRICT_FPOWI, STRICT_FSIN, STRICT_FCOS,		STRICT_FSQRT, STRICT_FPOW, STRICT_FPOWI, STRICT_FSIN, STRICT_FCOS,
STRICT_FEXP, STRICT_FEXP2, STRICT_FLOG, STRICT_FLOG10, STRICT_FLOG2,		STRICT_FEXP, STRICT_FEXP2, STRICT_FLOG, STRICT_FLOG10, STRICT_FLOG2,
STRICT_FRINT, STRICT_FNEARBYINT, STRICT_FMAXNUM, STRICT_FMINNUM,		STRICT_FRINT, STRICT_FNEARBYINT, STRICT_FMAXNUM, STRICT_FMINNUM,
STRICT_FCEIL, STRICT_FFLOOR, STRICT_FROUND, STRICT_FTRUNC,		STRICT_FCEIL, STRICT_FFLOOR, STRICT_FROUND, STRICT_FTRUNC,

		/// STRICT_FP_TO_[US]INT - Convert a floating point value to a signed or
		/// unsigned integer. These have the same semantics as fptosi and fptoui
		/// in IR.
		/// They are used to limit optimizations while the DAG is being optimized.
		STRICT_FP_TO_SINT,
		STRICT_FP_TO_UINT,

/// X = STRICT_FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating		/// X = STRICT_FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating
/// point type down to the precision of the destination VT. TRUNC is a		/// point type down to the precision of the destination VT. TRUNC is a
/// flag, which is always an integer that is zero or one. If TRUNC is 0,		/// flag, which is always an integer that is zero or one. If TRUNC is 0,
/// this is a normal rounding, if it is 1, this FP_ROUND is known to not		/// this is a normal rounding, if it is 1, this FP_ROUND is known to not
/// change the value of Y.		/// change the value of Y.
///		///
/// The TRUNC = 1 case is used in cases where we know that the value will		/// The TRUNC = 1 case is used in cases where we know that the value will
/// not be modified by the node, because Y is not using any of the extra		/// not be modified by the node, because Y is not using any of the extra
▲ Show 20 Lines • Show All 779 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h

Show First 20 Lines • Show All 703 Lines • ▼ Show 20 Lines	switch (NodeType) {
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
return true;		return true;
}		}
}		}

/// Test if this node has a post-isel opcode, directly		/// Test if this node has a post-isel opcode, directly
/// corresponding to a MachineInstr opcode.		/// corresponding to a MachineInstr opcode.
▲ Show 20 Lines • Show All 1,944 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 954 Lines • ▼ Show 20 Lines	switch (Op) {
case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;		case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;
case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;		case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;
case ISD::STRICT_FMAXNUM: EqOpc = ISD::FMAXNUM; break;		case ISD::STRICT_FMAXNUM: EqOpc = ISD::FMAXNUM; break;
case ISD::STRICT_FMINNUM: EqOpc = ISD::FMINNUM; break;		case ISD::STRICT_FMINNUM: EqOpc = ISD::FMINNUM; break;
case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;		case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;
case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;		case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;
case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;		case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;
case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;		case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;
		case ISD::STRICT_FP_TO_SINT: EqOpc = ISD::FP_TO_SINT; break;
		case ISD::STRICT_FP_TO_UINT: EqOpc = ISD::FP_TO_UINT; break;
case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;		case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;
case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;		case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;
}		}

return getOperationAction(EqOpc, VT);		return getOperationAction(EqOpc, VT);
}		}

/// Return true if the specified operation is legal on this target or can be		/// Return true if the specified operation is legal on this target or can be
▲ Show 20 Lines • Show All 3,023 Lines • ▼ Show 20 Lines
/// \param Result output after conversion		/// \param Result output after conversion
/// \returns True, if the expansion was successful, false otherwise		/// \returns True, if the expansion was successful, false otherwise
bool expandFP_TO_SINT(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;		bool expandFP_TO_SINT(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;

/// Expand float to UINT conversion		/// Expand float to UINT conversion
/// \param N Node to expand		/// \param N Node to expand
/// \param Result output after conversion		/// \param Result output after conversion
/// \returns True, if the expansion was successful, false otherwise		/// \returns True, if the expansion was successful, false otherwise
bool expandFP_TO_UINT(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;		bool expandFP_TO_UINT(SDNode *N, SDValue &Result, SDValue &Chain, SelectionDAG &DAG) const;

/// Expand UINT(i64) to double(f64) conversion		/// Expand UINT(i64) to double(f64) conversion
/// \param N Node to expand		/// \param N Node to expand
/// \param Result output after conversion		/// \param Result output after conversion
/// \returns True, if the expansion was successful, false otherwise		/// \returns True, if the expansion was successful, false otherwise
bool expandUINT_TO_FP(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;		bool expandUINT_TO_FP(SDNode *N, SDValue &Result, SelectionDAG &DAG) const;

/// Expand fminnum/fmaxnum into fminnum_ieee/fmaxnum_ieee with quieted inputs.		/// Expand fminnum/fmaxnum into fminnum_ieee/fmaxnum_ieee with quieted inputs.
▲ Show 20 Lines • Show All 188 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/IntrinsicInst.h

Show First 20 Lines • Show All 253 Lines • ▼ Show 20 Lines	public:
static bool classof(const IntrinsicInst *I) {		static bool classof(const IntrinsicInst *I) {
switch (I->getIntrinsicID()) {		switch (I->getIntrinsicID()) {
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
case Intrinsic::experimental_constrained_fptrunc:		case Intrinsic::experimental_constrained_fptrunc:
case Intrinsic::experimental_constrained_fpext:		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
▲ Show 20 Lines • Show All 617 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 616 Lines • ▼ Show 20 Lines	let IntrProperties = [IntrInaccessibleMemOnly, IntrWillReturn] in {

def int_experimental_constrained_fma : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_fma : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptosi : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

		def int_experimental_constrained_fptoui : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;

def int_experimental_constrained_fptrunc : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_fptrunc : Intrinsic<[ llvm_anyfloat_ty ],
[ llvm_anyfloat_ty,		[ llvm_anyfloat_ty,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;

def int_experimental_constrained_fpext : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_fpext : Intrinsic<[ llvm_anyfloat_ty ],
[ llvm_anyfloat_ty,		[ llvm_anyfloat_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	def int_experimental_constrained_round : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_trunc : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_trunc : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
}		}
// FIXME: Add intrinsics for fcmp, fptoui and fptosi.		// FIXME: Add intrinsic for fcmp.
		// FIXME: Consider maybe adding intrinsics for sitofp, uitofp.

//===------------------------- Expect Intrinsics --------------------------===//		//===------------------------- Expect Intrinsics --------------------------===//
//		//
def int_expect : Intrinsic<[llvm_anyint_ty],		def int_expect : Intrinsic<[llvm_anyint_ty],
[LLVMMatchType<0>, LLVMMatchType<0>], [IntrNoMem, IntrWillReturn]>;		[LLVMMatchType<0>, LLVMMatchType<0>], [IntrNoMem, IntrWillReturn]>;

//===-------------------- Bit Manipulation Intrinsics ---------------------===//		//===-------------------- Bit Manipulation Intrinsics ---------------------===//
//		//
▲ Show 20 Lines • Show All 537 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 230 Lines • ▼ Show 20 Lines	void ReplaceNode(SDNode Old, const SDValue New) {
for (unsigned i = 0, e = Old->getNumValues(); i != e; ++i) {		for (unsigned i = 0, e = Old->getNumValues(); i != e; ++i) {
LLVM_DEBUG(dbgs() << (i == 0 ? " with: " : " and: ");		LLVM_DEBUG(dbgs() << (i == 0 ? " with: " : " and: ");
New[i]->dump(&DAG));		New[i]->dump(&DAG));
if (UpdatedNodes)		if (UpdatedNodes)
UpdatedNodes->insert(New[i].getNode());		UpdatedNodes->insert(New[i].getNode());
}		}
ReplacedNode(Old);		ReplacedNode(Old);
}		}

		void ReplaceNodeWithValue(SDValue Old, SDValue New) {
		LLVM_DEBUG(dbgs() << " ... replacing: "; Old->dump(&DAG);
		dbgs() << " with: "; New->dump(&DAG));

		DAG.ReplaceAllUsesOfValueWith(Old, New);
		if (UpdatedNodes)
		UpdatedNodes->insert(New.getNode());
		ReplacedNode(Old.getNode());
		}
};		};

} // end anonymous namespace		} // end anonymous namespace

/// Return a vector shuffle operation which		/// Return a vector shuffle operation which
/// performs the same shuffle in terms of order or result bytes, but on a type		/// performs the same shuffle in terms of order or result bytes, but on a type
/// whose vector element type is narrower than the original shuffle type.		/// whose vector element type is narrower than the original shuffle type.
/// e.g. <v4i32> <0, 1, 0, 1> -> v8i16 <0, 1, 2, 3, 0, 1, 2, 3>		/// e.g. <v4i32> <0, 1, 0, 1> -> v8i16 <0, 1, 2, 3, 0, 1, 2, 3>
▲ Show 20 Lines • Show All 2,628 Lines • ▼ Show 20 Lines	case ISD::SINT_TO_FP:
Tmp1 = ExpandLegalINT_TO_FP(Node->getOpcode() == ISD::SINT_TO_FP,		Tmp1 = ExpandLegalINT_TO_FP(Node->getOpcode() == ISD::SINT_TO_FP,
Node->getOperand(0), Node->getValueType(0), dl);		Node->getOperand(0), Node->getValueType(0), dl);
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))		if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG))
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
		case ISD::STRICT_FP_TO_SINT:
		if (TLI.expandFP_TO_SINT(Node, Tmp1, DAG)) {
		ReplaceNode(Node, Tmp1.getNode());
		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_FP_TO_SINT node\n");
		return true;
		}
		break;
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
if (TLI.expandFP_TO_UINT(Node, Tmp1, DAG))		if (TLI.expandFP_TO_UINT(Node, Tmp1, Tmp2, DAG))
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
		case ISD::STRICT_FP_TO_UINT:
		if (TLI.expandFP_TO_UINT(Node, Tmp1, Tmp2, DAG)) {
		// Relink the chain.
		DAG.ReplaceAllUsesOfValueWith(SDValue(Node,1), Tmp2);
		// Replace the new UINT result.
		ReplaceNodeWithValue(SDValue(Node, 0), Tmp1);
		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_FP_TO_UINT node\n");
		return true;
		}
		break;
case ISD::LROUND:		case ISD::LROUND:
Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LROUND_F32,		Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LROUND_F32,
RTLIB::LROUND_F64, RTLIB::LROUND_F80,		RTLIB::LROUND_F64, RTLIB::LROUND_F80,
RTLIB::LROUND_F128,		RTLIB::LROUND_F128,
RTLIB::LROUND_PPCF128));		RTLIB::LROUND_PPCF128));
break;		break;
case ISD::LLROUND:		case ISD::LLROUND:
Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLROUND_F32,		Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLROUND_F32,
▲ Show 20 Lines • Show All 1,695 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	#endif
case ISD::SIGN_EXTEND_VECTOR_INREG:		case ISD::SIGN_EXTEND_VECTOR_INREG:
case ISD::ZERO_EXTEND_VECTOR_INREG:		case ISD::ZERO_EXTEND_VECTOR_INREG:
Res = PromoteIntRes_EXTEND_VECTOR_INREG(N); break;		Res = PromoteIntRes_EXTEND_VECTOR_INREG(N); break;

case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;		case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;

		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;		case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;

case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;		case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;

case ISD::FLT_ROUNDS_: Res = PromoteIntRes_FLT_ROUNDS(N); break;		case ISD::FLT_ROUNDS_: Res = PromoteIntRes_FLT_ROUNDS(N); break;

case ISD::AND:		case ISD::AND:
▲ Show 20 Lines • Show All 366 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_XINT(SDNode *N) {
// not Legal, check to see if we can use FP_TO_SINT instead. (If both UINT		// not Legal, check to see if we can use FP_TO_SINT instead. (If both UINT
// and SINT conversions are Custom, there is no way to tell which is		// and SINT conversions are Custom, there is no way to tell which is
// preferable. We choose SINT because that's the right thing on PPC.)		// preferable. We choose SINT because that's the right thing on PPC.)
if (N->getOpcode() == ISD::FP_TO_UINT &&		if (N->getOpcode() == ISD::FP_TO_UINT &&
!TLI.isOperationLegal(ISD::FP_TO_UINT, NVT) &&		!TLI.isOperationLegal(ISD::FP_TO_UINT, NVT) &&
TLI.isOperationLegalOrCustom(ISD::FP_TO_SINT, NVT))		TLI.isOperationLegalOrCustom(ISD::FP_TO_SINT, NVT))
NewOpc = ISD::FP_TO_SINT;		NewOpc = ISD::FP_TO_SINT;

SDValue Res = DAG.getNode(NewOpc, dl, NVT, N->getOperand(0));		if (N->getOpcode() == ISD::STRICT_FP_TO_UINT &&
		!TLI.isOperationLegal(ISD::STRICT_FP_TO_UINT, NVT) &&
		TLI.isOperationLegalOrCustom(ISD::STRICT_FP_TO_SINT, NVT))
		NewOpc = ISD::STRICT_FP_TO_SINT;

		SDValue Res;
		if (N->isStrictFPOpcode()) {
		Res = DAG.getNode(NewOpc, dl, { NVT, MVT::Other },
		{ N->getOperand(0), N->getOperand(1) });
		// Legalize the chain result - switch anything that used the old chain to
		// use the new one.
		ReplaceValueWith(SDValue(N, 1), Res.getValue(1));
		} else
		Res = DAG.getNode(NewOpc, dl, NVT, N->getOperand(0));

// Assert that the converted value fits in the original type. If it doesn't		// Assert that the converted value fits in the original type. If it doesn't
// (eg: because the value being converted is too big), then the result of the		// (eg: because the value being converted is too big), then the result of the
// original operation was undefined anyway, so the assert is still correct.		// original operation was undefined anyway, so the assert is still correct.
//		//
// NOTE: fp-to-uint to fp-to-sint promotion guarantees zero extend. For example:		// NOTE: fp-to-uint to fp-to-sint promotion guarantees zero extend. For example:
// before legalization: fp-to-uint16, 65534. -> 0xfffe		// before legalization: fp-to-uint16, 65534. -> 0xfffe
// after legalization: fp-to-sint32, 65534. -> 0x0000fffe		// after legalization: fp-to-sint32, 65534. -> 0x0000fffe
return DAG.getNode(N->getOpcode() == ISD::FP_TO_UINT ?		return DAG.getNode((N->getOpcode() == ISD::FP_TO_UINT \|\|
		N->getOpcode() == ISD::STRICT_FP_TO_UINT) ?
ISD::AssertZext : ISD::AssertSext, dl, NVT, Res,		ISD::AssertZext : ISD::AssertSext, dl, NVT, Res,
DAG.getValueType(N->getValueType(0).getScalarType()));		DAG.getValueType(N->getValueType(0).getScalarType()));
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {
EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
SDLoc dl(N);		SDLoc dl(N);

▲ Show 20 Lines • Show All 3,696 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 709 Lines • ▼ Show 20 Lines	private:
SDValue ScalarizeVecRes_VECTOR_SHUFFLE(SDNode *N);		SDValue ScalarizeVecRes_VECTOR_SHUFFLE(SDNode *N);

SDValue ScalarizeVecRes_MULFIX(SDNode *N);		SDValue ScalarizeVecRes_MULFIX(SDNode *N);

// Vector Operand Scalarization: <1 x ty> -> ty.		// Vector Operand Scalarization: <1 x ty> -> ty.
bool ScalarizeVectorOperand(SDNode *N, unsigned OpNo);		bool ScalarizeVectorOperand(SDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_BITCAST(SDNode *N);		SDValue ScalarizeVecOp_BITCAST(SDNode *N);
SDValue ScalarizeVecOp_UnaryOp(SDNode *N);		SDValue ScalarizeVecOp_UnaryOp(SDNode *N);
		SDValue ScalarizeVecOp_UnaryOp_StrictFP(SDNode *N);
SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);		SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);
SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue ScalarizeVecOp_VSELECT(SDNode *N);		SDValue ScalarizeVecOp_VSELECT(SDNode *N);
SDValue ScalarizeVecOp_VSETCC(SDNode *N);		SDValue ScalarizeVecOp_VSETCC(SDNode *N);
SDValue ScalarizeVecOp_STORE(StoreSDNode *N, unsigned OpNo);		SDValue ScalarizeVecOp_STORE(StoreSDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo);		SDValue ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_STRICT_FP_ROUND(SDNode *N, unsigned OpNo);		SDValue ScalarizeVecOp_STRICT_FP_ROUND(SDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_VECREDUCE(SDNode *N);		SDValue ScalarizeVecOp_VECREDUCE(SDNode *N);
▲ Show 20 Lines • Show All 257 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 327 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
// If we're asked to expand a strict vector floating-point operation,		// If we're asked to expand a strict vector floating-point operation,
// by default we're going to simply unroll it. That is usually the		// by default we're going to simply unroll it. That is usually the
// best approach, except in the case where the resulting strict (scalar)		// best approach, except in the case where the resulting strict (scalar)
// operations would themselves use the fallback mutation to non-strict.		// operations would themselves use the fallback mutation to non-strict.
// In that specific case, just do the fallback on the vector op.		// In that specific case, just do the fallback on the vector op.
▲ Show 20 Lines • Show All 513 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::Expand(SDValue Op) {
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
return ExpandStrictFPOp(Op);		return ExpandStrictFPOp(Op);
case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
case ISD::VECREDUCE_XOR:		case ISD::VECREDUCE_XOR:
case ISD::VECREDUCE_SMAX:		case ISD::VECREDUCE_SMAX:
case ISD::VECREDUCE_SMIN:		case ISD::VECREDUCE_SMIN:
▲ Show 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	if (TLI.expandABS(Op.getNode(), Result, DAG))
return Result;		return Result;

// Otherwise go ahead and unroll.		// Otherwise go ahead and unroll.
return DAG.UnrollVectorOp(Op.getNode());		return DAG.UnrollVectorOp(Op.getNode());
}		}

SDValue VectorLegalizer::ExpandFP_TO_UINT(SDValue Op) {		SDValue VectorLegalizer::ExpandFP_TO_UINT(SDValue Op) {
// Attempt to expand using TargetLowering.		// Attempt to expand using TargetLowering.
SDValue Result;		SDValue Result, Chain;
if (TLI.expandFP_TO_UINT(Op.getNode(), Result, DAG))		if (TLI.expandFP_TO_UINT(Op.getNode(), Result, Chain, DAG)) {
		if (Op.getNode()->isStrictFPOpcode())
		// Relink the chain
		DAG.ReplaceAllUsesOfValueWith(Op.getValue(1), Chain);
return Result;		return Result;
		}

// Otherwise go ahead and unroll.		// Otherwise go ahead and unroll.
return DAG.UnrollVectorOp(Op.getNode());		return DAG.UnrollVectorOp(Op.getNode());
}		}

SDValue VectorLegalizer::ExpandUINT_TO_FLOAT(SDValue Op) {		SDValue VectorLegalizer::ExpandUINT_TO_FLOAT(SDValue Op) {
EVT VT = Op.getOperand(0).getValueType();		EVT VT = Op.getOperand(0).getValueType();
SDLoc DL(Op);		SDLoc DL(Op);
▲ Show 20 Lines • Show All 239 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
R = ScalarizeVecRes_StrictFPOp(N);		R = ScalarizeVecRes_StrictFPOp(N);
break;		break;
case ISD::UADDO:		case ISD::UADDO:
case ISD::SADDO:		case ISD::SADDO:
case ISD::USUBO:		case ISD::USUBO:
case ISD::SSUBO:		case ISD::SSUBO:
case ISD::UMULO:		case ISD::UMULO:
▲ Show 20 Lines • Show All 417 Lines • ▼ Show 20 Lines	#endif
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
Res = ScalarizeVecOp_UnaryOp(N);		Res = ScalarizeVecOp_UnaryOp(N);
break;		break;
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
		Res = ScalarizeVecOp_UnaryOp_StrictFP(N);
		break;
case ISD::CONCAT_VECTORS:		case ISD::CONCAT_VECTORS:
Res = ScalarizeVecOp_CONCAT_VECTORS(N);		Res = ScalarizeVecOp_CONCAT_VECTORS(N);
break;		break;
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
Res = ScalarizeVecOp_EXTRACT_VECTOR_ELT(N);		Res = ScalarizeVecOp_EXTRACT_VECTOR_ELT(N);
break;		break;
case ISD::VSELECT:		case ISD::VSELECT:
Res = ScalarizeVecOp_VSELECT(N);		Res = ScalarizeVecOp_VSELECT(N);
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::ScalarizeVecOp_UnaryOp(SDNode *N) {
SDValue Elt = GetScalarizedVector(N->getOperand(0));		SDValue Elt = GetScalarizedVector(N->getOperand(0));
SDValue Op = DAG.getNode(N->getOpcode(), SDLoc(N),		SDValue Op = DAG.getNode(N->getOpcode(), SDLoc(N),
N->getValueType(0).getScalarType(), Elt);		N->getValueType(0).getScalarType(), Elt);
// Revectorize the result so the types line up with what the uses of this		// Revectorize the result so the types line up with what the uses of this
// expression expect.		// expression expect.
return DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(N), N->getValueType(0), Op);		return DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(N), N->getValueType(0), Op);
}		}

		/// If the input is a vector that needs to be scalarized, it must be <1 x ty>.
		/// Do the strict FP operation on the element instead.
		SDValue DAGTypeLegalizer::ScalarizeVecOp_UnaryOp_StrictFP(SDNode *N) {
		assert(N->getValueType(0).getVectorNumElements() == 1 &&
		"Unexpected vector type!");
		SDValue Elt = GetScalarizedVector(N->getOperand(1));
		SDValue Res = DAG.getNode(N->getOpcode(), SDLoc(N),
		{ N->getValueType(0).getScalarType(), MVT::Other },
		{ N->getOperand(0), Elt });
		// Legalize the chain result - switch anything that used the old chain to
		// use the new one.
		ReplaceValueWith(SDValue(N, 1), Res.getValue(1));
		// Revectorize the result so the types line up with what the uses of this
		// expression expect.
		return DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(N), N->getValueType(0), Res);
		}

/// The vectors to concatenate have length one - use a BUILD_VECTOR instead.		/// The vectors to concatenate have length one - use a BUILD_VECTOR instead.
SDValue DAGTypeLegalizer::ScalarizeVecOp_CONCAT_VECTORS(SDNode *N) {		SDValue DAGTypeLegalizer::ScalarizeVecOp_CONCAT_VECTORS(SDNode *N) {
SmallVector<SDValue, 8> Ops(N->getNumOperands());		SmallVector<SDValue, 8> Ops(N->getNumOperands());
for (unsigned i = 0, e = N->getNumOperands(); i < e; ++i)		for (unsigned i = 0, e = N->getNumOperands(); i < e; ++i)
Ops[i] = GetScalarizedVector(N->getOperand(i));		Ops[i] = GetScalarizedVector(N->getOperand(i));
return DAG.getBuildVector(N->getValueType(0), SDLoc(N), Ops);		return DAG.getBuildVector(N->getValueType(0), SDLoc(N), Ops);
}		}

▲ Show 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	#endif
case ISD::FLOG2:		case ISD::FLOG2:
case ISD::FNEARBYINT:		case ISD::FNEARBYINT:
case ISD::FNEG:		case ISD::FNEG:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
case ISD::FP_ROUND:		case ISD::FP_ROUND:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
		case ISD::STRICT_FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::FRINT:		case ISD::FRINT:
case ISD::FROUND:		case ISD::FROUND:
case ISD::FSIN:		case ISD::FSIN:
case ISD::FSQRT:		case ISD::FSQRT:
case ISD::FTRUNC:		case ISD::FTRUNC:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
▲ Show 20 Lines • Show All 1,087 Lines • ▼ Show 20 Lines	#endif
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
if (N->getValueType(0).bitsLT(N->getOperand(0).getValueType()))		if (N->getValueType(0).bitsLT(N->getOperand(0).getValueType()))
Res = SplitVecOp_TruncateHelper(N);		Res = SplitVecOp_TruncateHelper(N);
else		else
Res = SplitVecOp_UnaryOp(N);		Res = SplitVecOp_UnaryOp(N);
break;		break;
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::CTTZ:		case ISD::CTTZ:
case ISD::CTLZ:		case ISD::CTLZ:
case ISD::CTPOP:		case ISD::CTPOP:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
▲ Show 20 Lines • Show All 811 Lines • ▼ Show 20 Lines	#endif
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
Res = WidenVecRes_Convert(N);		Res = WidenVecRes_Convert(N);
break;		break;

case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_FP_TO_SINT:
		case ISD::STRICT_FP_TO_UINT:
Res = WidenVecRes_Convert_StrictFP(N);		Res = WidenVecRes_Convert_StrictFP(N);
break;		break;

case ISD::FABS:		case ISD::FABS:
case ISD::FCEIL:		case ISD::FCEIL:
case ISD::FCOS:		case ISD::FCOS:
case ISD::FEXP:		case ISD::FEXP:
case ISD::FEXP2:		case ISD::FEXP2:
▲ Show 20 Lines • Show All 1,299 Lines • ▼ Show 20 Lines	#endif
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
Res = WidenVecOp_EXTEND(N);		Res = WidenVecOp_EXTEND(N);
break;		break;

case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
		case ISD::STRICT_FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
		case ISD::STRICT_FP_TO_UINT:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
Res = WidenVecOp_Convert(N);		Res = WidenVecOp_Convert(N);
break;		break;

case ISD::VECREDUCE_FADD:		case ISD::VECREDUCE_FADD:
case ISD::VECREDUCE_FMUL:		case ISD::VECREDUCE_FMUL:
▲ Show 20 Lines • Show All 944 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,772 Lines • ▼ Show 20 Lines	SDNode* SelectionDAG::mutateStrictFPToFP(SDNode *Node) {
case ISD::STRICT_FMAXNUM: NewOpc = ISD::FMAXNUM; break;		case ISD::STRICT_FMAXNUM: NewOpc = ISD::FMAXNUM; break;
case ISD::STRICT_FMINNUM: NewOpc = ISD::FMINNUM; break;		case ISD::STRICT_FMINNUM: NewOpc = ISD::FMINNUM; break;
case ISD::STRICT_FCEIL: NewOpc = ISD::FCEIL; break;		case ISD::STRICT_FCEIL: NewOpc = ISD::FCEIL; break;
case ISD::STRICT_FFLOOR: NewOpc = ISD::FFLOOR; break;		case ISD::STRICT_FFLOOR: NewOpc = ISD::FFLOOR; break;
case ISD::STRICT_FROUND: NewOpc = ISD::FROUND; break;		case ISD::STRICT_FROUND: NewOpc = ISD::FROUND; break;
case ISD::STRICT_FTRUNC: NewOpc = ISD::FTRUNC; break;		case ISD::STRICT_FTRUNC: NewOpc = ISD::FTRUNC; break;
case ISD::STRICT_FP_ROUND: NewOpc = ISD::FP_ROUND; break;		case ISD::STRICT_FP_ROUND: NewOpc = ISD::FP_ROUND; break;
case ISD::STRICT_FP_EXTEND: NewOpc = ISD::FP_EXTEND; break;		case ISD::STRICT_FP_EXTEND: NewOpc = ISD::FP_EXTEND; break;
		case ISD::STRICT_FP_TO_SINT: NewOpc = ISD::FP_TO_SINT; break;
		case ISD::STRICT_FP_TO_UINT: NewOpc = ISD::FP_TO_UINT; break;
}		}

assert(Node->getNumValues() == 2 && "Unexpected number of results!");		assert(Node->getNumValues() == 2 && "Unexpected number of results!");

// We're taking this node out of the chain, so we need to re-link things.		// We're taking this node out of the chain, so we need to re-link things.
SDValue InputChain = Node->getOperand(0);		SDValue InputChain = Node->getOperand(0);
SDValue OutputChain = SDValue(Node, 1);		SDValue OutputChain = SDValue(Node, 1);
ReplaceAllUsesOfValueWith(OutputChain, InputChain);		ReplaceAllUsesOfValueWith(OutputChain, InputChain);
▲ Show 20 Lines • Show All 1,847 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,100 Lines • ▼ Show 20 Lines	setValue(&I, DAG.getNode(ISD::FMA, sdl,
getValue(I.getArgOperand(2))));		getValue(I.getArgOperand(2))));
return;		return;
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
case Intrinsic::experimental_constrained_fptrunc:		case Intrinsic::experimental_constrained_fptrunc:
case Intrinsic::experimental_constrained_fpext:		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
▲ Show 20 Lines • Show All 777 Lines • ▼ Show 20 Lines	case Intrinsic::experimental_constrained_fdiv:
Opcode = ISD::STRICT_FDIV;		Opcode = ISD::STRICT_FDIV;
break;		break;
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
Opcode = ISD::STRICT_FREM;		Opcode = ISD::STRICT_FREM;
break;		break;
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
Opcode = ISD::STRICT_FMA;		Opcode = ISD::STRICT_FMA;
break;		break;
		case Intrinsic::experimental_constrained_fptosi:
		Opcode = ISD::STRICT_FP_TO_SINT;
		break;
		case Intrinsic::experimental_constrained_fptoui:
		Opcode = ISD::STRICT_FP_TO_UINT;
		break;
case Intrinsic::experimental_constrained_fptrunc:		case Intrinsic::experimental_constrained_fptrunc:
Opcode = ISD::STRICT_FP_ROUND;		Opcode = ISD::STRICT_FP_ROUND;
break;		break;
case Intrinsic::experimental_constrained_fpext:		case Intrinsic::experimental_constrained_fpext:
Opcode = ISD::STRICT_FP_EXTEND;		Opcode = ISD::STRICT_FP_EXTEND;
break;		break;
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
Opcode = ISD::STRICT_FSQRT;		Opcode = ISD::STRICT_FSQRT;
▲ Show 20 Lines • Show All 3,627 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 319 Lines • ▼ Show 20 Lines	#endif
case ISD::FLT_ROUNDS_: return "flt_rounds";		case ISD::FLT_ROUNDS_: return "flt_rounds";
case ISD::FP_ROUND_INREG: return "fp_round_inreg";		case ISD::FP_ROUND_INREG: return "fp_round_inreg";
case ISD::FP_EXTEND: return "fp_extend";		case ISD::FP_EXTEND: return "fp_extend";
case ISD::STRICT_FP_EXTEND: return "strict_fp_extend";		case ISD::STRICT_FP_EXTEND: return "strict_fp_extend";

case ISD::SINT_TO_FP: return "sint_to_fp";		case ISD::SINT_TO_FP: return "sint_to_fp";
case ISD::UINT_TO_FP: return "uint_to_fp";		case ISD::UINT_TO_FP: return "uint_to_fp";
case ISD::FP_TO_SINT: return "fp_to_sint";		case ISD::FP_TO_SINT: return "fp_to_sint";
		case ISD::STRICT_FP_TO_SINT: return "strict_fp_to_sint";
case ISD::FP_TO_UINT: return "fp_to_uint";		case ISD::FP_TO_UINT: return "fp_to_uint";
		case ISD::STRICT_FP_TO_UINT: return "strict_fp_to_uint";
case ISD::BITCAST: return "bitcast";		case ISD::BITCAST: return "bitcast";
case ISD::ADDRSPACECAST: return "addrspacecast";		case ISD::ADDRSPACECAST: return "addrspacecast";
case ISD::FP16_TO_FP: return "fp16_to_fp";		case ISD::FP16_TO_FP: return "fp16_to_fp";
case ISD::FP_TO_FP16: return "fp_to_fp16";		case ISD::FP_TO_FP16: return "fp_to_fp16";
case ISD::LROUND: return "lround";		case ISD::LROUND: return "lround";
case ISD::LLROUND: return "llround";		case ISD::LLROUND: return "llround";
case ISD::LRINT: return "lrint";		case ISD::LRINT: return "lrint";
case ISD::LLRINT: return "llrint";		case ISD::LLRINT: return "llrint";
▲ Show 20 Lines • Show All 620 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,591 Lines • ▼ Show 20 Lines	bool TargetLowering::expandROT(SDNode *Node, SDValue &Result,
SDValue And1 = DAG.getNode(ISD::AND, DL, ShVT, NegOp1, BitWidthMinusOneC);		SDValue And1 = DAG.getNode(ISD::AND, DL, ShVT, NegOp1, BitWidthMinusOneC);
Result = DAG.getNode(ISD::OR, DL, VT, DAG.getNode(ShOpc, DL, VT, Op0, And0),		Result = DAG.getNode(ISD::OR, DL, VT, DAG.getNode(ShOpc, DL, VT, Op0, And0),
DAG.getNode(HsOpc, DL, VT, Op0, And1));		DAG.getNode(HsOpc, DL, VT, Op0, And1));
return true;		return true;
}		}

bool TargetLowering::expandFP_TO_SINT(SDNode *Node, SDValue &Result,		bool TargetLowering::expandFP_TO_SINT(SDNode *Node, SDValue &Result,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
SDValue Src = Node->getOperand(0);		unsigned OpNo = Node->isStrictFPOpcode() ? 1 : 0;
		SDValue Src = Node->getOperand(OpNo);
EVT SrcVT = Src.getValueType();		EVT SrcVT = Src.getValueType();
EVT DstVT = Node->getValueType(0);		EVT DstVT = Node->getValueType(0);
SDLoc dl(SDValue(Node, 0));		SDLoc dl(SDValue(Node, 0));

// FIXME: Only f32 to i64 conversions are supported.		// FIXME: Only f32 to i64 conversions are supported.
if (SrcVT != MVT::f32 \|\| DstVT != MVT::i64)		if (SrcVT != MVT::f32 \|\| DstVT != MVT::i64)
return false;		return false;

		if (Node->isStrictFPOpcode())
		// When a NaN is converted to an integer a trap is allowed. We can't
		// use this expansion here because it would eliminate that trap. Other
		// traps are also allowed and cannot be eliminated. See
		// IEEE 754-2008 sec 5.8.
		return false;

// Expand f32 -> i64 conversion		// Expand f32 -> i64 conversion
// This algorithm comes from compiler-rt's implementation of fixsfdi:		// This algorithm comes from compiler-rt's implementation of fixsfdi:
// https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/builtins/fixsfdi.c		// https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/builtins/fixsfdi.c
unsigned SrcEltBits = SrcVT.getScalarSizeInBits();		unsigned SrcEltBits = SrcVT.getScalarSizeInBits();
EVT IntVT = SrcVT.changeTypeToInteger();		EVT IntVT = SrcVT.changeTypeToInteger();
EVT IntShVT = getShiftAmountTy(IntVT, DAG.getDataLayout());		EVT IntShVT = getShiftAmountTy(IntVT, DAG.getDataLayout());

SDValue ExponentMask = DAG.getConstant(0x7F800000, dl, IntVT);		SDValue ExponentMask = DAG.getConstant(0x7F800000, dl, IntVT);
Show All 37 Lines	SDValue Ret = DAG.getNode(ISD::SUB, dl, DstVT,
DAG.getNode(ISD::XOR, dl, DstVT, R, Sign), Sign);		DAG.getNode(ISD::XOR, dl, DstVT, R, Sign), Sign);

Result = DAG.getSelectCC(dl, Exponent, DAG.getConstant(0, dl, IntVT),		Result = DAG.getSelectCC(dl, Exponent, DAG.getConstant(0, dl, IntVT),
DAG.getConstant(0, dl, DstVT), Ret, ISD::SETLT);		DAG.getConstant(0, dl, DstVT), Ret, ISD::SETLT);
return true;		return true;
}		}

bool TargetLowering::expandFP_TO_UINT(SDNode *Node, SDValue &Result,		bool TargetLowering::expandFP_TO_UINT(SDNode *Node, SDValue &Result,
		SDValue &Chain,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
SDLoc dl(SDValue(Node, 0));		SDLoc dl(SDValue(Node, 0));
SDValue Src = Node->getOperand(0);		unsigned OpNo = Node->isStrictFPOpcode() ? 1 : 0;
		SDValue Src = Node->getOperand(OpNo);

EVT SrcVT = Src.getValueType();		EVT SrcVT = Src.getValueType();
EVT DstVT = Node->getValueType(0);		EVT DstVT = Node->getValueType(0);
EVT SetCCVT =		EVT SetCCVT =
getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), SrcVT);		getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), SrcVT);

// Only expand vector types if we have the appropriate vector bit operations.		// Only expand vector types if we have the appropriate vector bit operations.
if (DstVT.isVector() && (!isOperationLegalOrCustom(ISD::FP_TO_SINT, DstVT) \|\|		unsigned SIntOpcode = Node->isStrictFPOpcode() ? ISD::STRICT_FP_TO_SINT :
		ISD::FP_TO_SINT;
		if (DstVT.isVector() && (!isOperationLegalOrCustom(SIntOpcode, DstVT) \|\|
!isOperationLegalOrCustomOrPromote(ISD::XOR, SrcVT)))		!isOperationLegalOrCustomOrPromote(ISD::XOR, SrcVT)))
return false;		return false;

// If the maximum float value is smaller then the signed integer range,		// If the maximum float value is smaller then the signed integer range,
// the destination signmask can't be represented by the float, so we can		// the destination signmask can't be represented by the float, so we can
// just use FP_TO_SINT directly.		// just use FP_TO_SINT directly.
const fltSemantics &APFSem = DAG.EVTToAPFloatSemantics(SrcVT);		const fltSemantics &APFSem = DAG.EVTToAPFloatSemantics(SrcVT);
APFloat APF(APFSem, APInt::getNullValue(SrcVT.getScalarSizeInBits()));		APFloat APF(APFSem, APInt::getNullValue(SrcVT.getScalarSizeInBits()));
APInt SignMask = APInt::getSignMask(DstVT.getScalarSizeInBits());		APInt SignMask = APInt::getSignMask(DstVT.getScalarSizeInBits());
if (APFloat::opOverflow &		if (APFloat::opOverflow &
APF.convertFromAPInt(SignMask, false, APFloat::rmNearestTiesToEven)) {		APF.convertFromAPInt(SignMask, false, APFloat::rmNearestTiesToEven)) {
		if (Node->isStrictFPOpcode()) {
		Result = DAG.getNode(ISD::STRICT_FP_TO_SINT, dl, { DstVT, MVT::Other },
		{ Node->getOperand(0), Src });
		Chain = Result.getValue(1);
		} else
Result = DAG.getNode(ISD::FP_TO_SINT, dl, DstVT, Src);		Result = DAG.getNode(ISD::FP_TO_SINT, dl, DstVT, Src);
return true;		return true;
}		}

SDValue Cst = DAG.getConstantFP(APF, dl, SrcVT);		SDValue Cst = DAG.getConstantFP(APF, dl, SrcVT);
SDValue Sel = DAG.getSetCC(dl, SetCCVT, Src, Cst, ISD::SETLT);		SDValue Sel = DAG.getSetCC(dl, SetCCVT, Src, Cst, ISD::SETLT);

bool Strict = shouldUseStrictFP_TO_INT(SrcVT, DstVT, /IsSigned/ false);		bool Strict = Node->isStrictFPOpcode() \|\|
		shouldUseStrictFP_TO_INT(SrcVT, DstVT, /IsSigned/ false);

if (Strict) {		if (Strict) {
// Expand based on maximum range of FP_TO_SINT, if the value exceeds the		// Expand based on maximum range of FP_TO_SINT, if the value exceeds the
// signmask then offset (the result of which should be fully representable).		// signmask then offset (the result of which should be fully representable).
// Sel = Src < 0x8000000000000000		// Sel = Src < 0x8000000000000000
// Val = select Sel, Src, Src - 0x8000000000000000		// Val = select Sel, Src, Src - 0x8000000000000000
// Ofs = select Sel, 0, 0x8000000000000000		// Ofs = select Sel, 0, 0x8000000000000000
// Result = fp_to_sint(Val) ^ Ofs		// Result = fp_to_sint(Val) ^ Ofs

// TODO: Should any fast-math-flags be set for the FSUB?		// TODO: Should any fast-math-flags be set for the FSUB?
SDValue Val = DAG.getSelect(dl, SrcVT, Sel, Src,		SDValue SrcBiased;
DAG.getNode(ISD::FSUB, dl, SrcVT, Src, Cst));		if (Node->isStrictFPOpcode())
		SrcBiased = DAG.getNode(ISD::STRICT_FSUB, dl, { SrcVT, MVT::Other },
		{ Node->getOperand(0), Src, Cst });
		else
		SrcBiased = DAG.getNode(ISD::FSUB, dl, SrcVT, Src, Cst);
		SDValue Val = DAG.getSelect(dl, SrcVT, Sel, Src, SrcBiased);
SDValue Ofs = DAG.getSelect(dl, DstVT, Sel, DAG.getConstant(0, dl, DstVT),		SDValue Ofs = DAG.getSelect(dl, DstVT, Sel, DAG.getConstant(0, dl, DstVT),
DAG.getConstant(SignMask, dl, DstVT));		DAG.getConstant(SignMask, dl, DstVT));
Result = DAG.getNode(ISD::XOR, dl, DstVT,		SDValue SInt;
DAG.getNode(ISD::FP_TO_SINT, dl, DstVT, Val), Ofs);		if (Node->isStrictFPOpcode()) {
		SInt = DAG.getNode(ISD::STRICT_FP_TO_SINT, dl, { DstVT, MVT::Other },
		{ SrcBiased.getValue(1), Val });
		Chain = SInt.getValue(1);
		} else
		SInt = DAG.getNode(ISD::FP_TO_SINT, dl, DstVT, Val);
		Result = DAG.getNode(ISD::XOR, dl, DstVT, SInt, Ofs);
} else {		} else {
// Expand based on maximum range of FP_TO_SINT:		// Expand based on maximum range of FP_TO_SINT:
// True = fp_to_sint(Src)		// True = fp_to_sint(Src)
// False = 0x8000000000000000 + fp_to_sint(Src - 0x8000000000000000)		// False = 0x8000000000000000 + fp_to_sint(Src - 0x8000000000000000)
// Result = select (Src < 0x8000000000000000), True, False		// Result = select (Src < 0x8000000000000000), True, False

SDValue True = DAG.getNode(ISD::FP_TO_SINT, dl, DstVT, Src);		SDValue True = DAG.getNode(ISD::FP_TO_SINT, dl, DstVT, Src);
// TODO: Should any fast-math-flags be set for the FSUB?		// TODO: Should any fast-math-flags be set for the FSUB?
▲ Show 20 Lines • Show All 1,304 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 710 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::all_valuetypes()) {
setOperationAction(ISD::STRICT_FCEIL, VT, Expand);		setOperationAction(ISD::STRICT_FCEIL, VT, Expand);
setOperationAction(ISD::STRICT_FFLOOR, VT, Expand);		setOperationAction(ISD::STRICT_FFLOOR, VT, Expand);
setOperationAction(ISD::STRICT_FROUND, VT, Expand);		setOperationAction(ISD::STRICT_FROUND, VT, Expand);
setOperationAction(ISD::STRICT_FTRUNC, VT, Expand);		setOperationAction(ISD::STRICT_FTRUNC, VT, Expand);
setOperationAction(ISD::STRICT_FMAXNUM, VT, Expand);		setOperationAction(ISD::STRICT_FMAXNUM, VT, Expand);
setOperationAction(ISD::STRICT_FMINNUM, VT, Expand);		setOperationAction(ISD::STRICT_FMINNUM, VT, Expand);
setOperationAction(ISD::STRICT_FP_ROUND, VT, Expand);		setOperationAction(ISD::STRICT_FP_ROUND, VT, Expand);
setOperationAction(ISD::STRICT_FP_EXTEND, VT, Expand);		setOperationAction(ISD::STRICT_FP_EXTEND, VT, Expand);
		setOperationAction(ISD::STRICT_FP_TO_SINT, VT, Expand);
		setOperationAction(ISD::STRICT_FP_TO_UINT, VT, Expand);

// For most targets @llvm.get.dynamic.area.offset just returns 0.		// For most targets @llvm.get.dynamic.area.offset just returns 0.
setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);		setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);

// Vector reduction default to expand.		// Vector reduction default to expand.
setOperationAction(ISD::VECREDUCE_FADD, VT, Expand);		setOperationAction(ISD::VECREDUCE_FADD, VT, Expand);
setOperationAction(ISD::VECREDUCE_FMUL, VT, Expand);		setOperationAction(ISD::VECREDUCE_FMUL, VT, Expand);
setOperationAction(ISD::VECREDUCE_ADD, VT, Expand);		setOperationAction(ISD::VECREDUCE_ADD, VT, Expand);
▲ Show 20 Lines • Show All 1,247 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/IntrinsicInst.cpp

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	ConstrainedFPIntrinsic::ExceptionBehaviorToStr(ExceptionBehavior UseExcept) {
}		}
return ExceptStr;		return ExceptStr;
}		}

bool ConstrainedFPIntrinsic::isUnaryOp() const {		bool ConstrainedFPIntrinsic::isUnaryOp() const {
switch (getIntrinsicID()) {		switch (getIntrinsicID()) {
default:		default:
return false;		return false;
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
case Intrinsic::experimental_constrained_fptrunc:		case Intrinsic::experimental_constrained_fptrunc:
case Intrinsic::experimental_constrained_fpext:		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,276 Lines • ▼ Show 20 Lines	case Intrinsic::coro_id: {
break;		break;
}		}
case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui:
case Intrinsic::experimental_constrained_fptrunc:		case Intrinsic::experimental_constrained_fptrunc:
case Intrinsic::experimental_constrained_fpext:		case Intrinsic::experimental_constrained_fpext:
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
▲ Show 20 Lines • Show All 475 Lines • ▼ Show 20 Lines	void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {
case Intrinsic::experimental_constrained_maxnum:		case Intrinsic::experimental_constrained_maxnum:
case Intrinsic::experimental_constrained_minnum:		case Intrinsic::experimental_constrained_minnum:
Assert((NumOperands == 4), "invalid arguments for constrained FP intrinsic",		Assert((NumOperands == 4), "invalid arguments for constrained FP intrinsic",
&FPI);		&FPI);
HasExceptionMD = true;		HasExceptionMD = true;
HasRoundingMD = true;		HasRoundingMD = true;
break;		break;

		case Intrinsic::experimental_constrained_fptosi:
		case Intrinsic::experimental_constrained_fptoui: {
		Assert((NumOperands == 2),
		"invalid arguments for constrained FP intrinsic", &FPI);
		HasExceptionMD = true;

		Value *Operand = FPI.getArgOperand(0);
		uint64_t NumSrcElem = 0;
		Assert(Operand->getType()->isFPOrFPVectorTy(),
		"Intrinsic first argument must be floating point", &FPI);
		if (auto *OperandT = dyn_cast<VectorType>(Operand->getType())) {
		NumSrcElem = OperandT->getNumElements();
		}

		Operand = &FPI;
		Assert((NumSrcElem > 0) == Operand->getType()->isVectorTy(),
		"Intrinsic first argument and result disagree on vector use", &FPI);
		Assert(Operand->getType()->isIntOrIntVectorTy(),
		"Intrinsic result must be an integer", &FPI);
		if (auto *OperandT = dyn_cast<VectorType>(Operand->getType())) {
		Assert(NumSrcElem == OperandT->getNumElements(),
		"Intrinsic first argument and result vector lengths must be equal",
		&FPI);
		}
		}
		break;

case Intrinsic::experimental_constrained_fptrunc:		case Intrinsic::experimental_constrained_fptrunc:
case Intrinsic::experimental_constrained_fpext: {		case Intrinsic::experimental_constrained_fpext: {
if (FPI.getIntrinsicID() == Intrinsic::experimental_constrained_fptrunc) {		if (FPI.getIntrinsicID() == Intrinsic::experimental_constrained_fptrunc) {
Assert((NumOperands == 3),		Assert((NumOperands == 3),
"invalid arguments for constrained FP intrinsic", &FPI);		"invalid arguments for constrained FP intrinsic", &FPI);
HasRoundingMD = true;		HasRoundingMD = true;
} else {		} else {
Assert((NumOperands == 2),		Assert((NumOperands == 2),
▲ Show 20 Lines • Show All 699 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/PowerPC/fp-intrinsics-fptosi-legal.ll

				; RUN: llc -O3 -mtriple=powerpc-unknown-linux-gnu -mcpu=e500 -mattr=spe < %s \| FileCheck %s

				; PowerPC SPE is a rare in-tree target that has the FP_TO_SINT node marked
				; as Legal.

				; Verify that fptosi(42.1) isn't simplified when the rounding mode is
				; unknown.
				; Verify that no gross errors happen.
				; CHECK-LABEL: @f20
				; COMMON: cfdctsiz
				define i32 @f20(double %a) {
				entry:
				%result = call i32 @llvm.experimental.constrained.fptosi.i32.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
				declare i32 @llvm.experimental.constrained.fptosi.i32.f64(double, metadata)

llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll

Show First 20 Lines • Show All 280 Lines • ▼ Show 20 Lines	entry:
%rem = call double @llvm.experimental.constrained.frem.f64(		%rem = call double @llvm.experimental.constrained.frem.f64(
double 1.000000e+00,		double 1.000000e+00,
double 1.000000e+01,		double 1.000000e+01,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret double %rem		ret double %rem
}		}

		; Verify that fptoui(%x) isn't simplified when the rounding mode is
		; unknown. The expansion should have only one conversion instruction.
		; Verify that no gross errors happen.
		; CHECK-LABEL: @f20u
		; NO-FMA: cmpltsd
		; NO-FMA: movapd
		; NO-FMA: andpd
		; NO-FMA: xorl
		; NO-FMA: ucomisd
		; NO-FMA: subsd
		; NO-FMA: andnpd
		; NO-FMA: orpd
		; NO-FMA: cvttsd2si
		; NO-FMA: setae
		; NO-FMA: shll
		; NO-FMA: xorl
		;
		; HAS-FMA: vcmpltsd
		; HAS-FMA: vsubsd
		; HAS-FMA: vblendvpd
		; HAS-FMA: vcvttsd2si
		; HAS-FMA: xorl
		; HAS-FMA: vucomisd
		; HAS-FMA: setae
		; HAS-FMA: shll
		; HAS-FMA: xorl
		define i32 @f20u(double %x) {
		entry:
		%result = call i32 @llvm.experimental.constrained.fptoui.i32.f64(double %x,
		metadata !"fpexcept.strict")
		ret i32 %result
		}

; Verify that round(42.1) isn't simplified when the rounding mode is		; Verify that round(42.1) isn't simplified when the rounding mode is
; unknown.		; unknown.
; Verify that no gross errors happen.		; Verify that no gross errors happen.
; CHECK-LABEL: @f21		; CHECK-LABEL: @f21
; COMMON: cvtsd2ss		; COMMON: cvtsd2ss
define float @f21() {		define float @f21() {
entry:		entry:
%result = call float @llvm.experimental.constrained.fptrunc.f32.f64(		%result = call float @llvm.experimental.constrained.fptrunc.f32.f64(
Show All 27 Lines
declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)		declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)		declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)
declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)		declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
		declare i32 @llvm.experimental.constrained.fptosi.i32.f64(double, metadata)
		declare i32 @llvm.experimental.constrained.fptoui.i32.f64(double, metadata)
declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)		declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)
declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)		declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)

llvm/trunk/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

Show First 20 Lines • Show All 3,862 Lines • ▼ Show 20 Lines	%min = call <4 x double> @llvm.experimental.constrained.minnum.v4f64(
double 46.0, double 47.0>,		double 46.0, double 47.0>,
<4 x double> <double 40.0, double 41.0,		<4 x double> <double 40.0, double 41.0,
double 42.0, double 43.0>,		double 42.0, double 43.0>,
metadata !"round.dynamic",		metadata !"round.dynamic",
metadata !"fpexcept.strict")		metadata !"fpexcept.strict")
ret <4 x double> %min		ret <4 x double> %min
}		}

		define <1 x i32> @constrained_vector_fptosi_v1i32_v1f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v1i32_v1f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v1i32_v1f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i32> @llvm.experimental.constrained.fptosi.v1i32.v1f32(
		<1 x float><float 42.0>,
		metadata !"fpexcept.strict")
		ret <1 x i32> %result
		}

		define <2 x i32> @constrained_vector_fptosi_v2i32_v2f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v2i32_v2f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v2i32_v2f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vmovd %eax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i32> @llvm.experimental.constrained.fptosi.v2i32.v2f32(
		<2 x float><float 42.0, float 43.0>,
		metadata !"fpexcept.strict")
		ret <2 x i32> %result
		}

		define <3 x i32> @constrained_vector_fptosi_v3i32_v3f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v3i32_v3f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v3i32_v3f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vmovd %eax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i32> @llvm.experimental.constrained.fptosi.v3i32.v3f32(
		<3 x float><float 42.0, float 43.0,
		float 44.0>,
		metadata !"fpexcept.strict")
		ret <3 x i32> %result
		}

		define <4 x i32> @constrained_vector_fptosi_v4i32_v4f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v4i32_v4f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm2
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1]
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v4i32_v4f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vmovd %eax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i32> @llvm.experimental.constrained.fptosi.v4i32.v4f32(
		<4 x float><float 42.0, float 43.0,
		float 44.0, float 45.0>,
		metadata !"fpexcept.strict")
		ret <4 x i32> %result
		}

		define <1 x i64> @constrained_vector_fptosi_v1i64_v1f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v1i64_v1f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v1i64_v1f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i64> @llvm.experimental.constrained.fptosi.v1i64.v1f32(
		<1 x float><float 42.0>,
		metadata !"fpexcept.strict")
		ret <1 x i64> %result
		}

		define <2 x i64> @constrained_vector_fptosi_v2i64_v2f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v2i64_v2f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v2i64_v2f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i64> @llvm.experimental.constrained.fptosi.v2i64.v2f32(
		<2 x float><float 42.0, float 43.0>,
		metadata !"fpexcept.strict")
		ret <2 x i64> %result
		}

		define <3 x i64> @constrained_vector_fptosi_v3i64_v3f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v3i64_v3f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rdx
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rcx
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v3i64_v3f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i64> @llvm.experimental.constrained.fptosi.v3i64.v3f32(
		<3 x float><float 42.0, float 43.0,
		float 44.0>,
		metadata !"fpexcept.strict")
		ret <3 x i64> %result
		}

		define <4 x i64> @constrained_vector_fptosi_v4i64_v4f32() {
		; CHECK-LABEL: constrained_vector_fptosi_v4i64_v4f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm2
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v4i64_v4f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm2
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0]
		; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i64> @llvm.experimental.constrained.fptosi.v4i64.v4f32(
		<4 x float><float 42.0, float 43.0,
		float 44.0, float 45.0>,
		metadata !"fpexcept.strict")
		ret <4 x i64> %result
		}

		define <1 x i32> @constrained_vector_fptosi_v1i32_v1f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v1i32_v1f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v1i32_v1f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i32> @llvm.experimental.constrained.fptosi.v1i32.v1f64(
		<1 x double><double 42.1>,
		metadata !"fpexcept.strict")
		ret <1 x i32> %result
		}


		define <2 x i32> @constrained_vector_fptosi_v2i32_v2f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v2i32_v2f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v2i32_v2f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vmovd %eax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i32> @llvm.experimental.constrained.fptosi.v2i32.v2f64(
		<2 x double><double 42.1, double 42.2>,
		metadata !"fpexcept.strict")
		ret <2 x i32> %result
		}

		define <3 x i32> @constrained_vector_fptosi_v3i32_v3f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v3i32_v3f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v3i32_v3f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vmovd %eax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i32> @llvm.experimental.constrained.fptosi.v3i32.v3f64(
		<3 x double><double 42.1, double 42.2,
		double 42.3>,
		metadata !"fpexcept.strict")
		ret <3 x i32> %result
		}

		define <4 x i32> @constrained_vector_fptosi_v4i32_v4f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v4i32_v4f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm2
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1]
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v4i32_v4f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vmovd %eax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i32> @llvm.experimental.constrained.fptosi.v4i32.v4f64(
		<4 x double><double 42.1, double 42.2,
		double 42.3, double 42.4>,
		metadata !"fpexcept.strict")
		ret <4 x i32> %result
		}

		define <1 x i64> @constrained_vector_fptosi_v1i64_v1f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v1i64_v1f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v1i64_v1f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i64> @llvm.experimental.constrained.fptosi.v1i64.v1f64(
		<1 x double><double 42.1>,
		metadata !"fpexcept.strict")
		ret <1 x i64> %result
		}

		define <2 x i64> @constrained_vector_fptosi_v2i64_v2f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v2i64_v2f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v2i64_v2f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i64> @llvm.experimental.constrained.fptosi.v2i64.v2f64(
		<2 x double><double 42.1, double 42.2>,
		metadata !"fpexcept.strict")
		ret <2 x i64> %result
		}

		define <3 x i64> @constrained_vector_fptosi_v3i64_v3f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v3i64_v3f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rdx
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rcx
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v3i64_v3f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i64> @llvm.experimental.constrained.fptosi.v3i64.v3f64(
		<3 x double><double 42.1, double 42.2,
		double 42.3>,
		metadata !"fpexcept.strict")
		ret <3 x i64> %result
		}

		define <4 x i64> @constrained_vector_fptosi_v4i64_v4f64() {
		; CHECK-LABEL: constrained_vector_fptosi_v4i64_v4f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm2
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptosi_v4i64_v4f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm2
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0]
		; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i64> @llvm.experimental.constrained.fptosi.v4i64.v4f64(
		<4 x double><double 42.1, double 42.2,
		double 42.3, double 42.4>,
		metadata !"fpexcept.strict")
		ret <4 x i64> %result
		}

		define <1 x i32> @constrained_vector_fptoui_v1i32_v1f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v1i32_v1f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v1i32_v1f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i32> @llvm.experimental.constrained.fptoui.v1i32.v1f32(
		<1 x float><float 42.0>,
		metadata !"fpexcept.strict")
		ret <1 x i32> %result
		}

		define <2 x i32> @constrained_vector_fptoui_v2i32_v2f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v2i32_v2f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v2i32_v2f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %ecx
		; AVX-NEXT: vmovd %ecx, %xmm0
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i32> @llvm.experimental.constrained.fptoui.v2i32.v2f32(
		<2 x float><float 42.0, float 43.0>,
		metadata !"fpexcept.strict")
		ret <2 x i32> %result
		}

		define <3 x i32> @constrained_vector_fptoui_v3i32_v3f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v3i32_v3f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v3i32_v3f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %ecx
		; AVX-NEXT: vmovd %ecx, %xmm0
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i32> @llvm.experimental.constrained.fptoui.v3i32.v3f32(
		<3 x float><float 42.0, float 43.0,
		float 44.0>,
		metadata !"fpexcept.strict")
		ret <3 x i32> %result
		}

		define <4 x i32> @constrained_vector_fptoui_v4i32_v4f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v4i32_v4f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm2
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1]
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v4i32_v4f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %ecx
		; AVX-NEXT: vmovd %ecx, %xmm0
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i32> @llvm.experimental.constrained.fptoui.v4i32.v4f32(
		<4 x float><float 42.0, float 43.0,
		float 44.0, float 45.0>,
		metadata !"fpexcept.strict")
		ret <4 x i32> %result
		}

		define <1 x i64> @constrained_vector_fptoui_v1i64_v1f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v1i64_v1f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v1i64_v1f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i64> @llvm.experimental.constrained.fptoui.v1i64.v1f32(
		<1 x float><float 42.0>,
		metadata !"fpexcept.strict")
		ret <1 x i64> %result
		}

		define <2 x i64> @constrained_vector_fptoui_v2i64_v2f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v2i64_v2f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v2i64_v2f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i64> @llvm.experimental.constrained.fptoui.v2i64.v2f32(
		<2 x float><float 42.0, float 43.0>,
		metadata !"fpexcept.strict")
		ret <2 x i64> %result
		}

		define <3 x i64> @constrained_vector_fptoui_v3i64_v3f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v3i64_v3f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rdx
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rcx
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v3i64_v3f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i64> @llvm.experimental.constrained.fptoui.v3i64.v3f32(
		<3 x float><float 42.0, float 43.0,
		float 44.0>,
		metadata !"fpexcept.strict")
		ret <3 x i64> %result
		}

		define <4 x i64> @constrained_vector_fptoui_v4i64_v4f32() {
		; CHECK-LABEL: constrained_vector_fptoui_v4i64_v4f32:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm2
		; CHECK-NEXT: cvttss2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v4i64_v4f32:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vcvttss2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm2
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0]
		; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i64> @llvm.experimental.constrained.fptoui.v4i64.v4f32(
		<4 x float><float 42.0, float 43.0,
		float 44.0, float 45.0>,
		metadata !"fpexcept.strict")
		ret <4 x i64> %result
		}

		define <1 x i32> @constrained_vector_fptoui_v1i32_v1f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v1i32_v1f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v1i32_v1f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i32> @llvm.experimental.constrained.fptoui.v1i32.v1f64(
		<1 x double><double 42.1>,
		metadata !"fpexcept.strict")
		ret <1 x i32> %result
		}

		define <2 x i32> @constrained_vector_fptoui_v2i32_v2f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v2i32_v2f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v2i32_v2f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %ecx
		; AVX-NEXT: vmovd %ecx, %xmm0
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i32> @llvm.experimental.constrained.fptoui.v2i32.v2f64(
		<2 x double><double 42.1, double 42.2>,
		metadata !"fpexcept.strict")
		ret <2 x i32> %result
		}

		define <3 x i32> @constrained_vector_fptoui_v3i32_v3f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v3i32_v3f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v3i32_v3f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %ecx
		; AVX-NEXT: vmovd %ecx, %xmm0
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i32> @llvm.experimental.constrained.fptoui.v3i32.v3f64(
		<3 x double><double 42.1, double 42.2,
		double 42.3>,
		metadata !"fpexcept.strict")
		ret <3 x i32> %result
		}

		define <4 x i32> @constrained_vector_fptoui_v4i32_v4f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v4i32_v4f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm1
		; CHECK-NEXT: punpckldq {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm2
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %eax
		; CHECK-NEXT: movd %eax, %xmm0
		; CHECK-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1]
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v4i32_v4f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %ecx
		; AVX-NEXT: vmovd %ecx, %xmm0
		; AVX-NEXT: vpinsrd $1, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %eax
		; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i32> @llvm.experimental.constrained.fptoui.v4i32.v4f64(
		<4 x double><double 42.1, double 42.2,
		double 42.3, double 42.4>,
		metadata !"fpexcept.strict")
		ret <4 x i32> %result
		}

		define <1 x i64> @constrained_vector_fptoui_v1i64_v1f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v1i64_v1f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v1i64_v1f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: retq
		entry:
		%result = call <1 x i64> @llvm.experimental.constrained.fptoui.v1i64.v1f64(
		<1 x double><double 42.1>,
		metadata !"fpexcept.strict")
		ret <1 x i64> %result
		}

		define <2 x i64> @constrained_vector_fptoui_v2i64_v2f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v2i64_v2f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v2i64_v2f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: retq
		entry:
		%result = call <2 x i64> @llvm.experimental.constrained.fptoui.v2i64.v2f64(
		<2 x double><double 42.1, double 42.2>,
		metadata !"fpexcept.strict")
		ret <2 x i64> %result
		}

		define <3 x i64> @constrained_vector_fptoui_v3i64_v3f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v3i64_v3f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rdx
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rcx
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v3i64_v3f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <3 x i64> @llvm.experimental.constrained.fptoui.v3i64.v3f64(
		<3 x double><double 42.1, double 42.2,
		double 42.3>,
		metadata !"fpexcept.strict")
		ret <3 x i64> %result
		}

		define <4 x i64> @constrained_vector_fptoui_v4i64_v4f64() {
		; CHECK-LABEL: constrained_vector_fptoui_v4i64_v4f64:
		; CHECK: # %bb.0: # %entry
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm0
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm2
		; CHECK-NEXT: cvttsd2si {{.*}}(%rip), %rax
		; CHECK-NEXT: movq %rax, %xmm1
		; CHECK-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]
		; CHECK-NEXT: retq
		;
		; AVX-LABEL: constrained_vector_fptoui_v4i64_v4f64:
		; AVX: # %bb.0: # %entry
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm0
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm1[0],xmm0[0]
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm1
		; AVX-NEXT: vcvttsd2si {{.*}}(%rip), %rax
		; AVX-NEXT: vmovq %rax, %xmm2
		; AVX-NEXT: vpunpcklqdq {{.*#+}} xmm1 = xmm2[0],xmm1[0]
		; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
		; AVX-NEXT: retq
		entry:
		%result = call <4 x i64> @llvm.experimental.constrained.fptoui.v4i64.v4f64(
		<4 x double><double 42.1, double 42.2,
		double 42.3, double 42.4>,
		metadata !"fpexcept.strict")
		ret <4 x i64> %result
		}


define <1 x float> @constrained_vector_fptrunc_v1f64() {		define <1 x float> @constrained_vector_fptrunc_v1f64() {
; CHECK-LABEL: constrained_vector_fptrunc_v1f64:		; CHECK-LABEL: constrained_vector_fptrunc_v1f64:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero		; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
; CHECK-NEXT: cvtsd2ss %xmm0, %xmm0		; CHECK-NEXT: cvtsd2ss %xmm0, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
;		;
; AVX-LABEL: constrained_vector_fptrunc_v1f64:		; AVX-LABEL: constrained_vector_fptrunc_v1f64:
▲ Show 20 Lines • Show All 777 Lines • ▼ Show 20 Lines
declare <2 x double> @llvm.experimental.constrained.exp2.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.exp2.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.log.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.log.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.log10.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.log10.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.log2.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.log2.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.rint.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.rint.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.nearbyint.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.nearbyint.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.maxnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.maxnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.minnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.minnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)
		declare <2 x i32> @llvm.experimental.constrained.fptosi.v2i32.v2f32(<2 x float>, metadata)
		declare <2 x i64> @llvm.experimental.constrained.fptosi.v2i64.v2f32(<2 x float>, metadata)
		declare <2 x i32> @llvm.experimental.constrained.fptosi.v2i32.v2f64(<2 x double>, metadata)
		declare <2 x i64> @llvm.experimental.constrained.fptosi.v2i64.v2f64(<2 x double>, metadata)
		declare <2 x i32> @llvm.experimental.constrained.fptoui.v2i32.v2f32(<2 x float>, metadata)
		declare <2 x i64> @llvm.experimental.constrained.fptoui.v2i64.v2f32(<2 x float>, metadata)
		declare <2 x i32> @llvm.experimental.constrained.fptoui.v2i32.v2f64(<2 x double>, metadata)
		declare <2 x i64> @llvm.experimental.constrained.fptoui.v2i64.v2f64(<2 x double>, metadata)
declare <2 x float> @llvm.experimental.constrained.fptrunc.v2f32.v2f64(<2 x double>, metadata, metadata)		declare <2 x float> @llvm.experimental.constrained.fptrunc.v2f32.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.fpext.v2f64.v2f32(<2 x float>, metadata)		declare <2 x double> @llvm.experimental.constrained.fpext.v2f64.v2f32(<2 x float>, metadata)
declare <2 x double> @llvm.experimental.constrained.ceil.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.ceil.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.floor.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.floor.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.round.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.round.v2f64(<2 x double>, metadata, metadata)
declare <2 x double> @llvm.experimental.constrained.trunc.v2f64(<2 x double>, metadata, metadata)		declare <2 x double> @llvm.experimental.constrained.trunc.v2f64(<2 x double>, metadata, metadata)

; Scalar width declarations		; Scalar width declarations
Show All 11 Lines
declare <1 x float> @llvm.experimental.constrained.exp2.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.exp2.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.log.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.log.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.log10.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.log10.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.log2.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.log2.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.rint.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.rint.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.nearbyint.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.nearbyint.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.maxnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.maxnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.minnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.minnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)
		declare <1 x i32> @llvm.experimental.constrained.fptosi.v1i32.v1f32(<1 x float>, metadata)
		declare <1 x i64> @llvm.experimental.constrained.fptosi.v1i64.v1f32(<1 x float>, metadata)
		declare <1 x i32> @llvm.experimental.constrained.fptosi.v1i32.v1f64(<1 x double>, metadata)
		declare <1 x i64> @llvm.experimental.constrained.fptosi.v1i64.v1f64(<1 x double>, metadata)
		declare <1 x i32> @llvm.experimental.constrained.fptoui.v1i32.v1f32(<1 x float>, metadata)
		declare <1 x i64> @llvm.experimental.constrained.fptoui.v1i64.v1f32(<1 x float>, metadata)
		declare <1 x i32> @llvm.experimental.constrained.fptoui.v1i32.v1f64(<1 x double>, metadata)
		declare <1 x i64> @llvm.experimental.constrained.fptoui.v1i64.v1f64(<1 x double>, metadata)
declare <1 x float> @llvm.experimental.constrained.fptrunc.v1f32.v1f64(<1 x double>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.fptrunc.v1f32.v1f64(<1 x double>, metadata, metadata)
declare <1 x double> @llvm.experimental.constrained.fpext.v1f64.v1f32(<1 x float>, metadata)		declare <1 x double> @llvm.experimental.constrained.fpext.v1f64.v1f32(<1 x float>, metadata)
declare <1 x float> @llvm.experimental.constrained.ceil.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.ceil.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.floor.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.floor.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.round.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.round.v1f32(<1 x float>, metadata, metadata)
declare <1 x float> @llvm.experimental.constrained.trunc.v1f32(<1 x float>, metadata, metadata)		declare <1 x float> @llvm.experimental.constrained.trunc.v1f32(<1 x float>, metadata, metadata)

; Illegal width declarations		; Illegal width declarations
Show All 30 Lines
declare <3 x float> @llvm.experimental.constrained.rint.v3f32(<3 x float>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.rint.v3f32(<3 x float>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.rint.v3f64(<3 x double>, metadata, metadata)		declare <3 x double> @llvm.experimental.constrained.rint.v3f64(<3 x double>, metadata, metadata)
declare <3 x float> @llvm.experimental.constrained.nearbyint.v3f32(<3 x float>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.nearbyint.v3f32(<3 x float>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.nearbyint.v3f64(<3 x double>, metadata, metadata)		declare <3 x double> @llvm.experimental.constrained.nearbyint.v3f64(<3 x double>, metadata, metadata)
declare <3 x float> @llvm.experimental.constrained.maxnum.v3f32(<3 x float>, <3 x float>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.maxnum.v3f32(<3 x float>, <3 x float>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.maxnum.v3f64(<3 x double>, <3 x double>, metadata, metadata)		declare <3 x double> @llvm.experimental.constrained.maxnum.v3f64(<3 x double>, <3 x double>, metadata, metadata)
declare <3 x float> @llvm.experimental.constrained.minnum.v3f32(<3 x float>, <3 x float>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.minnum.v3f32(<3 x float>, <3 x float>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.minnum.v3f64(<3 x double>, <3 x double>, metadata, metadata)		declare <3 x double> @llvm.experimental.constrained.minnum.v3f64(<3 x double>, <3 x double>, metadata, metadata)
		declare <3 x i32> @llvm.experimental.constrained.fptosi.v3i32.v3f32(<3 x float>, metadata)
		declare <3 x i64> @llvm.experimental.constrained.fptosi.v3i64.v3f32(<3 x float>, metadata)
		declare <3 x i32> @llvm.experimental.constrained.fptosi.v3i32.v3f64(<3 x double>, metadata)
		declare <3 x i64> @llvm.experimental.constrained.fptosi.v3i64.v3f64(<3 x double>, metadata)
		declare <3 x i32> @llvm.experimental.constrained.fptoui.v3i32.v3f32(<3 x float>, metadata)
		declare <3 x i64> @llvm.experimental.constrained.fptoui.v3i64.v3f32(<3 x float>, metadata)
		declare <3 x i32> @llvm.experimental.constrained.fptoui.v3i32.v3f64(<3 x double>, metadata)
		declare <3 x i64> @llvm.experimental.constrained.fptoui.v3i64.v3f64(<3 x double>, metadata)
declare <3 x float> @llvm.experimental.constrained.fptrunc.v3f32.v3f64(<3 x double>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.fptrunc.v3f32.v3f64(<3 x double>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.fpext.v3f64.v3f32(<3 x float>, metadata)		declare <3 x double> @llvm.experimental.constrained.fpext.v3f64.v3f32(<3 x float>, metadata)
declare <3 x float> @llvm.experimental.constrained.ceil.v3f32(<3 x float>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.ceil.v3f32(<3 x float>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.ceil.v3f64(<3 x double>, metadata, metadata)		declare <3 x double> @llvm.experimental.constrained.ceil.v3f64(<3 x double>, metadata, metadata)
declare <3 x float> @llvm.experimental.constrained.floor.v3f32(<3 x float>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.floor.v3f32(<3 x float>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.floor.v3f64(<3 x double>, metadata, metadata)		declare <3 x double> @llvm.experimental.constrained.floor.v3f64(<3 x double>, metadata, metadata)
declare <3 x float> @llvm.experimental.constrained.round.v3f32(<3 x float>, metadata, metadata)		declare <3 x float> @llvm.experimental.constrained.round.v3f32(<3 x float>, metadata, metadata)
declare <3 x double> @llvm.experimental.constrained.round.v3f64(<3 x double>, metadata, metadata)		declare <3 x double> @llvm.experimental.constrained.round.v3f64(<3 x double>, metadata, metadata)
Show All 15 Lines
declare <4 x double> @llvm.experimental.constrained.exp2.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.exp2.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.log.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.log.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.log10.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.log10.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.log2.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.log2.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.rint.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.rint.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.nearbyint.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.maxnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.maxnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.minnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.minnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)
		declare <4 x i32> @llvm.experimental.constrained.fptosi.v4i32.v4f32(<4 x float>, metadata)
		declare <4 x i64> @llvm.experimental.constrained.fptosi.v4i64.v4f32(<4 x float>, metadata)
		declare <4 x i32> @llvm.experimental.constrained.fptosi.v4i32.v4f64(<4 x double>, metadata)
		declare <4 x i64> @llvm.experimental.constrained.fptosi.v4i64.v4f64(<4 x double>, metadata)
		declare <4 x i32> @llvm.experimental.constrained.fptoui.v4i32.v4f32(<4 x float>, metadata)
		declare <4 x i64> @llvm.experimental.constrained.fptoui.v4i64.v4f32(<4 x float>, metadata)
		declare <4 x i32> @llvm.experimental.constrained.fptoui.v4i32.v4f64(<4 x double>, metadata)
		declare <4 x i64> @llvm.experimental.constrained.fptoui.v4i64.v4f64(<4 x double>, metadata)
declare <4 x float> @llvm.experimental.constrained.fptrunc.v4f32.v4f64(<4 x double>, metadata, metadata)		declare <4 x float> @llvm.experimental.constrained.fptrunc.v4f32.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.fpext.v4f64.v4f32(<4 x float>, metadata)		declare <4 x double> @llvm.experimental.constrained.fpext.v4f64.v4f32(<4 x float>, metadata)
declare <4 x double> @llvm.experimental.constrained.ceil.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.ceil.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.floor.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.floor.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.round.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.round.v4f64(<4 x double>, metadata, metadata)
declare <4 x double> @llvm.experimental.constrained.trunc.v4f64(<4 x double>, metadata, metadata)		declare <4 x double> @llvm.experimental.constrained.trunc.v4f64(<4 x double>, metadata, metadata)

llvm/trunk/test/Feature/fp-intrinsics.ll

	Show First 20 Lines • Show All 236 Lines • ▼ Show 20 Lines
	define double @f17() {			define double @f17() {
	entry:			entry:
	%result = call double @llvm.experimental.constrained.fma.f64(double 42.1, double 42.1, double 42.1,			%result = call double @llvm.experimental.constrained.fma.f64(double 42.1, double 42.1, double 42.1,
	metadata !"round.dynamic",			metadata !"round.dynamic",
	metadata !"fpexcept.strict")			metadata !"fpexcept.strict")
	ret double %result			ret double %result
	}			}

				; Verify that fptoui(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f18
				; CHECK: call zeroext i32 @llvm.experimental.constrained.fptoui
				define zeroext i32 @f18() {
				entry:
				%result = call zeroext i32 @llvm.experimental.constrained.fptoui.i32.f64(
				double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that fptosi(42.1) isn't simplified when the rounding mode is
				; unknown.
				; CHECK-LABEL: f19
				; CHECK: call i32 @llvm.experimental.constrained.fptosi
				define i32 @f19() {
				entry:
				%result = call i32 @llvm.experimental.constrained.fptosi.i32.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

	; Verify that fptrunc(42.1) isn't simplified when the rounding mode is			; Verify that fptrunc(42.1) isn't simplified when the rounding mode is
	; unknown.			; unknown.
	; CHECK-LABEL: f20			; CHECK-LABEL: f20
	; CHECK: call float @llvm.experimental.constrained.fptrunc			; CHECK: call float @llvm.experimental.constrained.fptrunc
	define float @f20() {			define float @f20() {
	entry:			entry:
	%result = call float @llvm.experimental.constrained.fptrunc.f32.f64(			%result = call float @llvm.experimental.constrained.fptrunc.f32.f64(
	double 42.1,			double 42.1,
	Show All 26 Lines
	declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
				declare i32 @llvm.experimental.constrained.fptosi.i32.f64(double, metadata)
				declare i32 @llvm.experimental.constrained.fptoui.i32.f64(double, metadata)
	declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)			declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)			declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)

This is an archive of the discontinued LLVM Phabricator instance.

[FPEnv] Add fptosi and fptoui constrained intrinsicsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 217675

llvm/trunk/docs/LangRef.rst

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

llvm/trunk/include/llvm/CodeGen/SelectionDAGNodes.h

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

llvm/trunk/include/llvm/IR/IntrinsicInst.h

llvm/trunk/include/llvm/IR/Intrinsics.td

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

llvm/trunk/lib/IR/IntrinsicInst.cpp

llvm/trunk/lib/IR/Verifier.cpp

llvm/trunk/test/CodeGen/PowerPC/fp-intrinsics-fptosi-legal.ll

llvm/trunk/test/CodeGen/X86/fp-intrinsics.ll

llvm/trunk/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

llvm/trunk/test/Feature/fp-intrinsics.ll

[FPEnv] Add fptosi and fptoui constrained intrinsics
ClosedPublic