This is an archive of the discontinued LLVM Phabricator instance.

docs/LangRef.rst
15705	I know you're just copying what we previously said about the constrained rint, but we should probably say that it "will" raise this exception.
15724	We should describe what is returned if the value is too large to be represented as a long. The llvm.lrint doesn't do that either, but it should too.
15736	It seems like we have a problem here. The intrinsics are overloaded to take any integer as a return type, but not all integers will match up with a real library call. Is that handled anywhere?
16010	Can you describe the rounding mode that will be used? You should also describe the conditions under which an exception will be raised.
include/llvm/CodeGen/ISDOpcodes.h
303	What's the logic behind the ordering here? If there's no good reason to insert new nodes in the middle of the existing list, it would be kinder to people maintaining out-of-tree branches if you would append them to the end of this group.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
208	Why are you changing this assertion?
2045	It's not clear to me why this is necessary now but hasn't been before. Since we're expanding to a library call, I don't think we need to preserve the chain that the strict FP node was using.
2062	If these changes do need to stay, you'll need to update this comment.
2940	Shouldn't something be getting pushed to Results here?

kpn marked 5 inline comments as done.Jul 16 2019, 7:34 AM

kpn added inline comments.

include/llvm/CodeGen/ISDOpcodes.h
303	Eh, I was just lumping the rounding nodes together. It's not important. Avoiding pain downstream overwhelms that weak reason. I'll change it.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
208	Because it is stronger than SelectionDAG::ReplaceAllUsesWith(SDNode From, SDNode To) needs it to be (unless I'm wrong), and because without this change the test case cannot pass. The conversion to a libcall results in a node with result value, a chain, and glue. There was no glue before, so the added result would trigger the assert.
2045	So we don't need the chain to preserve ordering since it's a function call? That makes things simpler. Here, at least. I don't know what happens in ReplaceNode when a Chain result gets swapped out with a Glue result.
2062	I'd love to see a test case for this. You'll notice the assert() I added below because I don't have one. That would make it much easier to update the comment. it does need to be updated.
2940	No. If we go that route then we end up in ReplaceNode(SDNode Old, const SDValue New) with multiple SDValues to replace but only one of them being given. This results in ReplaceNode() running off into some other memory and typically crashing. We need to instead call ReplaceNode(SDNode Old, SDNode New) so all values that need to be looked at are present. That's why I call ReplaceNode() myself here, and since we're bypassing the rest of the function I mostly-duplicated the debug message and the return statement.

kpn added inline comments.Jul 16 2019, 11:17 AM

docs/LangRef.rst
15724	What do we want this to be? My draft copy of C99 says the return value is "unspecified". What does that translate to in LLVM-land? Is this listed in IEEE 754?

cameron.mcinally added inline comments.Jul 16 2019, 1:21 PM

docs/LangRef.rst
15724	That should throw an Invalid exception. And I think we agreed in your other Diff that it should return a poison value.

cameron.mcinally added inline comments.Jul 16 2019, 1:24 PM

docs/LangRef.rst

15724

7.2 Invalid operation 

<...snip...>

For operations producing no result in floating-point format, the operations that signal the invalid operation exception are:

i) conversion of a floating-point number to an integer format, when the source is NaN, infinity, or a value that would convert to an integer outside the range of the result format under the applicable rounding attribute

kpn added inline comments.Jul 17 2019, 6:07 AM

docs/LangRef.rst
15724	Poison only in the constant folding case, right? What about in the non-constant-folding case?

cameron.mcinally added a subscriber: eli.friedman.Jul 17 2019, 8:43 AM

cameron.mcinally added inline comments.

docs/LangRef.rst
15724	Poison only in the constant folding case, right? What about in the non-constant-folding case? @eli.friedman Besides the obvious flag raising, that would be undefined. It would be up to the hardware.

craig.topper added inline comments.Jul 17 2019, 9:48 PM

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
118	Why do these need to be handled here, but the none strict versions aren't?
lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
174	We don't handle the non-vector version of these here. So I don't think we need to handle these yet.
lib/IR/Verifier.cpp
4250	Probably need to ensure these don't get used with vectors to match their none constrainted counterparts

efriedma added a subscriber: efriedma.Jul 18 2019, 3:48 PM

efriedma added inline comments.

docs/LangRef.rst
15724	LangRef defines the semantics of the IR; what the current version of the optimizer does or does not constant-fold, or otherwise optimize, is irrelevant. llvm.experimental.constrained.llrint should probably do the same thing as llvm.llrint, whatever that is, if the rounding mode is the default. If that somehow didn't get defined in LangRef, please make a separate patch for that.
15736	If you try to call an intrinsic where there is no underlying library call, it'll fail to compile, I assume. This doesn't really seem like a problem.

Address review comments.

I found that a minor tweak in ExpandArgFPLibCall() to mutate strict nodes makes the changes elsewhere much nicer and similar to the other library calls. I then moved the conversion to libcalls for lrint() and lround() into the ConvertNodeToLibCall() function. This patch should now be more in line with the traditional handling of library calls.

Herald added a subscriber: jdoerfert. · View Herald TranscriptAug 9 2019, 9:24 AM

Tweak documentation hopefully as requested. Remove a stray blank line that somehow got in there.

craig.topper added inline comments.Aug 27 2019, 8:23 AM

lib/CodeGen/TargetLoweringBase.cpp
756	What places check the result? Can they be fixed?

kpn added inline comments.Aug 28 2019, 9:44 AM

lib/CodeGen/TargetLoweringBase.cpp
756	Hmmm, in LegalizeDAG.cpp we're checking the value in LegalizeOp around line 1153. We're checking the value in ExpandNode around 3692. If these aren't libcalls then we'll do the mutation in SelectionDAGISel.cpp, but that checks the value as well. And it comes too late for mutating if it will become a libcall. That's what i'm seeing: a failure during instruction selection because mutation to LRINT happened too late. I'm not sure, though, that they are broken. Operation actions don't seem to be defined exclusively for operands or exclusively for values. And there's oodles of code checking operands, but oodles of code checking values. Some by definition, since, for example, the Expand code must be looking at values. Anything that isn't marked will default to Legal and cause us problems. I'm not sure we want to be playing whack-a-mole forever adding exceptions here and there. This is a bit of a shotgun approach, but it does cover all the bases we need covered. I'll experiment a bit more and see what if I can get this thing to behave without this block of code here.

Address review comments.

If I tweak ExpandNode() to know that we shouldn't attempt to bypass creating a libcall when encountering a strict lrint/lround then I no longer need to register these nodes for any of the integer types.

Ping.

This looks good to me, but I'd like @craig.topper to verfiy that he's happy with the parts he commented on.

LGTM other than that one comment.

lib/IR/Verifier.cpp
4724	Should this break be inside the curly braces? I don't think I've seen the style used here anywhere else.

This revision is now accepted and ready to land.Oct 3 2019, 1:49 PM

Changes pushed to r373900. I don't know why the ticket was left open.

kpn mentioned this in D68810: Document rounding for llvm.lround and llvm.lrint.Oct 10 2019, 9:34 AM

Hi,

The semantics described by patch for lrint regarding the rounding mode is a bit peculiar and different from all the other constrained FP instrinsics.
It says:

The rounding mode is described, not determined, by the rounding mode argument. The actual rounding mode is determined by the runtime floating-point environment. The rounding mode argument is only intended as information to the compiler.

So does this means that the only accepted rounding mode is !dynamic? Why is lrint different from the remaining intrinsics?

I was trying to implement the semantics of lrint in Alive2 and came across this discrepancy.

Thanks!

Herald added a subscriber: pengfei. · View Herald TranscriptFeb 15 2022, 10:43 AM

In D64746#3323792, @nlopes wrote:
Hi,

The semantics described by patch for lrint regarding the rounding mode is a bit peculiar and different from all the other constrained FP instrinsics.
It says:
The rounding mode is described, not determined, by the rounding mode argument. The actual rounding mode is determined by the runtime floating-point environment. The rounding mode argument is only intended as information to the compiler.
So does this means that the only accepted rounding mode is !dynamic? Why is lrint different from the remaining intrinsics?

I was trying to implement the semantics of lrint in Alive2 and came across this discrepancy.

Thanks!

I believe the quoted statement above is both correct and also identical to the semantics of the rounding mode argument of all other constrained intrinsics. For all of them, the rounding mode argument is a promise by the user to the compiler, not an instruction to the compiler to change anything about the rounding mode. All of these operations will actually use the current default rounding mode, but the presence of a constrained rounding mode argument is an implicit promise by the user to the compiler that the current rounding mode has been set up in a particular way, and the compiler (e.g. for optimization purposes) may rely on that promise.

In D64746#3323862, @uweigand wrote:
In D64746#3323792, @nlopes wrote:
Hi,

The semantics described by patch for lrint regarding the rounding mode is a bit peculiar and different from all the other constrained FP instrinsics.
It says:
The rounding mode is described, not determined, by the rounding mode argument. The actual rounding mode is determined by the runtime floating-point environment. The rounding mode argument is only intended as information to the compiler.
So does this means that the only accepted rounding mode is !dynamic? Why is lrint different from the remaining intrinsics?

I was trying to implement the semantics of lrint in Alive2 and came across this discrepancy.

Thanks!
I believe the quoted statement above is both correct and also identical to the semantics of the rounding mode argument of all other constrained intrinsics. For all of them, the rounding mode argument is a promise by the user to the compiler, not an instruction to the compiler to change anything about the rounding mode. All of these operations will actually use the current default rounding mode, but the presence of a constrained rounding mode argument is an implicit promise by the user to the compiler that the current rounding mode has been set up in a particular way, and the compiler (e.g. for optimization purposes) may rely on that promise.

Thank you.
Can I read what you wrote as "if the rounding mode argument is not !dynamic and if it differs from the run-time rounding mode, the intrinsic returns poison"?

Essentially you want to allow the compiler to use the given rounding mode for optimizations, but still be able to lower the intrinsic to a single libcall that will use the run-time rounding mode, not the one given as argument.

In D64746#3323899, @nlopes wrote:

In D64746#3323862, @uweigand wrote:

I believe the quoted statement above is both correct and also identical to the semantics of the rounding mode argument of all other constrained intrinsics. For all of them, the rounding mode argument is a promise by the user to the compiler, not an instruction to the compiler to change anything about the rounding mode. All of these operations will actually use the current default rounding mode, but the presence of a constrained rounding mode argument is an implicit promise by the user to the compiler that the current rounding mode has been set up in a particular way, and the compiler (e.g. for optimization purposes) may rely on that promise.

Thank you.
Can I read what you wrote as "if the rounding mode argument is not !dynamic and if it differs from the run-time rounding mode, the intrinsic returns poison"?

Essentially you want to allow the compiler to use the given rounding mode for optimizations, but still be able to lower the intrinsic to a single libcall that will use the run-time rounding mode, not the one given as argument.

This matches my understanding. (Though I must admit I'm not 100% confident I fully understand the precise distinction between "undefined behavior", a poison value, and an undef value ... It's certainly one of those :-) But from my reading of the IR reference, "poison" does indeed look correct here.)

In D64746#3324270, @uweigand wrote:

In D64746#3323899, @nlopes wrote:

In D64746#3323862, @uweigand wrote:

I believe the quoted statement above is both correct and also identical to the semantics of the rounding mode argument of all other constrained intrinsics. For all of them, the rounding mode argument is a promise by the user to the compiler, not an instruction to the compiler to change anything about the rounding mode. All of these operations will actually use the current default rounding mode, but the presence of a constrained rounding mode argument is an implicit promise by the user to the compiler that the current rounding mode has been set up in a particular way, and the compiler (e.g. for optimization purposes) may rely on that promise.

Thank you.
Can I read what you wrote as "if the rounding mode argument is not !dynamic and if it differs from the run-time rounding mode, the intrinsic returns poison"?

Essentially you want to allow the compiler to use the given rounding mode for optimizations, but still be able to lower the intrinsic to a single libcall that will use the run-time rounding mode, not the one given as argument.

This matches my understanding. (Though I must admit I'm not 100% confident I fully understand the precise distinction between "undefined behavior", a poison value, and an undef value ... It's certainly one of those :-) But from my reading of the IR reference, "poison" does indeed look correct here.)

OK, thank you!
It's poison. UB would be too strong. I've put on my todo list to fix LangRef as well.

In D64746#3324418, @nlopes wrote:

In D64746#3324270, @uweigand wrote:

This matches my understanding. (Though I must admit I'm not 100% confident I fully understand the precise distinction between "undefined behavior", a poison value, and an undef value ... It's certainly one of those :-) But from my reading of the IR reference, "poison" does indeed look correct here.)

OK, thank you!
It's poison. UB would be too strong. I've put on my todo list to fix LangRef as well.

The reason why I was hesitating about poison vs. UB is exceptions. This is clearer when talking about the exception metadata as opposed to the rounding mode metadata: if exception metadata promises that FP exceptions are disabled, but they are actually enabled at runtime, compiler optimizations are free to generate code triggering spurious exceptions that might kill the program outright - that seems more like UB than poison to me.

With the rounding mode, this is much less obvious to me, but I guess in theory it could happen that if the actual rounding mode does not match the mode promised by the metadata, a compiler optimization might possibly generate code that could now e.g. introduce an underflow where the unoptimized code would not have one, and if in addition exception are also enabled, that extra underflow could now trigger an extra exception. However, I'm not sure if that can actually ever occur in practice ...

In D64746#3325599, @uweigand wrote:

In D64746#3324418, @nlopes wrote:

In D64746#3324270, @uweigand wrote:

This matches my understanding. (Though I must admit I'm not 100% confident I fully understand the precise distinction between "undefined behavior", a poison value, and an undef value ... It's certainly one of those :-) But from my reading of the IR reference, "poison" does indeed look correct here.)

OK, thank you!
It's poison. UB would be too strong. I've put on my todo list to fix LangRef as well.

The reason why I was hesitating about poison vs. UB is exceptions. This is clearer when talking about the exception metadata as opposed to the rounding mode metadata: if exception metadata promises that FP exceptions are disabled, but they are actually enabled at runtime, compiler optimizations are free to generate code triggering spurious exceptions that might kill the program outright - that seems more like UB than poison to me.

With the rounding mode, this is much less obvious to me, but I guess in theory it could happen that if the actual rounding mode does not match the mode promised by the metadata, a compiler optimization might possibly generate code that could now e.g. introduce an underflow where the unoptimized code would not have one, and if in addition exception are also enabled, that extra underflow could now trigger an extra exception. However, I'm not sure if that can actually ever occur in practice ...

Thank you!
Makes sense. I need to think a bit more about this.
Ideally we want to allow intrinsics marked as not throwing exceptions to be executed speculatively. The semantics needs to be crafted to allow that scenario.
I think returning poison for the rounding mode is OK as that is sufficient to trigger an exception (as poison can be replaced with any value). So we can keep the mismatch in rounding modes as poison. And then move the complexity to the exception semantics. It's not obvious to me what's the ideal semantics yet.

In D64746#3325655, @nlopes wrote:

In D64746#3325599, @uweigand wrote:

In D64746#3324418, @nlopes wrote:

In D64746#3324270, @uweigand wrote:

This matches my understanding. (Though I must admit I'm not 100% confident I fully understand the precise distinction between "undefined behavior", a poison value, and an undef value ... It's certainly one of those :-) But from my reading of the IR reference, "poison" does indeed look correct here.)

OK, thank you!
It's poison. UB would be too strong. I've put on my todo list to fix LangRef as well.

The reason why I was hesitating about poison vs. UB is exceptions. This is clearer when talking about the exception metadata as opposed to the rounding mode metadata: if exception metadata promises that FP exceptions are disabled, but they are actually enabled at runtime, compiler optimizations are free to generate code triggering spurious exceptions that might kill the program outright - that seems more like UB than poison to me.

With the rounding mode, this is much less obvious to me, but I guess in theory it could happen that if the actual rounding mode does not match the mode promised by the metadata, a compiler optimization might possibly generate code that could now e.g. introduce an underflow where the unoptimized code would not have one, and if in addition exception are also enabled, that extra underflow could now trigger an extra exception. However, I'm not sure if that can actually ever occur in practice ...

Thank you!
Makes sense. I need to think a bit more about this.
Ideally we want to allow intrinsics marked as not throwing exceptions to be executed speculatively. The semantics needs to be crafted to allow that scenario.
I think returning poison for the rounding mode is OK as that is sufficient to trigger an exception (as poison can be replaced with any value). So we can keep the mismatch in rounding modes as poison. And then move the complexity to the exception semantics. It's not obvious to me what's the ideal semantics yet.

I'm a little worried about the runtime rounding mode being changed and speculative execution moving an instruction to a point where the wrong rounding mode is in effect. I think a dynamic rounding mode intrinsic call shouldn't be moved past a function call. I'm pretty sure the PowerPC backend already has this rule. But it's still possible for a function call to change the rounding mode with the rounding mode metadata properly representing this, but code movement optimizations might not take this into account and we'll have a problem.

I'm pretty sure the rounding metadata is there because the compiler can't know the actual rounding mode used at runtime. So I don't know how the compiler can know about UB and I don't know how it can use undef or poison.

Revision Contents

Path

Size

docs/

LangRef.rst

144 lines

include/

llvm/

CodeGen/

ISDOpcodes.h

3 lines

SelectionDAGNodes.h

4 lines

TargetLowering.h

4 lines

IR/

IntrinsicInst.h

4 lines

Intrinsics.td

14 lines

Target/

TargetSelectionDAG.td

20 lines

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

65 lines

LegalizeIntegerTypes.cpp

22 lines

LegalizeTypes.h

1 line

LegalizeVectorOps.cpp

8 lines

LegalizeVectorTypes.cpp

24 lines

SelectionDAG.cpp

4 lines

SelectionDAGBuilder.cpp

16 lines

SelectionDAGDumper.cpp

4 lines

TargetLoweringBase.cpp

18 lines

IR/

IntrinsicInst.cpp

4 lines

Verifier.cpp

13 lines

test/

CodeGen/

X86/

fp-intrinsics.ll

84 lines

vector-constrained-fp-intrinsics.ll

1540 lines

Feature/

fp-intrinsics.ll

92 lines

Diff 209864

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 15,679 Lines • ▼ Show 20 Lines

	This function returns the same values as the libm ``rint`` functions			This function returns the same values as the libm ``rint`` functions
	would, and handles error conditions in the same way. The rounding mode is			would, and handles error conditions in the same way. The rounding mode is
	described, not determined, by the rounding mode argument. The actual rounding			described, not determined, by the rounding mode argument. The actual rounding
	mode is determined by the runtime floating-point environment. The rounding			mode is determined by the runtime floating-point environment. The rounding
	mode argument is only intended as information to the compiler.			mode argument is only intended as information to the compiler.


				'``llvm.experimental.constrained.lrint``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <inttype>
				@llvm.experimental.constrained.lrint(<fptype> <op1>,
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
				operand rounded to the nearest integer. It may raise an inexact floating-point
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions I know you're just copying what we previously said about the constrained rint, but we should probably say that it "will" raise this exception. andrew.w.kaylor: I know you're just copying what we previously said about the constrained rint, but we should…
				exception if the operand is not an integer.

				Arguments:
				""""""""""

				The first argument is a floating-point number. The return value is an
				integer type.

				The second and third arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				This function returns the same values as the libm ``lrint`` functions
				would, and handles error conditions in the same way. The rounding mode is
				described, not determined, by the rounding mode argument. The actual rounding
				mode is determined by the runtime floating-point environment. The rounding
				mode argument is only intended as information to the compiler.
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions We should describe what is returned if the value is too large to be represented as a long. The llvm.lrint doesn't do that either, but it should too. andrew.w.kaylor: We should describe what is returned if the value is too large to be represented as a long. The…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions What do we want this to be? My draft copy of C99 says the return value is "unspecified". What does that translate to in LLVM-land? Is this listed in IEEE 754? kpn: What do we want this to be? My draft copy of C99 says the return value is "unspecified". What…
				cameron.mcinallyUnsubmitted Not Done Reply Inline Actions That should throw an Invalid exception. And I think we agreed in your other Diff that it should return a poison value. cameron.mcinally: That should throw an Invalid exception. And I think we agreed in your other Diff that it should…
				cameron.mcinallyUnsubmitted Not Done Reply Inline Actions 7.2 Invalid operation <...snip...> For operations producing no result in floating-point format, the operations that signal the invalid operation exception are: i) conversion of a floating-point number to an integer format, when the source is NaN, infinity, or a value that would convert to an integer outside the range of the result format under the applicable rounding attribute cameron.mcinally: ``` 7.2 Invalid operation <...snip...> For operations producing no result in floating-point…
				kpnAuthorUnsubmitted Not Done Reply Inline Actions Poison only in the constant folding case, right? What about in the non-constant-folding case? kpn: Poison only in the constant folding case, right? What about in the non-constant-folding case?
				cameron.mcinallyUnsubmitted Not Done Reply Inline Actions Poison only in the constant folding case, right? What about in the non-constant-folding case? @eli.friedman Besides the obvious flag raising, that would be undefined. It would be up to the hardware. cameron.mcinally: > Poison only in the constant folding case, right? What about in the non-constant-folding case?
				efriedmaUnsubmitted Not Done Reply Inline Actions LangRef defines the semantics of the IR; what the current version of the optimizer does or does not constant-fold, or otherwise optimize, is irrelevant. llvm.experimental.constrained.llrint should probably do the same thing as llvm.llrint, whatever that is, if the rounding mode is the default. If that somehow didn't get defined in LangRef, please make a separate patch for that. efriedma: LangRef defines the semantics of the IR; what the current version of the optimizer does or does…


				'``llvm.experimental.constrained.llrint``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <inttype>
				@llvm.experimental.constrained.llrint(<fptype> <op1>,
				andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions It seems like we have a problem here. The intrinsics are overloaded to take any integer as a return type, but not all integers will match up with a real library call. Is that handled anywhere? andrew.w.kaylor: It seems like we have a problem here. The intrinsics are overloaded to take any integer as a…
				efriedmaUnsubmitted Not Done Reply Inline Actions If you try to call an intrinsic where there is no underlying library call, it'll fail to compile, I assume. This doesn't really seem like a problem. efriedma: If you try to call an intrinsic where there is no underlying library call, it'll fail to…
				metadata <rounding mode>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
				operand rounded to the nearest integer. It may raise an inexact floating-point
				exception if the operand is not an integer.

				Arguments:
				""""""""""

				The first argument is a floating-point number. The return value is an
				integer type.

				The second and third arguments specify the rounding mode and exception
				behavior as described above.

				Semantics:
				""""""""""

				This function returns the same values as the libm ``llrint`` functions
				would, and handles error conditions in the same way. The rounding mode is
				described, not determined, by the rounding mode argument. The actual rounding
				mode is determined by the runtime floating-point environment. The rounding
				mode argument is only intended as information to the compiler.


	'``llvm.experimental.constrained.nearbyint``' Intrinsic			'``llvm.experimental.constrained.nearbyint``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	▲ Show 20 Lines • Show All 206 Lines • ▼ Show 20 Lines

	Semantics:			Semantics:
	""""""""""			""""""""""

	This function returns the same values as the libm ``round`` functions			This function returns the same values as the libm ``round`` functions
	would and handles error conditions in the same way.			would and handles error conditions in the same way.


				'``llvm.experimental.constrained.lround``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <inttype>
				@llvm.experimental.constrained.lround(<fptype> <op1>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.lround``' intrinsic returns the first
				operand rounded to the nearest integer.

				Arguments:
				""""""""""

				The first argument is a floating-point number. The return value is an
				integer type.
				andrew.w.kaylorUnsubmitted Done Reply Inline Actions Can you describe the rounding mode that will be used? You should also describe the conditions under which an exception will be raised. andrew.w.kaylor: Can you describe the rounding mode that will be used? You should also describe the conditions…

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				This function returns the same values as the libm ``lround`` functions
				would and handles error conditions in the same way.


				'``llvm.experimental.constrained.llround``' Intrinsic
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				Syntax:
				"""""""

				::

				declare <inttype>
				@llvm.experimental.constrained.llround(<fptype> <op1>,
				metadata <exception behavior>)

				Overview:
				"""""""""

				The '``llvm.experimental.constrained.llround``' intrinsic returns the first
				operand rounded to the nearest integer.

				Arguments:
				""""""""""

				The first argument is a floating-point number. The return value is an
				integer type.

				The second argument specifies the exception behavior as described above.

				Semantics:
				""""""""""

				This function returns the same values as the libm ``llround`` functions
				would and handles error conditions in the same way.


	'``llvm.experimental.constrained.trunc``' Intrinsic			'``llvm.experimental.constrained.trunc``' Intrinsic
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Syntax:			Syntax:
	"""""""			"""""""

	::			::

	▲ Show 20 Lines • Show All 1,502 Lines • Show Last 20 Lines

include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 294 Lines • ▼ Show 20 Lines	enum NodeType {

/// Constrained versions of libm-equivalent floating point intrinsics.		/// Constrained versions of libm-equivalent floating point intrinsics.
/// These will be lowered to the equivalent non-constrained pseudo-op		/// These will be lowered to the equivalent non-constrained pseudo-op
/// (or expanded to the equivalent library call) before final selection.		/// (or expanded to the equivalent library call) before final selection.
/// They are used to limit optimizations while the DAG is being optimized.		/// They are used to limit optimizations while the DAG is being optimized.
STRICT_FSQRT, STRICT_FPOW, STRICT_FPOWI, STRICT_FSIN, STRICT_FCOS,		STRICT_FSQRT, STRICT_FPOW, STRICT_FPOWI, STRICT_FSIN, STRICT_FCOS,
STRICT_FEXP, STRICT_FEXP2, STRICT_FLOG, STRICT_FLOG10, STRICT_FLOG2,		STRICT_FEXP, STRICT_FEXP2, STRICT_FLOG, STRICT_FLOG10, STRICT_FLOG2,
STRICT_FRINT, STRICT_FNEARBYINT, STRICT_FMAXNUM, STRICT_FMINNUM,		STRICT_FRINT, STRICT_FNEARBYINT, STRICT_FMAXNUM, STRICT_FMINNUM,
STRICT_FCEIL, STRICT_FFLOOR, STRICT_FROUND, STRICT_FTRUNC,		STRICT_FCEIL, STRICT_FFLOOR, STRICT_LROUND, STRICT_LLROUND, STRICT_FROUND,
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions What's the logic behind the ordering here? If there's no good reason to insert new nodes in the middle of the existing list, it would be kinder to people maintaining out-of-tree branches if you would append them to the end of this group. andrew.w.kaylor: What's the logic behind the ordering here? If there's no good reason to insert new nodes in the…
		kpnAuthorUnsubmitted Done Reply Inline Actions Eh, I was just lumping the rounding nodes together. It's not important. Avoiding pain downstream overwhelms that weak reason. I'll change it. kpn: Eh, I was just lumping the rounding nodes together. It's not important. Avoiding pain…
		STRICT_FTRUNC, STRICT_LRINT, STRICT_LLRINT,

/// X = STRICT_FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating		/// X = STRICT_FP_ROUND(Y, TRUNC) - Rounding 'Y' from a larger floating
/// point type down to the precision of the destination VT. TRUNC is a		/// point type down to the precision of the destination VT. TRUNC is a
/// flag, which is always an integer that is zero or one. If TRUNC is 0,		/// flag, which is always an integer that is zero or one. If TRUNC is 0,
/// this is a normal rounding, if it is 1, this FP_ROUND is known to not		/// this is a normal rounding, if it is 1, this FP_ROUND is known to not
/// change the value of Y.		/// change the value of Y.
///		///
/// The TRUNC = 1 case is used in cases where we know that the value will		/// The TRUNC = 1 case is used in cases where we know that the value will
▲ Show 20 Lines • Show All 763 Lines • Show Last 20 Lines

include/llvm/CodeGen/SelectionDAGNodes.h

Show First 20 Lines • Show All 690 Lines • ▼ Show 20 Lines	switch (NodeType) {
case ISD::STRICT_FPOWI:		case ISD::STRICT_FPOWI:
case ISD::STRICT_FSIN:		case ISD::STRICT_FSIN:
case ISD::STRICT_FCOS:		case ISD::STRICT_FCOS:
case ISD::STRICT_FEXP:		case ISD::STRICT_FEXP:
case ISD::STRICT_FEXP2:		case ISD::STRICT_FEXP2:
case ISD::STRICT_FLOG:		case ISD::STRICT_FLOG:
case ISD::STRICT_FLOG10:		case ISD::STRICT_FLOG10:
case ISD::STRICT_FLOG2:		case ISD::STRICT_FLOG2:
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
return true;		return true;
}		}
}		}

▲ Show 20 Lines • Show All 1,917 Lines • Show Last 20 Lines

include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 905 Lines • ▼ Show 20 Lines	switch (Op) {
case ISD::STRICT_FMA: EqOpc = ISD::FMA; break;		case ISD::STRICT_FMA: EqOpc = ISD::FMA; break;
case ISD::STRICT_FSIN: EqOpc = ISD::FSIN; break;		case ISD::STRICT_FSIN: EqOpc = ISD::FSIN; break;
case ISD::STRICT_FCOS: EqOpc = ISD::FCOS; break;		case ISD::STRICT_FCOS: EqOpc = ISD::FCOS; break;
case ISD::STRICT_FEXP: EqOpc = ISD::FEXP; break;		case ISD::STRICT_FEXP: EqOpc = ISD::FEXP; break;
case ISD::STRICT_FEXP2: EqOpc = ISD::FEXP2; break;		case ISD::STRICT_FEXP2: EqOpc = ISD::FEXP2; break;
case ISD::STRICT_FLOG: EqOpc = ISD::FLOG; break;		case ISD::STRICT_FLOG: EqOpc = ISD::FLOG; break;
case ISD::STRICT_FLOG10: EqOpc = ISD::FLOG10; break;		case ISD::STRICT_FLOG10: EqOpc = ISD::FLOG10; break;
case ISD::STRICT_FLOG2: EqOpc = ISD::FLOG2; break;		case ISD::STRICT_FLOG2: EqOpc = ISD::FLOG2; break;
		case ISD::STRICT_LRINT: EqOpc = ISD::LRINT; break;
		case ISD::STRICT_LLRINT: EqOpc = ISD::LLRINT; break;
case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;		case ISD::STRICT_FRINT: EqOpc = ISD::FRINT; break;
case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;		case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;
case ISD::STRICT_FMAXNUM: EqOpc = ISD::FMAXNUM; break;		case ISD::STRICT_FMAXNUM: EqOpc = ISD::FMAXNUM; break;
case ISD::STRICT_FMINNUM: EqOpc = ISD::FMINNUM; break;		case ISD::STRICT_FMINNUM: EqOpc = ISD::FMINNUM; break;
case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;		case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;
case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;		case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;
		case ISD::STRICT_LROUND: EqOpc = ISD::LROUND; break;
		case ISD::STRICT_LLROUND: EqOpc = ISD::LLROUND; break;
case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;		case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;
case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;		case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;
case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;		case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;
case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;		case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;
}		}

auto Action = getOperationAction(EqOpc, VT);		auto Action = getOperationAction(EqOpc, VT);

▲ Show 20 Lines • Show All 3,164 Lines • Show Last 20 Lines

include/llvm/IR/IntrinsicInst.h

Show First 20 Lines • Show All 265 Lines • ▼ Show 20 Lines	static bool classof(const IntrinsicInst *I) {
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
		case Intrinsic::experimental_constrained_lrint:
		case Intrinsic::experimental_constrained_llrint:
case Intrinsic::experimental_constrained_rint:		case Intrinsic::experimental_constrained_rint:
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
case Intrinsic::experimental_constrained_maxnum:		case Intrinsic::experimental_constrained_maxnum:
case Intrinsic::experimental_constrained_minnum:		case Intrinsic::experimental_constrained_minnum:
case Intrinsic::experimental_constrained_ceil:		case Intrinsic::experimental_constrained_ceil:
case Intrinsic::experimental_constrained_floor:		case Intrinsic::experimental_constrained_floor:
		case Intrinsic::experimental_constrained_lround:
		case Intrinsic::experimental_constrained_llround:
case Intrinsic::experimental_constrained_round:		case Intrinsic::experimental_constrained_round:
case Intrinsic::experimental_constrained_trunc:		case Intrinsic::experimental_constrained_trunc:
return true;		return true;
default: return false;		default: return false;
}		}
}		}
static bool classof(const Value *V) {		static bool classof(const Value *V) {
return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));		return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
▲ Show 20 Lines • Show All 599 Lines • Show Last 20 Lines

include/llvm/IR/Intrinsics.td

Show First 20 Lines • Show All 670 Lines • ▼ Show 20 Lines	let IntrProperties = [IntrInaccessibleMemOnly] in {
def int_experimental_constrained_rint : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_rint : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_nearbyint : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_nearbyint : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
		def int_experimental_constrained_lrint : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty,
		llvm_metadata_ty ]>;
		def int_experimental_constrained_llrint : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty,
		llvm_metadata_ty ]>;
def int_experimental_constrained_maxnum : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_maxnum : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_minnum : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_minnum : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
LLVMMatchType<0>,		LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_ceil : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_ceil : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_floor : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_floor : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
		def int_experimental_constrained_lround : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;
		def int_experimental_constrained_llround : Intrinsic<[ llvm_anyint_ty ],
		[ llvm_anyfloat_ty,
		llvm_metadata_ty ]>;
def int_experimental_constrained_round : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_round : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
def int_experimental_constrained_trunc : Intrinsic<[ llvm_anyfloat_ty ],		def int_experimental_constrained_trunc : Intrinsic<[ llvm_anyfloat_ty ],
[ LLVMMatchType<0>,		[ LLVMMatchType<0>,
llvm_metadata_ty,		llvm_metadata_ty,
llvm_metadata_ty ]>;		llvm_metadata_ty ]>;
▲ Show 20 Lines • Show All 544 Lines • Show Last 20 Lines

include/llvm/Target/TargetSelectionDAG.td

Show First 20 Lines • Show All 487 Lines • ▼ Show 20 Lines
def strict_fexp2 : SDNode<"ISD::STRICT_FEXP2",		def strict_fexp2 : SDNode<"ISD::STRICT_FEXP2",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fpow : SDNode<"ISD::STRICT_FPOW",		def strict_fpow : SDNode<"ISD::STRICT_FPOW",
SDTFPBinOp, [SDNPHasChain]>;		SDTFPBinOp, [SDNPHasChain]>;
def strict_flog2 : SDNode<"ISD::STRICT_FLOG2",		def strict_flog2 : SDNode<"ISD::STRICT_FLOG2",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
def strict_frint : SDNode<"ISD::STRICT_FRINT",		def strict_frint : SDNode<"ISD::STRICT_FRINT",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
		def strict_lrint : SDNode<"ISD::STRICT_LRINT",
		SDTFPToIntOp, [SDNPHasChain]>;
		def strict_llrint : SDNode<"ISD::STRICT_LLRINT",
		SDTFPToIntOp, [SDNPHasChain]>;
def strict_fnearbyint : SDNode<"ISD::STRICT_FNEARBYINT",		def strict_fnearbyint : SDNode<"ISD::STRICT_FNEARBYINT",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fceil : SDNode<"ISD::STRICT_FCEIL",		def strict_fceil : SDNode<"ISD::STRICT_FCEIL",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
def strict_ffloor : SDNode<"ISD::STRICT_FFLOOR",		def strict_ffloor : SDNode<"ISD::STRICT_FFLOOR",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
		def strict_lround : SDNode<"ISD::STRICT_LROUND",
		SDTFPToIntOp, [SDNPHasChain]>;
		def strict_llround : SDNode<"ISD::STRICT_LLROUND",
		SDTFPToIntOp, [SDNPHasChain]>;
def strict_fround : SDNode<"ISD::STRICT_FROUND",		def strict_fround : SDNode<"ISD::STRICT_FROUND",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
def strict_ftrunc : SDNode<"ISD::STRICT_FTRUNC",		def strict_ftrunc : SDNode<"ISD::STRICT_FTRUNC",
SDTFPUnaryOp, [SDNPHasChain]>;		SDTFPUnaryOp, [SDNPHasChain]>;
def strict_fminnum : SDNode<"ISD::STRICT_FMINNUM",		def strict_fminnum : SDNode<"ISD::STRICT_FMINNUM",
SDTFPBinOp, [SDNPHasChain,		SDTFPBinOp, [SDNPHasChain,
SDNPCommutative, SDNPAssociative]>;		SDNPCommutative, SDNPAssociative]>;
def strict_fmaxnum : SDNode<"ISD::STRICT_FMAXNUM",		def strict_fmaxnum : SDNode<"ISD::STRICT_FMAXNUM",
▲ Show 20 Lines • Show All 756 Lines • ▼ Show 20 Lines	def any_fpow : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fpow node:$lhs, node:$rhs),		[(strict_fpow node:$lhs, node:$rhs),
(fpow node:$lhs, node:$rhs)]>;		(fpow node:$lhs, node:$rhs)]>;
def any_flog2 : PatFrags<(ops node:$src),		def any_flog2 : PatFrags<(ops node:$src),
[(strict_flog2 node:$src),		[(strict_flog2 node:$src),
(flog2 node:$src)]>;		(flog2 node:$src)]>;
def any_frint : PatFrags<(ops node:$src),		def any_frint : PatFrags<(ops node:$src),
[(strict_frint node:$src),		[(strict_frint node:$src),
(frint node:$src)]>;		(frint node:$src)]>;
		def any_lrint : PatFrags<(ops node:$src),
		[(strict_lrint node:$src),
		(lrint node:$src)]>;
		def any_llrint : PatFrags<(ops node:$src),
		[(strict_llrint node:$src),
		(llrint node:$src)]>;
def any_fnearbyint : PatFrags<(ops node:$src),		def any_fnearbyint : PatFrags<(ops node:$src),
[(strict_fnearbyint node:$src),		[(strict_fnearbyint node:$src),
(fnearbyint node:$src)]>;		(fnearbyint node:$src)]>;
def any_fceil : PatFrags<(ops node:$src),		def any_fceil : PatFrags<(ops node:$src),
[(strict_fceil node:$src),		[(strict_fceil node:$src),
(fceil node:$src)]>;		(fceil node:$src)]>;
def any_ffloor : PatFrags<(ops node:$src),		def any_ffloor : PatFrags<(ops node:$src),
[(strict_ffloor node:$src),		[(strict_ffloor node:$src),
(ffloor node:$src)]>;		(ffloor node:$src)]>;
		def any_lround : PatFrags<(ops node:$src),
		[(strict_lround node:$src),
		(lround node:$src)]>;
		def any_llround : PatFrags<(ops node:$src),
		[(strict_llround node:$src),
		(llround node:$src)]>;
def any_fround : PatFrags<(ops node:$src),		def any_fround : PatFrags<(ops node:$src),
[(strict_fround node:$src),		[(strict_fround node:$src),
(fround node:$src)]>;		(fround node:$src)]>;
def any_ftrunc : PatFrags<(ops node:$src),		def any_ftrunc : PatFrags<(ops node:$src),
[(strict_ftrunc node:$src),		[(strict_ftrunc node:$src),
(ftrunc node:$src)]>;		(ftrunc node:$src)]>;
def any_fmaxnum : PatFrags<(ops node:$lhs, node:$rhs),		def any_fmaxnum : PatFrags<(ops node:$lhs, node:$rhs),
[(strict_fmaxnum node:$lhs, node:$rhs),		[(strict_fmaxnum node:$lhs, node:$rhs),
▲ Show 20 Lines • Show All 211 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	void ReplacedNode(SDNode *N) {
if (UpdatedNodes)		if (UpdatedNodes)
UpdatedNodes->insert(N);		UpdatedNodes->insert(N);
}		}

void ReplaceNode(SDNode Old, SDNode New) {		void ReplaceNode(SDNode Old, SDNode New) {
LLVM_DEBUG(dbgs() << " ... replacing: "; Old->dump(&DAG);		LLVM_DEBUG(dbgs() << " ... replacing: "; Old->dump(&DAG);
dbgs() << " with: "; New->dump(&DAG));		dbgs() << " with: "; New->dump(&DAG));

assert(Old->getNumValues() == New->getNumValues() &&		assert(Old->getNumValues() <= New->getNumValues() &&
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Why are you changing this assertion? andrew.w.kaylor: Why are you changing this assertion?
		kpnAuthorUnsubmitted Done Reply Inline Actions Because it is stronger than SelectionDAG::ReplaceAllUsesWith(SDNode From, SDNode To) needs it to be (unless I'm wrong), and because without this change the test case cannot pass. The conversion to a libcall results in a node with result value, a chain, and glue. There was no glue before, so the added result would trigger the assert. kpn: Because it is stronger than SelectionDAG::ReplaceAllUsesWith(SDNode From, SDNode To) needs it…
"Replacing one node with another that produces a different number "		"Replacing one node with another that produces a different number "
"of values!");		"of values!");
DAG.ReplaceAllUsesWith(Old, New);		DAG.ReplaceAllUsesWith(Old, New);
if (UpdatedNodes)		if (UpdatedNodes)
UpdatedNodes->insert(New);		UpdatedNodes->insert(New);
ReplacedNode(Old);		ReplacedNode(Old);
}		}

▲ Show 20 Lines • Show All 908 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
// These pseudo-ops get legalized as if they were their non-strict		// These pseudo-ops get legalized as if they were their non-strict
// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT		// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT
// is also legal, but if ISD::FSQRT requires expansion then so does		// is also legal, but if ISD::FSQRT requires expansion then so does
// ISD::STRICT_FSQRT.		// ISD::STRICT_FSQRT.
Action = TLI.getStrictFPOperationAction(Node->getOpcode(),		Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
break;		break;
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND:
		// These pseudo-ops are the same as the other STRICT_ ops except
		// they are registered with setOperationAction() using the input type
		// instead of the output type.
		Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
		Node->getOperand(1).getValueType());
		break;
case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: {		case ISD::USUBSAT: {
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
break;		break;
}		}
case ISD::SMULFIX:		case ISD::SMULFIX:
▲ Show 20 Lines • Show All 881 Lines • ▼ Show 20 Lines
}		}

// Expand a node into a call to a libcall. If the result value		// Expand a node into a call to a libcall. If the result value
// does not fit into a register, return the lo part and set the hi part to the		// does not fit into a register, return the lo part and set the hi part to the
// by-reg argument. If it does fit into a single register, return the result		// by-reg argument. If it does fit into a single register, return the result
// and leave the Hi part unset.		// and leave the Hi part unset.
SDValue SelectionDAGLegalize::ExpandLibCall(RTLIB::Libcall LC, SDNode *Node,		SDValue SelectionDAGLegalize::ExpandLibCall(RTLIB::Libcall LC, SDNode *Node,
bool isSigned) {		bool isSigned) {
		SDValue CurInChain;
TargetLowering::ArgListTy Args;		TargetLowering::ArgListTy Args;
TargetLowering::ArgListEntry Entry;		TargetLowering::ArgListEntry Entry;
for (const SDValue &Op : Node->op_values()) {		for (const SDValue &Op : Node->op_values()) {
EVT ArgVT = Op.getValueType();		EVT ArgVT = Op.getValueType();
		if (ArgVT.isSimple() && ArgVT.getSimpleVT() == MVT::Other) {
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions It's not clear to me why this is necessary now but hasn't been before. Since we're expanding to a library call, I don't think we need to preserve the chain that the strict FP node was using. andrew.w.kaylor: It's not clear to me why this is necessary now but hasn't been before. Since we're expanding to…
		kpnAuthorUnsubmitted Done Reply Inline Actions So we don't need the chain to preserve ordering since it's a function call? That makes things simpler. Here, at least. I don't know what happens in ReplaceNode when a Chain result gets swapped out with a Glue result. kpn: So we don't need the chain to preserve ordering since it's a function call? That makes things…
		CurInChain = Op;
		continue;
		}
Type ArgTy = ArgVT.getTypeForEVT(DAG.getContext());		Type ArgTy = ArgVT.getTypeForEVT(DAG.getContext());
Entry.Node = Op;		Entry.Node = Op;
Entry.Ty = ArgTy;		Entry.Ty = ArgTy;
Entry.IsSExt = TLI.shouldSignExtendTypeInLibCall(ArgVT, isSigned);		Entry.IsSExt = TLI.shouldSignExtendTypeInLibCall(ArgVT, isSigned);
Entry.IsZExt = !TLI.shouldSignExtendTypeInLibCall(ArgVT, isSigned);		Entry.IsZExt = !TLI.shouldSignExtendTypeInLibCall(ArgVT, isSigned);
Args.push_back(Entry);		Args.push_back(Entry);
}		}
SDValue Callee = DAG.getExternalSymbol(TLI.getLibcallName(LC),		SDValue Callee = DAG.getExternalSymbol(TLI.getLibcallName(LC),
TLI.getPointerTy(DAG.getDataLayout()));		TLI.getPointerTy(DAG.getDataLayout()));

EVT RetVT = Node->getValueType(0);		EVT RetVT = Node->getValueType(0);
Type RetTy = RetVT.getTypeForEVT(DAG.getContext());		Type RetTy = RetVT.getTypeForEVT(DAG.getContext());

// By default, the input chain to this libcall is the entry node of the		// By default, the input chain to this libcall is the entry node of the
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions If these changes do need to stay, you'll need to update this comment. andrew.w.kaylor: If these changes do need to stay, you'll need to update this comment.
		kpnAuthorUnsubmitted Done Reply Inline Actions I'd love to see a test case for this. You'll notice the assert() I added below because I don't have one. That would make it much easier to update the comment. it does need to be updated. kpn: I'd love to see a test case for this. You'll notice the assert() I added below because I don't…
// function. If the libcall is going to be emitted as a tail call then		// function. If the libcall is going to be emitted as a tail call then
// TLI.isUsedByReturnOnly will change it to the right chain if the return		// TLI.isUsedByReturnOnly will change it to the right chain if the return
// node which is being folded has a non-entry input chain.		// node which is being folded has a non-entry input chain.
SDValue InChain = DAG.getEntryNode();		SDValue InChain;
		if (Node->isStrictFPOpcode())
		InChain = CurInChain;
		else
		InChain = DAG.getEntryNode();

// isTailCall may be true since the callee does not reference caller stack		// isTailCall may be true since the callee does not reference caller stack
// frame. Check if it's in the right position and that the return types match.		// frame. Check if it's in the right position and that the return types match.
SDValue TCChain = InChain;		SDValue TCChain = InChain;
const Function &F = DAG.getMachineFunction().getFunction();		const Function &F = DAG.getMachineFunction().getFunction();
bool isTailCall =		bool isTailCall =
TLI.isInTailCallPosition(DAG, Node, TCChain) &&		TLI.isInTailCallPosition(DAG, Node, TCChain) &&
(RetTy == F.getReturnType() \|\| F.getReturnType()->isVoidTy());		(RetTy == F.getReturnType() \|\| F.getReturnType()->isVoidTy());
if (isTailCall)		if (isTailCall)
InChain = TCChain;		InChain = TCChain;
		assert(!(isTailCall && Node->isStrictFPOpcode()) &&
		"Constrained FP tail calls are untested.");

TargetLowering::CallLoweringInfo CLI(DAG);		TargetLowering::CallLoweringInfo CLI(DAG);
bool signExtend = TLI.shouldSignExtendTypeInLibCall(RetVT, isSigned);		bool signExtend = TLI.shouldSignExtendTypeInLibCall(RetVT, isSigned);
CLI.setDebugLoc(SDLoc(Node))		CLI.setDebugLoc(SDLoc(Node))
.setChain(InChain)		.setChain(InChain)
.setLibCallee(TLI.getLibcallCallingConv(LC), RetTy, Callee,		.setLibCallee(TLI.getLibcallCallingConv(LC), RetTy, Callee,
std::move(Args))		std::move(Args))
.setTailCall(isTailCall)		.setTailCall(isTailCall)
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines
/// lround and its variant).		/// lround and its variant).
SDValue SelectionDAGLegalize::ExpandArgFPLibCall(SDNode* Node,		SDValue SelectionDAGLegalize::ExpandArgFPLibCall(SDNode* Node,
RTLIB::Libcall Call_F32,		RTLIB::Libcall Call_F32,
RTLIB::Libcall Call_F64,		RTLIB::Libcall Call_F64,
RTLIB::Libcall Call_F80,		RTLIB::Libcall Call_F80,
RTLIB::Libcall Call_F128,		RTLIB::Libcall Call_F128,
RTLIB::Libcall Call_PPCF128) {		RTLIB::Libcall Call_PPCF128) {
RTLIB::Libcall LC;		RTLIB::Libcall LC;
switch (Node->getOperand(0).getValueType().getSimpleVT().SimpleTy) {		unsigned OpNum = Node->isStrictFPOpcode() ? 1 : 0;
		switch (Node->getOperand(OpNum).getValueType().getSimpleVT().SimpleTy) {
default: llvm_unreachable("Unexpected request for libcall!");		default: llvm_unreachable("Unexpected request for libcall!");
case MVT::f32: LC = Call_F32; break;		case MVT::f32: LC = Call_F32; break;
case MVT::f64: LC = Call_F64; break;		case MVT::f64: LC = Call_F64; break;
case MVT::f80: LC = Call_F80; break;		case MVT::f80: LC = Call_F80; break;
case MVT::f128: LC = Call_F128; break;		case MVT::f128: LC = Call_F128; break;
case MVT::ppcf128: LC = Call_PPCF128; break;		case MVT::ppcf128: LC = Call_PPCF128; break;
}		}

▲ Show 20 Lines • Show All 726 Lines • ▼ Show 20 Lines	if (TLI.expandFP_TO_UINT(Node, Tmp1, DAG))
Results.push_back(Tmp1);		Results.push_back(Tmp1);
break;		break;
case ISD::LROUND:		case ISD::LROUND:
Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LROUND_F32,		Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LROUND_F32,
RTLIB::LROUND_F64, RTLIB::LROUND_F80,		RTLIB::LROUND_F64, RTLIB::LROUND_F80,
RTLIB::LROUND_F128,		RTLIB::LROUND_F128,
RTLIB::LROUND_PPCF128));		RTLIB::LROUND_PPCF128));
break;		break;
		case ISD::STRICT_LROUND:
		Tmp1 = ExpandArgFPLibCall(Node, RTLIB::LROUND_F32,
		RTLIB::LROUND_F64, RTLIB::LROUND_F80,
		RTLIB::LROUND_F128,
		RTLIB::LROUND_PPCF128);
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Shouldn't something be getting pushed to Results here? andrew.w.kaylor: Shouldn't something be getting pushed to Results here?
		kpnAuthorUnsubmitted Done Reply Inline Actions No. If we go that route then we end up in ReplaceNode(SDNode Old, const SDValue New) with multiple SDValues to replace but only one of them being given. This results in ReplaceNode() running off into some other memory and typically crashing. We need to instead call ReplaceNode(SDNode Old, SDNode New) so all values that need to be looked at are present. That's why I call ReplaceNode() myself here, and since we're bypassing the rest of the function I mostly-duplicated the debug message and the return statement. kpn: No. If we go that route then we end up in ReplaceNode(SDNode Old, const SDValue New) with…
		ReplaceNode(Node, Tmp1.getNode());
		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_LROUND node\n");
		return true;
		break;
case ISD::LLROUND:		case ISD::LLROUND:
Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLROUND_F32,		Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLROUND_F32,
RTLIB::LLROUND_F64, RTLIB::LLROUND_F80,		RTLIB::LLROUND_F64, RTLIB::LLROUND_F80,
RTLIB::LLROUND_F128,		RTLIB::LLROUND_F128,
RTLIB::LLROUND_PPCF128));		RTLIB::LLROUND_PPCF128));
break;		break;
		case ISD::STRICT_LLROUND:
		Tmp1 = ExpandArgFPLibCall(Node, RTLIB::LLROUND_F32,
		RTLIB::LLROUND_F64, RTLIB::LLROUND_F80,
		RTLIB::LLROUND_F128,
		RTLIB::LLROUND_PPCF128);
		ReplaceNode(Node, Tmp1.getNode());
		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_LLROUND node\n");
		return true;
		break;
case ISD::LRINT:		case ISD::LRINT:
Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LRINT_F32,		Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LRINT_F32,
RTLIB::LRINT_F64, RTLIB::LRINT_F80,		RTLIB::LRINT_F64, RTLIB::LRINT_F80,
RTLIB::LRINT_F128,		RTLIB::LRINT_F128,
RTLIB::LRINT_PPCF128));		RTLIB::LRINT_PPCF128));
break;		break;
		case ISD::STRICT_LRINT:
		Tmp1 = ExpandArgFPLibCall(Node, RTLIB::LRINT_F32,
		RTLIB::LRINT_F64, RTLIB::LRINT_F80,
		RTLIB::LRINT_F128,
		RTLIB::LRINT_PPCF128);
		ReplaceNode(Node, Tmp1.getNode());
		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_LRINT node\n");
		return true;
		break;
case ISD::LLRINT:		case ISD::LLRINT:
Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLRINT_F32,		Results.push_back(ExpandArgFPLibCall(Node, RTLIB::LLRINT_F32,
RTLIB::LLRINT_F64, RTLIB::LLRINT_F80,		RTLIB::LLRINT_F64, RTLIB::LLRINT_F80,
RTLIB::LLRINT_F128,		RTLIB::LLRINT_F128,
RTLIB::LLRINT_PPCF128));		RTLIB::LLRINT_PPCF128));
break;		break;
		case ISD::STRICT_LLRINT:
		Tmp1 = ExpandArgFPLibCall(Node, RTLIB::LLRINT_F32,
		RTLIB::LLRINT_F64, RTLIB::LLRINT_F80,
		RTLIB::LLRINT_F128,
		RTLIB::LLRINT_PPCF128);
		ReplaceNode(Node, Tmp1.getNode());
		LLVM_DEBUG(dbgs() << "Successfully expanded STRICT_LLRINT node\n");
		return true;
		break;
case ISD::VAARG:		case ISD::VAARG:
Results.push_back(DAG.expandVAArg(Node));		Results.push_back(DAG.expandVAArg(Node));
Results.push_back(Results[0].getValue(1));		Results.push_back(Results[0].getValue(1));
break;		break;
case ISD::VACOPY:		case ISD::VACOPY:
Results.push_back(DAG.expandVACopy(Node));		Results.push_back(DAG.expandVACopy(Node));
break;		break;
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
▲ Show 20 Lines • Show All 1,620 Lines • ▼ Show 20 Lines	for (auto NI = allnodes_end(); NI != allnodes_begin();) {

SDNode N = &NI;		SDNode N = &NI;
if (N->use_empty() && N != getRoot().getNode()) {		if (N->use_empty() && N != getRoot().getNode()) {
++NI;		++NI;
DeleteNode(N);		DeleteNode(N);
continue;		continue;
}		}

		LLVM_DEBUG(dbgs() << "\nExamining: "; N->dump(this));
if (LegalizedNodes.insert(N).second) {		if (LegalizedNodes.insert(N).second) {
AnyLegalized = true;		AnyLegalized = true;
Legalizer.LegalizeOp(N);		Legalizer.LegalizeOp(N);

if (N->use_empty() && N != getRoot().getNode()) {		if (N->use_empty() && N != getRoot().getNode()) {
++NI;		++NI;
DeleteNode(N);		DeleteNode(N);
}		}
Show All 23 Lines

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	#endif

case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;		case ISD::ANY_EXTEND: Res = PromoteIntRes_INT_EXTEND(N); break;

case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;		case ISD::FP_TO_UINT: Res = PromoteIntRes_FP_TO_XINT(N); break;

		case ISD::STRICT_LRINT:
		craig.topperUnsubmitted Not Done Reply Inline Actions Why do these need to be handled here, but the none strict versions aren't? craig.topper: Why do these need to be handled here, but the none strict versions aren't?
		case ISD::STRICT_LLRINT:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND: Res = PromoteIntRes_CHAINED(N); break;

case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;		case ISD::FP_TO_FP16: Res = PromoteIntRes_FP_TO_FP16(N); break;

case ISD::FLT_ROUNDS_: Res = PromoteIntRes_FLT_ROUNDS(N); break;		case ISD::FLT_ROUNDS_: Res = PromoteIntRes_FLT_ROUNDS(N); break;

case ISD::AND:		case ISD::AND:
case ISD::OR:		case ISD::OR:
case ISD::XOR:		case ISD::XOR:
case ISD::ADD:		case ISD::ADD:
▲ Show 20 Lines • Show All 384 Lines • ▼ Show 20 Lines

SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_FP_TO_FP16(SDNode *N) {
EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
SDLoc dl(N);		SDLoc dl(N);

return DAG.getNode(N->getOpcode(), dl, NVT, N->getOperand(0));		return DAG.getNode(N->getOpcode(), dl, NVT, N->getOperand(0));
}		}

		SDValue DAGTypeLegalizer::PromoteIntRes_CHAINED(SDNode *N) {
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
		unsigned NewOpc = N->getOpcode();
		SmallVector<SDValue, 4> Opers;
		SDLoc dl(N);

		for (unsigned i = 0; i < N->getNumOperands(); ++i)
		Opers.push_back(N->getOperand(i));

		SDValue Result = DAG.getNode(NewOpc, dl, { NVT, MVT::Other }, Opers);

		// Legalize the chain result - switch anything that used the old chain to
		// use the new one.
		ReplaceValueWith(SDValue(N, 1), Result.getValue(1));
		return Result;
		}

SDValue DAGTypeLegalizer::PromoteIntRes_FLT_ROUNDS(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_FLT_ROUNDS(SDNode *N) {
EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
SDLoc dl(N);		SDLoc dl(N);

return DAG.getNode(N->getOpcode(), dl, NVT);		return DAG.getNode(N->getOpcode(), dl, NVT);
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_INT_EXTEND(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_INT_EXTEND(SDNode *N) {
▲ Show 20 Lines • Show All 3,656 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 314 Lines • ▼ Show 20 Lines	private:
SDValue PromoteIntRes_BUILD_PAIR(SDNode *N);		SDValue PromoteIntRes_BUILD_PAIR(SDNode *N);
SDValue PromoteIntRes_Constant(SDNode *N);		SDValue PromoteIntRes_Constant(SDNode *N);
SDValue PromoteIntRes_CTLZ(SDNode *N);		SDValue PromoteIntRes_CTLZ(SDNode *N);
SDValue PromoteIntRes_CTPOP(SDNode *N);		SDValue PromoteIntRes_CTPOP(SDNode *N);
SDValue PromoteIntRes_CTTZ(SDNode *N);		SDValue PromoteIntRes_CTTZ(SDNode *N);
SDValue PromoteIntRes_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue PromoteIntRes_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue PromoteIntRes_FP_TO_XINT(SDNode *N);		SDValue PromoteIntRes_FP_TO_XINT(SDNode *N);
SDValue PromoteIntRes_FP_TO_FP16(SDNode *N);		SDValue PromoteIntRes_FP_TO_FP16(SDNode *N);
		SDValue PromoteIntRes_CHAINED(SDNode *N);
SDValue PromoteIntRes_INT_EXTEND(SDNode *N);		SDValue PromoteIntRes_INT_EXTEND(SDNode *N);
SDValue PromoteIntRes_LOAD(LoadSDNode *N);		SDValue PromoteIntRes_LOAD(LoadSDNode *N);
SDValue PromoteIntRes_MLOAD(MaskedLoadSDNode *N);		SDValue PromoteIntRes_MLOAD(MaskedLoadSDNode *N);
SDValue PromoteIntRes_MGATHER(MaskedGatherSDNode *N);		SDValue PromoteIntRes_MGATHER(MaskedGatherSDNode *N);
SDValue PromoteIntRes_Overflow(SDNode *N);		SDValue PromoteIntRes_Overflow(SDNode *N);
SDValue PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_SADDSUBO(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_SELECT(SDNode *N);		SDValue PromoteIntRes_SELECT(SDNode *N);
SDValue PromoteIntRes_VSELECT(SDNode *N);		SDValue PromoteIntRes_VSELECT(SDNode *N);
▲ Show 20 Lines • Show All 649 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 327 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::LegalizeOp(SDValue Op) {
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND:
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
// These pseudo-ops get legalized as if they were their non-strict		// These pseudo-ops get legalized as if they were their non-strict
// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT		// equivalent. For instance, if ISD::FSQRT is legal then ISD::STRICT_FSQRT
// is also legal, but if ISD::FSQRT requires expansion then so does		// is also legal, but if ISD::FSQRT requires expansion then so does
// ISD::STRICT_FSQRT.		// ISD::STRICT_FSQRT.
Action = TLI.getStrictFPOperationAction(Node->getOpcode(),		Action = TLI.getStrictFPOperationAction(Node->getOpcode(),
Node->getValueType(0));		Node->getValueType(0));
▲ Show 20 Lines • Show All 495 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::Expand(SDValue Op) {
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND:
return ExpandStrictFPOp(Op);		return ExpandStrictFPOp(Op);
case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
case ISD::VECREDUCE_XOR:		case ISD::VECREDUCE_XOR:
case ISD::VECREDUCE_SMAX:		case ISD::VECREDUCE_SMAX:
case ISD::VECREDUCE_SMIN:		case ISD::VECREDUCE_SMIN:
▲ Show 20 Lines • Show All 566 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_LROUND:
		craig.topperUnsubmitted Done Reply Inline Actions We don't handle the non-vector version of these here. So I don't think we need to handle these yet. craig.topper: We don't handle the non-vector version of these here. So I don't think we need to handle these…
		case ISD::STRICT_LLROUND:
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
R = ScalarizeVecRes_StrictFPOp(N);		R = ScalarizeVecRes_StrictFPOp(N);
break;		break;
case ISD::UADDO:		case ISD::UADDO:
case ISD::SADDO:		case ISD::SADDO:
case ISD::USUBO:		case ISD::USUBO:
case ISD::SSUBO:		case ISD::SSUBO:
case ISD::UMULO:		case ISD::UMULO:
▲ Show 20 Lines • Show All 777 Lines • ▼ Show 20 Lines	#endif
case ISD::STRICT_FRINT:		case ISD::STRICT_FRINT:
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND:
SplitVecRes_StrictFPOp(N, Lo, Hi);		SplitVecRes_StrictFPOp(N, Lo, Hi);
break;		break;
case ISD::UADDO:		case ISD::UADDO:
case ISD::SADDO:		case ISD::SADDO:
case ISD::USUBO:		case ISD::USUBO:
case ISD::SSUBO:		case ISD::SSUBO:
case ISD::UMULO:		case ISD::UMULO:
case ISD::SMULO:		case ISD::SMULO:
▲ Show 20 Lines • Show All 1,009 Lines • ▼ Show 20 Lines	#endif
case ISD::CTPOP:		case ISD::CTPOP:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::SIGN_EXTEND:		case ISD::SIGN_EXTEND:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
case ISD::ANY_EXTEND:		case ISD::ANY_EXTEND:
case ISD::FTRUNC:		case ISD::FTRUNC:
case ISD::FCANONICALIZE:		case ISD::FCANONICALIZE:
		case ISD::LRINT:
		case ISD::STRICT_LRINT:
		case ISD::LLRINT:
		case ISD::STRICT_LLRINT:
		case ISD::LROUND:
		case ISD::STRICT_LROUND:
		case ISD::LLROUND:
		case ISD::STRICT_LLROUND:
Res = SplitVecOp_UnaryOp(N);		Res = SplitVecOp_UnaryOp(N);
break;		break;

case ISD::ANY_EXTEND_VECTOR_INREG:		case ISD::ANY_EXTEND_VECTOR_INREG:
case ISD::SIGN_EXTEND_VECTOR_INREG:		case ISD::SIGN_EXTEND_VECTOR_INREG:
case ISD::ZERO_EXTEND_VECTOR_INREG:		case ISD::ZERO_EXTEND_VECTOR_INREG:
Res = SplitVecOp_ExtVecInRegOp(N);		Res = SplitVecOp_ExtVecInRegOp(N);
break;		break;
▲ Show 20 Lines • Show All 785 Lines • ▼ Show 20 Lines	#endif
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
Res = WidenVecRes_Convert(N);		Res = WidenVecRes_Convert(N);
break;		break;

case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND:
Res = WidenVecRes_Convert_StrictFP(N);		Res = WidenVecRes_Convert_StrictFP(N);
break;		break;

case ISD::FABS:		case ISD::FABS:
case ISD::FCEIL:		case ISD::FCEIL:
case ISD::FCOS:		case ISD::FCOS:
case ISD::FEXP:		case ISD::FEXP:
case ISD::FEXP2:		case ISD::FEXP2:
▲ Show 20 Lines • Show All 1,292 Lines • ▼ Show 20 Lines	#endif

case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::STRICT_FP_EXTEND:		case ISD::STRICT_FP_EXTEND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
		case ISD::STRICT_LRINT:
		case ISD::STRICT_LLRINT:
		case ISD::STRICT_LROUND:
		case ISD::STRICT_LLROUND:
Res = WidenVecOp_Convert(N);		Res = WidenVecOp_Convert(N);
break;		break;

case ISD::VECREDUCE_FADD:		case ISD::VECREDUCE_FADD:
case ISD::VECREDUCE_FMUL:		case ISD::VECREDUCE_FMUL:
case ISD::VECREDUCE_ADD:		case ISD::VECREDUCE_ADD:
case ISD::VECREDUCE_MUL:		case ISD::VECREDUCE_MUL:
case ISD::VECREDUCE_AND:		case ISD::VECREDUCE_AND:
▲ Show 20 Lines • Show All 941 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,760 Lines • ▼ Show 20 Lines	SDNode* SelectionDAG::mutateStrictFPToFP(SDNode *Node) {
case ISD::STRICT_FPOWI: NewOpc = ISD::FPOWI; break;		case ISD::STRICT_FPOWI: NewOpc = ISD::FPOWI; break;
case ISD::STRICT_FSIN: NewOpc = ISD::FSIN; break;		case ISD::STRICT_FSIN: NewOpc = ISD::FSIN; break;
case ISD::STRICT_FCOS: NewOpc = ISD::FCOS; break;		case ISD::STRICT_FCOS: NewOpc = ISD::FCOS; break;
case ISD::STRICT_FEXP: NewOpc = ISD::FEXP; break;		case ISD::STRICT_FEXP: NewOpc = ISD::FEXP; break;
case ISD::STRICT_FEXP2: NewOpc = ISD::FEXP2; break;		case ISD::STRICT_FEXP2: NewOpc = ISD::FEXP2; break;
case ISD::STRICT_FLOG: NewOpc = ISD::FLOG; break;		case ISD::STRICT_FLOG: NewOpc = ISD::FLOG; break;
case ISD::STRICT_FLOG10: NewOpc = ISD::FLOG10; break;		case ISD::STRICT_FLOG10: NewOpc = ISD::FLOG10; break;
case ISD::STRICT_FLOG2: NewOpc = ISD::FLOG2; break;		case ISD::STRICT_FLOG2: NewOpc = ISD::FLOG2; break;
		case ISD::STRICT_LRINT: NewOpc = ISD::LRINT; break;
		case ISD::STRICT_LLRINT: NewOpc = ISD::LLRINT; break;
case ISD::STRICT_FRINT: NewOpc = ISD::FRINT; break;		case ISD::STRICT_FRINT: NewOpc = ISD::FRINT; break;
case ISD::STRICT_FNEARBYINT: NewOpc = ISD::FNEARBYINT; break;		case ISD::STRICT_FNEARBYINT: NewOpc = ISD::FNEARBYINT; break;
case ISD::STRICT_FMAXNUM: NewOpc = ISD::FMAXNUM; break;		case ISD::STRICT_FMAXNUM: NewOpc = ISD::FMAXNUM; break;
case ISD::STRICT_FMINNUM: NewOpc = ISD::FMINNUM; break;		case ISD::STRICT_FMINNUM: NewOpc = ISD::FMINNUM; break;
case ISD::STRICT_FCEIL: NewOpc = ISD::FCEIL; break;		case ISD::STRICT_FCEIL: NewOpc = ISD::FCEIL; break;
case ISD::STRICT_FFLOOR: NewOpc = ISD::FFLOOR; break;		case ISD::STRICT_FFLOOR: NewOpc = ISD::FFLOOR; break;
		case ISD::STRICT_LROUND: NewOpc = ISD::LROUND; break;
		case ISD::STRICT_LLROUND: NewOpc = ISD::LLROUND; break;
case ISD::STRICT_FROUND: NewOpc = ISD::FROUND; break;		case ISD::STRICT_FROUND: NewOpc = ISD::FROUND; break;
case ISD::STRICT_FTRUNC: NewOpc = ISD::FTRUNC; break;		case ISD::STRICT_FTRUNC: NewOpc = ISD::FTRUNC; break;
case ISD::STRICT_FP_ROUND: NewOpc = ISD::FP_ROUND; break;		case ISD::STRICT_FP_ROUND: NewOpc = ISD::FP_ROUND; break;
case ISD::STRICT_FP_EXTEND: NewOpc = ISD::FP_EXTEND; break;		case ISD::STRICT_FP_EXTEND: NewOpc = ISD::FP_EXTEND; break;
}		}

assert(Node->getNumValues() == 2 && "Unexpected number of results!");		assert(Node->getNumValues() == 2 && "Unexpected number of results!");

▲ Show 20 Lines • Show All 1,812 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,069 Lines • ▼ Show 20 Lines	void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
		case Intrinsic::experimental_constrained_lrint:
		case Intrinsic::experimental_constrained_llrint:
case Intrinsic::experimental_constrained_rint:		case Intrinsic::experimental_constrained_rint:
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
case Intrinsic::experimental_constrained_maxnum:		case Intrinsic::experimental_constrained_maxnum:
case Intrinsic::experimental_constrained_minnum:		case Intrinsic::experimental_constrained_minnum:
case Intrinsic::experimental_constrained_ceil:		case Intrinsic::experimental_constrained_ceil:
case Intrinsic::experimental_constrained_floor:		case Intrinsic::experimental_constrained_floor:
		case Intrinsic::experimental_constrained_lround:
		case Intrinsic::experimental_constrained_llround:
case Intrinsic::experimental_constrained_round:		case Intrinsic::experimental_constrained_round:
case Intrinsic::experimental_constrained_trunc:		case Intrinsic::experimental_constrained_trunc:
visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(I));		visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(I));
return;		return;
case Intrinsic::fmuladd: {		case Intrinsic::fmuladd: {
EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());		EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());
if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&		if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&
TLI.isFMAFasterThanFMulAndFAdd(VT)) {		TLI.isFMAFasterThanFMulAndFAdd(VT)) {
▲ Show 20 Lines • Show All 771 Lines • ▼ Show 20 Lines	case Intrinsic::experimental_constrained_log:
Opcode = ISD::STRICT_FLOG;		Opcode = ISD::STRICT_FLOG;
break;		break;
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
Opcode = ISD::STRICT_FLOG10;		Opcode = ISD::STRICT_FLOG10;
break;		break;
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
Opcode = ISD::STRICT_FLOG2;		Opcode = ISD::STRICT_FLOG2;
break;		break;
		case Intrinsic::experimental_constrained_lrint:
		Opcode = ISD::STRICT_LRINT;
		break;
		case Intrinsic::experimental_constrained_llrint:
		Opcode = ISD::STRICT_LLRINT;
		break;
case Intrinsic::experimental_constrained_rint:		case Intrinsic::experimental_constrained_rint:
Opcode = ISD::STRICT_FRINT;		Opcode = ISD::STRICT_FRINT;
break;		break;
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
Opcode = ISD::STRICT_FNEARBYINT;		Opcode = ISD::STRICT_FNEARBYINT;
break;		break;
case Intrinsic::experimental_constrained_maxnum:		case Intrinsic::experimental_constrained_maxnum:
Opcode = ISD::STRICT_FMAXNUM;		Opcode = ISD::STRICT_FMAXNUM;
break;		break;
case Intrinsic::experimental_constrained_minnum:		case Intrinsic::experimental_constrained_minnum:
Opcode = ISD::STRICT_FMINNUM;		Opcode = ISD::STRICT_FMINNUM;
break;		break;
case Intrinsic::experimental_constrained_ceil:		case Intrinsic::experimental_constrained_ceil:
Opcode = ISD::STRICT_FCEIL;		Opcode = ISD::STRICT_FCEIL;
break;		break;
case Intrinsic::experimental_constrained_floor:		case Intrinsic::experimental_constrained_floor:
Opcode = ISD::STRICT_FFLOOR;		Opcode = ISD::STRICT_FFLOOR;
break;		break;
		case Intrinsic::experimental_constrained_lround:
		Opcode = ISD::STRICT_LROUND;
		break;
		case Intrinsic::experimental_constrained_llround:
		Opcode = ISD::STRICT_LLROUND;
		break;
case Intrinsic::experimental_constrained_round:		case Intrinsic::experimental_constrained_round:
Opcode = ISD::STRICT_FROUND;		Opcode = ISD::STRICT_FROUND;
break;		break;
case Intrinsic::experimental_constrained_trunc:		case Intrinsic::experimental_constrained_trunc:
Opcode = ISD::STRICT_FTRUNC;		Opcode = ISD::STRICT_FTRUNC;
break;		break;
}		}
const TargetLowering &TLI = DAG.getTargetLoweringInfo();		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
▲ Show 20 Lines • Show All 3,549 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	#endif
case ISD::UINT_TO_FP: return "uint_to_fp";		case ISD::UINT_TO_FP: return "uint_to_fp";
case ISD::FP_TO_SINT: return "fp_to_sint";		case ISD::FP_TO_SINT: return "fp_to_sint";
case ISD::FP_TO_UINT: return "fp_to_uint";		case ISD::FP_TO_UINT: return "fp_to_uint";
case ISD::BITCAST: return "bitcast";		case ISD::BITCAST: return "bitcast";
case ISD::ADDRSPACECAST: return "addrspacecast";		case ISD::ADDRSPACECAST: return "addrspacecast";
case ISD::FP16_TO_FP: return "fp16_to_fp";		case ISD::FP16_TO_FP: return "fp16_to_fp";
case ISD::FP_TO_FP16: return "fp_to_fp16";		case ISD::FP_TO_FP16: return "fp_to_fp16";
case ISD::LROUND: return "lround";		case ISD::LROUND: return "lround";
		case ISD::STRICT_LROUND: return "strict_lround";
case ISD::LLROUND: return "llround";		case ISD::LLROUND: return "llround";
		case ISD::STRICT_LLROUND: return "strict_llround";
case ISD::LRINT: return "lrint";		case ISD::LRINT: return "lrint";
		case ISD::STRICT_LRINT: return "strict_lrint";
case ISD::LLRINT: return "llrint";		case ISD::LLRINT: return "llrint";
		case ISD::STRICT_LLRINT: return "strict_llrint";

// Control flow instructions		// Control flow instructions
case ISD::BR: return "br";		case ISD::BR: return "br";
case ISD::BRIND: return "brind";		case ISD::BRIND: return "brind";
case ISD::BR_JT: return "br_jt";		case ISD::BR_JT: return "br_jt";
case ISD::BRCOND: return "brcond";		case ISD::BRCOND: return "brcond";
case ISD::BR_CC: return "br_cc";		case ISD::BR_CC: return "br_cc";
case ISD::CALLSEQ_START: return "callseq_start";		case ISD::CALLSEQ_START: return "callseq_start";
▲ Show 20 Lines • Show All 612 Lines • Show Last 20 Lines

lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 674 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::all_valuetypes()) {
setOperationAction(ISD::STRICT_FPOWI, VT, Expand);		setOperationAction(ISD::STRICT_FPOWI, VT, Expand);
setOperationAction(ISD::STRICT_FSIN, VT, Expand);		setOperationAction(ISD::STRICT_FSIN, VT, Expand);
setOperationAction(ISD::STRICT_FCOS, VT, Expand);		setOperationAction(ISD::STRICT_FCOS, VT, Expand);
setOperationAction(ISD::STRICT_FEXP, VT, Expand);		setOperationAction(ISD::STRICT_FEXP, VT, Expand);
setOperationAction(ISD::STRICT_FEXP2, VT, Expand);		setOperationAction(ISD::STRICT_FEXP2, VT, Expand);
setOperationAction(ISD::STRICT_FLOG, VT, Expand);		setOperationAction(ISD::STRICT_FLOG, VT, Expand);
setOperationAction(ISD::STRICT_FLOG10, VT, Expand);		setOperationAction(ISD::STRICT_FLOG10, VT, Expand);
setOperationAction(ISD::STRICT_FLOG2, VT, Expand);		setOperationAction(ISD::STRICT_FLOG2, VT, Expand);
		setOperationAction(ISD::STRICT_LRINT, VT, Expand);
		setOperationAction(ISD::STRICT_LLRINT, VT, Expand);
setOperationAction(ISD::STRICT_FRINT, VT, Expand);		setOperationAction(ISD::STRICT_FRINT, VT, Expand);
setOperationAction(ISD::STRICT_FNEARBYINT, VT, Expand);		setOperationAction(ISD::STRICT_FNEARBYINT, VT, Expand);
setOperationAction(ISD::STRICT_FCEIL, VT, Expand);		setOperationAction(ISD::STRICT_FCEIL, VT, Expand);
setOperationAction(ISD::STRICT_FFLOOR, VT, Expand);		setOperationAction(ISD::STRICT_FFLOOR, VT, Expand);
		setOperationAction(ISD::STRICT_LROUND, VT, Expand);
		setOperationAction(ISD::STRICT_LLROUND, VT, Expand);
setOperationAction(ISD::STRICT_FROUND, VT, Expand);		setOperationAction(ISD::STRICT_FROUND, VT, Expand);
setOperationAction(ISD::STRICT_FTRUNC, VT, Expand);		setOperationAction(ISD::STRICT_FTRUNC, VT, Expand);
setOperationAction(ISD::STRICT_FMAXNUM, VT, Expand);		setOperationAction(ISD::STRICT_FMAXNUM, VT, Expand);
setOperationAction(ISD::STRICT_FMINNUM, VT, Expand);		setOperationAction(ISD::STRICT_FMINNUM, VT, Expand);
setOperationAction(ISD::STRICT_FP_ROUND, VT, Expand);		setOperationAction(ISD::STRICT_FP_ROUND, VT, Expand);
setOperationAction(ISD::STRICT_FP_EXTEND, VT, Expand);		setOperationAction(ISD::STRICT_FP_EXTEND, VT, Expand);

// For most targets @llvm.get.dynamic.area.offset just returns 0.		// For most targets @llvm.get.dynamic.area.offset just returns 0.
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	for (MVT VT : {MVT::f32, MVT::f64, MVT::f128}) {
setOperationAction(ISD::FTRUNC, VT, Expand);		setOperationAction(ISD::FTRUNC, VT, Expand);
setOperationAction(ISD::FROUND, VT, Expand);		setOperationAction(ISD::FROUND, VT, Expand);
setOperationAction(ISD::LROUND, VT, Expand);		setOperationAction(ISD::LROUND, VT, Expand);
setOperationAction(ISD::LLROUND, VT, Expand);		setOperationAction(ISD::LLROUND, VT, Expand);
setOperationAction(ISD::LRINT, VT, Expand);		setOperationAction(ISD::LRINT, VT, Expand);
setOperationAction(ISD::LLRINT, VT, Expand);		setOperationAction(ISD::LLRINT, VT, Expand);
}		}

		// These are likely to be library calls so vectors need to be unrolled.
		// All types of vector need to be marked since sometimes we check using
		// the floating point type and other times we check on the result which
		// is an integer vector.
		for (unsigned I = MVT::FIRST_VECTOR_VALUETYPE;
		craig.topperUnsubmitted Not Done Reply Inline Actions What places check the result? Can they be fixed? craig.topper: What places check the result? Can they be fixed?
		kpnAuthorUnsubmitted Not Done Reply Inline Actions Hmmm, in LegalizeDAG.cpp we're checking the value in LegalizeOp around line 1153. We're checking the value in ExpandNode around 3692. If these aren't libcalls then we'll do the mutation in SelectionDAGISel.cpp, but that checks the value as well. And it comes too late for mutating if it will become a libcall. That's what i'm seeing: a failure during instruction selection because mutation to LRINT happened too late. I'm not sure, though, that they are broken. Operation actions don't seem to be defined exclusively for operands or exclusively for values. And there's oodles of code checking operands, but oodles of code checking values. Some by definition, since, for example, the Expand code must be looking at values. Anything that isn't marked will default to Legal and cause us problems. I'm not sure we want to be playing whack-a-mole forever adding exceptions here and there. This is a bit of a shotgun approach, but it does cover all the bases we need covered. I'll experiment a bit more and see what if I can get this thing to behave without this block of code here. kpn: Hmmm, in LegalizeDAG.cpp we're checking the value in LegalizeOp around line 1153. We're…
		I <= MVT::LAST_VECTOR_VALUETYPE;
		++I) {
		MVT VT = MVT::SimpleValueType(I);
		setOperationAction(ISD::LROUND, VT, Expand);
		setOperationAction(ISD::LLROUND, VT, Expand);
		setOperationAction(ISD::LRINT, VT, Expand);
		setOperationAction(ISD::LLRINT, VT, Expand);
		}

// Default ISD::TRAP to expand (which turns it into abort).		// Default ISD::TRAP to expand (which turns it into abort).
setOperationAction(ISD::TRAP, MVT::Other, Expand);		setOperationAction(ISD::TRAP, MVT::Other, Expand);

// On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"		// On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"
// here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.		// here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.
setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);		setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);
}		}

▲ Show 20 Lines • Show All 1,179 Lines • Show Last 20 Lines

lib/IR/IntrinsicInst.cpp

Show First 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	switch (getIntrinsicID()) {
case Intrinsic::experimental_constrained_sqrt:		case Intrinsic::experimental_constrained_sqrt:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
		case Intrinsic::experimental_constrained_lrint:
		case Intrinsic::experimental_constrained_llrint:
case Intrinsic::experimental_constrained_rint:		case Intrinsic::experimental_constrained_rint:
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
case Intrinsic::experimental_constrained_ceil:		case Intrinsic::experimental_constrained_ceil:
case Intrinsic::experimental_constrained_floor:		case Intrinsic::experimental_constrained_floor:
		case Intrinsic::experimental_constrained_lround:
		case Intrinsic::experimental_constrained_llround:
case Intrinsic::experimental_constrained_round:		case Intrinsic::experimental_constrained_round:
case Intrinsic::experimental_constrained_trunc:		case Intrinsic::experimental_constrained_trunc:
return true;		return true;
}		}
}		}

bool ConstrainedFPIntrinsic::isTernaryOp() const {		bool ConstrainedFPIntrinsic::isTernaryOp() const {
switch (getIntrinsicID()) {		switch (getIntrinsicID()) {
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

lib/IR/Verifier.cpp

Show First 20 Lines • Show All 4,241 Lines • ▼ Show 20 Lines	void Verifier::visitIntrinsicCall(Intrinsic::ID ID, CallBase &Call) {
case Intrinsic::experimental_constrained_powi:		case Intrinsic::experimental_constrained_powi:
case Intrinsic::experimental_constrained_sin:		case Intrinsic::experimental_constrained_sin:
case Intrinsic::experimental_constrained_cos:		case Intrinsic::experimental_constrained_cos:
case Intrinsic::experimental_constrained_exp:		case Intrinsic::experimental_constrained_exp:
case Intrinsic::experimental_constrained_exp2:		case Intrinsic::experimental_constrained_exp2:
case Intrinsic::experimental_constrained_log:		case Intrinsic::experimental_constrained_log:
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
		case Intrinsic::experimental_constrained_lrint:
		craig.topperUnsubmitted Done Reply Inline Actions Probably need to ensure these don't get used with vectors to match their none constrainted counterparts craig.topper: Probably need to ensure these don't get used with vectors to match their none constrainted…
		case Intrinsic::experimental_constrained_llrint:
case Intrinsic::experimental_constrained_rint:		case Intrinsic::experimental_constrained_rint:
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
case Intrinsic::experimental_constrained_maxnum:		case Intrinsic::experimental_constrained_maxnum:
case Intrinsic::experimental_constrained_minnum:		case Intrinsic::experimental_constrained_minnum:
case Intrinsic::experimental_constrained_ceil:		case Intrinsic::experimental_constrained_ceil:
case Intrinsic::experimental_constrained_floor:		case Intrinsic::experimental_constrained_floor:
		case Intrinsic::experimental_constrained_lround:
		case Intrinsic::experimental_constrained_llround:
case Intrinsic::experimental_constrained_round:		case Intrinsic::experimental_constrained_round:
case Intrinsic::experimental_constrained_trunc:		case Intrinsic::experimental_constrained_trunc:
visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(Call));		visitConstrainedFPIntrinsic(cast<ConstrainedFPIntrinsic>(Call));
break;		break;
case Intrinsic::dbg_declare: // llvm.dbg.declare		case Intrinsic::dbg_declare: // llvm.dbg.declare
Assert(isa<MetadataAsValue>(Call.getArgOperand(0)),		Assert(isa<MetadataAsValue>(Call.getArgOperand(0)),
"invalid llvm.dbg.declare intrinsic call 1", Call);		"invalid llvm.dbg.declare intrinsic call 1", Call);
visitDbgIntrinsic("declare", cast<DbgVariableIntrinsic>(Call));		visitDbgIntrinsic("declare", cast<DbgVariableIntrinsic>(Call));
▲ Show 20 Lines • Show All 429 Lines • ▼ Show 20 Lines	void Verifier::visitConstrainedFPIntrinsic(ConstrainedFPIntrinsic &FPI) {
case Intrinsic::experimental_constrained_log10:		case Intrinsic::experimental_constrained_log10:
case Intrinsic::experimental_constrained_log2:		case Intrinsic::experimental_constrained_log2:
case Intrinsic::experimental_constrained_rint:		case Intrinsic::experimental_constrained_rint:
case Intrinsic::experimental_constrained_nearbyint:		case Intrinsic::experimental_constrained_nearbyint:
case Intrinsic::experimental_constrained_ceil:		case Intrinsic::experimental_constrained_ceil:
case Intrinsic::experimental_constrained_floor:		case Intrinsic::experimental_constrained_floor:
case Intrinsic::experimental_constrained_round:		case Intrinsic::experimental_constrained_round:
case Intrinsic::experimental_constrained_trunc:		case Intrinsic::experimental_constrained_trunc:
		case Intrinsic::experimental_constrained_lrint:
		case Intrinsic::experimental_constrained_llrint:
Assert((NumOperands == 3), "invalid arguments for constrained FP intrinsic",		Assert((NumOperands == 3), "invalid arguments for constrained FP intrinsic",
&FPI);		&FPI);
HasExceptionMD = true;		HasExceptionMD = true;
HasRoundingMD = true;		HasRoundingMD = true;
break;		break;

		case Intrinsic::experimental_constrained_lround:
		case Intrinsic::experimental_constrained_llround:
		Assert((NumOperands == 2), "invalid arguments for constrained FP intrinsic",
		&FPI);
		HasExceptionMD = true;
		break;

case Intrinsic::experimental_constrained_fma:		case Intrinsic::experimental_constrained_fma:
Assert((NumOperands == 5), "invalid arguments for constrained FP intrinsic",		Assert((NumOperands == 5), "invalid arguments for constrained FP intrinsic",
&FPI);		&FPI);
HasExceptionMD = true;		HasExceptionMD = true;
HasRoundingMD = true;		HasRoundingMD = true;
		craig.topperUnsubmitted Not Done Reply Inline Actions Should this break be inside the curly braces? I don't think I've seen the style used here anywhere else. craig.topper: Should this break be inside the curly braces? I don't think I've seen the style used here…
break;		break;

case Intrinsic::experimental_constrained_fadd:		case Intrinsic::experimental_constrained_fadd:
case Intrinsic::experimental_constrained_fsub:		case Intrinsic::experimental_constrained_fsub:
case Intrinsic::experimental_constrained_fmul:		case Intrinsic::experimental_constrained_fmul:
case Intrinsic::experimental_constrained_fdiv:		case Intrinsic::experimental_constrained_fdiv:
case Intrinsic::experimental_constrained_frem:		case Intrinsic::experimental_constrained_frem:
case Intrinsic::experimental_constrained_pow:		case Intrinsic::experimental_constrained_pow:
▲ Show 20 Lines • Show All 716 Lines • Show Last 20 Lines

test/CodeGen/X86/fp-intrinsics.ll

	Show First 20 Lines • Show All 303 Lines • ▼ Show 20 Lines
	; COMMON: cvtss2sd			; COMMON: cvtss2sd
	define double @f22(float %x) {			define double @f22(float %x) {
	entry:			entry:
	%result = call double @llvm.experimental.constrained.fpext.f64.f32(float %x,			%result = call double @llvm.experimental.constrained.fpext.f64.f32(float %x,
	metadata !"fpexcept.strict")			metadata !"fpexcept.strict")
	ret double %result			ret double %result
	}			}

				; CHECK-LABEL: f23
				; COMMON: callq lrint
				define i32 @f23(double %x) {
				entry:
				%result = call i32 @llvm.experimental.constrained.lrint.i32.f64(double %x,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; CHECK-LABEL: f24
				; COMMON: callq lrintf
				define i32 @f24(float %x) {
				entry:
				%result = call i32 @llvm.experimental.constrained.lrint.i32.f32(float %x,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; CHECK-LABEL: f25
				; COMMON: callq llrint
				define i64 @f25(double %x) {
				entry:
				%result = call i64 @llvm.experimental.constrained.llrint.i64.f64(double %x,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i64 %result
				}

				; CHECK-LABEL: f26
				; COMMON: callq llrintf
				define i64 @f26(float %x) {
				entry:
				%result = call i64 @llvm.experimental.constrained.llrint.i64.f32(float %x,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i64 %result
				}

				; CHECK-LABEL: f27
				; COMMON: callq lround
				define i32 @f27(double %x) {
				entry:
				%result = call i32 @llvm.experimental.constrained.lround.i32.f64(double %x,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; CHECK-LABEL: f28
				; COMMON: callq lroundf
				define i32 @f28(float %x) {
				entry:
				%result = call i32 @llvm.experimental.constrained.lround.i32.f32(float %x,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; CHECK-LABEL: f29
				; COMMON: callq llround
				define i64 @f29(double %x) {
				entry:
				%result = call i64 @llvm.experimental.constrained.llround.i64.f64(double %x,
				metadata !"fpexcept.strict")
				ret i64 %result
				}

				; CHECK-LABEL: f30
				; COMMON: callq llroundf
				define i64 @f30(float %x) {
				entry:
				%result = call i64 @llvm.experimental.constrained.llround.i64.f32(float %x,
				metadata !"fpexcept.strict")
				ret i64 %result
				}

	@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"			@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
	declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.frem.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.frem.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)			declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)
	declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
	declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)			declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata)
	declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
	declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)			declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)			declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)
				declare i32 @llvm.experimental.constrained.lrint.i32.f64(double, metadata, metadata)
				declare i32 @llvm.experimental.constrained.lrint.i32.f32(float, metadata, metadata)
				declare i64 @llvm.experimental.constrained.llrint.i64.f64(double, metadata, metadata)
				declare i64 @llvm.experimental.constrained.llrint.i64.f32(float, metadata, metadata)
				declare i32 @llvm.experimental.constrained.lround.i32.f64(double, metadata)
				declare i32 @llvm.experimental.constrained.lround.i32.f32(float, metadata)
				declare i64 @llvm.experimental.constrained.llround.i64.f64(double, metadata)
				declare i64 @llvm.experimental.constrained.llround.i64.f32(float, metadata)

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,597 Lines • ▼ Show 20 Lines
	entry:			entry:
	%trunc = call <3 x double> @llvm.experimental.constrained.trunc.v3f64(			%trunc = call <3 x double> @llvm.experimental.constrained.trunc.v3f64(
	<3 x double> <double 1.1, double 1.9, double 1.5>,			<3 x double> <double 1.1, double 1.9, double 1.5>,
	metadata !"round.dynamic",			metadata !"round.dynamic",
	metadata !"fpexcept.strict")			metadata !"fpexcept.strict")
	ret <3 x double> %trunc			ret <3 x double> %trunc
	}			}

				define <1 x i32> @constrained_vector_lrint_v1f32() {
				; CHECK-LABEL: constrained_vector_lrint_v1f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v1f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i32> @llvm.experimental.constrained.lrint.v1i32.v1f32(
				<1 x float> <float 42.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <1 x i32> %result
				}

				define <2 x i32> @constrained_vector_lrint_v2f32() {
				; CHECK-LABEL: constrained_vector_lrint_v2f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v2f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: addq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i32> @llvm.experimental.constrained.lrint.v2i32.v2f32(
				<2 x float> <float 42.0, float 43.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x i32> %result
				}

				define <3 x i32> @constrained_vector_lrint_v3f32() {
				; CHECK-LABEL: constrained_vector_lrint_v3f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movd %eax, %xmm1
				; CHECK-NEXT: movdqa (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v3f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i32> @llvm.experimental.constrained.lrint.v3i32.v3f32(
				<3 x float><float 42.0, float 43.0,
				float 44.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <3 x i32> %result
				}

				define <4 x i32> @constrained_vector_lrint_v4f32() {
				; CHECK-LABEL: constrained_vector_lrint_v4f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lrintf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v4f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lrintf
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i32> @llvm.experimental.constrained.lrint.v4i32.v4f32(
				<4 x float><float 42.0, float 43.0,
				float 44.0, float 45.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <4 x i32> %result
				}

				define <1 x i64> @constrained_vector_llrint_v1f32() {
				; CHECK-LABEL: constrained_vector_llrint_v1f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v1f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i64> @llvm.experimental.constrained.llrint.v1i64.v1f32(
				<1 x float> <float 42.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <1 x i64> %result
				}

				define <2 x i64> @constrained_vector_llrint_v2f32() {
				; CHECK-LABEL: constrained_vector_llrint_v2f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v2f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: addq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i64> @llvm.experimental.constrained.llrint.v2i32.v2f32(
				<2 x float> <float 42.0, float 43.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x i64> %result
				}

				define <3 x i64> @constrained_vector_llrint_v3f32() {
				; CHECK-LABEL: constrained_vector_llrint_v3f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: pushq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: .cfi_offset %rbx, -24
				; CHECK-NEXT: .cfi_offset %r14, -16
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %r14
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %rbx
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rbx, %rdx
				; CHECK-NEXT: movq %r14, %rcx
				; CHECK-NEXT: addq $8, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: popq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: popq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v3f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 64
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqu %ymm0, (%rsp) # 32-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: vzeroupper
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovups (%rsp), %ymm1 # 32-byte Reload
				; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
				; AVX-NEXT: addq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i64> @llvm.experimental.constrained.llrint.v3i64.v3f32(
				<3 x float><float 42.0, float 43.0,
				float 44.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <3 x i64> %result
				}

				define <4 x i64> @constrained_vector_llrint_v4f32() {
				; CHECK-LABEL: constrained_vector_llrint_v4f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llrintf
				; CHECK-NEXT: movq %rax, %xmm1
				; CHECK-NEXT: punpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm1 = xmm1[0],mem[0]
				; CHECK-NEXT: movaps (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v4f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 48
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llrintf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vinsertf128 $1, (%rsp), %ymm0, %ymm0 # 16-byte Folded Reload
				; AVX-NEXT: addq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i64> @llvm.experimental.constrained.llrint.v4i64.v4f32(
				<4 x float><float 42.0, float 43.0,
				float 44.0, float 45.0>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <4 x i64> %result
				}

				define <1 x i32> @constrained_vector_lrint_v1f64() {
				; CHECK-LABEL: constrained_vector_lrint_v1f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v1f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i32> @llvm.experimental.constrained.lrint.v1i32.v1f64(
				<1 x double> <double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <1 x i32> %result
				}

				define <2 x i32> @constrained_vector_lrint_v2f64() {
				; CHECK-LABEL: constrained_vector_lrint_v2f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v2f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; AVX-NEXT: popq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i32> @llvm.experimental.constrained.lrint.v2i32.v2f64(
				<2 x double> <double 42.1, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x i32> %result
				}

				define <3 x i32> @constrained_vector_lrint_v3f64() {
				; CHECK-LABEL: constrained_vector_lrint_v3f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movd %eax, %xmm1
				; CHECK-NEXT: movdqa (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v3f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i32> @llvm.experimental.constrained.lrint.v3i32.v3f64(
				<3 x double><double 42.1, double 42.2,
				double 42.3>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <3 x i32> %result
				}

				define <4 x i32> @constrained_vector_lrint_v4f64() {
				; CHECK-LABEL: constrained_vector_lrint_v4f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: shufps $136, (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0,2],mem[0,2]
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lrint_v4f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lrint
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i32> @llvm.experimental.constrained.lrint.v4i32.v4f64(
				<4 x double><double 42.1, double 42.2,
				double 42.3, double 42.4>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <4 x i32> %result
				}

				define <1 x i64> @constrained_vector_llrint_v1f64() {
				; CHECK-LABEL: constrained_vector_llrint_v1f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v1f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i64> @llvm.experimental.constrained.llrint.v1i64.v1f64(
				<1 x double> <double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <1 x i64> %result
				}

				define <2 x i64> @constrained_vector_llrint_v2f64() {
				; CHECK-LABEL: constrained_vector_llrint_v2f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v2f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; AVX-NEXT: popq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i64> @llvm.experimental.constrained.llrint.v2i32.v2f64(
				<2 x double> <double 42.1, double 42.1>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <2 x i64> %result
				}

				define <3 x i64> @constrained_vector_llrint_v3f64() {
				; CHECK-LABEL: constrained_vector_llrint_v3f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: pushq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: .cfi_offset %rbx, -24
				; CHECK-NEXT: .cfi_offset %r14, -16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rax, %r14
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rax, %rbx
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rbx, %rdx
				; CHECK-NEXT: movq %r14, %rcx
				; CHECK-NEXT: addq $8, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: popq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: popq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v3f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 64
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqu %ymm0, (%rsp) # 32-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: vzeroupper
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovups (%rsp), %ymm1 # 32-byte Reload
				; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
				; AVX-NEXT: addq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i64> @llvm.experimental.constrained.llrint.v3i64.v3f64(
				<3 x double><double 42.1, double 42.2,
				double 42.3>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <3 x i64> %result
				}

				define <4 x i64> @constrained_vector_llrint_v4f64() {
				; CHECK-LABEL: constrained_vector_llrint_v4f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llrint
				; CHECK-NEXT: movq %rax, %xmm1
				; CHECK-NEXT: punpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm1 = xmm1[0],mem[0]
				; CHECK-NEXT: movaps (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llrint_v4f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 48
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llrint
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vinsertf128 $1, (%rsp), %ymm0, %ymm0 # 16-byte Folded Reload
				; AVX-NEXT: addq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i64> @llvm.experimental.constrained.llrint.v4i64.v4f64(
				<4 x double><double 42.1, double 42.2,
				double 42.3, double 42.4>,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret <4 x i64> %result
				}

				define <1 x i32> @constrained_vector_lround_v1f32() {
				; CHECK-LABEL: constrained_vector_lround_v1f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v1f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i32> @llvm.experimental.constrained.lround.v1i32.v1f32(
				<1 x float> <float 42.0>,
				metadata !"fpexcept.strict")
				ret <1 x i32> %result
				}

				define <2 x i32> @constrained_vector_lround_v2f32() {
				; CHECK-LABEL: constrained_vector_lround_v2f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v2f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: addq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i32> @llvm.experimental.constrained.lround.v2i32.v2f32(
				<2 x float> <float 42.0, float 43.0>,
				metadata !"fpexcept.strict")
				ret <2 x i32> %result
				}

				define <3 x i32> @constrained_vector_lround_v3f32() {
				; CHECK-LABEL: constrained_vector_lround_v3f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movd %eax, %xmm1
				; CHECK-NEXT: movdqa (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v3f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i32> @llvm.experimental.constrained.lround.v3i32.v3f32(
				<3 x float><float 42.0, float 43.0,
				float 44.0>,
				metadata !"fpexcept.strict")
				ret <3 x i32> %result
				}

				define <4 x i32> @constrained_vector_lround_v4f32() {
				; CHECK-LABEL: constrained_vector_lround_v4f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq lroundf
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v4f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq lroundf
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i32> @llvm.experimental.constrained.lround.v4i32.v4f32(
				<4 x float><float 42.0, float 43.0,
				float 44.0, float 45.0>,
				metadata !"fpexcept.strict")
				ret <4 x i32> %result
				}

				define <1 x i64> @constrained_vector_llround_v1f32() {
				; CHECK-LABEL: constrained_vector_llround_v1f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v1f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i64> @llvm.experimental.constrained.llround.v1i64.v1f32(
				<1 x float> <float 42.0>,
				metadata !"fpexcept.strict")
				ret <1 x i64> %result
				}

				define <2 x i64> @constrained_vector_llround_v2f32() {
				; CHECK-LABEL: constrained_vector_llround_v2f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v2f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: addq $24, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i64> @llvm.experimental.constrained.llround.v2i32.v2f32(
				<2 x float> <float 42.0, float 43.0>,
				metadata !"fpexcept.strict")
				ret <2 x i64> %result
				}

				define <3 x i64> @constrained_vector_llround_v3f32() {
				; CHECK-LABEL: constrained_vector_llround_v3f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: pushq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: .cfi_offset %rbx, -24
				; CHECK-NEXT: .cfi_offset %r14, -16
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %r14
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %rbx
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rbx, %rdx
				; CHECK-NEXT: movq %r14, %rcx
				; CHECK-NEXT: addq $8, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: popq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: popq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v3f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 64
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqu %ymm0, (%rsp) # 32-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: vzeroupper
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovups (%rsp), %ymm1 # 32-byte Reload
				; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
				; AVX-NEXT: addq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i64> @llvm.experimental.constrained.llround.v3i64.v3f32(
				<3 x float><float 42.0, float 43.0,
				float 44.0>,
				metadata !"fpexcept.strict")
				ret <3 x i64> %result
				}

				define <4 x i64> @constrained_vector_llround_v4f32() {
				; CHECK-LABEL: constrained_vector_llround_v4f32:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; CHECK-NEXT: callq llroundf
				; CHECK-NEXT: movq %rax, %xmm1
				; CHECK-NEXT: punpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm1 = xmm1[0],mem[0]
				; CHECK-NEXT: movaps (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v4f32:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 48
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; AVX-NEXT: vmovss {{.*#+}} xmm0 = mem[0],zero,zero,zero
				; AVX-NEXT: callq llroundf
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vinsertf128 $1, (%rsp), %ymm0, %ymm0 # 16-byte Folded Reload
				; AVX-NEXT: addq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i64> @llvm.experimental.constrained.llround.v4i64.v4f32(
				<4 x float><float 42.0, float 43.0,
				float 44.0, float 45.0>,
				metadata !"fpexcept.strict")
				ret <4 x i64> %result
				}


				define <1 x i32> @constrained_vector_lround_v1f64() {
				; CHECK-LABEL: constrained_vector_lround_v1f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v1f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i32> @llvm.experimental.constrained.lround.v1i32.v1f64(
				<1 x double> <double 42.1>,
				metadata !"fpexcept.strict")
				ret <1 x i32> %result
				}

				define <2 x i32> @constrained_vector_lround_v2f64() {
				; CHECK-LABEL: constrained_vector_lround_v2f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v2f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; AVX-NEXT: popq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i32> @llvm.experimental.constrained.lround.v2i32.v2f64(
				<2 x double> <double 42.1, double 42.1>,
				metadata !"fpexcept.strict")
				ret <2 x i32> %result
				}

				define <3 x i32> @constrained_vector_lround_v3f64() {
				; CHECK-LABEL: constrained_vector_lround_v3f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movd %eax, %xmm0
				; CHECK-NEXT: punpckldq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0],xmm0[1],mem[1]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movd %eax, %xmm1
				; CHECK-NEXT: movdqa (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
				; CHECK-NEXT: addq $24, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v3f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i32> @llvm.experimental.constrained.lround.v3i32.v3f64(
				<3 x double><double 42.1, double 42.2,
				double 42.3>,
				metadata !"fpexcept.strict")
				ret <3 x i32> %result
				}

				define <4 x i32> @constrained_vector_lround_v4f64() {
				; CHECK-LABEL: constrained_vector_lround_v4f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq lround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: shufps $136, (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0,2],mem[0,2]
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_lround_v4f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: subq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 32
				; AVX-NEXT: .cfi_offset %rbx, -16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: movl %eax, %ebx
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: vmovd %eax, %xmm0
				; AVX-NEXT: vpinsrd $1, %ebx, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $2, %eax, %xmm0, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq lround
				; AVX-NEXT: vmovdqa (%rsp), %xmm0 # 16-byte Reload
				; AVX-NEXT: vpinsrd $3, %eax, %xmm0, %xmm0
				; AVX-NEXT: addq $16, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: popq %rbx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i32> @llvm.experimental.constrained.lround.v4i32.v4f64(
				<4 x double><double 42.1, double 42.2,
				double 42.3, double 42.4>,
				metadata !"fpexcept.strict")
				ret <4 x i32> %result
				}

				define <1 x i64> @constrained_vector_llround_v1f64() {
				; CHECK-LABEL: constrained_vector_llround_v1f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: popq %rcx
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v1f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: popq %rcx
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <1 x i64> @llvm.experimental.constrained.llround.v1i64.v1f64(
				<1 x double> <double 42.1>,
				metadata !"fpexcept.strict")
				ret <1 x i64> %result
				}

				define <2 x i64> @constrained_vector_llround_v2f64() {
				; CHECK-LABEL: constrained_vector_llround_v2f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: pshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v2f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: pushq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 16
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,1,0,1]
				; AVX-NEXT: popq %rax
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <2 x i64> @llvm.experimental.constrained.llround.v2i32.v2f64(
				<2 x double> <double 42.1, double 42.1>,
				metadata !"fpexcept.strict")
				ret <2 x i64> %result
				}

				define <3 x i64> @constrained_vector_llround_v3f64() {
				; CHECK-LABEL: constrained_vector_llround_v3f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: pushq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 32
				; CHECK-NEXT: .cfi_offset %rbx, -24
				; CHECK-NEXT: .cfi_offset %r14, -16
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rax, %r14
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rax, %rbx
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rbx, %rdx
				; CHECK-NEXT: movq %r14, %rcx
				; CHECK-NEXT: addq $8, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 24
				; CHECK-NEXT: popq %rbx
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: popq %r14
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v3f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 64
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqu %ymm0, (%rsp) # 32-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: vzeroupper
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovups (%rsp), %ymm1 # 32-byte Reload
				; AVX-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
				; AVX-NEXT: addq $56, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <3 x i64> @llvm.experimental.constrained.llround.v3i64.v3f64(
				<3 x double><double 42.1, double 42.2,
				double 42.3>,
				metadata !"fpexcept.strict")
				ret <3 x i64> %result
				}

				define <4 x i64> @constrained_vector_llround_v4f64() {
				; CHECK-LABEL: constrained_vector_llround_v4f64:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: subq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 48
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: punpcklqdq (%rsp), %xmm0 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm0 = xmm0[0],mem[0]
				; CHECK-NEXT: movdqa %xmm0, (%rsp) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rax, %xmm0
				; CHECK-NEXT: movdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; CHECK-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
				; CHECK-NEXT: callq llround
				; CHECK-NEXT: movq %rax, %xmm1
				; CHECK-NEXT: punpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm1 # 16-byte Folded Reload
				; CHECK-NEXT: # xmm1 = xmm1[0],mem[0]
				; CHECK-NEXT: movaps (%rsp), %xmm0 # 16-byte Reload
				; CHECK-NEXT: addq $40, %rsp
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
				;
				; AVX-LABEL: constrained_vector_llround_v4f64:
				; AVX: # %bb.0: # %entry
				; AVX-NEXT: subq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 48
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq (%rsp), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vmovdqa %xmm0, (%rsp) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vmovdqa %xmm0, {{[-0-9]+}}(%r{{[sb]}}p) # 16-byte Spill
				; AVX-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
				; AVX-NEXT: callq llround
				; AVX-NEXT: vmovq %rax, %xmm0
				; AVX-NEXT: vpunpcklqdq {{[-0-9]+}}(%r{{[sb]}}p), %xmm0, %xmm0 # 16-byte Folded Reload
				; AVX-NEXT: # xmm0 = xmm0[0],mem[0]
				; AVX-NEXT: vinsertf128 $1, (%rsp), %ymm0, %ymm0 # 16-byte Folded Reload
				; AVX-NEXT: addq $40, %rsp
				; AVX-NEXT: .cfi_def_cfa_offset 8
				; AVX-NEXT: retq
				entry:
				%result = call <4 x i64> @llvm.experimental.constrained.llround.v4i64.v4f64(
				<4 x double><double 42.1, double 42.2,
				double 42.3, double 42.4>,
				metadata !"fpexcept.strict")
				ret <4 x i64> %result
				}

	; Single width declarations			; Single width declarations
	declare <2 x double> @llvm.experimental.constrained.fadd.v2f64(<2 x double>, <2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.fadd.v2f64(<2 x double>, <2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.fsub.v2f64(<2 x double>, <2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.fsub.v2f64(<2 x double>, <2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.fmul.v2f64(<2 x double>, <2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.fmul.v2f64(<2 x double>, <2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.fdiv.v2f64(<2 x double>, <2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.fdiv.v2f64(<2 x double>, <2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.frem.v2f64(<2 x double>, <2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.frem.v2f64(<2 x double>, <2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.sqrt.v2f64(<2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.sqrt.v2f64(<2 x double>, metadata, metadata)
	Show All 11 Lines
	declare <2 x double> @llvm.experimental.constrained.maxnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.maxnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.minnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.minnum.v2f64(<2 x double>, <2 x double>, metadata, metadata)
	declare <2 x float> @llvm.experimental.constrained.fptrunc.v2f32.v2f64(<2 x double>, metadata, metadata)			declare <2 x float> @llvm.experimental.constrained.fptrunc.v2f32.v2f64(<2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.fpext.v2f64.v2f32(<2 x float>, metadata)			declare <2 x double> @llvm.experimental.constrained.fpext.v2f64.v2f32(<2 x float>, metadata)
	declare <2 x double> @llvm.experimental.constrained.ceil.v2f64(<2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.ceil.v2f64(<2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.floor.v2f64(<2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.floor.v2f64(<2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.round.v2f64(<2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.round.v2f64(<2 x double>, metadata, metadata)
	declare <2 x double> @llvm.experimental.constrained.trunc.v2f64(<2 x double>, metadata, metadata)			declare <2 x double> @llvm.experimental.constrained.trunc.v2f64(<2 x double>, metadata, metadata)
				declare <2 x i32> @llvm.experimental.constrained.lrint.v2i32.v2f32(<2 x float>, metadata, metadata)
				declare <2 x i64> @llvm.experimental.constrained.llrint.v2i32.v2f32(<2 x float>, metadata, metadata)
				declare <2 x i32> @llvm.experimental.constrained.lrint.v2i32.v2f64(<2 x double>, metadata, metadata)
				declare <2 x i64> @llvm.experimental.constrained.llrint.v2i32.v2f64(<2 x double>, metadata, metadata)
				declare <2 x i32> @llvm.experimental.constrained.lround.v2i32.v2f32(<2 x float>, metadata)
				declare <2 x i64> @llvm.experimental.constrained.llround.v2i32.v2f32(<2 x float>, metadata)
				declare <2 x i32> @llvm.experimental.constrained.lround.v2i32.v2f64(<2 x double>, metadata)
				declare <2 x i64> @llvm.experimental.constrained.llround.v2i32.v2f64(<2 x double>, metadata)

	; Scalar width declarations			; Scalar width declarations
	declare <1 x float> @llvm.experimental.constrained.fadd.v1f32(<1 x float>, <1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.fadd.v1f32(<1 x float>, <1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.fsub.v1f32(<1 x float>, <1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.fsub.v1f32(<1 x float>, <1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.fmul.v1f32(<1 x float>, <1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.fmul.v1f32(<1 x float>, <1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.fdiv.v1f32(<1 x float>, <1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.fdiv.v1f32(<1 x float>, <1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.frem.v1f32(<1 x float>, <1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.frem.v1f32(<1 x float>, <1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.sqrt.v1f32(<1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.sqrt.v1f32(<1 x float>, metadata, metadata)
	Show All 11 Lines
	declare <1 x float> @llvm.experimental.constrained.maxnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.maxnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.minnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.minnum.v1f32(<1 x float>, <1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.fptrunc.v1f32.v1f64(<1 x double>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.fptrunc.v1f32.v1f64(<1 x double>, metadata, metadata)
	declare <1 x double> @llvm.experimental.constrained.fpext.v1f64.v1f32(<1 x float>, metadata)			declare <1 x double> @llvm.experimental.constrained.fpext.v1f64.v1f32(<1 x float>, metadata)
	declare <1 x float> @llvm.experimental.constrained.ceil.v1f32(<1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.ceil.v1f32(<1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.floor.v1f32(<1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.floor.v1f32(<1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.round.v1f32(<1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.round.v1f32(<1 x float>, metadata, metadata)
	declare <1 x float> @llvm.experimental.constrained.trunc.v1f32(<1 x float>, metadata, metadata)			declare <1 x float> @llvm.experimental.constrained.trunc.v1f32(<1 x float>, metadata, metadata)
				declare <1 x i32> @llvm.experimental.constrained.lrint.v1i32.v1f32(<1 x float>, metadata, metadata)
				declare <1 x i64> @llvm.experimental.constrained.llrint.v1i64.v1f32(<1 x float>, metadata, metadata)
				declare <1 x i32> @llvm.experimental.constrained.lround.v1i32.v1f32(<1 x float>, metadata)
				declare <1 x i64> @llvm.experimental.constrained.llround.v1i64.v1f32(<1 x float>, metadata)
				declare <1 x i32> @llvm.experimental.constrained.lrint.v1i32.v1f64(<1 x double>, metadata, metadata)
				declare <1 x i64> @llvm.experimental.constrained.llrint.v1i64.v1f64(<1 x double>, metadata, metadata)
				declare <1 x i32> @llvm.experimental.constrained.lround.v1i32.v1f64(<1 x double>, metadata)
				declare <1 x i64> @llvm.experimental.constrained.llround.v1i64.v1f64(<1 x double>, metadata)

	; Illegal width declarations			; Illegal width declarations
	declare <3 x float> @llvm.experimental.constrained.fadd.v3f32(<3 x float>, <3 x float>, metadata, metadata)			declare <3 x float> @llvm.experimental.constrained.fadd.v3f32(<3 x float>, <3 x float>, metadata, metadata)
	declare <3 x double> @llvm.experimental.constrained.fadd.v3f64(<3 x double>, <3 x double>, metadata, metadata)			declare <3 x double> @llvm.experimental.constrained.fadd.v3f64(<3 x double>, <3 x double>, metadata, metadata)
	declare <3 x float> @llvm.experimental.constrained.fsub.v3f32(<3 x float>, <3 x float>, metadata, metadata)			declare <3 x float> @llvm.experimental.constrained.fsub.v3f32(<3 x float>, <3 x float>, metadata, metadata)
	declare <3 x double> @llvm.experimental.constrained.fsub.v3f64(<3 x double>, <3 x double>, metadata, metadata)			declare <3 x double> @llvm.experimental.constrained.fsub.v3f64(<3 x double>, <3 x double>, metadata, metadata)
	declare <3 x float> @llvm.experimental.constrained.fmul.v3f32(<3 x float>, <3 x float>, metadata, metadata)			declare <3 x float> @llvm.experimental.constrained.fmul.v3f32(<3 x float>, <3 x float>, metadata, metadata)
	declare <3 x double> @llvm.experimental.constrained.fmul.v3f64(<3 x double>, <3 x double>, metadata, metadata)			declare <3 x double> @llvm.experimental.constrained.fmul.v3f64(<3 x double>, <3 x double>, metadata, metadata)
	Show All 34 Lines
	declare <3 x float> @llvm.experimental.constrained.ceil.v3f32(<3 x float>, metadata, metadata)			declare <3 x float> @llvm.experimental.constrained.ceil.v3f32(<3 x float>, metadata, metadata)
	declare <3 x double> @llvm.experimental.constrained.ceil.v3f64(<3 x double>, metadata, metadata)			declare <3 x double> @llvm.experimental.constrained.ceil.v3f64(<3 x double>, metadata, metadata)
	declare <3 x float> @llvm.experimental.constrained.floor.v3f32(<3 x float>, metadata, metadata)			declare <3 x float> @llvm.experimental.constrained.floor.v3f32(<3 x float>, metadata, metadata)
	declare <3 x double> @llvm.experimental.constrained.floor.v3f64(<3 x double>, metadata, metadata)			declare <3 x double> @llvm.experimental.constrained.floor.v3f64(<3 x double>, metadata, metadata)
	declare <3 x float> @llvm.experimental.constrained.round.v3f32(<3 x float>, metadata, metadata)			declare <3 x float> @llvm.experimental.constrained.round.v3f32(<3 x float>, metadata, metadata)
	declare <3 x double> @llvm.experimental.constrained.round.v3f64(<3 x double>, metadata, metadata)			declare <3 x double> @llvm.experimental.constrained.round.v3f64(<3 x double>, metadata, metadata)
	declare <3 x float> @llvm.experimental.constrained.trunc.v3f32(<3 x float>, metadata, metadata)			declare <3 x float> @llvm.experimental.constrained.trunc.v3f32(<3 x float>, metadata, metadata)
	declare <3 x double> @llvm.experimental.constrained.trunc.v3f64(<3 x double>, metadata, metadata)			declare <3 x double> @llvm.experimental.constrained.trunc.v3f64(<3 x double>, metadata, metadata)
				declare <3 x i32> @llvm.experimental.constrained.lrint.v3i32.v3f32(<3 x float>, metadata, metadata)
				declare <3 x i64> @llvm.experimental.constrained.llrint.v3i64.v3f32(<3 x float>, metadata, metadata)
				declare <3 x i32> @llvm.experimental.constrained.lrint.v3i32.v3f64(<3 x double>, metadata, metadata)
				declare <3 x i64> @llvm.experimental.constrained.llrint.v3i64.v3f64(<3 x double>, metadata, metadata)
				declare <3 x i32> @llvm.experimental.constrained.lround.v3i32.v3f32(<3 x float>, metadata)
				declare <3 x i64> @llvm.experimental.constrained.llround.v3i64.v3f32(<3 x float>, metadata)
				declare <3 x i32> @llvm.experimental.constrained.lround.v3i32.v3f64(<3 x double>, metadata)
				declare <3 x i64> @llvm.experimental.constrained.llround.v3i64.v3f64(<3 x double>, metadata)

	; Double width declarations			; Double width declarations
	declare <4 x double> @llvm.experimental.constrained.fadd.v4f64(<4 x double>, <4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.fadd.v4f64(<4 x double>, <4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.fsub.v4f64(<4 x double>, <4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.fsub.v4f64(<4 x double>, <4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.fmul.v4f64(<4 x double>, <4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.fmul.v4f64(<4 x double>, <4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.fdiv.v4f64(<4 x double>, <4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.fdiv.v4f64(<4 x double>, <4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.frem.v4f64(<4 x double>, <4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.frem.v4f64(<4 x double>, <4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.sqrt.v4f64(<4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.sqrt.v4f64(<4 x double>, metadata, metadata)
	Show All 11 Lines
	declare <4 x double> @llvm.experimental.constrained.maxnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.maxnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.minnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.minnum.v4f64(<4 x double>, <4 x double>, metadata, metadata)
	declare <4 x float> @llvm.experimental.constrained.fptrunc.v4f32.v4f64(<4 x double>, metadata, metadata)			declare <4 x float> @llvm.experimental.constrained.fptrunc.v4f32.v4f64(<4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.fpext.v4f64.v4f32(<4 x float>, metadata)			declare <4 x double> @llvm.experimental.constrained.fpext.v4f64.v4f32(<4 x float>, metadata)
	declare <4 x double> @llvm.experimental.constrained.ceil.v4f64(<4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.ceil.v4f64(<4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.floor.v4f64(<4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.floor.v4f64(<4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.round.v4f64(<4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.round.v4f64(<4 x double>, metadata, metadata)
	declare <4 x double> @llvm.experimental.constrained.trunc.v4f64(<4 x double>, metadata, metadata)			declare <4 x double> @llvm.experimental.constrained.trunc.v4f64(<4 x double>, metadata, metadata)
				declare <4 x i32> @llvm.experimental.constrained.lrint.v4i32.v4f32(<4 x float>, metadata, metadata)
				declare <4 x i64> @llvm.experimental.constrained.llrint.v4i64.v4f32(<4 x float>, metadata, metadata)
				declare <4 x i32> @llvm.experimental.constrained.lrint.v4i32.v4f64(<4 x double>, metadata, metadata)
				declare <4 x i64> @llvm.experimental.constrained.llrint.v4i64.v4f64(<4 x double>, metadata, metadata)
				declare <4 x i32> @llvm.experimental.constrained.lround.v4i32.v4f32(<4 x float>, metadata)
				declare <4 x i64> @llvm.experimental.constrained.llround.v4i64.v4f32(<4 x float>, metadata)
				declare <4 x i32> @llvm.experimental.constrained.lround.v4i32.v4f64(<4 x double>, metadata)
				declare <4 x i64> @llvm.experimental.constrained.llround.v4i64.v4f64(<4 x double>, metadata)

test/Feature/fp-intrinsics.ll

	Show First 20 Lines • Show All 260 Lines • ▼ Show 20 Lines
	; CHECK: call double @llvm.experimental.constrained.fpext			; CHECK: call double @llvm.experimental.constrained.fpext
	define double @f21() {			define double @f21() {
	entry:			entry:
	%result = call double @llvm.experimental.constrained.fpext.f64.f32(float 42.0,			%result = call double @llvm.experimental.constrained.fpext.f64.f32(float 42.0,
	metadata !"fpexcept.strict")			metadata !"fpexcept.strict")
	ret double %result			ret double %result
	}			}

				; Verify that lrint(42.1) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f22
				; CHECK: call i32 @llvm.experimental.constrained.lrint
				define i32 @f22() {
				entry:
				%result = call i32 @llvm.experimental.constrained.lrint.i32.f64(double 42.1,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that lrintf(42.0) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f23
				; CHECK: call i32 @llvm.experimental.constrained.lrint
				define i32 @f23() {
				entry:
				%result = call i32 @llvm.experimental.constrained.lrint.i32.f32(float 42.0,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that llrint(42.1) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f24
				; CHECK: call i64 @llvm.experimental.constrained.llrint
				define i64 @f24() {
				entry:
				%result = call i64 @llvm.experimental.constrained.llrint.i64.f64(double 42.1,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i64 %result
				}

				; Verify that llrint(42.0) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f25
				; CHECK: call i64 @llvm.experimental.constrained.llrint
				define i64 @f25() {
				entry:
				%result = call i64 @llvm.experimental.constrained.llrint.i64.f32(float 42.0,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict")
				ret i64 %result
				}

				; Verify that lround(42.1) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f26
				; CHECK: call i32 @llvm.experimental.constrained.lround
				define i32 @f26() {
				entry:
				%result = call i32 @llvm.experimental.constrained.lround.i32.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that lround(42.0) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f27
				; CHECK: call i32 @llvm.experimental.constrained.lround
				define i32 @f27() {
				entry:
				%result = call i32 @llvm.experimental.constrained.lround.i32.f32(float 42.0,
				metadata !"fpexcept.strict")
				ret i32 %result
				}

				; Verify that llround(42.1) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f28
				; CHECK: call i64 @llvm.experimental.constrained.llround
				define i64 @f28() {
				entry:
				%result = call i64 @llvm.experimental.constrained.llround.i64.f64(double 42.1,
				metadata !"fpexcept.strict")
				ret i64 %result
				}

				; Verify that llround(42.0) isn't simplified when the rounding mode is unknown.
				; CHECK-LABEL: f29
				; CHECK: call i64 @llvm.experimental.constrained.llround
				define i64 @f29() {
				entry:
				%result = call i64 @llvm.experimental.constrained.llround.i64.f32(float 42.0,
				metadata !"fpexcept.strict")
				ret i64 %result
				}

	@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"			@llvm.fp.env = thread_local global i8 zeroinitializer, section "llvm.metadata"
	declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fmul.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fsub.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sqrt.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)			declare double @llvm.experimental.constrained.pow.f64(double, double, metadata, metadata)
	declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)			declare double @llvm.experimental.constrained.powi.f64(double, i32, metadata, metadata)
	declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.sin.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.cos.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.exp2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log10.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.log2.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)			declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)			declare double @llvm.experimental.constrained.fma.f64(double, double, double, metadata, metadata)
	declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)			declare float @llvm.experimental.constrained.fptrunc.f32.f64(double, metadata, metadata)
	declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)			declare double @llvm.experimental.constrained.fpext.f64.f32(float, metadata)
				declare i32 @llvm.experimental.constrained.lrint.i32.f64(double, metadata, metadata)
				declare i32 @llvm.experimental.constrained.lrint.i32.f32(float, metadata, metadata)
				declare i64 @llvm.experimental.constrained.llrint.i64.f64(double, metadata, metadata)
				declare i64 @llvm.experimental.constrained.llrint.i64.f32(float, metadata, metadata)
				declare i32 @llvm.experimental.constrained.lround.i32.f64(double, metadata)
				declare i32 @llvm.experimental.constrained.lround.i32.f32(float, metadata)
				declare i64 @llvm.experimental.constrained.llround.i64.f64(double, metadata)
				declare i64 @llvm.experimental.constrained.llround.i64.f32(float, metadata)

This is an archive of the discontinued LLVM Phabricator instance.

Add constrained intrinsics for lrint and lroundClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 209864

docs/LangRef.rst

include/llvm/CodeGen/ISDOpcodes.h

include/llvm/CodeGen/SelectionDAGNodes.h

include/llvm/CodeGen/TargetLowering.h

include/llvm/IR/IntrinsicInst.h

include/llvm/IR/Intrinsics.td

include/llvm/Target/TargetSelectionDAG.td

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

lib/CodeGen/SelectionDAG/LegalizeTypes.h

lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

lib/CodeGen/TargetLoweringBase.cpp

lib/IR/IntrinsicInst.cpp

lib/IR/Verifier.cpp

test/CodeGen/X86/fp-intrinsics.ll

test/CodeGen/X86/vector-constrained-fp-intrinsics.ll

test/Feature/fp-intrinsics.ll

Add constrained intrinsics for lrint and lround
ClosedPublic