This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
FPEnv.h
-
lib/
-
Analysis/
10/22
ConstantFolding.cpp
-
InstructionSimplify.cpp
-
IR/
2/6
FPEnv.cpp
-
test/
-
CodeGen/PowerPC/
-
PowerPC/
1
vector-constrained-fp-intrinsics.ll
-
Transforms/InstSimplify/
-
InstSimplify/
-
constfold-constrained.ll

Differential D72930

[FEnv] Constfold some unary constrained operations
ClosedPublic

Authored by sepavloff on Jan 17 2020, 8:29 AM.

Download Raw Diff

Details

Reviewers

andrew.w.kaylor
evandro
arsenm
kpn
craig.topper

Commits

rGf39873915294: [FEnv] Constfold some unary constrained operations

Summary

This change implements constant folding for constrained versions of
intrinsics, implementing rounding: floor, ceil, trunc, round, rint and
nearbyint.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sepavloff created this revision.Jan 17 2020, 8:29 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2020, 8:29 AM

Herald added subscribers: hiraditya, wdng. · View Herald Transcript

Harbormaster completed remote builds in B44282: Diff 238789.Jan 17 2020, 8:34 AM

craig.topper added inline comments.Jan 17 2020, 8:45 AM

llvm/lib/IR/FPEnv.cpp
98	rmToNearest should map to APFloat::rmNearestTiesToEven.

sepavloff marked an inline comment as done.Jan 17 2020, 9:29 AM

sepavloff added inline comments.

llvm/lib/IR/FPEnv.cpp
98	rmToNearest should map to APFloat::rmNearestTiesToEven Why? It seems that the C11 standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) does not specify which of two two IEEE-754 rounding modes corresponds to 'nearest'. Description of `round` however contains definite requirement (7.12.9.6): The round functions round their argument to the nearest integer value in floating-point format, rounding halfway cases away from zero, regardless of the current rounding direction. In this case rounding mode is `roundTiesToAway`. Why in other cases it must be `roundTiesToEven`?

kpn added inline comments.Jan 17 2020, 10:25 AM

llvm/lib/Analysis/ConstantFolding.cpp
1786	I thought rint could raise an inexact exception? Also, what happens if we don't know the floating point environment because of FENV_ACCESS=ON and no other math flags or math #pragmas have been given? Shouldn't the infrastructure for that go into clang first?

craig.topper added inline comments.Jan 17 2020, 10:28 AM

llvm/lib/IR/FPEnv.cpp
98	As far as I know the default environment is supposed to be TiesToEven. That's what all of the constant folding for non-strict FP assumes. The round function itself is weird and specifies a different behavior.

sepavloff marked 2 inline comments as done.Jan 17 2020, 11:29 AM

sepavloff added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
1786	`rint` may raise the inexact floating-point exception but does not have to do it. Result of rounding never requires rounding, so it is always exact. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm for related ideas. If we don't know rounding mode, the corresponding intrinsic should have "round.dynamic" as argument, `getAPFloatRoundingMode` returns `None` in this case and the expression is not folded. Yes, it is clang that must put "round.dynamic". In D69272 there is a test that checks such behavior.
llvm/lib/IR/FPEnv.cpp
98	Yes, I see that ConstantFolding.cpp uses mainly `roundTiesToEven` and will change the patch. It would be nice to understand why this is so. What make me worry is unusual behavior of `roundTiesToEven`. For instance, 10.5 is rounded to 10.0. Behavior of `round` is more familiar.

andrew.w.kaylor added inline comments.Jan 17 2020, 11:51 AM

llvm/lib/IR/FPEnv.cpp
98	The IEEE 754-2019 specification says this in section 4.3.3: "The roundTiesToEven rounding-direction attribute shall be the default rounding-direction attribute for results in binary formats." Although it describes roundTiesToAway, it says that rounding mode isn't required for binary format implementations. More practically, I believe most hardware rounds ties to even when the round-to-nearest mode is selected. I know this is the case for x86 architecture processors. This is likely the behavior for all architectures that don't support both tie-breaking methods.

craig.topper added inline comments.Jan 17 2020, 11:59 AM

llvm/lib/IR/FPEnv.cpp
98	roundTiestoAway is more familiar because its what we're taught in school when we're young. But it introduces bias in the results. Rounding to even can cancel out some bias. See the "round half to even" section here https://en.wikipedia.org/wiki/Rounding

Andrew, Craig, thank you for explanations!

Changed rounding to nearest from roundTiesToAway to roundTiesToEven

Harbormaster completed remote builds in B44342: Diff 238952.Jan 18 2020, 8:53 AM

craig.topper added inline comments.Jan 23 2020, 9:43 PM

llvm/lib/Analysis/ConstantFolding.cpp
1771	Does moving this up change the behavior or non-constrained intrinsics?

Updated patch

Implemented constfolding for non-finite numbers
Do not affect processing of non-constrained intrinsics.

Harbormaster completed remote builds in B44949: Diff 240428.Jan 26 2020, 3:56 AM

sepavloff marked an inline comment as done.Jan 26 2020, 3:59 AM

sepavloff added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
1771	Changed the patch so that behavior of non-constrained intrinsics remains unaffected.

Rebase and ping.

Harbormaster completed remote builds in B45655: Diff 242254.Feb 3 2020, 9:16 PM

kpn added inline comments.Feb 11 2020, 9:29 AM

llvm/lib/Analysis/ConstantFolding.cpp
1786	But the latest C2x standard still has the same description of rint from C99. The standard hasn't changed. So, are we allowed to simply drop an exception? I thought we weren't? What's the definition of "may"? Is it: If conditions warrant, an exception _shall_ be issued OR If conditions warrant, the decision to raise an exception is _implementation defined_. If it's #1 then we're dropping an exception we should be issuing.

sepavloff marked an inline comment as done.Feb 11 2020, 10:30 AM

sepavloff added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
1786	`Inexact` exception rises when result of an operation cannot be represented without loss of precision. Result of any rounding operation is an integer, it always can be represented without loss of significant digits. The reason why `rint` exists is faster implementation than `nearbyint` in some cases. I don't know these particular cases but the cited defect report mentions it. So `rint` is just a loose version of `nearbyint`. There are no cases when this exception would make sense.

kpn added inline comments.Feb 12 2020, 5:40 AM

llvm/lib/Analysis/ConstantFolding.cpp
1786	After sleeping on it I agree. Having rint() raise Inexact in the exact circumstance where you needed to call rint() anyway isn't helpful. So I'd guess that this is case #2 above and we can elide the exception. Objection withdrawn, and sorry I wasn't quicker to come to this conclusion.

evandro added inline comments.Feb 13 2020, 12:09 PM

llvm/lib/Analysis/ConstantFolding.cpp
1402–1403	Should these cases be true even when `isStrictFP()` is true?
1481–1502	Or should... return !Call->isStrictFP(); be inserted here?
2647	Again, not sure about the impact of doing this...

sepavloff marked 2 inline comments as done.Feb 13 2020, 10:24 PM

sepavloff added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
1402–1403	`fabs` is allowed irrespective of `isStrictFP()` , so it should be processed here. As for functions like `ceil`, they cannot be found in strictfp function, corresponding operations are represented by constrained intrinsics.
2647	The same thing. If an operation depends on or changes current floating point environment, it is represented by corresponding constrained intrinsics in strictfp function.

Ping.

andrew.w.kaylor added inline comments.Feb 24 2020, 10:45 AM

llvm/lib/Analysis/ConstantFolding.cpp
1402–1403	I originally added the Call->isStrictFP() check here before we had constrained versions of the primary FP intrinsics. Now that we have them it should be OK to let those pass. However, we aren't yet enforcing the rule that all calls within a strictfp function must be the constrained versions, so it might be too soon to make that assumption here. Also, I don't think we're planning to have constrained versions of the target specific intrinsics like Intrinsic::x86_avx512_vcvtss2usi32. Those will probably only have the strictfp attribute on the callsite and possibly an operand bundle to specify the specific details. The other intrinsics might revert to behavior like that too some day.
1481–1502	This is a reasonable suggestion.
1840	For all of these except nearbyint, you need to check the return code and only remove the call if no flags are raised or exceptionBehavior is ignore. The rest of the functions raise the inexact flag if the value is not already an integer. We could both keep the call and RAUW the return value. We need the exception to be raised, but the constant folding could unlock additional optimizations. It might be useful to introduce new intrinsics like llvm.raiseinexact() so that we could later combine multiple calls raising the same flag.
2647	This is probably OK. I think these changes will always end up going through the other function for FP calls.

Updated patch according to review notes

Reorganized canConstantFoldCallTo.

Harbormaster completed remote builds in B47225: Diff 246496.Feb 25 2020, 10:19 AM

sepavloff marked 4 inline comments as done.Feb 25 2020, 10:40 AM

sepavloff added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
1402–1403	Reorganized this function.
1840	For all of these except nearbyint, you need to check the return code None of rounding functions raises exception, if argument is not SNaN. All big numbers are already integers, so overflow cannot occur. Result of rounding is integer number, so underflow or inexact exceptions also absent. Mentioning of inexact exception in the description of `rint` in C standard is somewhat misleading, it is discusses earlier: https://reviews.llvm.org/D72930#inline-676354.

I may have overstated. The exception behavior of the

llvm/lib/Analysis/ConstantFolding.cpp
1840	My thoughts here: (1) We aren't implementing the C standard. LLVM IR is language-independent and needs to be suitable for any language. We should clearly define the semantics of the constrained intrinsics (ideally more clearly than the C standard does for the corresponding functions), and those semantics should be suitable for any language. Depending on what we decide, that may mean languages that want stricter behavior might need to avoid using the intrinsics and instead provide a library call that does what they need. (2) The C standard seems to be evolving on this point. The C99 standard says the rint function may raise inexact and nearbyint will not. It is silent on the other functions here. I believe the C11 standard says the rint functions "do" raise the inexact exception, and that ceil, floor, round, and trunc functions "may, but are not required to," raise inexact. Lacking information about which version of the standard the user requested, we need behavior that is suitable to any version. (3) The point of the strict exception semantics mode is that any exception that would be raised by a literal translation of the code will be raised when the code is optimized in the strict mode. If a call to one of these functions (in the case where this was a function call) or the instructions to which they would be lowered in the case where the intrinsic is lowered directly to instructions, would raise an exception, then we should not fold the exception away in this mode. (4) I know at least one math library implementation that raises inexact for rint, ceil, floor, round, and trunc. My earlier claim was based on writing a program and seeing what happened. The Intel math library raises exceptions for each of these functions. The x864-64 GNU math library only raises the exception for rint. Other implementations may behave differently. I certainly understand why we would want to be able to constant fold in all these cases. However, I'm not convinced that doing so is always the right thing to do. At the very least, I think we'd need either some option to control this behavior or perhaps a new interface in TTI to query the behavior of these functions in the target math library.

Updated patch according to reviewer's notes

rint now can raise inexact exception.

Harbormaster completed remote builds in B47436: Diff 246960.Feb 27 2020, 8:44 AM

sepavloff marked an inline comment as done.Feb 27 2020, 8:52 AM

sepavloff added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
1840	This is rather complicated topic. Let's consider it in details. IEEE 754 This standard defines the cases when operation should raise inexact exception (7.6): Unless stated otherwise, if the rounded result of an operation is inexact - that is, it differs from what would have been computed were both exponent range and precision unbounded - then the inexact exception shall be signaled. Result of rounding operation is an integer value in the same floating point format as its argument. Integers are always exact, so one could expect that inexact exception are never raised in rounding operations. However the standard treats inexactness in wider sense when dealing with rounding. The standard defines set of rounding operations that produce values in floating point formats in 5.9: roundToIntegralTiesToEven roundToIntegralTowardZero roundToIntegralTowardPositive roundToIntegralTowardNegative roundToIntegralTiesToAway It states that: These operations shall not signal any exception except for signaling NaN input. It also defines operation: For the following operation, the rounding direction is the applicable rounding-direction attribute. This operation signals the invalid operation exception for a signaling NaN operand, and for a numerical operand, signals the inexact exception if the result does not have the same numerical value as x. - sourceFormat roundToIntegralExact(source) roundToIntegralExact(x) rounds x to an integral value according to the applicable rounding direction attribute. C standard Most recent draft (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2454.pdf) defines functions `ceil`, `floor`, `round`, `roundeven` and `trunc`, that have strict correspondence with the rounding operations defined in IEEE Std 754-2008. `nearbyint` does not have strict counterpart in IEEE 754, but dynamically selects one of the standard operations depending on the current rounding mode. As for `rint`, its description in 7.12.9.4 states that `it may signal` and does not specifies when it may signal. In the recent draft cited above (but not in the previous versions of the C standard) it is stated (F.10p6): The functions bound to operations in IEC 60559 (F.3) are fully specified by IEC 60559, including rounding behaviors and floating-point exceptions. F.3p1 specifies that `rint` is bound to `roundToIntegralExact`, `ceil` - to `roundToIntegralTowardPositive` and so on. So depending on the viewpoint, `rint` either may rise `inexact` exception (because of 7.12.9.4) or must do it (because of F.10p6). The former reflects perception of rounding as an operation, see the defect report http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm, although it probably is outdated. LLVM LLVM IR is language-independent. Intrinsic functions `ceil`, `floor`, `round`, `roundeven` and `trunc` are defined by C standard but they correspond to rounding operations defined by IEEE 754, which is also language-independent. It looks reasonable to implement them according to their IEEEE 754 counterparts. This means that all these intrinsics but `rint` do not raise `inexact` exception, `rint` raises `inexact` exception if its result differs from the argument. (2) The C standard seems to be evolving on this point. The C99 standard says the rint function may raise inexact and nearbyint will not. It is silent on the other functions here. I believe the C11 standard says the rint functions "do" raise the inexact exception, and that ceil, floor, round, and trunc functions "may, but are not required to," raise inexact. Lacking information about which version of the standard the user requested, we need behavior that is suitable to any version. Exactly. In C11 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) there is a statement about `ceil` (F.10.6.1): The ceil functions may, but are not required to, raise the ‘‘inexact’’ floating-point exception for finite non-integer arguments, as this implementation does. Similar statements existed for `floor,` round` and `trunc`. The recent draft does not provide such option, `inexact` is not allowed for these functions. (4) I know at least one math library implementation that raises inexact for rint, ceil, floor, round, and trunc. My earlier claim was based on writing a program and seeing what happened. The Intel math library raises exceptions for each of these functions. The x864-64 GNU math library only raises the exception for rint. Other implementations may behave differently. I believe this is caused by evolution of C standard. GCC implemented special option `-fno-fp-int-builtin-inexact` deal with compatibility issues (https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02060.html). We could implement this option in clang if necessary.

sepavloff added a parent revision: D75246: Make IEEEFloat::roundToIntegral more standard conformant.Feb 27 2020, 9:45 AM

andrew.w.kaylor added inline comments.Feb 27 2020, 1:46 PM

llvm/lib/Analysis/ConstantFolding.cpp
1840	Thank you for the very thorough answer! Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions. That would satisfy my objection. I have verified that the latest Fortran standard also defines these functions in terms of their IEEE 754 counterparts, and I think that's a reasonable expectation for any front end that wants strict floating point semantics. The more precisely specific exception semantics do present a potential issue for backends. The older "may, but are not required to, raise inexact" allows for a wider variety of implementations. Semantics that do not permit exceptions may require special handling. For example, the x87 FRNDINT instruction raises the inexact exception and so could only be used with rint. Because the non-constrained versions of these functions are defined to assume a floating point environment with no side effects and the constrained versions are still considered experimental, I think we are free to just declare their semantics to be bound to the IEEE-754 operations. If we are agreed on this, then only the rint folding here would need to handle the inexact condition.

kpn added inline comments.Feb 28 2020, 5:43 AM

llvm/lib/Analysis/ConstantFolding.cpp
1840	Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions. I like this idea. +1

Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions.

I started preparing documentation patch but got problem with description of rint and nearbyint: we should describe non-constrained variants as returning argument rounded toward nearest even, because they are usable only when default FP environment is set. Asked question in maillist about that.

If we are agreed on this, then only the rint folding here would need to handle the inexact condition.

Great!

Is there something that prevents this patch from approval?

lgtm

Sorry for the delay in approval.

This revision is now accepted and ready to land.Mar 26 2020, 1:41 PM

Added changes for PowerPC tests

Rounding intrinsics now can be constfolded, even if they are represented
by constrained intrinsics. As a result a PowerPC test that checks code
generation for constrained intrinsics needs update.

@kpn Could you check the changes in the test? The checks were regenerated
using update_llc_test_checks.py.

Herald added a subscriber: nemanjai. · View Herald TranscriptMar 27 2020, 8:42 AM

Harbormaster failed remote builds in B50693: Diff 253132!Mar 27 2020, 9:07 AM

Herald added a subscriber: • wuzish. · View Herald TranscriptMar 27 2020, 9:07 AM

In D72930#1946234, @sepavloff wrote:

@kpn Could you check the changes in the test? The checks were regenerated
using update_llc_test_checks.py.

I'll take a look. I'm not a PowerPC expert but I'll do what I can.

It looks like a whole bunch of instructions got converted into simple loads. Which is exactly what was expected.

LGTM.

llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll
6444	Now that I look at this, it might have been nice if all along we'd been writing test cases that gave some different values for the vector elements. Oh well. I doubt it's worth changing now.

Thank you!

Closed by commit rGf39873915294: [FEnv] Constfold some unary constrained operations (authored by sepavloff). · Explain WhyMar 27 2020, 10:32 PM

This revision was automatically updated to reflect the committed changes.

qiucf mentioned this in D64193: [PowerPC] Add exception constraint to FP rounding operations.May 18 2020, 8:46 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

FPEnv.h

4 lines

lib/

Analysis/

ConstantFolding.cpp

129 lines

InstructionSimplify.cpp

5 lines

IR/

FPEnv.cpp

16 lines

test/

CodeGen/

PowerPC/

vector-constrained-fp-intrinsics.ll

340 lines

Transforms/

InstSimplify/

constfold-constrained.ll

244 lines

Diff 253132

llvm/include/llvm/IR/FPEnv.h

	Show All 9 Lines
	/// This file contains the declarations of entities that describe floating			/// This file contains the declarations of entities that describe floating
	/// point environment and related functions.			/// point environment and related functions.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_IR_FLOATINGPOINT_H			#ifndef LLVM_IR_FLOATINGPOINT_H
	#define LLVM_IR_FLOATINGPOINT_H			#define LLVM_IR_FLOATINGPOINT_H

				#include "llvm/ADT/APFloat.h"
	#include "llvm/ADT/Optional.h"			#include "llvm/ADT/Optional.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"
	#include <stdint.h>			#include <stdint.h>

	namespace llvm {			namespace llvm {

	namespace fp {			namespace fp {

	Show All 35 Lines
	/// Returns a valid ExceptionBehavior enumerator when given a string			/// Returns a valid ExceptionBehavior enumerator when given a string
	/// valid as input in constrained intrinsic exception behavior metadata.			/// valid as input in constrained intrinsic exception behavior metadata.
	Optional<fp::ExceptionBehavior> StrToExceptionBehavior(StringRef);			Optional<fp::ExceptionBehavior> StrToExceptionBehavior(StringRef);

	/// For any ExceptionBehavior enumerator, returns a string valid as			/// For any ExceptionBehavior enumerator, returns a string valid as
	/// input in constrained intrinsic exception behavior metadata.			/// input in constrained intrinsic exception behavior metadata.
	Optional<StringRef> ExceptionBehaviorToStr(fp::ExceptionBehavior);			Optional<StringRef> ExceptionBehaviorToStr(fp::ExceptionBehavior);

				/// Converts rounding mode represented by fp::RoundingMode to the rounding mode
				/// index used by APFloat. For fp::rmDynamic it returns None.
				Optional<APFloatBase::roundingMode> getAPFloatRoundingMode(fp::RoundingMode);
	}			}
	#endif			#endif

llvm/lib/Analysis/ConstantFolding.cpp

Show All 32 Lines
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/GlobalValue.h"		#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/IntrinsicsAMDGPU.h"		#include "llvm/IR/IntrinsicsAMDGPU.h"
#include "llvm/IR/IntrinsicsX86.h"		#include "llvm/IR/IntrinsicsX86.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
▲ Show 20 Lines • Show All 1,342 Lines • ▼ Show 20 Lines	llvm::ConstantFoldLoadThroughGEPIndices(Constant *C,
return C;		return C;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Constant Folding for Calls		// Constant Folding for Calls
//		//

bool llvm::canConstantFoldCallTo(const CallBase Call, const Function F) {		bool llvm::canConstantFoldCallTo(const CallBase Call, const Function F) {
if (Call->isNoBuiltin() \|\| Call->isStrictFP())		if (Call->isNoBuiltin())
return false;		return false;
switch (F->getIntrinsicID()) {		switch (F->getIntrinsicID()) {
case Intrinsic::fabs:		// Operations that do not operate floating-point numbers and do not depend on
		evandroUnsubmitted Not Done Reply Inline Actions Should these cases be true even when `isStrictFP()` is true? evandro: Should these cases be true even when `isStrictFP()` is true?
		sepavloffAuthorUnsubmitted Done Reply Inline Actions `fabs` is allowed irrespective of `isStrictFP()` , so it should be processed here. As for functions like `ceil`, they cannot be found in strictfp function, corresponding operations are represented by constrained intrinsics. sepavloff: `fabs` is allowed irrespective of `isStrictFP()` , so it should be processed here. As for…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I originally added the Call->isStrictFP() check here before we had constrained versions of the primary FP intrinsics. Now that we have them it should be OK to let those pass. However, we aren't yet enforcing the rule that all calls within a strictfp function must be the constrained versions, so it might be too soon to make that assumption here. Also, I don't think we're planning to have constrained versions of the target specific intrinsics like Intrinsic::x86_avx512_vcvtss2usi32. Those will probably only have the strictfp attribute on the callsite and possibly an operand bundle to specify the specific details. The other intrinsics might revert to behavior like that too some day. andrew.w.kaylor: I originally added the Call->isStrictFP() check here before we had constrained versions of the…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Reorganized this function. sepavloff: Reorganized this function.
case Intrinsic::minnum:		// FP environment can be folded even in strictfp functions.
case Intrinsic::maxnum:
case Intrinsic::minimum:
case Intrinsic::maximum:
case Intrinsic::log:
case Intrinsic::log2:
case Intrinsic::log10:
case Intrinsic::exp:
case Intrinsic::exp2:
case Intrinsic::floor:
case Intrinsic::ceil:
case Intrinsic::sqrt:
case Intrinsic::sin:
case Intrinsic::cos:
case Intrinsic::trunc:
case Intrinsic::rint:
case Intrinsic::nearbyint:
case Intrinsic::pow:
case Intrinsic::powi:
case Intrinsic::bswap:		case Intrinsic::bswap:
case Intrinsic::ctpop:		case Intrinsic::ctpop:
case Intrinsic::ctlz:		case Intrinsic::ctlz:
case Intrinsic::cttz:		case Intrinsic::cttz:
case Intrinsic::fshl:		case Intrinsic::fshl:
case Intrinsic::fshr:		case Intrinsic::fshr:
case Intrinsic::fma:
case Intrinsic::fmuladd:
case Intrinsic::copysign:
case Intrinsic::launder_invariant_group:		case Intrinsic::launder_invariant_group:
case Intrinsic::strip_invariant_group:		case Intrinsic::strip_invariant_group:
case Intrinsic::round:
case Intrinsic::masked_load:		case Intrinsic::masked_load:
case Intrinsic::sadd_with_overflow:		case Intrinsic::sadd_with_overflow:
case Intrinsic::uadd_with_overflow:		case Intrinsic::uadd_with_overflow:
case Intrinsic::ssub_with_overflow:		case Intrinsic::ssub_with_overflow:
case Intrinsic::usub_with_overflow:		case Intrinsic::usub_with_overflow:
case Intrinsic::smul_with_overflow:		case Intrinsic::smul_with_overflow:
case Intrinsic::umul_with_overflow:		case Intrinsic::umul_with_overflow:
case Intrinsic::sadd_sat:		case Intrinsic::sadd_sat:
case Intrinsic::uadd_sat:		case Intrinsic::uadd_sat:
case Intrinsic::ssub_sat:		case Intrinsic::ssub_sat:
case Intrinsic::usub_sat:		case Intrinsic::usub_sat:
case Intrinsic::smul_fix:		case Intrinsic::smul_fix:
case Intrinsic::smul_fix_sat:		case Intrinsic::smul_fix_sat:
		case Intrinsic::bitreverse:
		case Intrinsic::is_constant:
		return true;

		// Floating point operations cannot be folded in strictfp functions in
		// general case. They can be folded if FP environment is known to compiler.
		case Intrinsic::minnum:
		case Intrinsic::maxnum:
		case Intrinsic::minimum:
		case Intrinsic::maximum:
		case Intrinsic::log:
		case Intrinsic::log2:
		case Intrinsic::log10:
		case Intrinsic::exp:
		case Intrinsic::exp2:
		case Intrinsic::sqrt:
		case Intrinsic::sin:
		case Intrinsic::cos:
		case Intrinsic::pow:
		case Intrinsic::powi:
		case Intrinsic::fma:
		case Intrinsic::fmuladd:
case Intrinsic::convert_from_fp16:		case Intrinsic::convert_from_fp16:
case Intrinsic::convert_to_fp16:		case Intrinsic::convert_to_fp16:
case Intrinsic::bitreverse:		// The intrinsics below depend on rounding mode in MXCSR.
case Intrinsic::amdgcn_cubeid:		case Intrinsic::amdgcn_cubeid:
case Intrinsic::amdgcn_cubema:		case Intrinsic::amdgcn_cubema:
case Intrinsic::amdgcn_cubesc:		case Intrinsic::amdgcn_cubesc:
case Intrinsic::amdgcn_cubetc:		case Intrinsic::amdgcn_cubetc:
case Intrinsic::amdgcn_fmul_legacy:		case Intrinsic::amdgcn_fmul_legacy:
case Intrinsic::amdgcn_fract:		case Intrinsic::amdgcn_fract:
case Intrinsic::x86_sse_cvtss2si:		case Intrinsic::x86_sse_cvtss2si:
case Intrinsic::x86_sse_cvtss2si64:		case Intrinsic::x86_sse_cvtss2si64:
Show All 14 Lines	bool llvm::canConstantFoldCallTo(const CallBase Call, const Function F) {
case Intrinsic::x86_avx512_vcvtss2usi32:		case Intrinsic::x86_avx512_vcvtss2usi32:
case Intrinsic::x86_avx512_vcvtss2usi64:		case Intrinsic::x86_avx512_vcvtss2usi64:
case Intrinsic::x86_avx512_cvttss2usi:		case Intrinsic::x86_avx512_cvttss2usi:
case Intrinsic::x86_avx512_cvttss2usi64:		case Intrinsic::x86_avx512_cvttss2usi64:
case Intrinsic::x86_avx512_vcvtsd2usi32:		case Intrinsic::x86_avx512_vcvtsd2usi32:
case Intrinsic::x86_avx512_vcvtsd2usi64:		case Intrinsic::x86_avx512_vcvtsd2usi64:
case Intrinsic::x86_avx512_cvttsd2usi:		case Intrinsic::x86_avx512_cvttsd2usi:
case Intrinsic::x86_avx512_cvttsd2usi64:		case Intrinsic::x86_avx512_cvttsd2usi64:
case Intrinsic::is_constant:		return !Call->isStrictFP();

		// Sign operations are actually bitwise operations, they do not raise
		// exceptions even for SNANs.
		case Intrinsic::fabs:
		case Intrinsic::copysign:
		// Non-constrained variants of rounding operations means default FP
		// environment, they can be folded in any case.
		case Intrinsic::ceil:
		case Intrinsic::floor:
		case Intrinsic::round:
		case Intrinsic::trunc:
		case Intrinsic::nearbyint:
		case Intrinsic::rint:
		// Constrained intrinsics can be folded if FP environment is known
		// to compiler.
		case Intrinsic::experimental_constrained_ceil:
		case Intrinsic::experimental_constrained_floor:
		case Intrinsic::experimental_constrained_round:
		case Intrinsic::experimental_constrained_trunc:
		case Intrinsic::experimental_constrained_nearbyint:
		case Intrinsic::experimental_constrained_rint:
		evandroUnsubmitted Done Reply Inline Actions Or should... return !Call->isStrictFP(); be inserted here? evandro: Or should... ``` return !Call->isStrictFP(); ``` be inserted here?
		andrew.w.kaylorUnsubmitted Done Reply Inline Actions This is a reasonable suggestion. andrew.w.kaylor: This is a reasonable suggestion.
return true;		return true;
default:		default:
return false;		return false;
case Intrinsic::not_intrinsic: break;		case Intrinsic::not_intrinsic: break;
}		}

if (!F->hasName())		if (!F->hasName() \|\| Call->isStrictFP())
return false;		return false;

// In these cases, the check of the length is required. We don't want to		// In these cases, the check of the length is required. We don't want to
// return true for a name like "cos\0blah" which strcmp would return equal to		// return true for a name like "cos\0blah" which strcmp would return equal to
// "cos", but has length 8.		// "cos", but has length 8.
StringRef Name = F->getName();		StringRef Name = F->getName();
switch (Name[0]) {		switch (Name[0]) {
default:		default:
▲ Show 20 Lines • Show All 245 Lines • ▼ Show 20 Lines	if (IntrinsicID == Intrinsic::convert_to_fp16) {
Val.convert(APFloat::IEEEhalf(), APFloat::rmNearestTiesToEven, &lost);		Val.convert(APFloat::IEEEhalf(), APFloat::rmNearestTiesToEven, &lost);

return ConstantInt::get(Ty->getContext(), Val.bitcastToAPInt());		return ConstantInt::get(Ty->getContext(), Val.bitcastToAPInt());
}		}

if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy())		if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy())
return nullptr;		return nullptr;

// Use internal versions of these intrinsics.		// Use internal versions of these intrinsics.
		craig.topperUnsubmitted Not Done Reply Inline Actions Does moving this up change the behavior or non-constrained intrinsics? craig.topper: Does moving this up change the behavior or non-constrained intrinsics?
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Changed the patch so that behavior of non-constrained intrinsics remains unaffected. sepavloff: Changed the patch so that behavior of non-constrained intrinsics remains unaffected.
APFloat U = Op->getValueAPF();		APFloat U = Op->getValueAPF();

if (IntrinsicID == Intrinsic::nearbyint \|\| IntrinsicID == Intrinsic::rint) {		if (IntrinsicID == Intrinsic::nearbyint \|\| IntrinsicID == Intrinsic::rint) {
U.roundToIntegral(APFloat::rmNearestTiesToEven);		U.roundToIntegral(APFloat::rmNearestTiesToEven);
return ConstantFP::get(Ty->getContext(), U);		return ConstantFP::get(Ty->getContext(), U);
}		}

if (IntrinsicID == Intrinsic::round) {		if (IntrinsicID == Intrinsic::round) {
U.roundToIntegral(APFloat::rmNearestTiesToAway);		U.roundToIntegral(APFloat::rmNearestTiesToAway);
return ConstantFP::get(Ty->getContext(), U);		return ConstantFP::get(Ty->getContext(), U);
}		}

if (IntrinsicID == Intrinsic::ceil) {		if (IntrinsicID == Intrinsic::ceil) {
U.roundToIntegral(APFloat::rmTowardPositive);		U.roundToIntegral(APFloat::rmTowardPositive);
return ConstantFP::get(Ty->getContext(), U);		return ConstantFP::get(Ty->getContext(), U);
		kpnUnsubmitted Not Done Reply Inline Actions I thought rint could raise an inexact exception? Also, what happens if we don't know the floating point environment because of FENV_ACCESS=ON and no other math flags or math #pragmas have been given? Shouldn't the infrastructure for that go into clang first? kpn: I thought rint could raise an inexact exception? Also, what happens if we don't know the…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions `rint` may raise the inexact floating-point exception but does not have to do it. Result of rounding never requires rounding, so it is always exact. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm for related ideas. If we don't know rounding mode, the corresponding intrinsic should have "round.dynamic" as argument, `getAPFloatRoundingMode` returns `None` in this case and the expression is not folded. Yes, it is clang that must put "round.dynamic". In D69272 there is a test that checks such behavior. sepavloff: `rint` may raise the inexact floating-point exception but does not have to do it. Result of…
		kpnUnsubmitted Not Done Reply Inline Actions But the latest C2x standard still has the same description of rint from C99. The standard hasn't changed. So, are we allowed to simply drop an exception? I thought we weren't? What's the definition of "may"? Is it: If conditions warrant, an exception _shall_ be issued OR If conditions warrant, the decision to raise an exception is _implementation defined_. If it's #1 then we're dropping an exception we should be issuing. kpn: But the latest C2x standard still has the same description of rint from C99. The standard…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions `Inexact` exception rises when result of an operation cannot be represented without loss of precision. Result of any rounding operation is an integer, it always can be represented without loss of significant digits. The reason why `rint` exists is faster implementation than `nearbyint` in some cases. I don't know these particular cases but the cited defect report mentions it. So `rint` is just a loose version of `nearbyint`. There are no cases when this exception would make sense. sepavloff: `Inexact` exception rises when result of an operation cannot be represented without loss of…
		kpnUnsubmitted Not Done Reply Inline Actions After sleeping on it I agree. Having rint() raise Inexact in the exact circumstance where you needed to call rint() anyway isn't helpful. So I'd guess that this is case #2 above and we can elide the exception. Objection withdrawn, and sorry I wasn't quicker to come to this conclusion. kpn: After sleeping on it I agree. Having rint() raise Inexact in the exact circumstance where you…
}		}

if (IntrinsicID == Intrinsic::floor) {		if (IntrinsicID == Intrinsic::floor) {
U.roundToIntegral(APFloat::rmTowardNegative);		U.roundToIntegral(APFloat::rmTowardNegative);
return ConstantFP::get(Ty->getContext(), U);		return ConstantFP::get(Ty->getContext(), U);
}		}

if (IntrinsicID == Intrinsic::trunc) {		if (IntrinsicID == Intrinsic::trunc) {
Show All 14 Lines	if (IntrinsicID == Intrinsic::amdgcn_fract) {
APFloat FloorU(U);		APFloat FloorU(U);
FloorU.roundToIntegral(APFloat::rmTowardNegative);		FloorU.roundToIntegral(APFloat::rmTowardNegative);
APFloat FractU(U - FloorU);		APFloat FractU(U - FloorU);
APFloat AlmostOne(U.getSemantics(), 1);		APFloat AlmostOne(U.getSemantics(), 1);
AlmostOne.next(/nextDown/ true);		AlmostOne.next(/nextDown/ true);
return ConstantFP::get(Ty->getContext(), minimum(FractU, AlmostOne));		return ConstantFP::get(Ty->getContext(), minimum(FractU, AlmostOne));
}		}

		// Rounding operations (floor, trunc, ceil, round and nearbyint) do not
		// raise FP exceptions, unless the argument is signaling NaN.

		Optional<APFloat::roundingMode> RM;
		switch (IntrinsicID) {
		default:
		break;
		case Intrinsic::experimental_constrained_nearbyint:
		case Intrinsic::experimental_constrained_rint: {
		auto CI = cast<ConstrainedFPIntrinsic>(Call);
		Optional<fp::RoundingMode> RMOp = CI->getRoundingMode();
		if (RMOp)
		RM = getAPFloatRoundingMode(*RMOp);
		if (!RM)
		return nullptr;
		break;
		}
		case Intrinsic::experimental_constrained_round:
		RM = APFloat::rmNearestTiesToAway;
		break;
		case Intrinsic::experimental_constrained_ceil:
		RM = APFloat::rmTowardPositive;
		break;
		case Intrinsic::experimental_constrained_floor:
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions For all of these except nearbyint, you need to check the return code and only remove the call if no flags are raised or exceptionBehavior is ignore. The rest of the functions raise the inexact flag if the value is not already an integer. We could both keep the call and RAUW the return value. We need the exception to be raised, but the constant folding could unlock additional optimizations. It might be useful to introduce new intrinsics like llvm.raiseinexact() so that we could later combine multiple calls raising the same flag. andrew.w.kaylor: For all of these except nearbyint, you need to check the return code and only remove the call…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions For all of these except nearbyint, you need to check the return code None of rounding functions raises exception, if argument is not SNaN. All big numbers are already integers, so overflow cannot occur. Result of rounding is integer number, so underflow or inexact exceptions also absent. Mentioning of inexact exception in the description of `rint` in C standard is somewhat misleading, it is discusses earlier: https://reviews.llvm.org/D72930#inline-676354. sepavloff: > For all of these except nearbyint, you need to check the return code None of rounding…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions My thoughts here: (1) We aren't implementing the C standard. LLVM IR is language-independent and needs to be suitable for any language. We should clearly define the semantics of the constrained intrinsics (ideally more clearly than the C standard does for the corresponding functions), and those semantics should be suitable for any language. Depending on what we decide, that may mean languages that want stricter behavior might need to avoid using the intrinsics and instead provide a library call that does what they need. (2) The C standard seems to be evolving on this point. The C99 standard says the rint function may raise inexact and nearbyint will not. It is silent on the other functions here. I believe the C11 standard says the rint functions "do" raise the inexact exception, and that ceil, floor, round, and trunc functions "may, but are not required to," raise inexact. Lacking information about which version of the standard the user requested, we need behavior that is suitable to any version. (3) The point of the strict exception semantics mode is that any exception that would be raised by a literal translation of the code will be raised when the code is optimized in the strict mode. If a call to one of these functions (in the case where this was a function call) or the instructions to which they would be lowered in the case where the intrinsic is lowered directly to instructions, would raise an exception, then we should not fold the exception away in this mode. (4) I know at least one math library implementation that raises inexact for rint, ceil, floor, round, and trunc. My earlier claim was based on writing a program and seeing what happened. The Intel math library raises exceptions for each of these functions. The x864-64 GNU math library only raises the exception for rint. Other implementations may behave differently. I certainly understand why we would want to be able to constant fold in all these cases. However, I'm not convinced that doing so is always the right thing to do. At the very least, I think we'd need either some option to control this behavior or perhaps a new interface in TTI to query the behavior of these functions in the target math library. andrew.w.kaylor: My thoughts here: (1) We aren't implementing the C standard. LLVM IR is language-independent…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions This is rather complicated topic. Let's consider it in details. IEEE 754 This standard defines the cases when operation should raise inexact exception (7.6): Unless stated otherwise, if the rounded result of an operation is inexact - that is, it differs from what would have been computed were both exponent range and precision unbounded - then the inexact exception shall be signaled. Result of rounding operation is an integer value in the same floating point format as its argument. Integers are always exact, so one could expect that inexact exception are never raised in rounding operations. However the standard treats inexactness in wider sense when dealing with rounding. The standard defines set of rounding operations that produce values in floating point formats in 5.9: roundToIntegralTiesToEven roundToIntegralTowardZero roundToIntegralTowardPositive roundToIntegralTowardNegative roundToIntegralTiesToAway It states that: These operations shall not signal any exception except for signaling NaN input. It also defines operation: For the following operation, the rounding direction is the applicable rounding-direction attribute. This operation signals the invalid operation exception for a signaling NaN operand, and for a numerical operand, signals the inexact exception if the result does not have the same numerical value as x. - sourceFormat roundToIntegralExact(source) roundToIntegralExact(x) rounds x to an integral value according to the applicable rounding direction attribute. C standard Most recent draft (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2454.pdf) defines functions `ceil`, `floor`, `round`, `roundeven` and `trunc`, that have strict correspondence with the rounding operations defined in IEEE Std 754-2008. `nearbyint` does not have strict counterpart in IEEE 754, but dynamically selects one of the standard operations depending on the current rounding mode. As for `rint`, its description in 7.12.9.4 states that `it may signal` and does not specifies when it may signal. In the recent draft cited above (but not in the previous versions of the C standard) it is stated (F.10p6): The functions bound to operations in IEC 60559 (F.3) are fully specified by IEC 60559, including rounding behaviors and floating-point exceptions. F.3p1 specifies that `rint` is bound to `roundToIntegralExact`, `ceil` - to `roundToIntegralTowardPositive` and so on. So depending on the viewpoint, `rint` either may rise `inexact` exception (because of 7.12.9.4) or must do it (because of F.10p6). The former reflects perception of rounding as an operation, see the defect report http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_291.htm, although it probably is outdated. LLVM LLVM IR is language-independent. Intrinsic functions `ceil`, `floor`, `round`, `roundeven` and `trunc` are defined by C standard but they correspond to rounding operations defined by IEEE 754, which is also language-independent. It looks reasonable to implement them according to their IEEEE 754 counterparts. This means that all these intrinsics but `rint` do not raise `inexact` exception, `rint` raises `inexact` exception if its result differs from the argument. (2) The C standard seems to be evolving on this point. The C99 standard says the rint function may raise inexact and nearbyint will not. It is silent on the other functions here. I believe the C11 standard says the rint functions "do" raise the inexact exception, and that ceil, floor, round, and trunc functions "may, but are not required to," raise inexact. Lacking information about which version of the standard the user requested, we need behavior that is suitable to any version. Exactly. In C11 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) there is a statement about `ceil` (F.10.6.1): The ceil functions may, but are not required to, raise the ‘‘inexact’’ floating-point exception for finite non-integer arguments, as this implementation does. Similar statements existed for `floor,` round` and `trunc`. The recent draft does not provide such option, `inexact` is not allowed for these functions. (4) I know at least one math library implementation that raises inexact for rint, ceil, floor, round, and trunc. My earlier claim was based on writing a program and seeing what happened. The Intel math library raises exceptions for each of these functions. The x864-64 GNU math library only raises the exception for rint. Other implementations may behave differently. I believe this is caused by evolution of C standard. GCC implemented special option `-fno-fp-int-builtin-inexact` deal with compatibility issues (https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02060.html). We could implement this option in clang if necessary. sepavloff: This is rather complicated topic. Let's consider it in details. IEEE 754 This standard…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions Thank you for the very thorough answer! Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions. That would satisfy my objection. I have verified that the latest Fortran standard also defines these functions in terms of their IEEE 754 counterparts, and I think that's a reasonable expectation for any front end that wants strict floating point semantics. The more precisely specific exception semantics do present a potential issue for backends. The older "may, but are not required to, raise inexact" allows for a wider variety of implementations. Semantics that do not permit exceptions may require special handling. For example, the x87 FRNDINT instruction raises the inexact exception and so could only be used with rint. Because the non-constrained versions of these functions are defined to assume a floating point environment with no side effects and the constrained versions are still considered experimental, I think we are free to just declare their semantics to be bound to the IEEE-754 operations. If we are agreed on this, then only the rint folding here would need to handle the inexact condition. andrew.w.kaylor: Thank you for the very thorough answer! Perhaps we should document the LLVM floating point…
		kpnUnsubmitted Not Done Reply Inline Actions Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019 operations rather than the libm functions. I like this idea. +1 kpn: > Perhaps we should document the LLVM floating point intrinsics in terms of the IEEE 754-2019…
		RM = APFloat::rmTowardNegative;
		break;
		case Intrinsic::experimental_constrained_trunc:
		RM = APFloat::rmTowardZero;
		break;
		}
		if (RM) {
		auto CI = cast<ConstrainedFPIntrinsic>(Call);
		if (U.isFinite()) {
		APFloat::opStatus St = U.roundToIntegral(*RM);
		if (IntrinsicID == Intrinsic::experimental_constrained_rint &&
		St == APFloat::opInexact) {
		Optional<fp::ExceptionBehavior> EB = CI->getExceptionBehavior();
		if (EB && *EB == fp::ebStrict)
		return nullptr;
		}
		} else if (U.isSignaling()) {
		Optional<fp::ExceptionBehavior> EB = CI->getExceptionBehavior();
		if (EB && *EB != fp::ebIgnore)
		return nullptr;
		U = APFloat::getQNaN(U.getSemantics());
		}
		return ConstantFP::get(Ty->getContext(), U);
		}

/// We only fold functions with finite arguments. Folding NaN and inf is		/// We only fold functions with finite arguments. Folding NaN and inf is
/// likely to be aborted with an exception anyway, and some host libms		/// likely to be aborted with an exception anyway, and some host libms
/// have known errors raising exceptions.		/// have known errors raising exceptions.
if (!U.isFinite())		if (!U.isFinite())
return nullptr;		return nullptr;

/// Currently APFloat versions of these functions do not exist, so we use		/// Currently APFloat versions of these functions do not exist, so we use
/// the host native double versions. Float versions are not called		/// the host native double versions. Float versions are not called
▲ Show 20 Lines • Show All 765 Lines • ▼ Show 20 Lines	static Constant *ConstantFoldVectorCall(StringRef Name,
return ConstantVector::get(Result);		return ConstantVector::get(Result);
}		}

} // end anonymous namespace		} // end anonymous namespace

Constant llvm::ConstantFoldCall(const CallBase Call, Function *F,		Constant llvm::ConstantFoldCall(const CallBase Call, Function *F,
ArrayRef<Constant *> Operands,		ArrayRef<Constant *> Operands,
const TargetLibraryInfo *TLI) {		const TargetLibraryInfo *TLI) {
if (Call->isNoBuiltin() \|\| Call->isStrictFP())		if (Call->isNoBuiltin())
		evandroUnsubmitted Not Done Reply Inline Actions Again, not sure about the impact of doing this... evandro: Again, not sure about the impact of doing this...
		sepavloffAuthorUnsubmitted Done Reply Inline Actions The same thing. If an operation depends on or changes current floating point environment, it is represented by corresponding constrained intrinsics in strictfp function. sepavloff: The same thing. If an operation depends on or changes current floating point environment, it is…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions This is probably OK. I think these changes will always end up going through the other function for FP calls. andrew.w.kaylor: This is probably OK. I think these changes will always end up going through the other function…
return nullptr;		return nullptr;
if (!F->hasName())		if (!F->hasName())
return nullptr;		return nullptr;
StringRef Name = F->getName();		StringRef Name = F->getName();

Type *Ty = F->getReturnType();		Type *Ty = F->getReturnType();

if (auto *VTy = dyn_cast<VectorType>(Ty))		if (auto *VTy = dyn_cast<VectorType>(Ty))
▲ Show 20 Lines • Show All 155 Lines • Show Last 20 Lines

llvm/lib/Analysis/InstructionSimplify.cpp

Show First 20 Lines • Show All 5,362 Lines • ▼ Show 20 Lines	Value llvm::SimplifyCall(CallBase Call, const SimplifyQuery &Q) {
if (!canConstantFoldCallTo(Call, F))		if (!canConstantFoldCallTo(Call, F))
return nullptr;		return nullptr;

SmallVector<Constant *, 4> ConstantArgs;		SmallVector<Constant *, 4> ConstantArgs;
unsigned NumArgs = Call->getNumArgOperands();		unsigned NumArgs = Call->getNumArgOperands();
ConstantArgs.reserve(NumArgs);		ConstantArgs.reserve(NumArgs);
for (auto &Arg : Call->args()) {		for (auto &Arg : Call->args()) {
Constant *C = dyn_cast<Constant>(&Arg);		Constant *C = dyn_cast<Constant>(&Arg);
if (!C)		if (!C) {
		if (isa<MetadataAsValue>(Arg.get()))
		continue;
return nullptr;		return nullptr;
		}
ConstantArgs.push_back(C);		ConstantArgs.push_back(C);
}		}

return ConstantFoldCall(Call, F, ConstantArgs, Q.TLI);		return ConstantFoldCall(Call, F, ConstantArgs, Q.TLI);
}		}

/// Given operands for a Freeze, see if we can fold the result.		/// Given operands for a Freeze, see if we can fold the result.
static Value SimplifyFreezeInst(Value Op0, const SimplifyQuery &Q) {		static Value SimplifyFreezeInst(Value Op0, const SimplifyQuery &Q) {
▲ Show 20 Lines • Show All 301 Lines • Show Last 20 Lines

llvm/lib/IR/FPEnv.cpp

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	case fp::ebIgnore:
break;		break;
case fp::ebMayTrap:		case fp::ebMayTrap:
ExceptStr = "fpexcept.maytrap";		ExceptStr = "fpexcept.maytrap";
break;		break;
}		}
return ExceptStr;		return ExceptStr;
}		}

		Optional<APFloatBase::roundingMode>
		getAPFloatRoundingMode(fp::RoundingMode RM) {
		switch (RM) {
		case fp::rmDynamic:
		return None;
		case fp::rmToNearest:
		return APFloat::rmNearestTiesToEven;
		case fp::rmDownward:
		return APFloat::rmTowardNegative;
		case fp::rmUpward:
		return APFloat::rmTowardPositive;
		case fp::rmTowardZero:
		return APFloat::rmTowardZero;
		}
		llvm_unreachable("Unexpected rounding mode");
		}
}		}
		craig.topperUnsubmitted Not Done Reply Inline Actions rmToNearest should map to APFloat::rmNearestTiesToEven. craig.topper: rmToNearest should map to APFloat::rmNearestTiesToEven.
		sepavloffAuthorUnsubmitted Done Reply Inline Actions rmToNearest should map to APFloat::rmNearestTiesToEven Why? It seems that the C11 standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) does not specify which of two two IEEE-754 rounding modes corresponds to 'nearest'. Description of `round` however contains definite requirement (7.12.9.6): The round functions round their argument to the nearest integer value in floating-point format, rounding halfway cases away from zero, regardless of the current rounding direction. In this case rounding mode is `roundTiesToAway`. Why in other cases it must be `roundTiesToEven`? sepavloff: > rmToNearest should map to APFloat::rmNearestTiesToEven Why? It seems that the C11 standard…
		craig.topperUnsubmitted Not Done Reply Inline Actions As far as I know the default environment is supposed to be TiesToEven. That's what all of the constant folding for non-strict FP assumes. The round function itself is weird and specifies a different behavior. craig.topper: As far as I know the default environment is supposed to be TiesToEven. That's what all of the…
		sepavloffAuthorUnsubmitted Done Reply Inline Actions Yes, I see that ConstantFolding.cpp uses mainly `roundTiesToEven` and will change the patch. It would be nice to understand why this is so. What make me worry is unusual behavior of `roundTiesToEven`. For instance, 10.5 is rounded to 10.0. Behavior of `round` is more familiar. sepavloff: Yes, I see that ConstantFolding.cpp uses mainly `roundTiesToEven` and will change the patch. It…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions The IEEE 754-2019 specification says this in section 4.3.3: "The roundTiesToEven rounding-direction attribute shall be the default rounding-direction attribute for results in binary formats." Although it describes roundTiesToAway, it says that rounding mode isn't required for binary format implementations. More practically, I believe most hardware rounds ties to even when the round-to-nearest mode is selected. I know this is the case for x86 architecture processors. This is likely the behavior for all architectures that don't support both tie-breaking methods. andrew.w.kaylor: The IEEE 754-2019 specification says this in section 4.3.3: "The roundTiesToEven rounding…
		craig.topperUnsubmitted Not Done Reply Inline Actions roundTiestoAway is more familiar because its what we're taught in school when we're young. But it introduces bias in the results. Rounding to even can cancel out some bias. See the "round half to even" section here https://en.wikipedia.org/wiki/Rounding craig.topper: roundTiestoAway is more familiar because its what we're taught in school when we're young. But…

llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,367 Lines • ▼ Show 20 Lines	%result = call <4 x double> @llvm.experimental.constrained.fpext.v4f64.v4f32(
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <4 x double> %result		ret <4 x double> %result
}		}

define <1 x float> @constrained_vector_ceil_v1f32() #0 {		define <1 x float> @constrained_vector_ceil_v1f32() #0 {
; PC64LE-LABEL: constrained_vector_ceil_v1f32:		; PC64LE-LABEL: constrained_vector_ceil_v1f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI103_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI103_0@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI103_0@toc@l(3)		; PC64LE-NEXT: addi 3, 3, .LCPI103_0@toc@l
; PC64LE-NEXT: xsrdpip 0, 0		; PC64LE-NEXT: lfiwzx 0, 0, 3
; PC64LE-NEXT: xscvdpspn 0, 0		; PC64LE-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_ceil_v1f32:		; PC64LE9-LABEL: constrained_vector_ceil_v1f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI103_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI103_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI103_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI103_0@toc@l
; PC64LE9-NEXT: xsrdpip 0, 0		; PC64LE9-NEXT: lfiwzx 0, 0, 3
; PC64LE9-NEXT: xscvdpspn 0, 0		; PC64LE9-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE9-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%ceil = call <1 x float> @llvm.experimental.constrained.ceil.v1f32(		%ceil = call <1 x float> @llvm.experimental.constrained.ceil.v1f32(
<1 x float> <float 1.5>,		<1 x float> <float 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <1 x float> %ceil		ret <1 x float> %ceil
}		}

define <2 x double> @constrained_vector_ceil_v2f64() #0 {		define <2 x double> @constrained_vector_ceil_v2f64() #0 {
; PC64LE-LABEL: constrained_vector_ceil_v2f64:		; PC64LE-LABEL: constrained_vector_ceil_v2f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI104_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI104_0@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI104_0@toc@l		; PC64LE-NEXT: addi 3, 3, .LCPI104_0@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: xxswapd 34, 0
; PC64LE-NEXT: xvrdpip 34, 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_ceil_v2f64:		; PC64LE9-LABEL: constrained_vector_ceil_v2f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI104_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI104_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI104_0@toc@l		; PC64LE9-NEXT: addi 3, 3, .LCPI104_0@toc@l
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: xvrdpip 34, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%ceil = call <2 x double> @llvm.experimental.constrained.ceil.v2f64(		%ceil = call <2 x double> @llvm.experimental.constrained.ceil.v2f64(
<2 x double> <double 1.1, double 1.9>,		<2 x double> <double 1.1, double 1.9>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <2 x double> %ceil		ret <2 x double> %ceil
}		}

define <3 x float> @constrained_vector_ceil_v3f32() #0 {		define <3 x float> @constrained_vector_ceil_v3f32() #0 {
; PC64LE-LABEL: constrained_vector_ceil_v3f32:		; PC64LE-LABEL: constrained_vector_ceil_v3f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI105_2@toc@ha
; PC64LE-NEXT: addis 4, 2, .LCPI105_1@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI105_2@toc@l(3)
; PC64LE-NEXT: lfs 1, .LCPI105_1@toc@l(4)
; PC64LE-NEXT: addis 3, 2, .LCPI105_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI105_0@toc@ha
; PC64LE-NEXT: xsrdpip 0, 0		; PC64LE-NEXT: addi 3, 3, .LCPI105_0@toc@l
; PC64LE-NEXT: lfs 2, .LCPI105_0@toc@l(3)		; PC64LE-NEXT: lvx 2, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI105_3@toc@ha
; PC64LE-NEXT: xsrdpip 1, 1
; PC64LE-NEXT: addi 3, 3, .LCPI105_3@toc@l
; PC64LE-NEXT: xsrdpip 2, 2
; PC64LE-NEXT: xscvdpspn 0, 0
; PC64LE-NEXT: xscvdpspn 1, 1
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: xscvdpspn 0, 2
; PC64LE-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE-NEXT: vmrglw 2, 3, 2
; PC64LE-NEXT: lvx 3, 0, 3
; PC64LE-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE-NEXT: vperm 2, 4, 2, 3
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_ceil_v3f32:		; PC64LE9-LABEL: constrained_vector_ceil_v3f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI105_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI105_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI105_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI105_0@toc@l
; PC64LE9-NEXT: addis 3, 2, .LCPI105_1@toc@ha		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: lfs 1, .LCPI105_1@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI105_2@toc@ha
; PC64LE9-NEXT: xsrdpip 0, 0
; PC64LE9-NEXT: lfs 2, .LCPI105_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI105_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI105_3@toc@l
; PC64LE9-NEXT: xsrdpip 1, 1
; PC64LE9-NEXT: xsrdpip 2, 2
; PC64LE9-NEXT: xscvdpspn 0, 0
; PC64LE9-NEXT: xscvdpspn 1, 1
; PC64LE9-NEXT: xscvdpspn 2, 2
; PC64LE9-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE9-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE9-NEXT: xxsldwi 34, 2, 2, 1
; PC64LE9-NEXT: vmrglw 2, 3, 2
; PC64LE9-NEXT: lxvx 35, 0, 3
; PC64LE9-NEXT: vperm 2, 4, 2, 3
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%ceil = call <3 x float> @llvm.experimental.constrained.ceil.v3f32(		%ceil = call <3 x float> @llvm.experimental.constrained.ceil.v3f32(
<3 x float> <float 1.5, float 2.5, float 3.5>,		<3 x float> <float 1.5, float 2.5, float 3.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x float> %ceil		ret <3 x float> %ceil
}		}

define <3 x double> @constrained_vector_ceil_v3f64() #0 {		define <3 x double> @constrained_vector_ceil_v3f64() #0 {
; PC64LE-LABEL: constrained_vector_ceil_v3f64:		; PC64LE-LABEL: constrained_vector_ceil_v3f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI106_1@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI106_1@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI106_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI106_0@toc@ha
; PC64LE-NEXT: lfs 1, .LCPI106_0@toc@l(3)		; PC64LE-NEXT: lfs 1, .LCPI106_0@toc@l(3)
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: fmr 2, 1
; PC64LE-NEXT: xsrdpip 3, 1		; PC64LE-NEXT: fmr 3, 1
		kpnUnsubmitted Not Done Reply Inline Actions Now that I look at this, it might have been nice if all along we'd been writing test cases that gave some different values for the vector elements. Oh well. I doubt it's worth changing now. kpn: Now that I look at this, it might have been nice if all along we'd been writing test cases that…
; PC64LE-NEXT: xvrdpip 2, 0
; PC64LE-NEXT: xxswapd 1, 2
; PC64LE-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_ceil_v3f64:		; PC64LE9-LABEL: constrained_vector_ceil_v3f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI106_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI106_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI106_0@toc@l(3)		; PC64LE9-NEXT: lfs 1, .LCPI106_0@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI106_1@toc@ha		; PC64LE9-NEXT: fmr 2, 1
; PC64LE9-NEXT: addi 3, 3, .LCPI106_1@toc@l		; PC64LE9-NEXT: fmr 3, 1
; PC64LE9-NEXT: xsrdpip 3, 0
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: xvrdpip 2, 0
; PC64LE9-NEXT: xxswapd 1, 2
; PC64LE9-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE9-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%ceil = call <3 x double> @llvm.experimental.constrained.ceil.v3f64(		%ceil = call <3 x double> @llvm.experimental.constrained.ceil.v3f64(
<3 x double> <double 1.1, double 1.9, double 1.5>,		<3 x double> <double 1.1, double 1.9, double 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x double> %ceil		ret <3 x double> %ceil
}		}

define <1 x float> @constrained_vector_floor_v1f32() #0 {		define <1 x float> @constrained_vector_floor_v1f32() #0 {
; PC64LE-LABEL: constrained_vector_floor_v1f32:		; PC64LE-LABEL: constrained_vector_floor_v1f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI107_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI107_0@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI107_0@toc@l(3)		; PC64LE-NEXT: addi 3, 3, .LCPI107_0@toc@l
; PC64LE-NEXT: xsrdpim 0, 0		; PC64LE-NEXT: lfiwzx 0, 0, 3
; PC64LE-NEXT: xscvdpspn 0, 0		; PC64LE-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_floor_v1f32:		; PC64LE9-LABEL: constrained_vector_floor_v1f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI107_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI107_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI107_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI107_0@toc@l
; PC64LE9-NEXT: xsrdpim 0, 0		; PC64LE9-NEXT: lfiwzx 0, 0, 3
; PC64LE9-NEXT: xscvdpspn 0, 0		; PC64LE9-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE9-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%floor = call <1 x float> @llvm.experimental.constrained.floor.v1f32(		%floor = call <1 x float> @llvm.experimental.constrained.floor.v1f32(
<1 x float> <float 1.5>,		<1 x float> <float 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <1 x float> %floor		ret <1 x float> %floor
}		}


define <2 x double> @constrained_vector_floor_v2f64() #0 {		define <2 x double> @constrained_vector_floor_v2f64() #0 {
; PC64LE-LABEL: constrained_vector_floor_v2f64:		; PC64LE-LABEL: constrained_vector_floor_v2f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI108_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI108_0@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI108_0@toc@l		; PC64LE-NEXT: addi 3, 3, .LCPI108_0@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: xxswapd 34, 0
; PC64LE-NEXT: xvrdpim 34, 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_floor_v2f64:		; PC64LE9-LABEL: constrained_vector_floor_v2f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI108_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI108_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI108_0@toc@l		; PC64LE9-NEXT: addi 3, 3, .LCPI108_0@toc@l
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: xvrdpim 34, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%floor = call <2 x double> @llvm.experimental.constrained.floor.v2f64(		%floor = call <2 x double> @llvm.experimental.constrained.floor.v2f64(
<2 x double> <double 1.1, double 1.9>,		<2 x double> <double 1.1, double 1.9>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <2 x double> %floor		ret <2 x double> %floor
}		}

define <3 x float> @constrained_vector_floor_v3f32() #0 {		define <3 x float> @constrained_vector_floor_v3f32() #0 {
; PC64LE-LABEL: constrained_vector_floor_v3f32:		; PC64LE-LABEL: constrained_vector_floor_v3f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI109_2@toc@ha
; PC64LE-NEXT: addis 4, 2, .LCPI109_1@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI109_2@toc@l(3)
; PC64LE-NEXT: lfs 1, .LCPI109_1@toc@l(4)
; PC64LE-NEXT: addis 3, 2, .LCPI109_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI109_0@toc@ha
; PC64LE-NEXT: xsrdpim 0, 0		; PC64LE-NEXT: addi 3, 3, .LCPI109_0@toc@l
; PC64LE-NEXT: lfs 2, .LCPI109_0@toc@l(3)		; PC64LE-NEXT: lvx 2, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI109_3@toc@ha
; PC64LE-NEXT: xsrdpim 1, 1
; PC64LE-NEXT: addi 3, 3, .LCPI109_3@toc@l
; PC64LE-NEXT: xsrdpim 2, 2
; PC64LE-NEXT: xscvdpspn 0, 0
; PC64LE-NEXT: xscvdpspn 1, 1
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: xscvdpspn 0, 2
; PC64LE-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE-NEXT: vmrglw 2, 3, 2
; PC64LE-NEXT: lvx 3, 0, 3
; PC64LE-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE-NEXT: vperm 2, 4, 2, 3
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_floor_v3f32:		; PC64LE9-LABEL: constrained_vector_floor_v3f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI109_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI109_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI109_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI109_0@toc@l
; PC64LE9-NEXT: addis 3, 2, .LCPI109_1@toc@ha		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: lfs 1, .LCPI109_1@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI109_2@toc@ha
; PC64LE9-NEXT: xsrdpim 0, 0
; PC64LE9-NEXT: lfs 2, .LCPI109_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI109_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI109_3@toc@l
; PC64LE9-NEXT: xsrdpim 1, 1
; PC64LE9-NEXT: xsrdpim 2, 2
; PC64LE9-NEXT: xscvdpspn 0, 0
; PC64LE9-NEXT: xscvdpspn 1, 1
; PC64LE9-NEXT: xscvdpspn 2, 2
; PC64LE9-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE9-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE9-NEXT: xxsldwi 34, 2, 2, 1
; PC64LE9-NEXT: vmrglw 2, 3, 2
; PC64LE9-NEXT: lxvx 35, 0, 3
; PC64LE9-NEXT: vperm 2, 4, 2, 3
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%floor = call <3 x float> @llvm.experimental.constrained.floor.v3f32(		%floor = call <3 x float> @llvm.experimental.constrained.floor.v3f32(
<3 x float> <float 1.5, float 2.5, float 3.5>,		<3 x float> <float 1.5, float 2.5, float 3.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x float> %floor		ret <3 x float> %floor
}		}

define <3 x double> @constrained_vector_floor_v3f64() #0 {		define <3 x double> @constrained_vector_floor_v3f64() #0 {
; PC64LE-LABEL: constrained_vector_floor_v3f64:		; PC64LE-LABEL: constrained_vector_floor_v3f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI110_1@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI110_1@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI110_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI110_0@toc@ha
; PC64LE-NEXT: lfs 1, .LCPI110_0@toc@l(3)		; PC64LE-NEXT: lfs 1, .LCPI110_0@toc@l(3)
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: fmr 2, 1
; PC64LE-NEXT: xsrdpim 3, 1		; PC64LE-NEXT: fmr 3, 1
; PC64LE-NEXT: xvrdpim 2, 0
; PC64LE-NEXT: xxswapd 1, 2
; PC64LE-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_floor_v3f64:		; PC64LE9-LABEL: constrained_vector_floor_v3f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI110_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI110_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI110_0@toc@l(3)		; PC64LE9-NEXT: lfs 1, .LCPI110_0@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI110_1@toc@ha		; PC64LE9-NEXT: fmr 2, 1
; PC64LE9-NEXT: addi 3, 3, .LCPI110_1@toc@l		; PC64LE9-NEXT: fmr 3, 1
; PC64LE9-NEXT: xsrdpim 3, 0
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: xvrdpim 2, 0
; PC64LE9-NEXT: xxswapd 1, 2
; PC64LE9-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE9-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%floor = call <3 x double> @llvm.experimental.constrained.floor.v3f64(		%floor = call <3 x double> @llvm.experimental.constrained.floor.v3f64(
<3 x double> <double 1.1, double 1.9, double 1.5>,		<3 x double> <double 1.1, double 1.9, double 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x double> %floor		ret <3 x double> %floor
}		}

define <1 x float> @constrained_vector_round_v1f32() #0 {		define <1 x float> @constrained_vector_round_v1f32() #0 {
; PC64LE-LABEL: constrained_vector_round_v1f32:		; PC64LE-LABEL: constrained_vector_round_v1f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI111_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI111_0@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI111_0@toc@l(3)		; PC64LE-NEXT: addi 3, 3, .LCPI111_0@toc@l
; PC64LE-NEXT: xsrdpi 0, 0		; PC64LE-NEXT: lfiwzx 0, 0, 3
; PC64LE-NEXT: xscvdpspn 0, 0		; PC64LE-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_round_v1f32:		; PC64LE9-LABEL: constrained_vector_round_v1f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI111_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI111_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI111_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI111_0@toc@l
; PC64LE9-NEXT: xsrdpi 0, 0		; PC64LE9-NEXT: lfiwzx 0, 0, 3
; PC64LE9-NEXT: xscvdpspn 0, 0		; PC64LE9-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE9-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%round = call <1 x float> @llvm.experimental.constrained.round.v1f32(		%round = call <1 x float> @llvm.experimental.constrained.round.v1f32(
<1 x float> <float 1.5>,		<1 x float> <float 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <1 x float> %round		ret <1 x float> %round
}		}

define <2 x double> @constrained_vector_round_v2f64() #0 {		define <2 x double> @constrained_vector_round_v2f64() #0 {
; PC64LE-LABEL: constrained_vector_round_v2f64:		; PC64LE-LABEL: constrained_vector_round_v2f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI112_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI112_0@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI112_0@toc@l		; PC64LE-NEXT: addi 3, 3, .LCPI112_0@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: xxswapd 34, 0
; PC64LE-NEXT: xvrdpi 34, 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_round_v2f64:		; PC64LE9-LABEL: constrained_vector_round_v2f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI112_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI112_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI112_0@toc@l		; PC64LE9-NEXT: addi 3, 3, .LCPI112_0@toc@l
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: xvrdpi 34, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%round = call <2 x double> @llvm.experimental.constrained.round.v2f64(		%round = call <2 x double> @llvm.experimental.constrained.round.v2f64(
<2 x double> <double 1.1, double 1.9>,		<2 x double> <double 1.1, double 1.9>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <2 x double> %round		ret <2 x double> %round
}		}

define <3 x float> @constrained_vector_round_v3f32() #0 {		define <3 x float> @constrained_vector_round_v3f32() #0 {
; PC64LE-LABEL: constrained_vector_round_v3f32:		; PC64LE-LABEL: constrained_vector_round_v3f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI113_2@toc@ha
; PC64LE-NEXT: addis 4, 2, .LCPI113_1@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI113_2@toc@l(3)
; PC64LE-NEXT: lfs 1, .LCPI113_1@toc@l(4)
; PC64LE-NEXT: addis 3, 2, .LCPI113_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI113_0@toc@ha
; PC64LE-NEXT: xsrdpi 0, 0		; PC64LE-NEXT: addi 3, 3, .LCPI113_0@toc@l
; PC64LE-NEXT: lfs 2, .LCPI113_0@toc@l(3)		; PC64LE-NEXT: lvx 2, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI113_3@toc@ha
; PC64LE-NEXT: xsrdpi 1, 1
; PC64LE-NEXT: addi 3, 3, .LCPI113_3@toc@l
; PC64LE-NEXT: xsrdpi 2, 2
; PC64LE-NEXT: xscvdpspn 0, 0
; PC64LE-NEXT: xscvdpspn 1, 1
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: xscvdpspn 0, 2
; PC64LE-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE-NEXT: vmrglw 2, 3, 2
; PC64LE-NEXT: lvx 3, 0, 3
; PC64LE-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE-NEXT: vperm 2, 4, 2, 3
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_round_v3f32:		; PC64LE9-LABEL: constrained_vector_round_v3f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI113_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI113_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI113_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI113_0@toc@l
; PC64LE9-NEXT: addis 3, 2, .LCPI113_1@toc@ha		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: lfs 1, .LCPI113_1@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI113_2@toc@ha
; PC64LE9-NEXT: xsrdpi 0, 0
; PC64LE9-NEXT: lfs 2, .LCPI113_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI113_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI113_3@toc@l
; PC64LE9-NEXT: xsrdpi 1, 1
; PC64LE9-NEXT: xsrdpi 2, 2
; PC64LE9-NEXT: xscvdpspn 0, 0
; PC64LE9-NEXT: xscvdpspn 1, 1
; PC64LE9-NEXT: xscvdpspn 2, 2
; PC64LE9-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE9-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE9-NEXT: xxsldwi 34, 2, 2, 1
; PC64LE9-NEXT: vmrglw 2, 3, 2
; PC64LE9-NEXT: lxvx 35, 0, 3
; PC64LE9-NEXT: vperm 2, 4, 2, 3
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%round = call <3 x float> @llvm.experimental.constrained.round.v3f32(		%round = call <3 x float> @llvm.experimental.constrained.round.v3f32(
<3 x float> <float 1.5, float 2.5, float 3.5>,		<3 x float> <float 1.5, float 2.5, float 3.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x float> %round		ret <3 x float> %round
}		}


define <3 x double> @constrained_vector_round_v3f64() #0 {		define <3 x double> @constrained_vector_round_v3f64() #0 {
; PC64LE-LABEL: constrained_vector_round_v3f64:		; PC64LE-LABEL: constrained_vector_round_v3f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI114_1@toc@ha		; PC64LE-NEXT: addis 4, 2, .LCPI114_1@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI114_1@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI114_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI114_0@toc@ha
		; PC64LE-NEXT: lfs 2, .LCPI114_1@toc@l(4)
; PC64LE-NEXT: lfs 1, .LCPI114_0@toc@l(3)		; PC64LE-NEXT: lfs 1, .LCPI114_0@toc@l(3)
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: fmr 3, 2
; PC64LE-NEXT: xsrdpi 3, 1
; PC64LE-NEXT: xvrdpi 2, 0
; PC64LE-NEXT: xxswapd 1, 2
; PC64LE-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_round_v3f64:		; PC64LE9-LABEL: constrained_vector_round_v3f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI114_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI114_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI114_0@toc@l(3)		; PC64LE9-NEXT: lfs 1, .LCPI114_0@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI114_1@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI114_1@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI114_1@toc@l		; PC64LE9-NEXT: lfs 2, .LCPI114_1@toc@l(3)
; PC64LE9-NEXT: xsrdpi 3, 0		; PC64LE9-NEXT: fmr 3, 2
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: xvrdpi 2, 0
; PC64LE9-NEXT: xxswapd 1, 2
; PC64LE9-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE9-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%round = call <3 x double> @llvm.experimental.constrained.round.v3f64(		%round = call <3 x double> @llvm.experimental.constrained.round.v3f64(
<3 x double> <double 1.1, double 1.9, double 1.5>,		<3 x double> <double 1.1, double 1.9, double 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x double> %round		ret <3 x double> %round
}		}

define <1 x float> @constrained_vector_trunc_v1f32() #0 {		define <1 x float> @constrained_vector_trunc_v1f32() #0 {
; PC64LE-LABEL: constrained_vector_trunc_v1f32:		; PC64LE-LABEL: constrained_vector_trunc_v1f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI115_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI115_0@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI115_0@toc@l(3)		; PC64LE-NEXT: addi 3, 3, .LCPI115_0@toc@l
; PC64LE-NEXT: xsrdpiz 0, 0		; PC64LE-NEXT: lfiwzx 0, 0, 3
; PC64LE-NEXT: xscvdpspn 0, 0		; PC64LE-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_trunc_v1f32:		; PC64LE9-LABEL: constrained_vector_trunc_v1f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI115_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI115_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI115_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI115_0@toc@l
; PC64LE9-NEXT: xsrdpiz 0, 0		; PC64LE9-NEXT: lfiwzx 0, 0, 3
; PC64LE9-NEXT: xscvdpspn 0, 0		; PC64LE9-NEXT: xxpermdi 34, 0, 0, 2
; PC64LE9-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%trunc = call <1 x float> @llvm.experimental.constrained.trunc.v1f32(		%trunc = call <1 x float> @llvm.experimental.constrained.trunc.v1f32(
<1 x float> <float 1.5>,		<1 x float> <float 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <1 x float> %trunc		ret <1 x float> %trunc
}		}

define <2 x double> @constrained_vector_trunc_v2f64() #0 {		define <2 x double> @constrained_vector_trunc_v2f64() #0 {
; PC64LE-LABEL: constrained_vector_trunc_v2f64:		; PC64LE-LABEL: constrained_vector_trunc_v2f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI116_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI116_0@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI116_0@toc@l		; PC64LE-NEXT: addi 3, 3, .LCPI116_0@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3		; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: xxswapd 34, 0
; PC64LE-NEXT: xvrdpiz 34, 0
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_trunc_v2f64:		; PC64LE9-LABEL: constrained_vector_trunc_v2f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI116_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI116_0@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI116_0@toc@l		; PC64LE9-NEXT: addi 3, 3, .LCPI116_0@toc@l
; PC64LE9-NEXT: lxvx 0, 0, 3		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: xvrdpiz 34, 0
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%trunc = call <2 x double> @llvm.experimental.constrained.trunc.v2f64(		%trunc = call <2 x double> @llvm.experimental.constrained.trunc.v2f64(
<2 x double> <double 1.1, double 1.9>,		<2 x double> <double 1.1, double 1.9>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <2 x double> %trunc		ret <2 x double> %trunc
}		}

define <3 x float> @constrained_vector_trunc_v3f32() #0 {		define <3 x float> @constrained_vector_trunc_v3f32() #0 {
; PC64LE-LABEL: constrained_vector_trunc_v3f32:		; PC64LE-LABEL: constrained_vector_trunc_v3f32:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI117_2@toc@ha
; PC64LE-NEXT: addis 4, 2, .LCPI117_1@toc@ha
; PC64LE-NEXT: lfs 0, .LCPI117_2@toc@l(3)
; PC64LE-NEXT: lfs 1, .LCPI117_1@toc@l(4)
; PC64LE-NEXT: addis 3, 2, .LCPI117_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI117_0@toc@ha
; PC64LE-NEXT: xsrdpiz 0, 0		; PC64LE-NEXT: addi 3, 3, .LCPI117_0@toc@l
; PC64LE-NEXT: lfs 2, .LCPI117_0@toc@l(3)		; PC64LE-NEXT: lvx 2, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI117_3@toc@ha
; PC64LE-NEXT: xsrdpiz 1, 1
; PC64LE-NEXT: addi 3, 3, .LCPI117_3@toc@l
; PC64LE-NEXT: xsrdpiz 2, 2
; PC64LE-NEXT: xscvdpspn 0, 0
; PC64LE-NEXT: xscvdpspn 1, 1
; PC64LE-NEXT: xxsldwi 34, 0, 0, 1
; PC64LE-NEXT: xscvdpspn 0, 2
; PC64LE-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE-NEXT: vmrglw 2, 3, 2
; PC64LE-NEXT: lvx 3, 0, 3
; PC64LE-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE-NEXT: vperm 2, 4, 2, 3
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_trunc_v3f32:		; PC64LE9-LABEL: constrained_vector_trunc_v3f32:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI117_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI117_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI117_0@toc@l(3)		; PC64LE9-NEXT: addi 3, 3, .LCPI117_0@toc@l
; PC64LE9-NEXT: addis 3, 2, .LCPI117_1@toc@ha		; PC64LE9-NEXT: lxvx 34, 0, 3
; PC64LE9-NEXT: lfs 1, .LCPI117_1@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI117_2@toc@ha
; PC64LE9-NEXT: xsrdpiz 0, 0
; PC64LE9-NEXT: lfs 2, .LCPI117_2@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI117_3@toc@ha
; PC64LE9-NEXT: addi 3, 3, .LCPI117_3@toc@l
; PC64LE9-NEXT: xsrdpiz 1, 1
; PC64LE9-NEXT: xsrdpiz 2, 2
; PC64LE9-NEXT: xscvdpspn 0, 0
; PC64LE9-NEXT: xscvdpspn 1, 1
; PC64LE9-NEXT: xscvdpspn 2, 2
; PC64LE9-NEXT: xxsldwi 36, 0, 0, 1
; PC64LE9-NEXT: xxsldwi 35, 1, 1, 1
; PC64LE9-NEXT: xxsldwi 34, 2, 2, 1
; PC64LE9-NEXT: vmrglw 2, 3, 2
; PC64LE9-NEXT: lxvx 35, 0, 3
; PC64LE9-NEXT: vperm 2, 4, 2, 3
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%trunc = call <3 x float> @llvm.experimental.constrained.trunc.v3f32(		%trunc = call <3 x float> @llvm.experimental.constrained.trunc.v3f32(
<3 x float> <float 1.5, float 2.5, float 3.5>,		<3 x float> <float 1.5, float 2.5, float 3.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x float> %trunc		ret <3 x float> %trunc
}		}

define <3 x double> @constrained_vector_trunc_v3f64() #0 {		define <3 x double> @constrained_vector_trunc_v3f64() #0 {
; PC64LE-LABEL: constrained_vector_trunc_v3f64:		; PC64LE-LABEL: constrained_vector_trunc_v3f64:
; PC64LE: # %bb.0: # %entry		; PC64LE: # %bb.0: # %entry
; PC64LE-NEXT: addis 3, 2, .LCPI118_1@toc@ha
; PC64LE-NEXT: addi 3, 3, .LCPI118_1@toc@l
; PC64LE-NEXT: lxvd2x 0, 0, 3
; PC64LE-NEXT: addis 3, 2, .LCPI118_0@toc@ha		; PC64LE-NEXT: addis 3, 2, .LCPI118_0@toc@ha
; PC64LE-NEXT: lfs 1, .LCPI118_0@toc@l(3)		; PC64LE-NEXT: lfs 1, .LCPI118_0@toc@l(3)
; PC64LE-NEXT: xxswapd 0, 0		; PC64LE-NEXT: fmr 2, 1
; PC64LE-NEXT: xsrdpiz 3, 1		; PC64LE-NEXT: fmr 3, 1
; PC64LE-NEXT: xvrdpiz 2, 0
; PC64LE-NEXT: xxswapd 1, 2
; PC64LE-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE-NEXT: blr		; PC64LE-NEXT: blr
;		;
; PC64LE9-LABEL: constrained_vector_trunc_v3f64:		; PC64LE9-LABEL: constrained_vector_trunc_v3f64:
; PC64LE9: # %bb.0: # %entry		; PC64LE9: # %bb.0: # %entry
; PC64LE9-NEXT: addis 3, 2, .LCPI118_0@toc@ha		; PC64LE9-NEXT: addis 3, 2, .LCPI118_0@toc@ha
; PC64LE9-NEXT: lfs 0, .LCPI118_0@toc@l(3)		; PC64LE9-NEXT: lfs 1, .LCPI118_0@toc@l(3)
; PC64LE9-NEXT: addis 3, 2, .LCPI118_1@toc@ha		; PC64LE9-NEXT: fmr 2, 1
; PC64LE9-NEXT: addi 3, 3, .LCPI118_1@toc@l		; PC64LE9-NEXT: fmr 3, 1
; PC64LE9-NEXT: xsrdpiz 3, 0
; PC64LE9-NEXT: lxvx 0, 0, 3
; PC64LE9-NEXT: xvrdpiz 2, 0
; PC64LE9-NEXT: xxswapd 1, 2
; PC64LE9-NEXT: # kill: def $f1 killed $f1 killed $vsl1
; PC64LE9-NEXT: # kill: def $f2 killed $f2 killed $vsl2
; PC64LE9-NEXT: blr		; PC64LE9-NEXT: blr
entry:		entry:
%trunc = call <3 x double> @llvm.experimental.constrained.trunc.v3f64(		%trunc = call <3 x double> @llvm.experimental.constrained.trunc.v3f64(
<3 x double> <double 1.1, double 1.9, double 1.5>,		<3 x double> <double 1.1, double 1.9, double 1.5>,
metadata !"fpexcept.strict") #1		metadata !"fpexcept.strict") #1
ret <3 x double> %trunc		ret <3 x double> %trunc
}		}

▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/constfold-constrained.ll

This file was added.

				; RUN: opt < %s -instsimplify -S \| FileCheck %s


				; Verify that floor(10.1) is folded to 10.0 when the exception behavior is 'ignore'.
				define double @floor_01() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.floor.f64(
				double 1.010000e+01,
				metadata !"fpexcept.ignore") #0
				ret double %result
				; CHECK-LABEL: @floor_01
				; CHECK: ret double 1.000000e+01
				}

				; Verify that floor(-10.1) is folded to -11.0 when the exception behavior is not 'ignore'.
				define double @floor_02() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.floor.f64(
				double -1.010000e+01,
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @floor_02
				; CHECK: ret double -1.100000e+01
				}

				; Verify that ceil(10.1) is folded to 11.0 when the exception behavior is 'ignore'.
				define double @ceil_01() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.ceil.f64(
				double 1.010000e+01,
				metadata !"fpexcept.ignore") #0
				ret double %result
				; CHECK-LABEL: @ceil_01
				; CHECK: ret double 1.100000e+01
				}

				; Verify that ceil(-10.1) is folded to -10.0 when the exception behavior is not 'ignore'.
				define double @ceil_02() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.ceil.f64(
				double -1.010000e+01,
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @ceil_02
				; CHECK: ret double -1.000000e+01
				}

				; Verify that trunc(10.1) is folded to 10.0 when the exception behavior is 'ignore'.
				define double @trunc_01() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.trunc.f64(
				double 1.010000e+01,
				metadata !"fpexcept.ignore") #0
				ret double %result
				; CHECK-LABEL: @trunc_01
				; CHECK: ret double 1.000000e+01
				}

				; Verify that trunc(-10.1) is folded to -10.0 when the exception behavior is NOT 'ignore'.
				define double @trunc_02() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.trunc.f64(
				double -1.010000e+01,
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @trunc_02
				; CHECK: ret double -1.000000e+01
				}

				; Verify that round(10.5) is folded to 11.0 when the exception behavior is 'ignore'.
				define double @round_01() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.round.f64(
				double 1.050000e+01,
				metadata !"fpexcept.ignore") #0
				ret double %result
				; CHECK-LABEL: @round_01
				; CHECK: ret double 1.100000e+01
				}

				; Verify that floor(-10.5) is folded to -11.0 when the exception behavior is NOT 'ignore'.
				define double @round_02() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.round.f64(
				double -1.050000e+01,
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @round_02
				; CHECK: ret double -1.100000e+01
				}

				; Verify that nearbyint(10.5) is folded to 11.0 when the rounding mode is 'upward'.
				define double @nearbyint_01() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.nearbyint.f64(
				double 1.050000e+01,
				metadata !"round.upward",
				metadata !"fpexcept.ignore") #0
				ret double %result
				; CHECK-LABEL: @nearbyint_01
				; CHECK: ret double 1.100000e+01
				}

				; Verify that nearbyint(10.5) is folded to 10.0 when the rounding mode is 'downward'.
				define double @nearbyint_02() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.nearbyint.f64(
				double 1.050000e+01,
				metadata !"round.downward",
				metadata !"fpexcept.maytrap") #0
				ret double %result
				; CHECK-LABEL: @nearbyint_02
				; CHECK: ret double 1.000000e+01
				}

				; Verify that nearbyint(10.5) is folded to 10.0 when the rounding mode is 'towardzero'.
				define double @nearbyint_03() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.nearbyint.f64(
				double 1.050000e+01,
				metadata !"round.towardzero",
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @nearbyint_03
				; CHECK: ret double 1.000000e+01
				}

				; Verify that nearbyint(10.5) is folded to 10.0 when the rounding mode is 'tonearest'.
				define double @nearbyint_04() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.nearbyint.f64(
				double 1.050000e+01,
				metadata !"round.tonearest",
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @nearbyint_04
				; CHECK: ret double 1.000000e+01
				}

				; Verify that nearbyint(10.5) is NOT folded if the rounding mode is 'dynamic'.
				define double @nearbyint_05() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.nearbyint.f64(
				double 1.050000e+01,
				metadata !"round.dynamic",
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @nearbyint_05
				; CHECK: [[VAL:%.+]] = {{.*}}call double @llvm.experimental.constrained.nearbyint
				; CHECK: ret double [[VAL]]
				}

				; Verify that trunc(SNAN) is NOT folded if the exception behavior mode is not 'ignore'.
				define double @nonfinite_01() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.trunc.f64(
				double 0x7ff4000000000000,
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @nonfinite_01
				; CHECK: [[VAL:%.+]] = {{.*}}call double @llvm.experimental.constrained.trunc
				; CHECK: ret double [[VAL]]
				}

				; Verify that trunc(SNAN) is folded to QNAN if the exception behavior mode is 'ignore'.
				define double @nonfinite_02() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.trunc.f64(
				double 0x7ff4000000000000,
				metadata !"fpexcept.ignore") #0
				ret double %result
				; CHECK-LABEL: @nonfinite_02
				; CHECK: ret double 0x7FF8000000000000
				}

				; Verify that trunc(QNAN) is folded even if the exception behavior mode is not 'ignore'.
				define double @nonfinite_03() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.trunc.f64(
				double 0x7ff8000000000000,
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @nonfinite_03
				; CHECK: ret double 0x7FF8000000000000
				}

				; Verify that trunc(+Inf) is folded even if the exception behavior mode is not 'ignore'.
				define double @nonfinite_04() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.trunc.f64(
				double 0x7ff0000000000000,
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @nonfinite_04
				; CHECK: ret double 0x7FF0000000000000
				}

				; Verify that rint(10) is folded to 10.0 when the rounding mode is 'tonearest'.
				define double @rint_01() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.rint.f64(
				double 1.000000e+01,
				metadata !"round.tonearest",
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @rint_01
				; CHECK: ret double 1.000000e+01
				}

				; Verify that rint(10.1) is NOT folded to 10.0 when the exception behavior is 'strict'.
				define double @rint_02() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.rint.f64(
				double 1.010000e+01,
				metadata !"round.tonearest",
				metadata !"fpexcept.strict") #0
				ret double %result
				; CHECK-LABEL: @rint_02
				; CHECK: [[VAL:%.+]] = {{.*}}call double @llvm.experimental.constrained.rint
				; CHECK: ret double [[VAL]]
				}

				; Verify that rint(10.1) is folded to 10.0 when the exception behavior is not 'strict'.
				define double @rint_03() #0 {
				entry:
				%result = call double @llvm.experimental.constrained.rint.f64(
				double 1.010000e+01,
				metadata !"round.tonearest",
				metadata !"fpexcept.maytrap") #0
				ret double %result
				; CHECK-LABEL: @rint_03
				; CHECK: ret double 1.000000e+01
				}


				attributes #0 = { strictfp }

				declare double @llvm.experimental.constrained.nearbyint.f64(double, metadata, metadata)
				declare double @llvm.experimental.constrained.floor.f64(double, metadata)
				declare double @llvm.experimental.constrained.ceil.f64(double, metadata)
				declare double @llvm.experimental.constrained.trunc.f64(double, metadata)
				declare double @llvm.experimental.constrained.round.f64(double, metadata)
				declare double @llvm.experimental.constrained.rint.f64(double, metadata, metadata)

This is an archive of the discontinued LLVM Phabricator instance.

[FEnv] Constfold some unary constrained operationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 253132

llvm/include/llvm/IR/FPEnv.h

llvm/lib/Analysis/ConstantFolding.cpp

llvm/lib/Analysis/InstructionSimplify.cpp

llvm/lib/IR/FPEnv.cpp

llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll

llvm/test/Transforms/InstSimplify/constfold-constrained.ll

[FEnv] Constfold some unary constrained operations
ClosedPublic