This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
FPEnv.h
-
lib/Analysis/
-
Analysis/
6/16
InstructionSimplify.cpp
-
test/Transforms/InstSimplify/
-
Transforms/
-
InstSimplify/
-
strictfp-fadd.ll

Differential D106362

[FPEnv][InstSimplify] Enable more folds for constrained fadd
AbandonedPublic

Authored by kpn on Jul 20 2021, 7:24 AM.

Download Raw Diff

Details

Reviewers

sepavloff
spatel
nlopes
efriedma
scanon
jcranmer
lebedev.ri

Summary

Currently there are optimizations for the fadd instruction that do not fire for a constrained fadd. Add some of these optimizations.

Diff Detail

Unit TestsFailed

	Time	Test
	80 ms	x64 debian > ORC-x86_64-linux.TestCases/Linux/x86-64::trivial-cxa-atexit.S
	120 ms	x64 debian > ORC-x86_64-linux.TestCases/Linux/x86-64::trivial-static-initializer.S
	160 ms	x64 debian > ORC-x86_64-linux.TestCases/Linux/x86-64::trivial-tls.S

Event Timeline

kpn created this revision.Jul 20 2021, 7:24 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 20 2021, 7:24 AM

kpn requested review of this revision.Jul 20 2021, 7:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2021, 7:24 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B115095: Diff 360121.Jul 20 2021, 8:05 AM

Pre-commit the tests, so we just show diffs in this patch?

Tests have been precommitted.

Harbormaster completed remote builds in B116437: Diff 362043.Jul 27 2021, 9:52 AM

sepavloff added inline comments.Jul 29 2021, 3:24 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	Even if `ExBehavior != fp::ebStrict` this transformation is invalid if `X` is SNaN, the latter must be converted to QNaN.
4950	The same about `X==SNaN`.
4955	What about making such transformation in non-default mode?

kpn added inline comments.Jul 29 2021, 6:53 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4955	It requires adding support to the IR matchers like m_FSub(), and those are used elsewhere. Which implies testing in places in addition to here. So I'm saving that for a subsequent patch. Small steps.

kpn added inline comments.Aug 2 2021, 9:22 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	Can I put this in a different patch? We have the same issue for constrained and non-constrained cases, and we don't have a good way to distinguish them -- nor should we. Most of the required code is present in APFloat, but not all. Would it be OK for me to add the needed bits to APFloat and use them from InstSimplify in a different ticket?

sepavloff added inline comments.Aug 2 2021, 10:57 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	I am not sure I understand what you are going to put into another patch. The transformation: fadd X, -0 ==> X is valid only if `X != SNaN`, otherwise we have: fadd SNaN, -0 ==> QNaN It does not depend on whether FP environment is default or not, the code before your changes was already incorrect. Such transformation is valid if the operation has flag `nnan`, but not in general case. Code in APFloat hardly can help, as constant folding is made previously, if `X` is a constant here, it means it cannot be folded.

spatel added reviewers: efriedma, scanon.Aug 6 2021, 7:01 AM

spatel added inline comments.

llvm/lib/Analysis/InstructionSimplify.cpp
4944	IIUC, you are saying SNaN vs. QNaN is more than just a part of the exception state. But that's not how I interpret the current LangRef: https://llvm.org/docs/LangRef.html#floating-point-environment ...and that's why none of the transforms here are intentionally SNaN preserving/clearing. If we are going to change the behavior of the default FP env, the LangRef must be updated to make that clear. I don't see the motivation yet.

sepavloff added inline comments.Aug 6 2021, 10:43 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	That's true for default FP environment. A difference between SNaN and QNaN is that operations on SNaN raise invalid exception. In default FP environment exceptions are ignored so SNaN and QNaN behave identically. But this code works in non-default FP environment as well. `fadd` here designates both regular IR node used in default environment as well as its constrained counterpart. So SNaN here must be handled more carefully.

We could fold in the "strict" exception behavior cases if we had a matcher for a QNaN. But instructions still wouldn't be removed since "strict" makes them be !isInstructionTriviallyDead(). A TODO note is probably sufficient should a m_QNaN() be added in the future.

llvm/lib/Analysis/InstructionSimplify.cpp
4944	We won't see a NaN here if 'nnan' is specified since simplifyFPOp() will have already turned it to poison and returned. So I think the check for FMF.noNaNs() should be removed since it can't happen and is therefore misleading. We won't fold in the 'strict' exception case because we check for it and reject it. I suspect the real objection is that we're not turning a SNaN into a QNaN here. But with the constant folding that Serge recently added we won't even be here in the "ignore" or "maytrap" cases at all. There won't be a "fadd NaN, -0" case except in the "strict" exception case which we decline to fold. Aside from the unneeded check for FMF.noNaNs() I don't see a problem here. We aren't returning the wrong result here or below. And I do now have code for the APFloat and ConstantFP classes to make it easy for simplifyFPOp() to convert SNaN->QNaN. That's what I was planning on submitting in a different ticket.

spatel added inline comments.Aug 6 2021, 11:50 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	This seems to be getting fuzzy when the exception state is "MayTrap", so we probably need to clarify that definition in the LangRef. Is there a regression test below that shows where you think this patch is wrong? I think the fold is correct with MayTrap (assuming the rounding mode is known suitable) because "passes are not required to preserve all exceptions that are implied by the original code"; presumably some other operation is eventually going to generate the invalid exception when it sees the SNaN?

kpn added inline comments.Aug 6 2021, 1:19 PM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	Actually, the check for FMF.noNaNs() is required to catch non-constants. We can still fold "fadd X, -0" if we know X is not a NaN. We still have the "trivially dead" issue where the instruction would be hanging around needlessly, but still...

kpn added inline comments.Aug 6 2021, 2:16 PM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	Rereading @spatel's comment: what about the "X is a variable that happens to have the value of an SNaN" case: it's true that we'll be removing an instruction that would have converted the SNaN to a QNaN. So the removal of the instruction would be observable. I don't see a way around that without eliminating the ability to optimize at all.

andrew.w.kaylor added a subscriber: andrew.w.kaylor.Aug 23 2021, 1:05 PM

andrew.w.kaylor added inline comments.

llvm/lib/Analysis/InstructionSimplify.cpp
4944	I don't think you can make this transformation in the "maytrap" case. It would be ok to optimize an SNaN to a QNaN, but you can't eliminate an instruction that would raise the exception while performing the same conversion. Consider this example: double foo(double x, double y) { #pragma clang fp exceptions(maytrap) double temp; feclearexcept(FE_ALL_EXCEPT); temp = x + -0.0; // Check the x = SNaN case if (fetestexcept(FE_INEXACT)) return -1.0; temp = temp + y; // Check the y = SNaN case if (fetestexcept(FE_INEXACT)) return -2.0; return temp; } Eliminating the fadd x, -0 causes the exception to be raised in the wrong place.

kpn added inline comments.Sep 21 2021, 7:42 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4955	Actually, the check for the default FP environment isn't needed. No matcher support has been done yet. The matchers won't match so there's no need for the guard.

Update for review comments. Avoid changes that move the location of traps unless we know they won't ever trap due to fast math flags.

Harbormaster completed remote builds in B124915: Diff 373930.Sep 21 2021, 8:46 AM

sepavloff added inline comments.Sep 29 2021, 12:23 AM

llvm/lib/Analysis/InstructionSimplify.cpp
4944	The transformation `fadd X, -0 ==> X` is wrong in the two cases: When the rounding mode is `downward` and `X` is +0: `+0 + (-0) ==> -0`. If MFM has flag `nsz`, I think we can apply this transformation. When `X` is a signaling NaN: `SNaN + (-0) ==> QNaN`. In runtime the operation would produce quiet NaN but it this transformation is done, the result would be signaling NaN. It is not correct to produce SNaN instead of QNaN. This result might be stored somewhere and could further be processed by other code, which operates in strict mode. Such transformation would cause that code to trap. Behavior of the code with this transformation and without it would be different. So this transformation is safe only if `FMF.noNans()`. The proposed condition is: if ((!RoundingModeMayBe(Rounding, RoundingMode::TowardNegative) \|\| FMF.noSignedZeros()) && FMF.noNaNs())
4955	According to https://en.wikipedia.org/wiki/Signed_zero: "x-x=x+(-x)=+0 (for any finite x, -0 when rounding toward negative)" So the check for rounding mode is also necessary, unless `FMF.noSignedZero()`.

Changing the way the existing transforms check for FMF.noNaNs() sounds like a different ticket that needs to be done before this one can progress. Unless I misunderstood?

In D106362#3030790, @kpn wrote:

Changing the way the existing transforms check for FMF.noNaNs() sounds like a different ticket that needs to be done before this one can progress. Unless I misunderstood?

I know nothing about such change.

In D106362#3030856, @sepavloff wrote:

In D106362#3030790, @kpn wrote:

Changing the way the existing transforms check for FMF.noNaNs() sounds like a different ticket that needs to be done before this one can progress. Unless I misunderstood?

I know nothing about such change.

Your proposed changes around SNaN will affect the non-constrained cases.

In D106362#3030871, @kpn wrote:

In D106362#3030856, @sepavloff wrote:

In D106362#3030790, @kpn wrote:

Changing the way the existing transforms check for FMF.noNaNs() sounds like a different ticket that needs to be done before this one can progress. Unless I misunderstood?

I know nothing about such change.

Your proposed changes around SNaN will affect the non-constrained cases.

I think the optimization fadd X, -0 ==> X in general case (not FMF.noNaNs()) is incorrect. The obtained values must be identical to what would produce hardware, otherwise it is not an optimization. If you think that such change deserves a separate patch, no problem. It seems to me that this patch could establish correct folding even if it changes non-constrained operation as well.

kpn added a reviewer: jcranmer.Sep 29 2021, 11:50 AM

In D106362#3031030, @sepavloff wrote:

In D106362#3030871, @kpn wrote:

In D106362#3030856, @sepavloff wrote:

In D106362#3030790, @kpn wrote:

Changing the way the existing transforms check for FMF.noNaNs() sounds like a different ticket that needs to be done before this one can progress. Unless I misunderstood?

I know nothing about such change.

Your proposed changes around SNaN will affect the non-constrained cases.

I think the optimization fadd X, -0 ==> X in general case (not FMF.noNaNs()) is incorrect. The obtained values must be identical to what would produce hardware, otherwise it is not an optimization. If you think that such change deserves a separate patch, no problem. It seems to me that this patch could establish correct folding even if it changes non-constrained operation as well.

I really don't think we should be changing how LLVM handles the ebIgnore cases. LLVM currently treats SNaN and QNaN as the same except where it doesn't, and I believe we should keep that behavior unchanged when ignoring exceptions. At the very least we shouldn't be disabling any currently enabled optimizations when ignoring exceptions. @lebedev.ri or @jcranmer ?

Our closest approximation to a strict IEEE-754 mode currently is with ebStrict / "fpexcept.strict" usage, but that isn't documented as being strictly IEEE-754 compliant. If someone wants to go through and add a strict -754 mode at some point then they can. But that's a different project.

In D106362#3031370, @kpn wrote:

In D106362#3031030, @sepavloff wrote:

In D106362#3030871, @kpn wrote:

In D106362#3030856, @sepavloff wrote:

In D106362#3030790, @kpn wrote:

Changing the way the existing transforms check for FMF.noNaNs() sounds like a different ticket that needs to be done before this one can progress. Unless I misunderstood?

I know nothing about such change.

Your proposed changes around SNaN will affect the non-constrained cases.

I think the optimization fadd X, -0 ==> X in general case (not FMF.noNaNs()) is incorrect. The obtained values must be identical to what would produce hardware, otherwise it is not an optimization. If you think that such change deserves a separate patch, no problem. It seems to me that this patch could establish correct folding even if it changes non-constrained operation as well.

I really don't think we should be changing how LLVM handles the ebIgnore cases. LLVM currently treats SNaN and QNaN as the same except where it doesn't, and I believe we should keep that behavior unchanged when ignoring exceptions. At the very least we shouldn't be disabling any currently enabled optimizations when ignoring exceptions. @lebedev.ri or @jcranmer ?

It is actually a bug. None of the operation in IEEE-754 compatible implementation may produce SNaN. If some code that would produce QNaN in runtime produces SNaN due to constant folding, such optimization is invalid. A user might use the expression x + 0.0 just to convert SNaN to QNaN, it won't work in the case of such optimization.

Our closest approximation to a strict IEEE-754 mode currently is with ebStrict / "fpexcept.strict" usage, but that isn't documented as being strictly IEEE-754 compliant. If someone wants to go through and add a strict -754 mode at some point then they can. But that's a different project.

Most hardware FP implementations adhere IEEE-754. As LLVM IR is by design target-independent, non-IEEE-754 implementation hardly is interesting. Also, ebIgnore is only an optimization hint, it should not change the semantics of the add operation.

In D106362#3032895, @sepavloff wrote:

In D106362#3031370, @kpn wrote:

In D106362#3031030, @sepavloff wrote:

In D106362#3030871, @kpn wrote:

In D106362#3030856, @sepavloff wrote:

In D106362#3030790, @kpn wrote:

Changing the way the existing transforms check for FMF.noNaNs() sounds like a different ticket that needs to be done before this one can progress. Unless I misunderstood?

I know nothing about such change.

Your proposed changes around SNaN will affect the non-constrained cases.

I think the optimization fadd X, -0 ==> X in general case (not FMF.noNaNs()) is incorrect. The obtained values must be identical to what would produce hardware, otherwise it is not an optimization. If you think that such change deserves a separate patch, no problem. It seems to me that this patch could establish correct folding even if it changes non-constrained operation as well.

I really don't think we should be changing how LLVM handles the ebIgnore cases. LLVM currently treats SNaN and QNaN as the same except where it doesn't, and I believe we should keep that behavior unchanged when ignoring exceptions. At the very least we shouldn't be disabling any currently enabled optimizations when ignoring exceptions. @lebedev.ri or @jcranmer ?

I agree.

It is actually a bug. None of the operation in IEEE-754 compatible implementation may produce SNaN. If some code that would produce QNaN in runtime produces SNaN due to constant folding, such optimization is invalid. A user might use the expression x + 0.0 just to convert SNaN to QNaN, it won't work in the case of such optimization.

Our closest approximation to a strict IEEE-754 mode currently is with ebStrict / "fpexcept.strict" usage, but that isn't documented as being strictly IEEE-754 compliant. If someone wants to go through and add a strict -754 mode at some point then they can. But that's a different project.

Most hardware FP implementations adhere IEEE-754. As LLVM IR is by design target-independent, non-IEEE-754 implementation hardly is interesting. Also, ebIgnore is only an optimization hint, it should not change the semantics of the add operation.

fadd X, -0 ==> X is *NOT* a miscompile, at least given the current LLVM IR semantics: https://alive2.llvm.org/ce/z/TuTiSQ
I would personally strongly suggest to not reason about semantics via hand-waving, but to actually model them in alive2, if it isn't already.
Honestly, i'm quite worried that this is repeating the same approach as in isnan threads.
Some might interpret it as being dismissive/intentionally ignoring documented semantics.

In D106362#3032900, @lebedev.ri wrote:

fadd X, -0 ==> X is *NOT* a miscompile, at least given the current LLVM IR semantics: https://alive2.llvm.org/ce/z/TuTiSQ
I would personally strongly suggest to not reason about semantics via hand-waving, but to actually model them in alive2, if it isn't already.
Honestly, i'm quite worried that this is repeating the same approach as in isnan threads.
Some might interpret it as being dismissive/intentionally ignoring documented semantics.

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

In D106362#3033346, @sepavloff wrote:

In D106362#3032900, @lebedev.ri wrote:

fadd X, -0 ==> X is *NOT* a miscompile, at least given the current LLVM IR semantics: https://alive2.llvm.org/ce/z/TuTiSQ
I would personally strongly suggest to not reason about semantics via hand-waving, but to actually model them in alive2, if it isn't already.
Honestly, i'm quite worried that this is repeating the same approach as in isnan threads.
Some might interpret it as being dismissive/intentionally ignoring documented semantics.

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

Could you please write a little longer arguments?
Your point being? Pick one of:

the optimization is incorrect as per the llvm langref
the optimization is correct as per the llvm langref, which is itself incorrect
???

Which one is it?

In D106362#3033377, @lebedev.ri wrote:

In D106362#3033346, @sepavloff wrote:

In D106362#3032900, @lebedev.ri wrote:

fadd X, -0 ==> X is *NOT* a miscompile, at least given the current LLVM IR semantics: https://alive2.llvm.org/ce/z/TuTiSQ
I would personally strongly suggest to not reason about semantics via hand-waving, but to actually model them in alive2, if it isn't already.
Honestly, i'm quite worried that this is repeating the same approach as in isnan threads.
Some might interpret it as being dismissive/intentionally ignoring documented semantics.

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

Could you please write a little longer arguments?
Your point being? Pick one of:

the optimization is incorrect as per the llvm langref

the optimization is correct as per the llvm langref, which is itself incorrect

???

Which one is it?

This optimization is incorrect because it violates IEEE-754, which states (6.2):

Under default exception handling, any operation signaling an invalid operation exception and for which a
floating-point result is to be delivered shall deliver a quiet NaN.

As a result it is inconsistent with the main hardware implementations, including at least X86, RISC-V and ARM. If it matters I can provide the exact references.

As for llvm langref, I think the following statement is cited here: https://llvm.org/docs/LangRef.html#floating-point-environment:

The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. No floating-point exception state is maintained in this environment. Therefore, there is no attempt to create or preserve invalid operation (SNaN) or division-by-zero exceptions.

This sentence is about exceptions but not the results of operations. So actually nothing in the langref justifies this transformation.

CC @nlopes / @aqjune

In D106362#3033555, @sepavloff wrote:
In D106362#3033377, @lebedev.ri wrote:

In D106362#3033346, @sepavloff wrote:

In D106362#3032900, @lebedev.ri wrote:

fadd X, -0 ==> X is *NOT* a miscompile, at least given the current LLVM IR semantics: https://alive2.llvm.org/ce/z/TuTiSQ
I would personally strongly suggest to not reason about semantics via hand-waving, but to actually model them in alive2, if it isn't already.
Honestly, i'm quite worried that this is repeating the same approach as in isnan threads.
Some might interpret it as being dismissive/intentionally ignoring documented semantics.

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

Could you please write a little longer arguments?
Your point being? Pick one of:

the optimization is incorrect as per the llvm langref

the optimization is correct as per the llvm langref, which is itself incorrect

???

Which one is it?

This optimization is incorrect because it violates IEEE-754, which states (6.2):
Under default exception handling, any operation signaling an invalid operation exception and for which a
floating-point result is to be delivered shall deliver a quiet NaN.
As a result it is inconsistent with the main hardware implementations, including at least X86, RISC-V and ARM. If it matters I can provide the exact references.

As for llvm langref, I think the following statement is cited here: https://llvm.org/docs/LangRef.html#floating-point-environment:
The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. No floating-point exception state is maintained in this environment. Therefore, there is no attempt to create or preserve invalid operation (SNaN) or division-by-zero exceptions.
This sentence is about exceptions but not the results of operations. So actually nothing in the langref justifies this transformation.

In D106362#3033555, @sepavloff wrote:

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

This is cherry-picking the example to try to prove your point; gcc does not behave as you are suggesting in general. Here is a counter-example based on the first transform (fadd X, -0.0) affected by this patch:
https://godbolt.org/z/qj9Meavnd

As for llvm langref, I think the following statement is cited here: https://llvm.org/docs/LangRef.html#floating-point-environment:
The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. No floating-point exception state is maintained in this environment. Therefore, there is no attempt to create or preserve invalid operation (SNaN) or division-by-zero exceptions.
This sentence is about exceptions but not the results of operations. So actually nothing in the langref justifies this transformation.

The wording could be made more explicit, but as I suggested in my comment from Aug 6 in this review: the default LLVM FP env ignores exceptions, so citing the IEEE-754 clause that begins with "under default exception handling" to describe SNAN->QNAN behavior is not applicable.

But this has gone out-of-bounds for this patch review - this was just trying to nail down the behavior in a subset of a strict FP env. Please start an llvm-dev discussion if you want to change the LangRef to formalize the behavior you are proposing for default FP.

@kpn - might be worth re-starting this patch as "X + -0.0" only, so we can make progress?

In D106362#3033765, @spatel wrote:

In D106362#3033555, @sepavloff wrote:

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

This is cherry-picking the example to try to prove your point; gcc does not behave as you are suggesting in general. Here is a counter-example based on the first transform (fadd X, -0.0) affected by this patch:
https://godbolt.org/z/qj9Meavnd

In this case no add instruction is generated, so it does not demonstrate hardware behavior. It is (incorrect) constant folding. ICC does not do such transformation. Could you please file a bug in GCC bug tracker?

As for llvm langref, I think the following statement is cited here: https://llvm.org/docs/LangRef.html#floating-point-environment:
The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. No floating-point exception state is maintained in this environment. Therefore, there is no attempt to create or preserve invalid operation (SNaN) or division-by-zero exceptions.
This sentence is about exceptions but not the results of operations. So actually nothing in the langref justifies this transformation.
The wording could be made more explicit, but as I suggested in my comment from Aug 6 in this review: the default LLVM FP env ignores exceptions, so citing the IEEE-754 clause that begins with "under default exception handling" to describe SNAN->QNAN behavior is not applicable.

This is peculiarities of IEEE-754 language. Exception handling in this context is what to do if exceptional situation occurs. In 7.1 there is a description of the default exception handling:

This clause also specifies default non-stop exception handling for exception signals, which is to deliver a
default result, continue execution, and raise the corresponding status flag (except in the case of exact
underflow, see 7.5).

This is exactly the way the exceptions are handled in the default FP environment. Exception must be handled in some way because hardware always signals them if exceptional situation occurs.

As for llvm langref, I think the following statement is cited here: https://llvm.org/docs/LangRef.html#floating-point-environment:
The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. No floating-point exception state is maintained in this environment. Therefore, there is no attempt to create or preserve invalid operation (SNaN) or division-by-zero exceptions.
This sentence is about exceptions but not the results of operations. So actually nothing in the langref justifies this transformation.
The wording could be made more explicit, but as I suggested in my comment from Aug 6 in this review: the default LLVM FP env ignores exceptions, so citing the IEEE-754 clause that begins with "under default exception handling" to describe SNAN->QNAN behavior is not applicable.
This is peculiarities of IEEE-754 language. Exception handling in this context is what to do if exceptional situation occurs. In 7.1 there is a description of the default exception handling:
This clause also specifies default non-stop exception handling for exception signals, which is to deliver a
default result, continue execution, and raise the corresponding status flag (except in the case of exact
underflow, see 7.5).
This is exactly the way the exceptions are handled in the default FP environment. Exception must be handled in some way because hardware always signals them if exceptional situation occurs.

I agree with @sepavloff's interpretation of IEEE-754. It's a bit jarring switching between -754 and the LangRef or Unix because they use the same words but with different meanings.

However, my understanding is that strict IEEE-754 compliance is _not_ a goal of the LLVM project. And having different behavior than the hardware in the case of NaN handling in the non-trapping case is generally accepted because of the large performance boost in total. LLVM implements its own IR language and NaN handling (non-trapping case) is one of those times where the IR language does not promise to behave like the hardware.

I'll do what @spatel asked and narrow the changes down to get this moving.

In D106362#3034000, @sepavloff wrote:

In D106362#3033765, @spatel wrote:

In D106362#3033555, @sepavloff wrote:

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

This is cherry-picking the example to try to prove your point; gcc does not behave as you are suggesting in general. Here is a counter-example based on the first transform (fadd X, -0.0) affected by this patch:
https://godbolt.org/z/qj9Meavnd

In this case no add instruction is generated, so it does not demonstrate hardware behavior. It is (incorrect) constant folding. ICC does not do such transformation. Could you please file a bug in GCC bug tracker?

Apologies to this patch review for being off-topic again, but the answer is no.
I think you know that's not a standalone folding bug. It's intentional behavior in both compilers which you can easily show with a number of similar examples...
https://godbolt.org/z/fT1j8YPK6

In D106362#3034097, @spatel wrote:

In D106362#3034000, @sepavloff wrote:

In D106362#3033765, @spatel wrote:

In D106362#3033555, @sepavloff wrote:

Here is the sample program: https://godbolt.org/z/ssYs6ez91
Hardware converts SNaN + 0 into QNaN.

This is cherry-picking the example to try to prove your point; gcc does not behave as you are suggesting in general. Here is a counter-example based on the first transform (fadd X, -0.0) affected by this patch:
https://godbolt.org/z/qj9Meavnd

In this case no add instruction is generated, so it does not demonstrate hardware behavior. It is (incorrect) constant folding. ICC does not do such transformation. Could you please file a bug in GCC bug tracker?

Apologies to this patch review for being off-topic again, but the answer is no.
I think you know that's not a standalone folding bug. It's intentional behavior in both compilers which you can easily show with a number of similar examples...
https://godbolt.org/z/fT1j8YPK6

I tried opening a bug against GCC (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102549) and got recommendation to use -fsignaling-nans. There is no such option in clang but gcc manual says that it implies ftrapping-math and the latter is a synonym of -ffp-exception-behavior=strict in clang.

So, you are right. Ignoring peculiarities of SNaN is by design in GCC and IEEE-754 conformant behavior is only possible in strict FP mode. Conformant behavior of GCC for X + 0.0 looks accidental. As this behavior opens optimization possibilities it is reasonable to use it.

Thank you for the discussion!

Ouch, this is a long thread.

If I understand correctly, an important missing feature in Alive2 is support for multiple rounding modes. I didn't implement that so far mostly because I was under the impression that rounding modes was still a working in progress thing. Is this true or is it mostly settled in stone?
I would appreciate if someone could point me out to the functions that can change the rounding mode (e.g., a function call may change the rounding mode, right? is the rounding mode global per thread or per process?). If I get to understand how this thing works I'm happy to implement it in Alive2 :)

In D106362#3035988, @nlopes wrote:

Ouch, this is a long thread.

If I understand correctly, an important missing feature in Alive2 is support for multiple rounding modes. I didn't implement that so far mostly because I was under the impression that rounding modes was still a working in progress thing. Is this true or is it mostly settled in stone?

The support of non-default rounding modes is still not as mature as it should be.

I would appreciate if someone could point me out to the functions that can change the rounding mode (e.g., a function call may change the rounding mode, right? is the rounding mode global per thread or per process?). If I get to understand how this thing works I'm happy to implement it in Alive2 :)

There are two kinds of rounding mode, static and dynamic. Availability of them depends on the target. The dynamic rounding mode is more universal, most targets support it. This kind of rounding mode is set by a write to a special hardware register. If IR it is done by the intrinsic llvm.set_rounding. Another intrinsic, 'llvm.flt_rounds` can be used to read the value of dynamic rounding mode. The dynamic rounding mode is defined for each thread. Operations that use dynamic rounding mode are represented in IR by constrained intrinsics like:

%add = call float @llvm.experimental.constrained.fadd.f32(float %a, float 0.0, metadata !"round.dynamic", metadata !"fpexcept.strict")

Support of the dynamic rounding mode is more or less ready.
The static rounding mode is available on some targets, for example on RISC-V. In this case rounding mode is encoded as a part of processor instruction, it is not kept in a separate register. In IR the static rounding mode is represented by special metadata operands like:

%add = call float @llvm.experimental.constrained.fadd.f32(float %a, float 0.0, metadata !"round.upward", metadata !"fpexcept.strict")

Support of the static rounding more is still weak, mainly because the support of constrained intrinsics for such targets is limited.

kpn mentioned this in rG770c57898e12: [FPEnv][InstSimplify] Prepush more tests for D106362..Oct 4 2021, 10:49 AM

kpn mentioned this in D111085: [FPEnv][InstSimplify] Fold constrained X + -0.0 ==> X.Oct 4 2021, 11:50 AM

kpn mentioned this in rGf86c930cc967: [FPEnv][InstSimplify] Fold constrained X + -0.0 ==> X.Oct 6 2021, 10:52 AM

kpn mentioned this in D111450: [FPEnv][InstSimplify] Fold fadd X, 0 ==> X, when we know X is not -0.Oct 8 2021, 11:38 AM

kpn mentioned this in rG727a891ec8c4: [FPEnv][InstSimplify] Fold fadd X, 0 ==> X, when we know X is not -0.Oct 14 2021, 9:33 AM

Everything that can be tested is now pushed in other tickets. I'll keep this around as a reminder to come back when m_FSub() is updated, but it doesn't need any action from anyone right now.

This review seems to be stuck/dead, consider abandoning if no longer relevant.

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:49 PM

Herald added a subscriber: StephenFan. · View Herald Transcript

This is no longer needed as the pieces have all made it into the tree already.

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

FPEnv.h

6 lines

lib/

Analysis/

InstructionSimplify.cpp

18 lines

test/

Transforms/

InstSimplify/

strictfp-fadd.ll

38 lines

Diff 373930

llvm/include/llvm/IR/FPEnv.h

	Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	/// input in constrained intrinsic exception behavior metadata.			/// input in constrained intrinsic exception behavior metadata.
	Optional<StringRef> ExceptionBehaviorToStr(fp::ExceptionBehavior);			Optional<StringRef> ExceptionBehaviorToStr(fp::ExceptionBehavior);

	/// Returns true if the exception handling behavior and rounding mode			/// Returns true if the exception handling behavior and rounding mode
	/// match what is used in the default floating point environment.			/// match what is used in the default floating point environment.
	inline bool isDefaultFPEnvironment(fp::ExceptionBehavior EB, RoundingMode RM) {			inline bool isDefaultFPEnvironment(fp::ExceptionBehavior EB, RoundingMode RM) {
	return EB == fp::ebIgnore && RM == RoundingMode::NearestTiesToEven;			return EB == fp::ebIgnore && RM == RoundingMode::NearestTiesToEven;
	}			}

				/// Returns true if the rounding mode RM may be QRM at compile time or
				/// at run time.
				inline bool RoundingModeMayBe(RoundingMode RM, RoundingMode QRM) {
				Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'RoundingModeMayBe' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'RoundingModeMayBe' [readability…
				return RM == QRM \|\| RM == RoundingMode::Dynamic;
				}
	}			}
	#endif			#endif

llvm/lib/Analysis/InstructionSimplify.cpp

Show First 20 Lines • Show All 4,932 Lines • ▼ Show 20 Lines	SimplifyFAddInst(Value Op0, Value Op1, FastMathFlags FMF,
RoundingMode Rounding = RoundingMode::NearestTiesToEven) {		RoundingMode Rounding = RoundingMode::NearestTiesToEven) {
if (isDefaultFPEnvironment(ExBehavior, Rounding))		if (isDefaultFPEnvironment(ExBehavior, Rounding))
if (Constant *C = foldOrCommuteConstant(Instruction::FAdd, Op0, Op1, Q))		if (Constant *C = foldOrCommuteConstant(Instruction::FAdd, Op0, Op1, Q))
return C;		return C;

if (Constant *C = simplifyFPOp({Op0, Op1}, FMF, Q, ExBehavior, Rounding))		if (Constant *C = simplifyFPOp({Op0, Op1}, FMF, Q, ExBehavior, Rounding))
return C;		return C;

if (!isDefaultFPEnvironment(ExBehavior, Rounding))
return nullptr;

// fadd X, -0 ==> X		// fadd X, -0 ==> X
		if (!RoundingModeMayBe(Rounding, RoundingMode::TowardNegative) &&
		(ExBehavior == fp::ebIgnore \|\|
		(ExBehavior == fp::ebMayTrap && FMF.noNaNs())))
		sepavloffUnsubmitted Not Done Reply Inline Actions Even if `ExBehavior != fp::ebStrict` this transformation is invalid if `X` is SNaN, the latter must be converted to QNaN. sepavloff: Even if `ExBehavior != fp::ebStrict` this transformation is invalid if `X` is SNaN, the latter…
		kpnAuthorUnsubmitted Done Reply Inline Actions Can I put this in a different patch? We have the same issue for constrained and non-constrained cases, and we don't have a good way to distinguish them -- nor should we. Most of the required code is present in APFloat, but not all. Would it be OK for me to add the needed bits to APFloat and use them from InstSimplify in a different ticket? kpn: Can I put this in a different patch? We have the same issue for constrained and non-constrained…
		sepavloffUnsubmitted Not Done Reply Inline Actions I am not sure I understand what you are going to put into another patch. The transformation: fadd X, -0 ==> X is valid only if `X != SNaN`, otherwise we have: fadd SNaN, -0 ==> QNaN It does not depend on whether FP environment is default or not, the code before your changes was already incorrect. Such transformation is valid if the operation has flag `nnan`, but not in general case. Code in APFloat hardly can help, as constant folding is made previously, if `X` is a constant here, it means it cannot be folded. sepavloff: I am not sure I understand what you are going to put into another patch. The transformation…
		spatelUnsubmitted Not Done Reply Inline Actions IIUC, you are saying SNaN vs. QNaN is more than just a part of the exception state. But that's not how I interpret the current LangRef: https://llvm.org/docs/LangRef.html#floating-point-environment ...and that's why none of the transforms here are intentionally SNaN preserving/clearing. If we are going to change the behavior of the default FP env, the LangRef must be updated to make that clear. I don't see the motivation yet. spatel: IIUC, you are saying SNaN vs. QNaN is more than just a part of the exception state. But that's…
		sepavloffUnsubmitted Not Done Reply Inline Actions That's true for default FP environment. A difference between SNaN and QNaN is that operations on SNaN raise invalid exception. In default FP environment exceptions are ignored so SNaN and QNaN behave identically. But this code works in non-default FP environment as well. `fadd` here designates both regular IR node used in default environment as well as its constrained counterpart. So SNaN here must be handled more carefully. sepavloff: That's true for default FP environment. A difference between SNaN and QNaN is that operations…
		spatelUnsubmitted Not Done Reply Inline Actions This seems to be getting fuzzy when the exception state is "MayTrap", so we probably need to clarify that definition in the LangRef. Is there a regression test below that shows where you think this patch is wrong? I think the fold is correct with MayTrap (assuming the rounding mode is known suitable) because "passes are not required to preserve all exceptions that are implied by the original code"; presumably some other operation is eventually going to generate the invalid exception when it sees the SNaN? spatel: This seems to be getting fuzzy when the exception state is "MayTrap", so we probably need to…
		kpnAuthorUnsubmitted Done Reply Inline Actions We won't see a NaN here if 'nnan' is specified since simplifyFPOp() will have already turned it to poison and returned. So I think the check for FMF.noNaNs() should be removed since it can't happen and is therefore misleading. We won't fold in the 'strict' exception case because we check for it and reject it. I suspect the real objection is that we're not turning a SNaN into a QNaN here. But with the constant folding that Serge recently added we won't even be here in the "ignore" or "maytrap" cases at all. There won't be a "fadd NaN, -0" case except in the "strict" exception case which we decline to fold. Aside from the unneeded check for FMF.noNaNs() I don't see a problem here. We aren't returning the wrong result here or below. And I do now have code for the APFloat and ConstantFP classes to make it easy for simplifyFPOp() to convert SNaN->QNaN. That's what I was planning on submitting in a different ticket. kpn: We won't see a NaN here if 'nnan' is specified since simplifyFPOp() will have already turned it…
		kpnAuthorUnsubmitted Done Reply Inline Actions Actually, the check for FMF.noNaNs() is required to catch non-constants. We can still fold "fadd X, -0" if we know X is not a NaN. We still have the "trivially dead" issue where the instruction would be hanging around needlessly, but still... kpn: Actually, the check for FMF.noNaNs() is required to catch non-constants. We can still fold…
		kpnAuthorUnsubmitted Done Reply Inline Actions Rereading @spatel's comment: what about the "X is a variable that happens to have the value of an SNaN" case: it's true that we'll be removing an instruction that would have converted the SNaN to a QNaN. So the removal of the instruction would be observable. I don't see a way around that without eliminating the ability to optimize at all. kpn: Rereading @spatel's comment: what about the "X is a variable that happens to have the value of…
		andrew.w.kaylorUnsubmitted Not Done Reply Inline Actions I don't think you can make this transformation in the "maytrap" case. It would be ok to optimize an SNaN to a QNaN, but you can't eliminate an instruction that would raise the exception while performing the same conversion. Consider this example: double foo(double x, double y) { #pragma clang fp exceptions(maytrap) double temp; feclearexcept(FE_ALL_EXCEPT); temp = x + -0.0; // Check the x = SNaN case if (fetestexcept(FE_INEXACT)) return -1.0; temp = temp + y; // Check the y = SNaN case if (fetestexcept(FE_INEXACT)) return -2.0; return temp; } Eliminating the fadd x, -0 causes the exception to be raised in the wrong place. andrew.w.kaylor: I don't think you can make this transformation in the "maytrap" case. It would be ok to…
		sepavloffUnsubmitted Not Done Reply Inline Actions The transformation `fadd X, -0 ==> X` is wrong in the two cases: When the rounding mode is `downward` and `X` is +0: `+0 + (-0) ==> -0`. If MFM has flag `nsz`, I think we can apply this transformation. When `X` is a signaling NaN: `SNaN + (-0) ==> QNaN`. In runtime the operation would produce quiet NaN but it this transformation is done, the result would be signaling NaN. It is not correct to produce SNaN instead of QNaN. This result might be stored somewhere and could further be processed by other code, which operates in strict mode. Such transformation would cause that code to trap. Behavior of the code with this transformation and without it would be different. So this transformation is safe only if `FMF.noNans()`. The proposed condition is: if ((!RoundingModeMayBe(Rounding, RoundingMode::TowardNegative) \|\| FMF.noSignedZeros()) && FMF.noNaNs()) sepavloff: The transformation `fadd X, -0 ==> X` is wrong in the two cases: 1. When the rounding mode is…
if (match(Op1, m_NegZeroFP()))		if (match(Op1, m_NegZeroFP()))
return Op0;		return Op0;

// fadd X, 0 ==> X, when we know X is not -0		// fadd X, 0 ==> X, when we know X is not -0
		if (ExBehavior == fp::ebIgnore \|\|
		(ExBehavior == fp::ebMayTrap && FMF.noNaNs()))
		sepavloffUnsubmitted Not Done Reply Inline Actions The same about `X==SNaN`. sepavloff: The same about `X==SNaN`.
if (match(Op1, m_PosZeroFP()) &&		if (match(Op1, m_PosZeroFP()) &&
(FMF.noSignedZeros() \|\| CannotBeNegativeZero(Op0, Q.TLI)))		(FMF.noSignedZeros() \|\| CannotBeNegativeZero(Op0, Q.TLI)))
return Op0;		return Op0;

// With nnan: -X + X --> 0.0 (and commuted variant)		// With nnan: -X + X --> 0.0 (and commuted variant)
		sepavloffUnsubmitted Not Done Reply Inline Actions What about making such transformation in non-default mode? sepavloff: What about making such transformation in non-default mode?
		kpnAuthorUnsubmitted Done Reply Inline Actions It requires adding support to the IR matchers like m_FSub(), and those are used elsewhere. Which implies testing in places in addition to here. So I'm saving that for a subsequent patch. Small steps. kpn: It requires adding support to the IR matchers like m_FSub(), and those are used elsewhere.
		kpnAuthorUnsubmitted Done Reply Inline Actions Actually, the check for the default FP environment isn't needed. No matcher support has been done yet. The matchers won't match so there's no need for the guard. kpn: Actually, the check for the default FP environment isn't needed. No matcher support has been…
		sepavloffUnsubmitted Not Done Reply Inline Actions According to https://en.wikipedia.org/wiki/Signed_zero: "x-x=x+(-x)=+0 (for any finite x, -0 when rounding toward negative)" So the check for rounding mode is also necessary, unless `FMF.noSignedZero()`. sepavloff: According to https://en.wikipedia.org/wiki/Signed_zero: "x-x=x+(-x)=+0 (for any finite x, -0…
// We don't have to explicitly exclude infinities (ninf): INF + -INF == NaN.		// We don't have to explicitly exclude infinities (ninf): INF + -INF == NaN.
// Negative zeros are allowed because we always end up with positive zero:		// Negative zeros are allowed because we always end up with positive zero:
// X = -0.0: (-0.0 - (-0.0)) + (-0.0) == ( 0.0) + (-0.0) == 0.0		// X = -0.0: (-0.0 - (-0.0)) + (-0.0) == ( 0.0) + (-0.0) == 0.0
// X = -0.0: ( 0.0 - (-0.0)) + (-0.0) == ( 0.0) + (-0.0) == 0.0		// X = -0.0: ( 0.0 - (-0.0)) + (-0.0) == ( 0.0) + (-0.0) == 0.0
// X = 0.0: (-0.0 - ( 0.0)) + ( 0.0) == (-0.0) + ( 0.0) == 0.0		// X = 0.0: (-0.0 - ( 0.0)) + ( 0.0) == (-0.0) + ( 0.0) == 0.0
// X = 0.0: ( 0.0 - ( 0.0)) + ( 0.0) == ( 0.0) + ( 0.0) == 0.0		// X = 0.0: ( 0.0 - ( 0.0)) + ( 0.0) == ( 0.0) + ( 0.0) == 0.0
if (FMF.noNaNs()) {		if (FMF.noNaNs()) {
if (match(Op0, m_FSub(m_AnyZeroFP(), m_Specific(Op1))) \|\|		if (match(Op0, m_FSub(m_AnyZeroFP(), m_Specific(Op1))) \|\|
▲ Show 20 Lines • Show All 1,444 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/strictfp-fadd.ll

Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	;
%ret = call <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float><float -0.0, float -0.0>, metadata !"round.dynamic", metadata !"fpexcept.ignore") #0		%ret = call <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float><float -0.0, float -0.0>, metadata !"round.dynamic", metadata !"fpexcept.ignore") #0
ret <2 x float> %ret		ret <2 x float> %ret
}		}

; The rounding mode here needs to not be { tonearest, downward, dynamic }.		; The rounding mode here needs to not be { tonearest, downward, dynamic }.
; Test one of the remaining rounding modes and the rest will be fine.		; Test one of the remaining rounding modes and the rest will be fine.
define float @fadd_x_n0_towardzero(float %a) #0 {		define float @fadd_x_n0_towardzero(float %a) #0 {
; CHECK-LABEL: @fadd_x_n0_towardzero(		; CHECK-LABEL: @fadd_x_n0_towardzero(
; CHECK-NEXT: [[RET:%.]] = call float @llvm.experimental.constrained.fadd.f32(float [[A:%.]], float -0.000000e+00, metadata !"round.towardzero", metadata !"fpexcept.ignore") #[[ATTR0]]		; CHECK-NEXT: ret float [[A:%.*]]
; CHECK-NEXT: ret float [[RET]]
;		;
%ret = call float @llvm.experimental.constrained.fadd.f32(float %a, float -0.0, metadata !"round.towardzero", metadata !"fpexcept.ignore") #0		%ret = call float @llvm.experimental.constrained.fadd.f32(float %a, float -0.0, metadata !"round.towardzero", metadata !"fpexcept.ignore") #0
ret float %ret		ret float %ret
}		}

; The rounding mode here needs to not be { tonearest, downward, dynamic }.		; The rounding mode here needs to not be { tonearest, downward, dynamic }.
; Test one of the remaining rounding modes and the rest will be fine.		; Test one of the remaining rounding modes and the rest will be fine.
define <2 x float> @fadd_vec_x_n0_towardzero(<2 x float> %a) #0 {		define <2 x float> @fadd_vec_x_n0_towardzero(<2 x float> %a) #0 {
; CHECK-LABEL: @fadd_vec_x_n0_towardzero(		; CHECK-LABEL: @fadd_vec_x_n0_towardzero(
; CHECK-NEXT: [[RET:%.]] = call <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> [[A:%.]], <2 x float> <float -0.000000e+00, float -0.000000e+00>, metadata !"round.towardzero", metadata !"fpexcept.ignore") #[[ATTR0]]		; CHECK-NEXT: ret <2 x float> [[A:%.*]]
; CHECK-NEXT: ret <2 x float> [[RET]]
;		;
%ret = call <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float><float -0.0, float -0.0>, metadata !"round.towardzero", metadata !"fpexcept.ignore") #0		%ret = call <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float><float -0.0, float -0.0>, metadata !"round.towardzero", metadata !"fpexcept.ignore") #0
ret <2 x float> %ret		ret <2 x float> %ret
}		}

		define float @fadd_nnan_x_n0_ebmaytrap(float %a) #0 {
		; CHECK-LABEL: @fadd_nnan_x_n0_ebmaytrap(
		; CHECK-NEXT: ret float [[A:%.*]]
		;
		%ret = call nnan float @llvm.experimental.constrained.fadd.f32(float %a, float -0.0, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0
		ret float %ret
		}

		define <2 x float> @fadd_vec_nnan_x_n0_ebmaytrap(<2 x float> %a) #0 {
		; CHECK-LABEL: @fadd_vec_nnan_x_n0_ebmaytrap(
		; CHECK-NEXT: ret <2 x float> [[A:%.*]]
		;
		%ret = call nnan <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float><float -0.0, float -0.0>, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0
		ret <2 x float> %ret
		}

define float @fadd_nnan_x_n0_ebstrict(float %a) #0 {		define float @fadd_nnan_x_n0_ebstrict(float %a) #0 {
; CHECK-LABEL: @fadd_nnan_x_n0_ebstrict(		; CHECK-LABEL: @fadd_nnan_x_n0_ebstrict(
; CHECK-NEXT: [[RET:%.]] = call nnan float @llvm.experimental.constrained.fadd.f32(float [[A:%.]], float -0.000000e+00, metadata !"round.tonearest", metadata !"fpexcept.strict") #[[ATTR0]]		; CHECK-NEXT: [[RET:%.]] = call nnan float @llvm.experimental.constrained.fadd.f32(float [[A:%.]], float -0.000000e+00, metadata !"round.tonearest", metadata !"fpexcept.strict") #[[ATTR0]]
; CHECK-NEXT: ret float [[RET]]		; CHECK-NEXT: ret float [[RET]]
;		;
%ret = call nnan float @llvm.experimental.constrained.fadd.f32(float %a, float -0.0, metadata !"round.tonearest", metadata !"fpexcept.strict") #0		%ret = call nnan float @llvm.experimental.constrained.fadd.f32(float %a, float -0.0, metadata !"round.tonearest", metadata !"fpexcept.strict") #0
ret float %ret		ret float %ret
}		}
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines
; CHECK-LABEL: @fold_fadd_vec_nsz_x_0_ebmaytrap(		; CHECK-LABEL: @fold_fadd_vec_nsz_x_0_ebmaytrap(
; CHECK-NEXT: [[ADD:%.]] = call nsz <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> [[A:%.]], <2 x float> zeroinitializer, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #[[ATTR0]]		; CHECK-NEXT: [[ADD:%.]] = call nsz <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> [[A:%.]], <2 x float> zeroinitializer, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #[[ATTR0]]
; CHECK-NEXT: ret <2 x float> [[ADD]]		; CHECK-NEXT: ret <2 x float> [[ADD]]
;		;
%add = call nsz <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float> zeroinitializer, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0		%add = call nsz <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float> zeroinitializer, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0
ret <2 x float> %add		ret <2 x float> %add
}		}

		define float @fold_fadd_nnan_nsz_x_0_ebmaytrap(float %a) #0 {
		; CHECK-LABEL: @fold_fadd_nnan_nsz_x_0_ebmaytrap(
		; CHECK-NEXT: ret float [[A:%.*]]
		;
		%add = call nnan nsz float @llvm.experimental.constrained.fadd.f32(float %a, float 0.0, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0
		ret float %add
		}

		define <2 x float> @fold_fadd_vec_nnan_nsz_x_0_ebmaytrap(<2 x float> %a) #0 {
		; CHECK-LABEL: @fold_fadd_vec_nnan_nsz_x_0_ebmaytrap(
		; CHECK-NEXT: ret <2 x float> [[A:%.*]]
		;
		%add = call nnan nsz <2 x float> @llvm.experimental.constrained.fadd.v2f32(<2 x float> %a, <2 x float> zeroinitializer, metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0
		ret <2 x float> %add
		}

define float @fold_fadd_nsz_x_0_ebstrict(float %a) #0 {		define float @fold_fadd_nsz_x_0_ebstrict(float %a) #0 {
; CHECK-LABEL: @fold_fadd_nsz_x_0_ebstrict(		; CHECK-LABEL: @fold_fadd_nsz_x_0_ebstrict(
; CHECK-NEXT: [[ADD:%.]] = call nsz float @llvm.experimental.constrained.fadd.f32(float [[A:%.]], float 0.000000e+00, metadata !"round.tonearest", metadata !"fpexcept.strict") #[[ATTR0]]		; CHECK-NEXT: [[ADD:%.]] = call nsz float @llvm.experimental.constrained.fadd.f32(float [[A:%.]], float 0.000000e+00, metadata !"round.tonearest", metadata !"fpexcept.strict") #[[ATTR0]]
; CHECK-NEXT: ret float [[ADD]]		; CHECK-NEXT: ret float [[ADD]]
;		;
%add = call nsz float @llvm.experimental.constrained.fadd.f32(float %a, float 0.0, metadata !"round.tonearest", metadata !"fpexcept.strict") #0		%add = call nsz float @llvm.experimental.constrained.fadd.f32(float %a, float 0.0, metadata !"round.tonearest", metadata !"fpexcept.strict") #0
ret float %add		ret float %add
}		}
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines