This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
Analysis/
-
ConstantFolding.cpp
2/6
InstructionSimplify.cpp
-
Target/AMDGPU/
-
AMDGPU/
-
AMDGPUInstCombineIntrinsic.cpp
-
Transforms/InstCombine/
-
InstCombine/
13/34
InstCombineCalls.cpp
-
InstCombineInternal.h
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
AMDGPU/
-
amdgcn-intrinsics.ll
1/1
is_fpclass.ll

Differential D137811

InstCombine: Perform basic isnan combines on llvm.is.fpclass
ClosedPublic

Authored by arsenm on Nov 10 2022, 7:13 PM.

Download Raw Diff

Details

Reviewers

sepavloff
spatel
kpn
andrew.w.kaylor
efriedma
cameron.mcinally
jcranmer
jyknight
foad

Summary

is.fpclass(x, qnan|snan) -> fcmp uno x, 0.0
is.fpclass(nnan x, qnan|snan|other) -> is.fpclass(x, other)

Start porting the existing combines from llvm.amdgcn.class to the
generic intrinsic. Start with the ones which aren't dependent on the
FP mode.

Diff Detail

Event Timeline

arsenm created this revision.Nov 10 2022, 7:13 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2022, 7:13 PM

Herald added subscribers: nlopes, kosarev, foad and 3 others. · View Herald Transcript

arsenm requested review of this revision.Nov 10 2022, 7:13 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 10 2022, 7:13 PM

Herald added a subscriber: wdng. · View Herald Transcript

arsenm added a parent revision: D135447: [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU.Nov 10 2022, 7:13 PM

Harbormaster completed remote builds in B197155: Diff 474648.Nov 10 2022, 7:14 PM

sepavloff added inline comments.Nov 10 2022, 10:06 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
818–829	The second argument is described with `ImmArg`. It must always be a constant.
842	This replacement is not valid in general case, only if FP exceptions are ignored. If the argument is a signaling NaN, compare instruction raises `Invalid` exception.
849	It also is not always valid, only if the argument is not a signaling NaN or FP exceptions are ignored..
881	Constant folding should be done in `ConstantFolding.cpp`.

arsenm added inline comments.Nov 10 2022, 10:30 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
842	Isn't that implied by not being a constrained intrinsic?
849	Isn't that implied by not being a constrained intrinsic?

sepavloff added inline comments.Nov 10 2022, 11:30 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
842	`is_fpclass` does not depend on FP environment and does not change it. So it does not have constrained variant.

Rebase on added strictfp checks, and split code between InstSimplify and ConstantFolding

Harbormaster completed remote builds in B197252: Diff 474788.Nov 11 2022, 9:05 AM

foad added inline comments.Nov 13 2022, 11:24 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
822	Seems like you should handle `Mask == (fcAllFlags & ~fcNan)` here too. Same for the other `Mask ==` cases below. And allow InstCombine to "freely invert" this intrinsic by flipping all bits in the mask.
845	What's this `!CVal` check for?

sepavloff added inline comments.Nov 13 2022, 11:43 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
824	Is it profitable to make such replacement early? Is there any advantage, at least hypothetical?
846–850	Deleted code?

jcranmer-intel added a subscriber: jcranmer-intel.Nov 14 2022, 12:22 PM

jcranmer-intel added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
819	It feels like you should be able to use ninf/nnan/nsz flags to modify the mask here to make it more likely to fall into one of these categories, although this may provoke some backlash from those who argue that `nnan isnan(x)` => unconditional `false` shouldn't be an allowable optimization.
837	You're also missing cases for `Mask == fcPosInf` and `Mask == fcNegInf`, which can similarly be lowered to quiet comparisons.
842	`Builder.CreateFCmp*` functions look like they create quiet comparison instructions in FP-constrained mode, and you need to use `CreateFCmpS` to generate one of the signalling comparisons. So replacing them even in strict mode is legal.

arsenm added inline comments.Nov 14 2022, 12:26 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
818–829	I'm debating removing this restriction
819	I think this is debatable, and beyond the scope of this patch where I'd like to simply move what we already do for the target specific intrinsic
824	I think an fcmp should be more canonical and better handled by existing optimizations, than class which is handled ~nowhere
837	Ditto, for now I'd like to just move the code
845	I moved this part to InstSimplify, will drop

sepavloff added inline comments.Nov 14 2022, 8:26 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
842	Quiet comparison instruction do not raise FP exception if an argument is quiet NaN. It still generates exceptions for signaling NaNs.

Remove dead code, stop turning poison into undef

Harbormaster completed remote builds in B198118: Diff 475996.Nov 16 2022, 8:33 PM

arsenm added inline comments.Nov 16 2022, 9:23 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
819	we can't actually put fast math flags here, because it's only allowed for calls with FP return types

arsenm added a child revision: D138180: InstCombine: Fold negations of is_fpclass intrinsics.Nov 16 2022, 9:36 PM

ping

LGTM modulo inline comments.

llvm/lib/Analysis/InstructionSimplify.cpp
6058–6066	Poison/undef checks should probably go first, before other simplifications?
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
822	Please add a TODO for this if you're not going to do it now.
846	Might be simpler to modify the instruction in place with setOperand, and then Worklist.pushUsersToWorkList. But perhaps that is less clear.

arsenm added inline comments.Nov 29 2022, 10:04 AM

llvm/lib/Analysis/InstructionSimplify.cpp
6058–6066	I specifically had these later. The first operand shouldn't matter based on the test. I'm thinking about the interaction between fast math optimizations, and class uses used to guard regions where nnan/ninf is expected.

foad added inline comments.Nov 30 2022, 1:37 AM

llvm/lib/Analysis/InstructionSimplify.cpp
6058–6066	It's maybe not worth arguing about, but... LangRef says "Most instructions return ‘poison’ when one of their arguments is ‘poison’". If you're saying is.fpclass is an exception to that rule then it at least ought to be documented. But I'm not sure why simplifying to poison would be a problem for the kind of code you're talking about - do you have an example?

arsenm added inline comments.Nov 30 2022, 6:26 AM

llvm/lib/Analysis/InstructionSimplify.cpp
6058–6066	Cases like this, if you violate nnan/ninf but you have a test of the special cases: div = fdiv nan x / nan if (!is_fpclass(div, nan)) { // do fast stuff } I was viewing this as similar to select of poison. select with a poison condition propagates poison, but the value operands can be unobserved. Similarly here, a test of 0 shouldn't need to observe the actual compared value. I was planning on revisiting this as I get further in optimizing all the special case checks in the math libraries. For now, this is the more conservative direction. The main thing I'm worried about is the asymmetry between equivalent fcmps and the class if we make this the expected behavior.

Add todo to handle inverted masks

arsenm added a child revision: D139012: InstCombine: Fold out is_fpclass inf checks from test mask for known finite sources.Nov 30 2022, 6:58 AM

Fix stale test

sepavloff added inline comments.Nov 30 2022, 7:24 AM

llvm/lib/Analysis/InstructionSimplify.cpp
6058–6066	In the example you presented the operation producing NaN can be evaluated at compile time, so is_fpclass can be evaluated during compilation. There is no reason to create poison value here. In the presence of nnan/ninf it is also saver to evaluate is_fpclass instead of poisoning it, because `fdiv nan, nan` and `is_fpclass(div, nan)` can come from different functions compiled with different fast-math settings merged, for example, by LTO. Poisoning can make a correct program non-working.

nlopes added inline comments.Nov 30 2022, 7:36 AM

llvm/lib/Analysis/InstructionSimplify.cpp
6058–6066	The issue of special casing handling of poison is that then you cannot easily convert to/from other instructions. Blocking propagation of poison is similar to freeze, which is sticky. Select is an exception, as it blocks poison propagation, but the caveat is that we had to remove this transformation: select X, true, Y -> or X, Y (as the or propagates more poison than the select did). So I would suggest to not special case poison anywhere unless it's really important for some reason.

Propagate poison. Leave undef alone, I still think test all/test none should win for undef

jyknight added inline comments.Nov 30 2022, 8:28 AM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
824	This seems OK. I'm not sure it's the best choice -- if a CPU actually has an fpclassify instruction, is it really a good idea to canonicalize in generic code to fcmp? But I think that's fine to revisit later if it becomes a problem.
839	These removal of Mask bits should come before the `Mask == $X` tests, shouldn't they?
848	Why are unknown bits even accepted? ISTM it should be an error in Verifier::visitIntrinsicCall to pass invalid bits.

Also: commit description should mention the instcombine/etc changes; it makes it sound like just an amdgpu change.

Harbormaster completed remote builds in B200276: Diff 478950.Nov 30 2022, 8:39 AM

foad added inline comments.Nov 30 2022, 8:56 AM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
839	I assume that this function will be called again to revisit the modified instrinsic.

arsenm added inline comments.Nov 30 2022, 9:01 AM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
824	I've been thinking if we can do a classify in <= 2 IR instructions, fcmp+fabs is probably a better canonical form. If a class pattern is 3-4+ instructions, the class is probably better. FCmp + fneg + fabs are always going to be more broadly understood. Fcmp also supports fast math flags, unlike class (I guess we could fix that though)
839	Right, it's revisited. I don't expect the bit removal part to ever actually happen
848	I don't know. Really this should be an i10 argument. I've been debating whether to add a verifier check, or make it an i10. I'd prefer to do that separately and clean up the bits here for now.

arsenm mentioned this in D139032: InstCombine: Handle folding some negated is_fpclass mask test cases.Nov 30 2022, 10:09 AM

arsenm added a child revision: D139032: InstCombine: Handle folding some negated is_fpclass mask test cases.

jcranmer-intel added inline comments.Nov 30 2022, 12:27 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
848	The LangRef documentation of `llvm.is.fpclass` doesn't pin down the handling for noncanonical values well. It's plausible they could be handled by extension of extra bits, but existing code seems to ignore them for `ppc_fp128` and treat them as NaNs that are neither qNaN nor sNaN for `x86_fp80`. Not that it makes any difference to this patch, but it suggests that making it a verifier check instead of an `i10` is the better path, as it is slightly better future-proofed.

arsenm added inline comments.Nov 30 2022, 3:17 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
848	Another reason is the operand doesn't really need to be immarg. The AMDGPU class instruction handles a register just fine (in fact nearly every test mask needs to be materialized). In that case we would want folds to a canonicalizable constant value

arsenm mentioned this in D139761: Verifier: Enforce value of llvm.is.fpclass test mask.Dec 9 2022, 6:20 PM

Break up patches into 4 pieces. Assume excess bits in test mask is a verifier error, and don't handle the FP mode dependent transforms here

arsenm added a parent revision: D139761: Verifier: Enforce value of llvm.is.fpclass test mask.Dec 10 2022, 7:50 AM

Resplit patches to handle negated isnan case here

arsenm added a parent revision: D139772: InstSimplify: Add basic folding of llvm.is.fpclass intrinsic.Dec 10 2022, 8:42 AM

Harbormaster completed remote builds in B202395: Diff 481862.Dec 10 2022, 8:42 AM

arsenm mentioned this in rGe20a092838f3: Verifier: Enforce value of llvm.is.fpclass test mask.Dec 12 2022, 7:16 PM

arsenm mentioned this in rG82b31703013f: InstCombine: Add baseline tests for negated fpclass tests.Dec 13 2022, 6:58 AM

Rebase

llvm/test/Transforms/InstCombine/is_fpclass.ll
358	This will be recovered by D139130, depending on which lands first

Harbormaster completed remote builds in B203114: Diff 482842.Dec 14 2022, 7:40 AM

arsenm added a reviewer: foad.Dec 16 2022, 7:11 AM

arsenm added child revisions: D139903: InstCombine: Fold is.fpclass (fabs x), mask -> is.fpclass x, (fabs mask), D139895: InstCombine: Fold is.fpclass (fneg x) into the test mask.

is.fpclass(x, qnan|snan) -> fcmp uno x, 0.0
is.fpclass(nnan x, qnan|snan|other) -> is.fpclass(x, other)

The first one sounds good. Do you have a specific motivation for the second one?

In D137811#4001711, @foad wrote:

is.fpclass(x, qnan|snan) -> fcmp uno x, 0.0
is.fpclass(nnan x, qnan|snan|other) -> is.fpclass(x, other)

The first one sounds good. Do you have a specific motivation for the second one?

Eventually all the class-like tests should be merged into one class call. We still want to reduce the number of tests to perform in the end and also don't want to be ordering dependent

Rebase

Harbormaster completed remote builds in B203913: Diff 483955.Dec 19 2022, 8:59 AM

sepavloff mentioned this in D140294: clang: Replace implementation of __builtin_isnormal.Dec 20 2022, 12:28 AM

Early conversion is_fpclass->fcmp may result in incorrect semantics. It make the code:

if (!isnan(x)) {
…
}

indistinguishable from:

if (x == x) {
…
}

But they have different behavior if x is signaling NaN. Although the replacement is proposed for strictfp functions only, inlining may require conversion to a form that uses constrained intrinsics. The resulting code would have different behavior than original.

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 16 2023, 6:32 AM

inlining may require conversion to a form that uses constrained intrinsics

Functions which are not strictfp are allowed to introduce "spurious" fp flag writes; for example, we can flatten control flow that contains floating-point ops. Inlining the function doesn't change that general rule. The inliner converts fp operations just to ensure that later optimizations don't move those operations around.

In D137811#4063159, @efriedma wrote:

inlining may require conversion to a form that uses constrained intrinsics

Functions which are not strictfp are allowed to introduce "spurious" fp flag writes; for example, we can flatten control flow that contains floating-point ops. Inlining the function doesn't change that general rule. The inliner converts fp operations just to ensure that later optimizations don't move those operations around.

I didn't think we had code in the tree to convert normal FP instructions into constrained intrinsics. Andy Kaylor had a ticket with code to do this, but I didn't think it ever went in. Where is this code used by the inliner?

I didn't think we had code in the tree to convert normal FP instructions into constrained intrinsics.

I'm not sure if that actually got merged; I was just assuming based on my memory of the discussions. In any case, it's not really relevant to the point I was trying to make.

In D137811#4063159, @efriedma wrote:

inlining may require conversion to a form that uses constrained intrinsics

Functions which are not strictfp are allowed to introduce "spurious" fp flag writes; for example, we can flatten control flow that contains floating-point ops. Inlining the function doesn't change that general rule. The inliner converts fp operations just to ensure that later optimizations don't move those operations around.

"spurious" fp flag writes can appear only as a result of transformations, the source code provided to compiler does not have them. Compiler should avoid such transformation at early stages of IR pipeline, otherwise it produces program with incorrect semantics.

In D137811#4063200, @kpn wrote:

In D137811#4063159, @efriedma wrote:

inlining may require conversion to a form that uses constrained intrinsics

Functions which are not strictfp are allowed to introduce "spurious" fp flag writes; for example, we can flatten control flow that contains floating-point ops. Inlining the function doesn't change that general rule. The inliner converts fp operations just to ensure that later optimizations don't move those operations around.

I didn't think we had code in the tree to convert normal FP instructions into constrained intrinsics. Andy Kaylor had a ticket with code to do this, but I didn't think it ever went in. Where is this code used by the inliner?

Here it is: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/CloneFunction.cpp#L390

In D137811#4063200, @kpn wrote:

In D137811#4063159, @efriedma wrote:

inlining may require conversion to a form that uses constrained intrinsics

Functions which are not strictfp are allowed to introduce "spurious" fp flag writes; for example, we can flatten control flow that contains floating-point ops. Inlining the function doesn't change that general rule. The inliner converts fp operations just to ensure that later optimizations don't move those operations around.

I didn't think we had code in the tree to convert normal FP instructions into constrained intrinsics. Andy Kaylor had a ticket with code to do this, but I didn't think it ever went in. Where is this code used by the inliner?

Here it is: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/CloneFunction.cpp#L390

Ah, excellent. I misremembered. Thanks for the correction!

But they have different behavior if x is signaling NaN. Although the replacement is proposed for strictfp functions only, inlining may require conversion to a form that uses constrained intrinsics. The resulting code would have different behavior than original.

A transformation that’s valid for !strictfp functions cannot be invalid because of the potential for being inlined into a strictfp function. If this isn’t possible, there is a representational issue the inliner would need to account for.

In D137811#4065720, @arsenm wrote:

But they have different behavior if x is signaling NaN. Although the replacement is proposed for strictfp functions only, inlining may require conversion to a form that uses constrained intrinsics. The resulting code would have different behavior than original.

A transformation that’s valid for !strictfp functions cannot be invalid because of the potential for being inlined into a strictfp function. If this isn’t possible, there is a representational issue the inliner would need to account for.

Thinking about this more, I think any situation where this could be an issue would be malformed to begin with. You had a piece of code that was expecting exceptions to be enabled calling into code where exceptions were disabled without turning exceptions off

In D137811#4066479, @arsenm wrote:

In D137811#4065720, @arsenm wrote:

But they have different behavior if x is signaling NaN. Although the replacement is proposed for strictfp functions only, inlining may require conversion to a form that uses constrained intrinsics. The resulting code would have different behavior than original.

A transformation that’s valid for !strictfp functions cannot be invalid because of the potential for being inlined into a strictfp function. If this isn’t possible, there is a representational issue the inliner would need to account for.

Thinking about this more, I think any situation where this could be an issue would be malformed to begin with. You had a piece of code that was expecting exceptions to be enabled calling into code where exceptions were disabled without turning exceptions off

If exceptions were turned off, the call was made, and exceptions were turned back on then there would be no correctness issue. Inlining the called function wouldn't change that. The toggling of the exception enablement would be invisible to LLVM. Thus we can't categorically say that a strictfp function calling a !strictfp function is malformed.

The constrained intrinsics that specify the default rounding and exceptions disabled are supposed to behave the same as the normal instructions. They're supposed to have the exact same meaning and behavior. So a conversion of isnan(x) to (x == x) for the constrained intrinsics in the default environment is just as correct as the instructions that do the same thing but aren't constrained intrinsics. Thus inlining of a !strictfp function into a strictfp function with the conversion of FP instructions into constrained intrinsics can't introduce any new correctness issues.

Matt added a subscriber: Matt.Jan 25 2023, 9:08 AM

ping, I have a lot of stuff blocked on this

In D137811#4068836, @kpn wrote:

In D137811#4066479, @arsenm wrote:

In D137811#4065720, @arsenm wrote:

But they have different behavior if x is signaling NaN. Although the replacement is proposed for strictfp functions only, inlining may require conversion to a form that uses constrained intrinsics. The resulting code would have different behavior than original.

A transformation that’s valid for !strictfp functions cannot be invalid because of the potential for being inlined into a strictfp function. If this isn’t possible, there is a representational issue the inliner would need to account for.

Thinking about this more, I think any situation where this could be an issue would be malformed to begin with. You had a piece of code that was expecting exceptions to be enabled calling into code where exceptions were disabled without turning exceptions off

If exceptions were turned off, the call was made, and exceptions were turned back on then there would be no correctness issue. Inlining the called function wouldn't change that. The toggling of the exception enablement would be invisible to LLVM. Thus we can't categorically say that a strictfp function calling a !strictfp function is malformed.

On most targets turning off FP exceptions means reading content of FP control register, changing value of mask bits and putting the modified register value back. It is expensive operation and cannot be made invisible to LLVM. Some targets (like RISCV) do not have possibility to mask FP exceptions at all, so at IR level there is no way to turn exception off. Default FP environment supposes that the exceptions are ignored, not disabled. In general case they are raised always.

The constrained intrinsics that specify the default rounding and exceptions disabled are supposed to behave the same as the normal instructions. They're supposed to have the exact same meaning and behavior. So a conversion of isnan(x) to (x == x) for the constrained intrinsics in the default environment is just as correct as the instructions that do the same thing but aren't constrained intrinsics. Thus inlining of a !strictfp function into a strictfp function with the conversion of FP instructions into constrained intrinsics can't introduce any new correctness issues.

No matter what exception behavior is specified in constrained intrinsics call, processor instruction will raise FP exceptions. If bits in FP environment specifies trap on Invalid exception, the check x != x may result in crash, if x is signaling NaN. The check isnan(x) in this case is safe. So replacement is_fpclass->fcmp changes semantics and cannot be used here.

Particular cores may implement is_fpclass as CMP instruction, it is made by default lowering, but making such replacement in IR is incorrect.

On most targets turning off FP exceptions means reading content of FP control register, changing value of mask bits and putting the modified register value back. It is expensive operation and cannot be made invisible to LLVM. Some targets (like RISCV) do not have possibility to mask FP exceptions at all, so at IR level there is no way to turn exception off. Default FP environment supposes that the exceptions are ignored, not disabled. In general case they are raised always.

Managing the FP mode is a user responsibility. If the operation in unconstrained we can assume the default floating point environment without preserved exceptions.

The constrained intrinsics that specify the default rounding and exceptions disabled are supposed to behave the same as the normal instructions. They're supposed to have the exact same meaning and behavior. So a conversion of isnan(x) to (x == x) for the constrained intrinsics in the default environment is just as correct as the instructions that do the same thing but aren't constrained intrinsics. Thus inlining of a !strictfp function into a strictfp function with the conversion of FP instructions into constrained intrinsics can't introduce any new correctness issues.

No matter what exception behavior is specified in constrained intrinsics call, processor instruction will raise FP exceptions. If bits in FP environment specifies trap on Invalid exception, the check x != x may result in crash, if x is signaling NaN. The check isnan(x) in this case is safe

Particular cores may implement is_fpclass as CMP instruction, it is made by default lowering, but making such replacement in IR is incorrect.

Floating point exception doesn’t mean trap, but that’s a possible mode. The IR semantics are not dependent on the possible set of lowerings. Ultimately between the user and codegen, this code must never have trapped and may never be transformed to a form that could trap. The replacement fcmp must not trap, but it’s not the middle end’s responsibility to ensure that. This is correct regardless of what any particular processor may do or whatever the source did.

In D137811#4081771, @sepavloff wrote:

In D137811#4068836, @kpn wrote:

In D137811#4066479, @arsenm wrote:

In D137811#4065720, @arsenm wrote:

But they have different behavior if x is signaling NaN. Although the replacement is proposed for strictfp functions only, inlining may require conversion to a form that uses constrained intrinsics. The resulting code would have different behavior than original.

A transformation that’s valid for !strictfp functions cannot be invalid because of the potential for being inlined into a strictfp function. If this isn’t possible, there is a representational issue the inliner would need to account for.

Thinking about this more, I think any situation where this could be an issue would be malformed to begin with. You had a piece of code that was expecting exceptions to be enabled calling into code where exceptions were disabled without turning exceptions off

If exceptions were turned off, the call was made, and exceptions were turned back on then there would be no correctness issue. Inlining the called function wouldn't change that. The toggling of the exception enablement would be invisible to LLVM. Thus we can't categorically say that a strictfp function calling a !strictfp function is malformed.

On most targets turning off FP exceptions means reading content of FP control register, changing value of mask bits and putting the modified register value back. It is expensive operation and cannot be made invisible to LLVM. Some targets (like RISCV) do not have possibility to mask FP exceptions at all, so at IR level there is no way to turn exception off. Default FP environment supposes that the exceptions are ignored, not disabled. In general case they are raised always.

Ok, true, it takes executing code to change the floating point environment. Yes, there would be inline assembly or a function call to change the FP environment. LLVM would see that code because there would be IR for it. All true.

But LLVM wouldn't know what the inline assembly was doing. It wouldn't recognize the function call and thus wouldn't know what it was doing. LLVM would not know the floating point environment had changed. Rephrased, the change in the floating point environment would, to LLVM, be invisible. That's the "invisible" I was referring to earlier.

The constrained intrinsics that specify the default rounding and exceptions disabled are supposed to behave the same as the normal instructions. They're supposed to have the exact same meaning and behavior. So a conversion of isnan(x) to (x == x) for the constrained intrinsics in the default environment is just as correct as the instructions that do the same thing but aren't constrained intrinsics. Thus inlining of a !strictfp function into a strictfp function with the conversion of FP instructions into constrained intrinsics can't introduce any new correctness issues.

No matter what exception behavior is specified in constrained intrinsics call, processor instruction will raise FP exceptions. If bits in FP environment specifies trap on Invalid exception, the check x != x may result in crash, if x is signaling NaN. The check isnan(x) in this case is safe. So replacement is_fpclass->fcmp changes semantics and cannot be used here.

Particular cores may implement is_fpclass as CMP instruction, it is made by default lowering, but making such replacement in IR is incorrect.

Be careful of how IEEE 754 uses the same terminology that a Unix person uses, but the words have different meanings. I'm going to use the term "754 trap" to mean a trap in the IEEE-754 document's use of the term. I'm going to say "Unix trap" when a trap involves transferring of control to the OS.

An FP instruction can "754 trap" but the result may just be changing the FP status bits in the environment to record that something happened. And if we are not using the constrained intrinsics, or we are using them with exceptions "ignore" and rounding "roundtoeven", then we are assumed to not be accessing the FP status bits.

A CPU is allowed to always "Unix trap" and transfer control to the OS. I think you are saying that RISCV does this. But the OS is allowed to fix things up so that the application doesn't observe the CPU's trap. Indeed, in the default FP environment the OS is _required_ to hide the CPU's "Unix trap" from the application. From the application's point of view this is the same as a CPU not doing a "Unix trap" at all. In this case we can treat it the same as a CPU that doesn't trap in the default FP environment.

It's true that a blind replacement of is_fpclass with fcmp would be incorrect _if_ the floating point environment is not the default environment. But if we are in the default FP environment we can assume that any CPU trap will be hidden from us by the OS and therefore is not a part of our discussion of IR correctness.

In D137811#4081771, @sepavloff wrote:

In D137811#4068836, @kpn wrote:

If exceptions were turned off, the call was made, and exceptions were turned back on then there would be no correctness issue. Inlining the called function wouldn't change that. The toggling of the exception enablement would be invisible to LLVM. Thus we can't categorically say that a strictfp function calling a !strictfp function is malformed.

On most targets turning off FP exceptions means reading content of FP control register, changing value of mask bits and putting the modified register value back. It is expensive operation and cannot be made invisible to LLVM. Some targets (like RISCV) do not have possibility to mask FP exceptions at all, so at IR level there is no way to turn exception off. Default FP environment supposes that the exceptions are ignored, not disabled. In general case they are raised always.

Non-strictfp functions are assumed to be in the default FP environment, which implies there's some form of undefined behavior if you call a non-strictfp function with a non-default FP environment. The precise, formal semantics that effect this rule is of course underdefined,

This is how we define default FP environment in the LangRef today:

The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. No floating-point exception state is maintained in this environment. Therefore, there is no attempt to create or preserve invalid operation (SNaN) or division-by-zero exceptions.

From this definition, it seems clear to me that we are legally allowed to insert instructions that would cause FP exceptions in non-strictfp functions.

To rephrase the rules (as I understand them) in somewhat more precise terms, if you call a non-strictfp function with a non-default FP environment, the values of any FP operation are unspecified, floating-point sticky bits have unspecified values, and if the dynamic FP environment is set to generate hardware traps, the act of *calling* the function is UB (that is, we are permitted to introduce FP operations that may trap in code-paths where none existed). In this understanding, converting a non-strictfp function into a strictfp function with bare FP instructions replaced with constrained intrinsics and round.tonearest and fpexcept.ignore metadata is a valid optimization, but one that narrows the possible semantics (since non-strictfp may introduce FP operations that strictfp may not).

In D137811#4083368, @kpn wrote:

Be careful of how IEEE 754 uses the same terminology that a Unix person uses, but the words have different meanings. I'm going to use the term "754 trap" to mean a trap in the IEEE-754 document's use of the term. I'm going to say "Unix trap" when a trap involves transferring of control to the OS.

An FP instruction can "754 trap" but the result may just be changing the FP status bits in the environment to record that something happened. And if we are not using the constrained intrinsics, or we are using them with exceptions "ignore" and rounding "roundtoeven", then we are assumed to not be accessing the FP status bits.

A CPU is allowed to always "Unix trap" and transfer control to the OS. I think you are saying that RISCV does this. But the OS is allowed to fix things up so that the application doesn't observe the CPU's trap. Indeed, in the default FP environment the OS is _required_ to hide the CPU's "Unix trap" from the application. From the application's point of view this is the same as a CPU not doing a "Unix trap" at all. In this case we can treat it the same as a CPU that doesn't trap in the default FP environment.

I went ahead and looked at the RISC-V specification to see what it does on FP exceptions. The RISC-V instructions only model 754 traps as sticky bits in the fcsr, and there's an explicit note that it provides no way to convert a 754 trap to a Unix trap (to use your terminology).

FWIW, C itself requires that implementations provide FP exception handling as sticky-bits (this is the IEEE 754 default exception handling), with the existence of a Unix trapping mode being an allowable extension (which it declines to specify).

In D137811#4082385, @arsenm wrote:

On most targets turning off FP exceptions means reading content of FP control register, changing value of mask bits and putting the modified register value back. It is expensive operation and cannot be made invisible to LLVM. Some targets (like RISCV) do not have possibility to mask FP exceptions at all, so at IR level there is no way to turn exception off. Default FP environment supposes that the exceptions are ignored, not disabled. In general case they are raised always.

Managing the FP mode is a user responsibility. If the operation in unconstrained we can assume the default floating point environment without preserved exceptions.

This is true for rounding direction, denormal behavior or any other FP control mode. They are represented by bits in some registers, may be set/read by appropriate API functions. Exception behavior is a very different thing. There is no register that keeps "current exception behavior". It is only a compiler hint that facilitates code generation. So user has limited means to control exception behavior.

Floating point exception doesn’t mean trap, but that’s a possible mode. The IR semantics are not dependent on the possible set of lowerings. Ultimately between the user and codegen, this code must never have trapped and may never be transformed to a form that could trap.

Absolutely.

The replacement fcmp must not trap, but it’s not the middle end’s responsibility to ensure that. This is correct regardless of what any particular processor may do or whatever the source did.

It is the middle end's responsibility to make only transformations that keep FP semantics. Replacement is_fpclass -> fcmp changes the semantics and can produce incorrect programs.

In D137811#4083368, @kpn wrote:

Ok, true, it takes executing code to change the floating point environment. Yes, there would be inline assembly or a function call to change the FP environment. LLVM would see that code because there would be IR for it. All true.

But LLVM wouldn't know what the inline assembly was doing. It wouldn't recognize the function call and thus wouldn't know what it was doing. LLVM would not know the floating point environment had changed. Rephrased, the change in the floating point environment would, to LLVM, be invisible. That's the "invisible" I was referring to earlier.

Ok, I see what is "invisible". Do you think inline assembly should be decorated as def/use of FP environment similar to function calls as is done in D111433 and D139549?

It's true that a blind replacement of is_fpclass with fcmp would be incorrect _if_ the floating point environment is not the default environment. But if we are in the default FP environment we can assume that any CPU trap will be hidden from us by the OS and therefore is not a part of our discussion of IR correctness.

What about such code?

int get_code(float x) {
  return isnan(x) ? 1 : 2;
}
void func1(float x) {
  int code1 = get_code(x);
  ...
}
#pragma STDC FENV_ACCESS ON
void func2(float x) {
  int code1 = get_code(x);
  ...
}

Acoording to C standard it must work as intended. If compiler replaces isnan with comparison, the code would be broken.

In D137811#4084248, @jcranmer-intel wrote:

Non-strictfp functions are assumed to be in the default FP environment, which implies there's some form of undefined behavior if you call a non-strictfp function with a non-default FP environment. The precise, formal semantics that effect this rule is of course underdefined,

This is how we define default FP environment in the LangRef today:

The default LLVM floating-point environment assumes that floating-point instructions do not have side effects. Results assume the round-to-nearest rounding mode. No floating-point exception state is maintained in this environment. Therefore, there is no attempt to create or preserve invalid operation (SNaN) or division-by-zero exceptions.

From this definition, it seems clear to me that we are legally allowed to insert instructions that would cause FP exceptions in non-strictfp functions.

LLVM must be able to represent semantics of the supported languages. The code snippet above represents a program that is valid from viewpoint of C standard but would be broken if the middle end makes replacement isnan(x)->x!=x. So such replacement must not be made.

In D137811#4084663, @sepavloff wrote:

It is the middle end's responsibility to make only transformations that keep FP semantics. Replacement is_fpclass -> fcmp changes the semantics and can produce incorrect programs.

It doesn't change the semantics because the FP exception that the machine fcmp may end up raising can be assumed to be invisible based on the IR semantics. An unconstrained fcmp does not have observable FP exceptions. For any regular non-constrained operation, you simply never need to be concerned with floating point exceptions. They can't be observed or relied on.

In D137811#4083368, @kpn wrote:

Ok, true, it takes executing code to change the floating point environment. Yes, there would be inline assembly or a function call to change the FP environment. LLVM would see that code because there would be IR for it. All true.

But LLVM wouldn't know what the inline assembly was doing. It wouldn't recognize the function call and thus wouldn't know what it was doing. LLVM would not know the floating point environment had changed. Rephrased, the change in the floating point environment would, to LLVM, be invisible. That's the "invisible" I was referring to earlier.

Ok, I see what is "invisible". Do you think inline assembly should be decorated as def/use of FP environment similar to function calls as is done in D111433 and D139549?

It's true that a blind replacement of is_fpclass with fcmp would be incorrect _if_ the floating point environment is not the default environment. But if we are in the default FP environment we can assume that any CPU trap will be hidden from us by the OS and therefore is not a part of our discussion of IR correctness.

What about such code?
int get_code(float x) {
  return isnan(x) ? 1 : 2;
}
void func1(float x) {
  int code1 = get_code(x);
  ...
}
#pragma STDC FENV_ACCESS ON
void func2(float x) {
  int code1 = get_code(x);
  ...
}
Acoording to C standard it must work as intended. If compiler replaces isnan with comparison, the code would be broken.

This code didn't actually manipulate the floating point mode, just made it legal to do so for the scope of the pragma. If you were to insert code to modify the mode to enable FP exceptions inside func2, it would be illegal because you did not restore the mode before calling into code outside of the pragma scope. FENV_ACCESS doesn't require the compiler to fixup the floating point mode for you before calling into other code.

In D137811#4084663, @sepavloff wrote:
What about such code?
int get_code(float x) {
  return isnan(x) ? 1 : 2;
}
void func1(float x) {
  int code1 = get_code(x);
  ...
}
#pragma STDC FENV_ACCESS ON
void func2(float x) {
  int code1 = get_code(x);
  ...
}
Acoording to C standard it must work as intended. If compiler replaces isnan with comparison, the code would be broken.

From the C standard (7.6.1p2 from draft N3054):

(When execution passes from a part of the program translated with FENV_ACCESS "off" to a part translated with FENV_ACCESS "on", the state of the floating-point status flags is unspecified and the floating-point control modes have their default settings.)

The purpose of the FENV_ACCESS pragma is to allow certain optimizations that could subvert flag tests and mode changes (e.g., global common subexpression elimination, code motion, and constant folding). In general, if the state of FENV_ACCESS is "off", the translator can assume that the flags are not tested, and that default modes are in effect, except where specified otherwise by an FENV_ROUND pragma.

On a strict reading of the standard, on transition from get_code (which has FENV_ACCESS OFF) back to func2, the FP flags are unspecified, which means it is legal to transform the non-flag-setting isnan into flag-setting fcmp. There is a slight wording mismatch, as FENV_ACCESS is about testing, not setting flags, but the footnote seems to indicate that the correct reading is that FENV_ACCESS is necessary if you want guarantees about flags being set or not set.

If it would help, I could contact the CFP study group to see what they think is the correct reading of the standard.

In D137811#4085761, @jcranmer-intel wrote:

On a strict reading of the standard, on transition from get_code (which has FENV_ACCESS OFF) back to func2, the FP flags are unspecified, which means it is legal to transform the non-flag-setting isnan into flag-setting fcmp. There is a slight wording mismatch, as FENV_ACCESS is about testing, not setting flags, but the footnote seems to indicate that the correct reading is that FENV_ACCESS is necessary if you want guarantees about flags being set or not set.

If it would help, I could contact the CFP study group to see what they think is the correct reading of the standard.

For the purpose of this change, the exact definition of FENV_ACCESS doesn't matter. This is not something that the context of a single instruction can or should consider. This would have to be managed by implicit mode switches inserted by the frontend to maintain the invariant that non-strict functions don't have to be concerned about it. Our strictfp representation would be totally broken if we had to think about this in a non-strict function

In D137811#4085810, @arsenm wrote:

For the purpose of this change, the exact definition of FENV_ACCESS doesn't matter. This is not something that the context of a single instruction can or should consider. This would have to be managed by implicit mode switches inserted by the frontend to maintain the invariant that non-strict functions don't have to be concerned about it. Our strictfp representation would be totally broken if we had to think about this in a non-strict function

The underlying question is if it's legal for a compiler to insert a call to a flags-setting instruction in FENV_ACCESS OFF along code paths that never had them (with the answer almost certainly "yes"). We have these semantics for non-strictfp in LLVM.

In D137811#4085644, @arsenm wrote:
In D137811#4084663, @sepavloff wrote:
What about such code?
int get_code(float x) {
  return isnan(x) ? 1 : 2;
}
void func1(float x) {
  int code1 = get_code(x);
  ...
}
#pragma STDC FENV_ACCESS ON
void func2(float x) {
  int code1 = get_code(x);
  ...
}
Acoording to C standard it must work as intended. If compiler replaces isnan with comparison, the code would be broken.
This code didn't actually manipulate the floating point mode, just made it legal to do so for the scope of the pragma. If you were to insert code to modify the mode to enable FP exceptions inside func2, it would be illegal because you did not restore the mode before calling into code outside of the pragma scope. FENV_ACCESS doesn't require the compiler to fixup the floating point mode for you before calling into other code.

Pragma STDC FENV_ACCESS ON may be used not only for code that manipulates control modes, but also when exception status is examined (7.6.1p2):

The FENV_ACCESS pragma provides a means to inform the implementation when a program might
 access the floating-point environment to test floating-point status flags or run under non-default
 floating-point control modes.

A call to fetestexcep could be put somewhere inside func2 to have an access to FP environment, but it does not change the behavior of isnan inside get_code.

In D137811#4085761, @jcranmer-intel wrote:

From the C standard (7.6.1p2 from draft N3054):

(When execution passes from a part of the program translated with FENV_ACCESS "off" to a part translated with FENV_ACCESS "on", the state of the floating-point status flags is unspecified and the floating-point control modes have their default settings.)

The purpose of the FENV_ACCESS pragma is to allow certain optimizations that could subvert flag tests and mode changes (e.g., global common subexpression elimination, code motion, and constant folding). In general, if the state of FENV_ACCESS is "off", the translator can assume that the flags are not tested, and that default modes are in effect, except where specified otherwise by an FENV_ROUND pragma.

On a strict reading of the standard, on transition from get_code (which has FENV_ACCESS OFF) back to func2, the FP flags are unspecified, which means it is legal to transform the non-flag-setting isnan into flag-setting fcmp. There is a slight wording mismatch, as FENV_ACCESS is about testing, not setting flags, but the footnote seems to indicate that the correct reading is that FENV_ACCESS is necessary if you want guarantees about flags being set or not set.

This statement is not pertinent to this case because the only operation in get_code is a call to isnan, but it must not raise any exception according to the same standard (F3p6):

The C classification macros fpclassify, iscanonical, isfinite, isinf, isnan, isnormal,
issignaling, issubnormal, iszero, and signbit provide the IEC 60559 operations indicated
in the table above provided their arguments are in the format of their semantic type. Then these
macros raise no floating-point exceptions, even if an argument is a signaling NaN.

and this property must be kept no matter if exceptions are ignored or not.

In a broader context, it looks natural that a non-strictfp function may lose some exceptions that strictfp function would raise. But it would be counterintuitive if it raised new exceptions, which cannot be observed in strictfp function.

I do not believe there's any problem with an isnan to fcmp transform -- as long as we restrict it to non-strictfp functions. And I'm not sure why there's so much debate here. I thought all those sorts of semantic questions were already pretty well settled. Does the documentation about strictfp vs not-strictfp semantics need to be made clearer?

In D137811#4090256, @sepavloff wrote:

The C classification macros fpclassify, iscanonical, isfinite, isinf, isnan, isnormal,
issignaling, issubnormal, iszero, and signbit provide the IEC 60559 operations indicated
in the table above provided their arguments are in the format of their semantic type. Then these
macros raise no floating-point exceptions, even if an argument is a signaling NaN.

and this property must be kept no matter if exceptions are ignored or not.

No, the property does not need to be maintained if the exception flag state cannot be observed by correct code. And that's exactly the case at hand in FENV_ACCESS OFF code (and, correspondingly, non-strictfp in LLVM IR). In such code, you are not allowed to have trap-on-exception enabled, nor are you allowed to access the floating-point exception status flags, and, the state of the status flags is unspecified upon transitioning back to FENV_ACCESS ON code. Thus, within such code, we may set the status flags to any arbitrary value we wish -- nobody can (correctly) tell the difference.

In a broader context, it looks natural that a non-strictfp function may lose some exceptions that strictfp function would raise. But it would be counterintuitive if it raised new exceptions, which cannot be observed in strictfp function.

That's not a problem. It is definitely permissible for a non-strictfp function to raise an FP-exceptions that "should not" be raised. That this is permissible is one of the most important properties of the non-strict operations -- it allows code motion which is impermissible if the status-flag side effects must be taken into account. (e.g. speculating an "fadd" above a conditional).

jyknight added inline comments.Jan 31 2023, 4:06 PM

llvm/include/llvm/IR/IRBuilder.h
2437 ↗	(On Diff #483955)	I don't think this function should be added -- it emits the wrong thing for strictfp mode, but also just doesn't seem useful. That we sometimes canonicalize to fcmp is just an implementation detail -- the CreateFCmpUNO call can just be done inline in foldIntrinsicIsFPClass.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
824	That may be true. However, on architectures that have an fpclass instruction, it could be profitable to canonicalize other operations into llvm.is.fpclass operations, especially if that then allows merging multiple llvm.is.fpclass calls into one. E.g. `bool a = isinf(x) \|\| isnan(x)` can turn into a single instruction `llvm.is.fpclass(x, snan/qnan/pinf/ninf)`. (However, this isn't an objection to taking this patch for now -- revisiting the canonical form later is always possible.)

jyknight added inline comments.Jan 31 2023, 5:25 PM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
824	(Sorry, I just noticed you already have another patch which touches upon that already. OK!)

In D137811#4095037, @jyknight wrote:
In D137811#4090256, @sepavloff wrote:
The C classification macros fpclassify, iscanonical, isfinite, isinf, isnan, isnormal,
issignaling, issubnormal, iszero, and signbit provide the IEC 60559 operations indicated
in the table above provided their arguments are in the format of their semantic type. Then these
macros raise no floating-point exceptions, even if an argument is a signaling NaN.
and this property must be kept no matter if exceptions are ignored or not.
No, the property does not need to be maintained if the exception flag state cannot be observed by correct code. And that's exactly the case at hand in FENV_ACCESS OFF code (and, correspondingly, non-strictfp in LLVM IR). In such code, you are not allowed to have trap-on-exception enabled, nor are you allowed to access the floating-point exception status flags, and, the state of the status flags is unspecified upon transitioning back to FENV_ACCESS ON code. Thus, within such code, we may set the status flags to any arbitrary value we wish -- nobody can (correctly) tell the difference.

What about the original case, when isnan is used in inline function?

inline int get_code(float x) {
  return isnan(x) ? 1 : 2;
}
#pragma STDC FENV_ACCESS ON
void func2(float x) {
  int code1 = get_code(x);
  ...
}

Can this function, defined as non-strictfp, raise Invalid exception when inlined into strictfp function?

In a broader context, it looks natural that a non-strictfp function may lose some exceptions that strictfp function would raise. But it would be counterintuitive if it raised new exceptions, which cannot be observed in strictfp function.

That's not a problem. It is definitely permissible for a non-strictfp function to raise an FP-exceptions that "should not" be raised. That this is permissible is one of the most important properties of the non-strict operations -- it allows code motion which is impermissible if the status-flag side effects must be taken into account. (e.g. speculating an "fadd" above a conditional).

There is a use case, when a computation is made with traps allowed for errors (Invalid, Overflow, DivideByZero). If an error occurs, computation is stopped. It saves from checks in multiple places during computation. The computation runs in default mode, because rounding mode is standard and precise exception flag behavior is unimportant. If non-strictfp functions are allowed to throw spurious exceptions, such solution won't work. Running computations in strictfp mode is not an option because of poor performance.

In D137811#4095594, @sepavloff wrote:

Can this function, defined as non-strictfp, raise Invalid exception when inlined into strictfp function?

Inlining cannot change semantics. The function can, but not from the part derived from the inlined callee

Drop IRBuilder helper

Harbormaster completed remote builds in B211191: Diff 493895.Feb 1 2023, 5:48 AM

In D137811#4095594, @sepavloff wrote:
What about the original case, when isnan is used in inline function?
inline int get_code(float x) {
  return isnan(x) ? 1 : 2;
}
#pragma STDC FENV_ACCESS ON
void func2(float x) {
  int code1 = get_code(x);
  ...
}
Can this function, defined as non-strictfp, raise Invalid exception when inlined into strictfp function?

Since the function is non-strictfp, it's legal to optimize it in a way to raise spurious exceptions. If the isnan is retained at the time the function is inlined into a strictfp function, then it is no longer possible to optimize the copy of isnan in the strictfp function to raise spurious exceptions. But an optimization pass that runs before the inliner may still introduce spurious exceptions.

There is a use case, when a computation is made with traps allowed for errors (Invalid, Overflow, DivideByZero). If an error occurs, computation is stopped. It saves from checks in multiple places during computation. The computation runs in default mode, because rounding mode is standard and precise exception flag behavior is unimportant. If non-strictfp functions are allowed to throw spurious exceptions, such solution won't work. Running computations in strictfp mode is not an option because of poor performance.

Enabling hardware traps for FP exceptions makes the FP environment no longer in the default mode, which requires the computation to be done in a strictfp function to have any defined behavior.

arsenm added a child revision: D143264: InstCombine: Fold is.fpclass(x, fcZero) to fcmp oeq 0.Feb 3 2023, 6:38 AM

As far as I can tell, this conversation has run its course, and I don't think there were any more comments on the actual code, so I'm going to accept this revision.

Use-cases that require traps or otherwise observing floating-point status flags must generate strictfp functions and constrained intrinsics. (E.g. in Clang, you might wish to compile with -ffp-exception-behavior=maytrap). That was true before this change, too.

If the performance of strictfp code is not as good as desired, then I'm sure there's more work that can be done optimizing it -- but there is a reason it's a separate mode.

jyknight accepted this revision.Feb 4 2023, 6:56 PM

This revision is now accepted and ready to land.Feb 4 2023, 6:56 PM

e9f3034febc62d77caaa0746358332f4f4bead49

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ConstantFolding.cpp

21 lines

InstructionSimplify.cpp

13 lines

Target/

AMDGPU/

AMDGPUInstCombineIntrinsic.cpp

81 lines

Transforms/

InstCombine/

InstCombineCalls.cpp

47 lines

InstCombineInternal.h

1 line

test/

Transforms/

InstCombine/

AMDGPU/

amdgcn-intrinsics.ll

30 lines

is_fpclass.ll

113 lines

Diff 475996

llvm/lib/Analysis/ConstantFolding.cpp

Show First 20 Lines • Show All 1,611 Lines • ▼ Show 20 Lines	bool llvm::canConstantFoldCallTo(const CallBase Call, const Function F) {
case Intrinsic::x86_avx512_cvttsd2usi:		case Intrinsic::x86_avx512_cvttsd2usi:
case Intrinsic::x86_avx512_cvttsd2usi64:		case Intrinsic::x86_avx512_cvttsd2usi64:
return !Call->isStrictFP();		return !Call->isStrictFP();

// Sign operations are actually bitwise operations, they do not raise		// Sign operations are actually bitwise operations, they do not raise
// exceptions even for SNANs.		// exceptions even for SNANs.
case Intrinsic::fabs:		case Intrinsic::fabs:
case Intrinsic::copysign:		case Intrinsic::copysign:
		case Intrinsic::is_fpclass:
// Non-constrained variants of rounding operations means default FP		// Non-constrained variants of rounding operations means default FP
// environment, they can be folded in any case.		// environment, they can be folded in any case.
case Intrinsic::ceil:		case Intrinsic::ceil:
case Intrinsic::floor:		case Intrinsic::floor:
case Intrinsic::round:		case Intrinsic::round:
case Intrinsic::roundeven:		case Intrinsic::roundeven:
case Intrinsic::trunc:		case Intrinsic::trunc:
case Intrinsic::nearbyint:		case Intrinsic::nearbyint:
▲ Show 20 Lines • Show All 948 Lines • ▼ Show 20 Lines	if (const auto *Op2 = dyn_cast<ConstantFP>(Operands[1])) {
[[fallthrough]];		[[fallthrough]];
case LibFunc_atan2_finite:		case LibFunc_atan2_finite:
case LibFunc_atan2f_finite:		case LibFunc_atan2f_finite:
if (TLI->has(Func))		if (TLI->has(Func))
return ConstantFoldBinaryFP(atan2, Op1V, Op2V, Ty);		return ConstantFoldBinaryFP(atan2, Op1V, Op2V, Ty);
break;		break;
}		}
} else if (auto *Op2C = dyn_cast<ConstantInt>(Operands[1])) {		} else if (auto *Op2C = dyn_cast<ConstantInt>(Operands[1])) {
		switch (IntrinsicID) {
		case Intrinsic::is_fpclass: {
		uint32_t Mask = Op2C->getZExtValue();
		bool Result =
		((Mask & fcSNan) && Op1V.isNaN() && Op1V.isSignaling()) \|\|
		((Mask & fcQNan) && Op1V.isNaN() && !Op1V.isSignaling()) \|\|
		((Mask & fcNegInf) && Op1V.isInfinity() && Op1V.isNegative()) \|\|
		((Mask & fcNegNormal) && Op1V.isNormal() && Op1V.isNegative()) \|\|
		((Mask & fcNegSubnormal) && Op1V.isDenormal() && Op1V.isNegative()) \|\|
		((Mask & fcNegZero) && Op1V.isZero() && Op1V.isNegative()) \|\|
		((Mask & fcPosZero) && Op1V.isZero() && !Op1V.isNegative()) \|\|
		((Mask & fcPosSubnormal) && Op1V.isDenormal() && !Op1V.isNegative()) \|\|
		((Mask & fcPosNormal) && Op1V.isNormal() && !Op1V.isNegative()) \|\|
		((Mask & fcPosInf) && Op1V.isInfinity() && !Op1V.isNegative());
		return ConstantInt::get(Ty, Result);
		}
		default:
		break;
		}

if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy())		if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy())
return nullptr;		return nullptr;
if (IntrinsicID == Intrinsic::powi && Ty->isHalfTy())		if (IntrinsicID == Intrinsic::powi && Ty->isHalfTy())
return ConstantFP::get(		return ConstantFP::get(
Ty->getContext(),		Ty->getContext(),
APFloat((float)std::pow((float)Op1V.convertToDouble(),		APFloat((float)std::pow((float)Op1V.convertToDouble(),
(int)Op2C->getZExtValue())));		(int)Op2C->getZExtValue())));
if (IntrinsicID == Intrinsic::powi && Ty->isFloatTy())		if (IntrinsicID == Intrinsic::powi && Ty->isFloatTy())
▲ Show 20 Lines • Show All 803 Lines • Show Last 20 Lines

llvm/lib/Analysis/InstructionSimplify.cpp

Show First 20 Lines • Show All 6,047 Lines • ▼ Show 20 Lines	case Intrinsic::copysign:
if (Op0 == Op1)		if (Op0 == Op1)
return Op0;		return Op0;
// copysign -X, X --> X		// copysign -X, X --> X
// copysign X, -X --> -X		// copysign X, -X --> -X
if (match(Op0, m_FNeg(m_Specific(Op1))) \|\|		if (match(Op0, m_FNeg(m_Specific(Op1))) \|\|
match(Op1, m_FNeg(m_Specific(Op0))))		match(Op1, m_FNeg(m_Specific(Op0))))
return Op1;		return Op1;
break;		break;
		case Intrinsic::is_fpclass: {
		uint64_t Mask = cast<ConstantInt>(Op1)->getZExtValue();
		// If all tests are made, it doesn't matter what the value is.
		if ((Mask & fcAllFlags) == fcAllFlags)
		return ConstantInt::get(ReturnType, true);
		if ((Mask & fcAllFlags) == 0)
		return ConstantInt::get(ReturnType, false);
		if (isa<PoisonValue>(Op0))
		return PoisonValue::get(ReturnType);
		if (Q.isUndefValue(Op0))
		return UndefValue::get(ReturnType);
		foadUnsubmitted Not Done Reply Inline Actions Poison/undef checks should probably go first, before other simplifications? foad: Poison/undef checks should probably go first, before other simplifications?
		arsenmAuthorUnsubmitted Done Reply Inline Actions I specifically had these later. The first operand shouldn't matter based on the test. I'm thinking about the interaction between fast math optimizations, and class uses used to guard regions where nnan/ninf is expected. arsenm: I specifically had these later. The first operand shouldn't matter based on the test. I'm…
		foadUnsubmitted Not Done Reply Inline Actions It's maybe not worth arguing about, but... LangRef says "Most instructions return ‘poison’ when one of their arguments is ‘poison’". If you're saying is.fpclass is an exception to that rule then it at least ought to be documented. But I'm not sure why simplifying to poison would be a problem for the kind of code you're talking about - do you have an example? foad: It's maybe not worth arguing about, but... LangRef says "Most instructions return ‘poison’…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Cases like this, if you violate nnan/ninf but you have a test of the special cases: div = fdiv nan x / nan if (!is_fpclass(div, nan)) { // do fast stuff } I was viewing this as similar to select of poison. select with a poison condition propagates poison, but the value operands can be unobserved. Similarly here, a test of 0 shouldn't need to observe the actual compared value. I was planning on revisiting this as I get further in optimizing all the special case checks in the math libraries. For now, this is the more conservative direction. The main thing I'm worried about is the asymmetry between equivalent fcmps and the class if we make this the expected behavior. arsenm: Cases like this, if you violate nnan/ninf but you have a test of the special cases: ``` div =…
		sepavloffUnsubmitted Not Done Reply Inline Actions In the example you presented the operation producing NaN can be evaluated at compile time, so is_fpclass can be evaluated during compilation. There is no reason to create poison value here. In the presence of nnan/ninf it is also saver to evaluate is_fpclass instead of poisoning it, because `fdiv nan, nan` and `is_fpclass(div, nan)` can come from different functions compiled with different fast-math settings merged, for example, by LTO. Poisoning can make a correct program non-working. sepavloff: In the example you presented the operation producing NaN can be evaluated at compile time, so…
		nlopesUnsubmitted Not Done Reply Inline Actions The issue of special casing handling of poison is that then you cannot easily convert to/from other instructions. Blocking propagation of poison is similar to freeze, which is sticky. Select is an exception, as it blocks poison propagation, but the caveat is that we had to remove this transformation: select X, true, Y -> or X, Y (as the or propagates more poison than the select did). So I would suggest to not special case poison anywhere unless it's really important for some reason. nlopes: The issue of special casing handling of poison is that then you cannot easily convert to/from…
		break;
		}
case Intrinsic::maxnum:		case Intrinsic::maxnum:
case Intrinsic::minnum:		case Intrinsic::minnum:
case Intrinsic::maximum:		case Intrinsic::maximum:
case Intrinsic::minimum: {		case Intrinsic::minimum: {
// If the arguments are the same, this is a no-op.		// If the arguments are the same, this is a no-op.
if (Op0 == Op1)		if (Op0 == Op1)
return Op0;		return Op0;

▲ Show 20 Lines • Show All 588 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp

Show First 20 Lines • Show All 416 Lines • ▼ Show 20 Lines	case Intrinsic::amdgcn_frexp_exp: {
}		}

break;		break;
}		}
case Intrinsic::amdgcn_class: {		case Intrinsic::amdgcn_class: {
Value *Src0 = II.getArgOperand(0);		Value *Src0 = II.getArgOperand(0);
Value *Src1 = II.getArgOperand(1);		Value *Src1 = II.getArgOperand(1);
const ConstantInt *CMask = dyn_cast<ConstantInt>(Src1);		const ConstantInt *CMask = dyn_cast<ConstantInt>(Src1);
if (!CMask) {		if (CMask) {
if (isa<UndefValue>(Src0)) {		II.setCalledOperand(Intrinsic::getDeclaration(
return IC.replaceInstUsesWith(II, UndefValue::get(II.getType()));		II.getModule(), Intrinsic::is_fpclass, Src0->getType()));
}		return &II;

if (isa<UndefValue>(Src1)) {
return IC.replaceInstUsesWith(II,
ConstantInt::get(II.getType(), false));
}
break;
}

uint32_t Mask = CMask->getZExtValue();

// If all tests are made, it doesn't matter what the value is.
if ((Mask & fcAllFlags) == fcAllFlags) {
return IC.replaceInstUsesWith(II, ConstantInt::get(II.getType(), true));
}

if ((Mask & fcAllFlags) == 0) {
return IC.replaceInstUsesWith(II, ConstantInt::get(II.getType(), false));
}

if (Mask == fcNan && !II.isStrictFP()) {
// Equivalent of isnan. Replace with standard fcmp.
Value *FCmp = IC.Builder.CreateFCmpUNO(Src0, Src0);
FCmp->takeName(&II);
return IC.replaceInstUsesWith(II, FCmp);
}

if (Mask == fcZero && !II.isStrictFP()) {
// Equivalent of == 0.
Value *FCmp =
IC.Builder.CreateFCmpOEQ(Src0, ConstantFP::get(Src0->getType(), 0.0));

FCmp->takeName(&II);
return IC.replaceInstUsesWith(II, FCmp);
}

// fp_class (nnan x), qnan\|snan\|other -> fp_class (nnan x), other
if ((Mask & fcNan) && isKnownNeverNaN(Src0, &IC.getTargetLibraryInfo())) {
return IC.replaceOperand(
II, 1, ConstantInt::get(Src1->getType(), Mask & ~fcNan));
}		}

const ConstantFP *CVal = dyn_cast<ConstantFP>(Src0);		if (isa<UndefValue>(Src0))
if (!CVal) {
if (isa<UndefValue>(Src0)) {
return IC.replaceInstUsesWith(II, UndefValue::get(II.getType()));		return IC.replaceInstUsesWith(II, UndefValue::get(II.getType()));
}

// Clamp mask to used bits
if ((Mask & fcAllFlags) != Mask) {
CallInst *NewCall = IC.Builder.CreateCall(
II.getCalledFunction(),
{Src0, ConstantInt::get(Src1->getType(), Mask & fcAllFlags)});

NewCall->takeName(&II);		if (isa<UndefValue>(Src1)) {
return IC.replaceInstUsesWith(II, NewCall);		return IC.replaceInstUsesWith(II, ConstantInt::get(II.getType(), false));
}		}

break;		break;
}		}

const APFloat &Val = CVal->getValueAPF();

bool Result =
((Mask & fcSNan) && Val.isNaN() && Val.isSignaling()) \|\|
((Mask & fcQNan) && Val.isNaN() && !Val.isSignaling()) \|\|
((Mask & fcNegInf) && Val.isInfinity() && Val.isNegative()) \|\|
((Mask & fcNegNormal) && Val.isNormal() && Val.isNegative()) \|\|
((Mask & fcNegSubnormal) && Val.isDenormal() && Val.isNegative()) \|\|
((Mask & fcNegZero) && Val.isZero() && Val.isNegative()) \|\|
((Mask & fcPosZero) && Val.isZero() && !Val.isNegative()) \|\|
((Mask & fcPosSubnormal) && Val.isDenormal() && !Val.isNegative()) \|\|
((Mask & fcPosNormal) && Val.isNormal() && !Val.isNegative()) \|\|
((Mask & fcPosInf) && Val.isInfinity() && !Val.isNegative());

return IC.replaceInstUsesWith(II, ConstantInt::get(II.getType(), Result));
}
case Intrinsic::amdgcn_cvt_pkrtz: {		case Intrinsic::amdgcn_cvt_pkrtz: {
Value *Src0 = II.getArgOperand(0);		Value *Src0 = II.getArgOperand(0);
Value *Src1 = II.getArgOperand(1);		Value *Src1 = II.getArgOperand(1);
if (const ConstantFP *C0 = dyn_cast<ConstantFP>(Src0)) {		if (const ConstantFP *C0 = dyn_cast<ConstantFP>(Src0)) {
if (const ConstantFP *C1 = dyn_cast<ConstantFP>(Src1)) {		if (const ConstantFP *C1 = dyn_cast<ConstantFP>(Src1)) {
const fltSemantics &HalfSem =		const fltSemantics &HalfSem =
II.getType()->getScalarType()->getFltSemantics();		II.getType()->getScalarType()->getFltSemantics();
bool LosesInfo;		bool LosesInfo;
▲ Show 20 Lines • Show All 720 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 806 Lines • ▼ Show 20 Lines	InstCombinerImpl::foldIntrinsicWithOverflowCommon(IntrinsicInst *II) {
Value *OperationResult = nullptr;		Value *OperationResult = nullptr;
Constant *OverflowResult = nullptr;		Constant *OverflowResult = nullptr;
if (OptimizeOverflowCheck(WO->getBinaryOp(), WO->isSigned(), WO->getLHS(),		if (OptimizeOverflowCheck(WO->getBinaryOp(), WO->isSigned(), WO->getLHS(),
WO->getRHS(), *WO, OperationResult, OverflowResult))		WO->getRHS(), *WO, OperationResult, OverflowResult))
return createOverflowTuple(WO, OperationResult, OverflowResult);		return createOverflowTuple(WO, OperationResult, OverflowResult);
return nullptr;		return nullptr;
}		}

		Instruction *InstCombinerImpl::foldIntrinsicIsFPClass(IntrinsicInst &II) {
		Value *Src0 = II.getArgOperand(0);
		Value *Src1 = II.getArgOperand(1);
		const ConstantInt *CMask = cast<ConstantInt>(Src1);
		uint32_t Mask = CMask->getZExtValue();
		jcranmer-intelUnsubmitted Not Done Reply Inline Actions It feels like you should be able to use ninf/nnan/nsz flags to modify the mask here to make it more likely to fall into one of these categories, although this may provoke some backlash from those who argue that `nnan isnan(x)` => unconditional `false` shouldn't be an allowable optimization. jcranmer-intel: It feels like you should be able to use ninf/nnan/nsz flags to modify the mask here to make it…
		arsenmAuthorUnsubmitted Done Reply Inline Actions I think this is debatable, and beyond the scope of this patch where I'd like to simply move what we already do for the target specific intrinsic arsenm: I think this is debatable, and beyond the scope of this patch where I'd like to simply move…
		arsenmAuthorUnsubmitted Done Reply Inline Actions we can't actually put fast math flags here, because it's only allowed for calls with FP return types arsenm: we can't actually put fast math flags here, because it's only allowed for calls with FP return…
		const bool IsStrict = II.isStrictFP();

		if (Mask == fcNan && !IsStrict) {
		foadUnsubmitted Not Done Reply Inline Actions Seems like you should handle `Mask == (fcAllFlags & ~fcNan)` here too. Same for the other `Mask ==` cases below. And allow InstCombine to "freely invert" this intrinsic by flipping all bits in the mask. foad: Seems like you should handle `Mask == (fcAllFlags & ~fcNan)` here too. Same for the other `Mask…
		foadUnsubmitted Not Done Reply Inline Actions Please add a TODO for this if you're not going to do it now. foad: Please add a TODO for this if you're not going to do it now.
		// Equivalent of isnan. Replace with standard fcmp.
		Value *FCmp = Builder.CreateFCmpUNO(Src0, Src0);
		sepavloffUnsubmitted Not Done Reply Inline Actions Is it profitable to make such replacement early? Is there any advantage, at least hypothetical? sepavloff: Is it profitable to make such replacement early? Is there any advantage, at least hypothetical?
		arsenmAuthorUnsubmitted Done Reply Inline Actions I think an fcmp should be more canonical and better handled by existing optimizations, than class which is handled ~nowhere arsenm: I think an fcmp should be more canonical and better handled by existing optimizations, than…
		jyknightUnsubmitted Not Done Reply Inline Actions This seems OK. I'm not sure it's the best choice -- if a CPU actually has an fpclassify instruction, is it really a good idea to canonicalize in generic code to fcmp? But I think that's fine to revisit later if it becomes a problem. jyknight: This seems OK. I'm not sure it's the //best// choice -- if a CPU actually has an fpclassify…
		arsenmAuthorUnsubmitted Done Reply Inline Actions I've been thinking if we can do a classify in <= 2 IR instructions, fcmp+fabs is probably a better canonical form. If a class pattern is 3-4+ instructions, the class is probably better. FCmp + fneg + fabs are always going to be more broadly understood. Fcmp also supports fast math flags, unlike class (I guess we could fix that though) arsenm: I've been thinking if we can do a classify in <= 2 IR instructions, fcmp+fabs is probably a…
		jyknightUnsubmitted Not Done Reply Inline Actions That may be true. However, on architectures that have an fpclass instruction, it could be profitable to canonicalize other operations into llvm.is.fpclass operations, especially if that then allows merging multiple llvm.is.fpclass calls into one. E.g. `bool a = isinf(x) \|\| isnan(x)` can turn into a single instruction `llvm.is.fpclass(x, snan/qnan/pinf/ninf)`. (However, this isn't an objection to taking this patch for now -- revisiting the canonical form later is always possible.) jyknight: That may be true. However, on architectures that have an fpclass instruction, it could be…
		jyknightUnsubmitted Not Done Reply Inline Actions (Sorry, I just noticed you already have another patch which touches upon that already. OK!) jyknight: (Sorry, I just noticed you already have another patch which touches upon that already. OK!)
		FCmp->takeName(&II);
		return replaceInstUsesWith(II, FCmp);
		}

		if (Mask == fcZero && !IsStrict) {
		sepavloffUnsubmitted Not Done Reply Inline Actions The second argument is described with `ImmArg`. It must always be a constant. sepavloff: The second argument is described with `ImmArg`. It must always be a constant.
		arsenmAuthorUnsubmitted Done Reply Inline Actions I'm debating removing this restriction arsenm: I'm debating removing this restriction
		// Equivalent of == 0.
		Value *FCmp =
		Builder.CreateFCmpOEQ(Src0, ConstantFP::get(Src0->getType(), 0.0));

		FCmp->takeName(&II);
		return replaceInstUsesWith(II, FCmp);
		}

		jcranmer-intelUnsubmitted Not Done Reply Inline Actions You're also missing cases for `Mask == fcPosInf` and `Mask == fcNegInf`, which can similarly be lowered to quiet comparisons. jcranmer-intel: You're also missing cases for `Mask == fcPosInf` and `Mask == fcNegInf`, which can similarly be…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Ditto, for now I'd like to just move the code arsenm: Ditto, for now I'd like to just move the code
		// fp_class (nnan x), qnan\|snan\|other -> fp_class (nnan x), other
		if ((Mask & fcNan) && isKnownNeverNaN(Src0, &getTargetLibraryInfo())) {
		jyknightUnsubmitted Not Done Reply Inline Actions These removal of Mask bits should come before the `Mask == $X` tests, shouldn't they? jyknight: These removal of Mask bits should come before the `Mask == $X` tests, shouldn't they?
		foadUnsubmitted Not Done Reply Inline Actions I assume that this function will be called again to revisit the modified instrinsic. foad: I assume that this function will be called again to revisit the modified instrinsic.
		arsenmAuthorUnsubmitted Done Reply Inline Actions Right, it's revisited. I don't expect the bit removal part to ever actually happen arsenm: Right, it's revisited. I don't expect the bit removal part to ever actually happen
		return replaceOperand(II, 1,
		ConstantInt::get(Src1->getType(), Mask & ~fcNan));
		}
		sepavloffUnsubmitted Not Done Reply Inline Actions This replacement is not valid in general case, only if FP exceptions are ignored. If the argument is a signaling NaN, compare instruction raises `Invalid` exception. sepavloff: This replacement is not valid in general case, only if FP exceptions are ignored. If the…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Isn't that implied by not being a constrained intrinsic? arsenm: Isn't that implied by not being a constrained intrinsic?
		sepavloffUnsubmitted Not Done Reply Inline Actions `is_fpclass` does not depend on FP environment and does not change it. So it does not have constrained variant. sepavloff: `is_fpclass` does not depend on FP environment and does not change it. So it does not have…
		jcranmer-intelUnsubmitted Not Done Reply Inline Actions `Builder.CreateFCmp` functions look like they create quiet comparison instructions in FP-constrained mode, and you need to use `CreateFCmpS` to generate one of the signalling comparisons. So replacing them even in strict mode is legal. jcranmer-intel:* `Builder.CreateFCmp*` functions look like they create quiet comparison instructions in FP…
		sepavloffUnsubmitted Not Done Reply Inline Actions Quiet comparison instruction do not raise FP exception if an argument is quiet NaN. It still generates exceptions for signaling NaNs. sepavloff: Quiet comparison instruction do not raise FP exception if an argument is quiet NaN. It still…

		// Clamp mask to used bits
		if ((Mask & fcAllFlags) != Mask) {
		foadUnsubmitted Not Done Reply Inline Actions What's this `!CVal` check for? foad: What's this `!CVal` check for?
		arsenmAuthorUnsubmitted Done Reply Inline Actions I moved this part to InstSimplify, will drop arsenm: I moved this part to InstSimplify, will drop
		CallInst *NewCall = Builder.CreateCall(
		foadUnsubmitted Not Done Reply Inline Actions Might be simpler to modify the instruction in place with setOperand, and then Worklist.pushUsersToWorkList. But perhaps that is less clear. foad: Might be simpler to modify the instruction in place with setOperand, and then Worklist.
		II.getCalledFunction(),
		{Src0, ConstantInt::get(Src1->getType(), Mask & fcAllFlags)});
		jyknightUnsubmitted Not Done Reply Inline Actions Why are unknown bits even accepted? ISTM it should be an error in Verifier::visitIntrinsicCall to pass invalid bits. jyknight: Why are unknown bits even accepted? ISTM it should be an error in Verifier::visitIntrinsicCall…
		arsenmAuthorUnsubmitted Done Reply Inline Actions I don't know. Really this should be an i10 argument. I've been debating whether to add a verifier check, or make it an i10. I'd prefer to do that separately and clean up the bits here for now. arsenm: I don't know. Really this should be an i10 argument. I've been debating whether to add a…
		jcranmer-intelUnsubmitted Not Done Reply Inline Actions The LangRef documentation of `llvm.is.fpclass` doesn't pin down the handling for noncanonical values well. It's plausible they could be handled by extension of extra bits, but existing code seems to ignore them for `ppc_fp128` and treat them as NaNs that are neither qNaN nor sNaN for `x86_fp80`. Not that it makes any difference to this patch, but it suggests that making it a verifier check instead of an `i10` is the better path, as it is slightly better future-proofed. jcranmer-intel: The LangRef documentation of `llvm.is.fpclass` doesn't pin down the handling for noncanonical…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Another reason is the operand doesn't really need to be immarg. The AMDGPU class instruction handles a register just fine (in fact nearly every test mask needs to be materialized). In that case we would want folds to a canonicalizable constant value arsenm: Another reason is the operand doesn't really need to be immarg. The AMDGPU class instruction…

		sepavloffUnsubmitted Not Done Reply Inline Actions It also is not always valid, only if the argument is not a signaling NaN or FP exceptions are ignored.. sepavloff: It also is not always valid, only if the argument is not a signaling NaN or FP exceptions are…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Isn't that implied by not being a constrained intrinsic? arsenm: Isn't that implied by not being a constrained intrinsic?
		NewCall->takeName(&II);
		sepavloffUnsubmitted Done Reply Inline Actions Deleted code? sepavloff: Deleted code?
		return replaceInstUsesWith(II, NewCall);
		}

		return nullptr;
		}

static Optional<bool> getKnownSign(Value Op, Instruction CxtI,		static Optional<bool> getKnownSign(Value Op, Instruction CxtI,
const DataLayout &DL, AssumptionCache *AC,		const DataLayout &DL, AssumptionCache *AC,
DominatorTree *DT) {		DominatorTree *DT) {
KnownBits Known = computeKnownBits(Op, DL, 0, AC, CxtI, DT);		KnownBits Known = computeKnownBits(Op, DL, 0, AC, CxtI, DT);
if (Known.isNonNegative())		if (Known.isNonNegative())
return false;		return false;
if (Known.isNegative())		if (Known.isNegative())
return true;		return true;

Value X, Y;		Value X, Y;
if (match(Op, m_NSWSub(m_Value(X), m_Value(Y))))		if (match(Op, m_NSWSub(m_Value(X), m_Value(Y))))
return isImpliedByDomCondition(ICmpInst::ICMP_SLT, X, Y, CxtI, DL);		return isImpliedByDomCondition(ICmpInst::ICMP_SLT, X, Y, CxtI, DL);

return isImpliedByDomCondition(		return isImpliedByDomCondition(
ICmpInst::ICMP_SLT, Op, Constant::getNullValue(Op->getType()), CxtI, DL);		ICmpInst::ICMP_SLT, Op, Constant::getNullValue(Op->getType()), CxtI, DL);
}		}

/// Try to canonicalize min/max(X + C0, C1) as min/max(X, C1 - C0) + C0. This		/// Try to canonicalize min/max(X + C0, C1) as min/max(X, C1 - C0) + C0. This
/// can trigger other combines.		/// can trigger other combines.
static Instruction moveAddAfterMinMax(IntrinsicInst II,		static Instruction moveAddAfterMinMax(IntrinsicInst II,
InstCombiner::BuilderTy &Builder) {		InstCombiner::BuilderTy &Builder) {
Intrinsic::ID MinMaxID = II->getIntrinsicID();		Intrinsic::ID MinMaxID = II->getIntrinsicID();
assert((MinMaxID == Intrinsic::smax \|\| MinMaxID == Intrinsic::smin \|\|		assert((MinMaxID == Intrinsic::smax \|\| MinMaxID == Intrinsic::smin \|\|
MinMaxID == Intrinsic::umax \|\| MinMaxID == Intrinsic::umin) &&		MinMaxID == Intrinsic::umax \|\| MinMaxID == Intrinsic::umin) &&
"Expected a min or max intrinsic");		"Expected a min or max intrinsic");
		sepavloffUnsubmitted Not Done Reply Inline Actions Constant folding should be done in `ConstantFolding.cpp`. sepavloff: Constant folding should be done in `ConstantFolding.cpp`.

// TODO: Match vectors with undef elements, but undef may not propagate.		// TODO: Match vectors with undef elements, but undef may not propagate.
Value Op0 = II->getArgOperand(0), Op1 = II->getArgOperand(1);		Value Op0 = II->getArgOperand(0), Op1 = II->getArgOperand(1);
Value *X;		Value *X;
const APInt C0, C1;		const APInt C0, C1;
if (!match(Op0, m_OneUse(m_Add(m_Value(X), m_APInt(C0)))) \|\|		if (!match(Op0, m_OneUse(m_Add(m_Value(X), m_APInt(C0)))) \|\|
!match(Op1, m_APInt(C1)))		!match(Op1, m_APInt(C1)))
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 1,964 Lines • ▼ Show 20 Lines	case Intrinsic::vector_reduce_fmul: {
// Can remove shuffle iff just shuffled elements, no repeats, undefs, or		// Can remove shuffle iff just shuffled elements, no repeats, undefs, or
// other changes.		// other changes.
if (UsedIndices.all()) {		if (UsedIndices.all()) {
replaceUse(II->getOperandUse(ArgIdx), V);		replaceUse(II->getOperandUse(ArgIdx), V);
return nullptr;		return nullptr;
}		}
break;		break;
}		}
		case Intrinsic::is_fpclass: {
		if (Instruction I = foldIntrinsicIsFPClass(II))
		return I;
		break;
		}
default: {		default: {
// Handle target specific intrinsics		// Handle target specific intrinsics
Optional<Instruction > V = targetInstCombineIntrinsic(II);		Optional<Instruction > V = targetInstCombineIntrinsic(II);
if (V)		if (V)
return V.value();		return V.value();
break;		break;
}		}
}		}
▲ Show 20 Lines • Show All 963 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

Show First 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	private:
Value foldAndOrOfICmpsOfAndWithPow2(ICmpInst LHS, ICmpInst *RHS,		Value foldAndOrOfICmpsOfAndWithPow2(ICmpInst LHS, ICmpInst *RHS,
Instruction *CxtI, bool IsAnd,		Instruction *CxtI, bool IsAnd,
bool IsLogical = false);		bool IsLogical = false);
Value matchSelectFromAndOr(Value A, Value B, Value C, Value *D);		Value matchSelectFromAndOr(Value A, Value B, Value C, Value *D);
Value getSelectCondition(Value A, Value *B);		Value getSelectCondition(Value A, Value *B);

Instruction *foldExtractOfOverflowIntrinsic(ExtractValueInst &EV);		Instruction *foldExtractOfOverflowIntrinsic(ExtractValueInst &EV);
Instruction foldIntrinsicWithOverflowCommon(IntrinsicInst II);		Instruction foldIntrinsicWithOverflowCommon(IntrinsicInst II);
		Instruction *foldIntrinsicIsFPClass(IntrinsicInst &II);
Instruction *foldFPSignBitOps(BinaryOperator &I);		Instruction *foldFPSignBitOps(BinaryOperator &I);
Instruction *foldFDivConstantDivisor(BinaryOperator &I);		Instruction *foldFDivConstantDivisor(BinaryOperator &I);

// Optimize one of these forms:		// Optimize one of these forms:
// and i1 Op, SI / select i1 Op, i1 SI, i1 false (if IsAnd = true)		// and i1 Op, SI / select i1 Op, i1 SI, i1 false (if IsAnd = true)
// or i1 Op, SI / select i1 Op, i1 true, i1 SI (if IsAnd = false)		// or i1 Op, SI / select i1 Op, i1 true, i1 SI (if IsAnd = false)
// into simplier select instruction using isImpliedCondition.		// into simplier select instruction using isImpliedCondition.
Instruction foldAndOrOfSelectUsingImpliedCond(Value Op, SelectInst &SI,		Instruction foldAndOrOfSelectUsingImpliedCond(Value Op, SelectInst &SI,
▲ Show 20 Lines • Show All 443 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret double 0x3F97D05F417D05F4		; CHECK-NEXT: ret double 0x3F97D05F417D05F4
;		;
%val = call double @llvm.amdgcn.rcp.f64(double 4.300000e+01) nounwind readnone		%val = call double @llvm.amdgcn.rcp.f64(double 4.300000e+01) nounwind readnone
ret double %val		ret double %val
}		}

define float @test_constant_fold_rcp_f32_43_strictfp() nounwind strictfp {		define float @test_constant_fold_rcp_f32_43_strictfp() nounwind strictfp {
; CHECK-LABEL: @test_constant_fold_rcp_f32_43_strictfp(		; CHECK-LABEL: @test_constant_fold_rcp_f32_43_strictfp(
; CHECK-NEXT: [[VAL:%.*]] = call float @llvm.amdgcn.rcp.f32(float 4.300000e+01) #[[ATTR14:[0-9]+]]		; CHECK-NEXT: [[VAL:%.*]] = call float @llvm.amdgcn.rcp.f32(float 4.300000e+01) #[[ATTR15:[0-9]+]]
; CHECK-NEXT: ret float [[VAL]]		; CHECK-NEXT: ret float [[VAL]]
;		;
%val = call float @llvm.amdgcn.rcp.f32(float 4.300000e+01) strictfp nounwind readnone		%val = call float @llvm.amdgcn.rcp.f32(float 4.300000e+01) strictfp nounwind readnone
ret float %val		ret float %val
}		}

; --------------------------------------------------------------------		; --------------------------------------------------------------------
; llvm.amdgcn.sqrt		; llvm.amdgcn.sqrt
Show All 24 Lines
; CHECK-NEXT: ret double 0x7FF8000000000000		; CHECK-NEXT: ret double 0x7FF8000000000000
;		;
%val = call double @llvm.amdgcn.sqrt.f64(double undef) nounwind readnone		%val = call double @llvm.amdgcn.sqrt.f64(double undef) nounwind readnone
ret double %val		ret double %val
}		}

define half @test_constant_fold_sqrt_f16_0() nounwind {		define half @test_constant_fold_sqrt_f16_0() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f16_0(		; CHECK-LABEL: @test_constant_fold_sqrt_f16_0(
; CHECK-NEXT: [[VAL:%.*]] = call half @llvm.amdgcn.sqrt.f16(half 0xH0000) #[[ATTR15:[0-9]+]]		; CHECK-NEXT: [[VAL:%.*]] = call half @llvm.amdgcn.sqrt.f16(half 0xH0000) #[[ATTR16:[0-9]+]]
; CHECK-NEXT: ret half [[VAL]]		; CHECK-NEXT: ret half [[VAL]]
;		;
%val = call half @llvm.amdgcn.sqrt.f16(half 0.0) nounwind readnone		%val = call half @llvm.amdgcn.sqrt.f16(half 0.0) nounwind readnone
ret half %val		ret half %val
}		}

define float @test_constant_fold_sqrt_f32_0() nounwind {		define float @test_constant_fold_sqrt_f32_0() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f32_0(		; CHECK-LABEL: @test_constant_fold_sqrt_f32_0(
; CHECK-NEXT: [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float 0.000000e+00) #[[ATTR15]]		; CHECK-NEXT: [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float 0.000000e+00) #[[ATTR16]]
; CHECK-NEXT: ret float [[VAL]]		; CHECK-NEXT: ret float [[VAL]]
;		;
%val = call float @llvm.amdgcn.sqrt.f32(float 0.0) nounwind readnone		%val = call float @llvm.amdgcn.sqrt.f32(float 0.0) nounwind readnone
ret float %val		ret float %val
}		}

define double @test_constant_fold_sqrt_f64_0() nounwind {		define double @test_constant_fold_sqrt_f64_0() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f64_0(		; CHECK-LABEL: @test_constant_fold_sqrt_f64_0(
; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double 0.000000e+00) #[[ATTR15]]		; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double 0.000000e+00) #[[ATTR16]]
; CHECK-NEXT: ret double [[VAL]]		; CHECK-NEXT: ret double [[VAL]]
;		;
%val = call double @llvm.amdgcn.sqrt.f64(double 0.0) nounwind readnone		%val = call double @llvm.amdgcn.sqrt.f64(double 0.0) nounwind readnone
ret double %val		ret double %val
}		}

define half @test_constant_fold_sqrt_f16_neg0() nounwind {		define half @test_constant_fold_sqrt_f16_neg0() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f16_neg0(		; CHECK-LABEL: @test_constant_fold_sqrt_f16_neg0(
; CHECK-NEXT: [[VAL:%.*]] = call half @llvm.amdgcn.sqrt.f16(half 0xH8000) #[[ATTR15]]		; CHECK-NEXT: [[VAL:%.*]] = call half @llvm.amdgcn.sqrt.f16(half 0xH8000) #[[ATTR16]]
; CHECK-NEXT: ret half [[VAL]]		; CHECK-NEXT: ret half [[VAL]]
;		;
%val = call half @llvm.amdgcn.sqrt.f16(half -0.0) nounwind readnone		%val = call half @llvm.amdgcn.sqrt.f16(half -0.0) nounwind readnone
ret half %val		ret half %val
}		}

define float @test_constant_fold_sqrt_f32_neg0() nounwind {		define float @test_constant_fold_sqrt_f32_neg0() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f32_neg0(		; CHECK-LABEL: @test_constant_fold_sqrt_f32_neg0(
; CHECK-NEXT: [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float -0.000000e+00) #[[ATTR15]]		; CHECK-NEXT: [[VAL:%.*]] = call float @llvm.amdgcn.sqrt.f32(float -0.000000e+00) #[[ATTR16]]
; CHECK-NEXT: ret float [[VAL]]		; CHECK-NEXT: ret float [[VAL]]
;		;
%val = call float @llvm.amdgcn.sqrt.f32(float -0.0) nounwind readnone		%val = call float @llvm.amdgcn.sqrt.f32(float -0.0) nounwind readnone
ret float %val		ret float %val
}		}

define double @test_constant_fold_sqrt_f64_neg0() nounwind {		define double @test_constant_fold_sqrt_f64_neg0() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_f64_neg0(		; CHECK-LABEL: @test_constant_fold_sqrt_f64_neg0(
; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double -0.000000e+00) #[[ATTR15]]		; CHECK-NEXT: [[VAL:%.*]] = call double @llvm.amdgcn.sqrt.f64(double -0.000000e+00) #[[ATTR16]]
; CHECK-NEXT: ret double [[VAL]]		; CHECK-NEXT: ret double [[VAL]]
;		;
%val = call double @llvm.amdgcn.sqrt.f64(double -0.0) nounwind readnone		%val = call double @llvm.amdgcn.sqrt.f64(double -0.0) nounwind readnone
ret double %val		ret double %val
}		}

define double @test_constant_fold_sqrt_snan_f64() nounwind {		define double @test_constant_fold_sqrt_snan_f64() nounwind {
; CHECK-LABEL: @test_constant_fold_sqrt_snan_f64(		; CHECK-LABEL: @test_constant_fold_sqrt_snan_f64(
▲ Show 20 Lines • Show All 400 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret i1 false		; CHECK-NEXT: ret i1 false
;		;
%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 undef)		%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 undef)
ret i1 %val		ret i1 %val
}		}

define i1 @test_class_over_max_mask_f32(float %x) nounwind {		define i1 @test_class_over_max_mask_f32(float %x) nounwind {
; CHECK-LABEL: @test_class_over_max_mask_f32(		; CHECK-LABEL: @test_class_over_max_mask_f32(
; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.amdgcn.class.f32(float [[X:%.]], i32 1)		; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 1)
; CHECK-NEXT: ret i1 [[VAL]]		; CHECK-NEXT: ret i1 [[VAL]]
;		;
%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 1025)		%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 1025)
ret i1 %val		ret i1 %val
}		}

define i1 @test_class_no_mask_f32(float %x) nounwind {		define i1 @test_class_no_mask_f32(float %x) nounwind {
; CHECK-LABEL: @test_class_no_mask_f32(		; CHECK-LABEL: @test_class_no_mask_f32(
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret i1 [[VAL]]		; CHECK-NEXT: ret i1 [[VAL]]
;		;
%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 3)		%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 3)
ret i1 %val		ret i1 %val
}		}

define i1 @test_class_isnan_f32_strict(float %x) nounwind {		define i1 @test_class_isnan_f32_strict(float %x) nounwind {
; CHECK-LABEL: @test_class_isnan_f32_strict(		; CHECK-LABEL: @test_class_isnan_f32_strict(
; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.amdgcn.class.f32(float [[X:%.]], i32 3) #[[ATTR15:[0-9]+]]		; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 3) #[[ATTR17:[0-9]+]]
; CHECK-NEXT: ret i1 [[VAL]]		; CHECK-NEXT: ret i1 [[VAL]]
;		;
%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 3) strictfp		%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 3) strictfp
ret i1 %val		ret i1 %val
}		}

define i1 @test_class_is_p0_n0_f32(float %x) nounwind {		define i1 @test_class_is_p0_n0_f32(float %x) nounwind {
; CHECK-LABEL: @test_class_is_p0_n0_f32(		; CHECK-LABEL: @test_class_is_p0_n0_f32(
; CHECK-NEXT: [[VAL:%.]] = fcmp oeq float [[X:%.]], 0.000000e+00		; CHECK-NEXT: [[VAL:%.]] = fcmp oeq float [[X:%.]], 0.000000e+00
; CHECK-NEXT: ret i1 [[VAL]]		; CHECK-NEXT: ret i1 [[VAL]]
;		;
%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 96)		%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 96)
ret i1 %val		ret i1 %val
}		}

define i1 @test_class_is_p0_n0_f32_strict(float %x) nounwind {		define i1 @test_class_is_p0_n0_f32_strict(float %x) nounwind {
; CHECK-LABEL: @test_class_is_p0_n0_f32_strict(		; CHECK-LABEL: @test_class_is_p0_n0_f32_strict(
; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.amdgcn.class.f32(float [[X:%.]], i32 96) #[[ATTR15]]		; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 96) #[[ATTR17]]
; CHECK-NEXT: ret i1 [[VAL]]		; CHECK-NEXT: ret i1 [[VAL]]
;		;
%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 96) strictfp		%val = call i1 @llvm.amdgcn.class.f32(float %x, i32 96) strictfp
ret i1 %val		ret i1 %val
}		}

define i1 @test_constant_class_snan_test_snan_f64() nounwind {		define i1 @test_constant_class_snan_test_snan_f64() nounwind {
; CHECK-LABEL: @test_constant_class_snan_test_snan_f64(		; CHECK-LABEL: @test_constant_class_snan_test_snan_f64(
▲ Show 20 Lines • Show All 204 Lines • ▼ Show 20 Lines	;
%nnan = fadd nnan float %x, 1.0		%nnan = fadd nnan float %x, 1.0
%class = call i1 @llvm.amdgcn.class.f32(float %nnan, i32 3)		%class = call i1 @llvm.amdgcn.class.f32(float %nnan, i32 3)
ret i1 %class		ret i1 %class
}		}

define i1 @test_class_is_nan_other_nnan_src(float %x) {		define i1 @test_class_is_nan_other_nnan_src(float %x) {
; CHECK-LABEL: @test_class_is_nan_other_nnan_src(		; CHECK-LABEL: @test_class_is_nan_other_nnan_src(
; CHECK-NEXT: [[NNAN:%.]] = fadd nnan float [[X:%.]], 1.000000e+00		; CHECK-NEXT: [[NNAN:%.]] = fadd nnan float [[X:%.]], 1.000000e+00
; CHECK-NEXT: [[CLASS:%.*]] = call i1 @llvm.amdgcn.class.f32(float [[NNAN]], i32 264)		; CHECK-NEXT: [[CLASS:%.*]] = call i1 @llvm.is.fpclass.f32(float [[NNAN]], i32 264)
; CHECK-NEXT: ret i1 [[CLASS]]		; CHECK-NEXT: ret i1 [[CLASS]]
;		;
%nnan = fadd nnan float %x, 1.0		%nnan = fadd nnan float %x, 1.0
%class = call i1 @llvm.amdgcn.class.f32(float %nnan, i32 267)		%class = call i1 @llvm.amdgcn.class.f32(float %nnan, i32 267)
ret i1 %class		ret i1 %class
}		}

; --------------------------------------------------------------------		; --------------------------------------------------------------------
▲ Show 20 Lines • Show All 893 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret i64 0		; CHECK-NEXT: ret i64 0
;		;
%result = call i64 @llvm.amdgcn.icmp.i64.i32(i32 9, i32 8, i32 32)		%result = call i64 @llvm.amdgcn.icmp.i64.i32(i32 9, i32 8, i32 32)
ret i64 %result		ret i64 %result
}		}

define i64 @icmp_constant_inputs_true() {		define i64 @icmp_constant_inputs_true() {
; CHECK-LABEL: @icmp_constant_inputs_true(		; CHECK-LABEL: @icmp_constant_inputs_true(
; CHECK-NEXT: [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0:![0-9]+]]) #[[ATTR16:[0-9]+]]		; CHECK-NEXT: [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0:![0-9]+]]) #[[ATTR18:[0-9]+]]
; CHECK-NEXT: ret i64 [[RESULT]]		; CHECK-NEXT: ret i64 [[RESULT]]
;		;
%result = call i64 @llvm.amdgcn.icmp.i64.i32(i32 9, i32 8, i32 34)		%result = call i64 @llvm.amdgcn.icmp.i64.i32(i32 9, i32 8, i32 34)
ret i64 %result		ret i64 %result
}		}

define i64 @icmp_constant_to_rhs_slt(i32 %x) {		define i64 @icmp_constant_to_rhs_slt(i32 %x) {
; CHECK-LABEL: @icmp_constant_to_rhs_slt(		; CHECK-LABEL: @icmp_constant_to_rhs_slt(
▲ Show 20 Lines • Show All 690 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret i64 0		; CHECK-NEXT: ret i64 0
;		;
%result = call i64 @llvm.amdgcn.fcmp.i64.f32(float 2.0, float 4.0, i32 1)		%result = call i64 @llvm.amdgcn.fcmp.i64.f32(float 2.0, float 4.0, i32 1)
ret i64 %result		ret i64 %result
}		}

define i64 @fcmp_constant_inputs_true() {		define i64 @fcmp_constant_inputs_true() {
; CHECK-LABEL: @fcmp_constant_inputs_true(		; CHECK-LABEL: @fcmp_constant_inputs_true(
; CHECK-NEXT: [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0]]) #[[ATTR16]]		; CHECK-NEXT: [[RESULT:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0]]) #[[ATTR18]]
; CHECK-NEXT: ret i64 [[RESULT]]		; CHECK-NEXT: ret i64 [[RESULT]]
;		;
%result = call i64 @llvm.amdgcn.fcmp.i64.f32(float 2.0, float 4.0, i32 4)		%result = call i64 @llvm.amdgcn.fcmp.i64.f32(float 2.0, float 4.0, i32 4)
ret i64 %result		ret i64 %result
}		}

define i64 @fcmp_constant_to_rhs_olt(float %x) {		define i64 @fcmp_constant_to_rhs_olt(float %x) {
; CHECK-LABEL: @fcmp_constant_to_rhs_olt(		; CHECK-LABEL: @fcmp_constant_to_rhs_olt(
Show All 25 Lines
; CHECK-NEXT: ret i64 0		; CHECK-NEXT: ret i64 0
;		;
%b = call i64 @llvm.amdgcn.ballot.i64(i1 0)		%b = call i64 @llvm.amdgcn.ballot.i64(i1 0)
ret i64 %b		ret i64 %b
}		}

define i64 @ballot_one_64() {		define i64 @ballot_one_64() {
; CHECK-LABEL: @ballot_one_64(		; CHECK-LABEL: @ballot_one_64(
; CHECK-NEXT: [[B:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0]]) #[[ATTR16]]		; CHECK-NEXT: [[B:%.*]] = call i64 @llvm.read_register.i64(metadata [[META0]]) #[[ATTR18]]
; CHECK-NEXT: ret i64 [[B]]		; CHECK-NEXT: ret i64 [[B]]
;		;
%b = call i64 @llvm.amdgcn.ballot.i64(i1 1)		%b = call i64 @llvm.amdgcn.ballot.i64(i1 1)
ret i64 %b		ret i64 %b
}		}

define i32 @ballot_nocombine_32(i1 %i) {		define i32 @ballot_nocombine_32(i1 %i) {
; CHECK-LABEL: @ballot_nocombine_32(		; CHECK-LABEL: @ballot_nocombine_32(
Show All 9 Lines
; CHECK-NEXT: ret i32 0		; CHECK-NEXT: ret i32 0
;		;
%b = call i32 @llvm.amdgcn.ballot.i32(i1 0)		%b = call i32 @llvm.amdgcn.ballot.i32(i1 0)
ret i32 %b		ret i32 %b
}		}

define i32 @ballot_one_32() {		define i32 @ballot_one_32() {
; CHECK-LABEL: @ballot_one_32(		; CHECK-LABEL: @ballot_one_32(
; CHECK-NEXT: [[B:%.*]] = call i32 @llvm.read_register.i32(metadata [[META1:![0-9]+]]) #[[ATTR16]]		; CHECK-NEXT: [[B:%.*]] = call i32 @llvm.read_register.i32(metadata [[META1:![0-9]+]]) #[[ATTR18]]
; CHECK-NEXT: ret i32 [[B]]		; CHECK-NEXT: ret i32 [[B]]
;		;
%b = call i32 @llvm.amdgcn.ballot.i32(i1 1)		%b = call i32 @llvm.amdgcn.ballot.i32(i1 1)
ret i32 %b		ret i32 %b
}		}

; --------------------------------------------------------------------		; --------------------------------------------------------------------
; llvm.amdgcn.wqm.vote		; llvm.amdgcn.wqm.vote
▲ Show 20 Lines • Show All 2,891 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/is_fpclass.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -S -mcpu=gfx1010 -passes=instcombine %s \| FileCheck %s			; RUN: opt -S -mcpu=gfx1010 -passes=instcombine %s \| FileCheck %s

	; --------------------------------------------------------------------			; --------------------------------------------------------------------
	; llvm.is.fpclass			; llvm.is.fpclass
	; --------------------------------------------------------------------			; --------------------------------------------------------------------

	; FIXME: Should this really be immarg?			; FIXME: Should this really be immarg?
	; define i1 @test_class_undef_mask_f32(float %x) nounwind {			; define i1 @test_class_undef_mask_f32(float %x) nounwind {
	; %val = call i1 @llvm.is.fpclass.f32(float %x, i32 undef)			; %val = call i1 @llvm.is.fpclass.f32(float %x, i32 undef)
	; ret i1 %val			; ret i1 %val
	; }			; }

	define i1 @test_class_over_max_mask_f32(float %x) nounwind {			define i1 @test_class_over_max_mask_f32(float %x) nounwind {
	; CHECK-LABEL: @test_class_over_max_mask_f32(			; CHECK-LABEL: @test_class_over_max_mask_f32(
	; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 1025)			; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 1)
	; CHECK-NEXT: ret i1 [[VAL]]			; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float %x, i32 1025)			%val = call i1 @llvm.is.fpclass.f32(float %x, i32 1025)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_no_mask_f32(float %x) nounwind {			define i1 @test_class_no_mask_f32(float %x) nounwind {
	; CHECK-LABEL: @test_class_no_mask_f32(			; CHECK-LABEL: @test_class_no_mask_f32(
	; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 0)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float %x, i32 0)			%val = call i1 @llvm.is.fpclass.f32(float %x, i32 0)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_full_mask_f32(float %x) nounwind {			define i1 @test_class_full_mask_f32(float %x) nounwind {
	; CHECK-LABEL: @test_class_full_mask_f32(			; CHECK-LABEL: @test_class_full_mask_f32(
	; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 1023)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float %x, i32 1023)			%val = call i1 @llvm.is.fpclass.f32(float %x, i32 1023)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_undef_no_mask_f32() nounwind {			define i1 @test_class_undef_no_mask_f32() nounwind {
	; CHECK-LABEL: @test_class_undef_no_mask_f32(			; CHECK-LABEL: @test_class_undef_no_mask_f32(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float undef, i32 0)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float undef, i32 0)			%val = call i1 @llvm.is.fpclass.f32(float undef, i32 0)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_undef_full_mask_f32() nounwind {			define i1 @test_class_undef_full_mask_f32() nounwind {
	; CHECK-LABEL: @test_class_undef_full_mask_f32(			; CHECK-LABEL: @test_class_undef_full_mask_f32(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float undef, i32 1023)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float undef, i32 1023)			%val = call i1 @llvm.is.fpclass.f32(float undef, i32 1023)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_poison_no_mask_f32() nounwind {			define i1 @test_class_poison_no_mask_f32() nounwind {
	; CHECK-LABEL: @test_class_poison_no_mask_f32(			; CHECK-LABEL: @test_class_poison_no_mask_f32(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float poison, i32 0)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float poison, i32 0)			%val = call i1 @llvm.is.fpclass.f32(float poison, i32 0)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_poison_full_mask_f32() nounwind {			define i1 @test_class_poison_full_mask_f32() nounwind {
	; CHECK-LABEL: @test_class_poison_full_mask_f32(			; CHECK-LABEL: @test_class_poison_full_mask_f32(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float poison, i32 1023)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float poison, i32 1023)			%val = call i1 @llvm.is.fpclass.f32(float poison, i32 1023)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_undef_val_f32() nounwind {			define i1 @test_class_undef_val_f32() nounwind {
	; CHECK-LABEL: @test_class_undef_val_f32(			; CHECK-LABEL: @test_class_undef_val_f32(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float undef, i32 4)			; CHECK-NEXT: ret i1 undef
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float undef, i32 4)			%val = call i1 @llvm.is.fpclass.f32(float undef, i32 4)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_poison_val_f32() nounwind {			define i1 @test_class_poison_val_f32() nounwind {
	; CHECK-LABEL: @test_class_poison_val_f32(			; CHECK-LABEL: @test_class_poison_val_f32(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f32(float poison, i32 4)			; CHECK-NEXT: ret i1 poison
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float poison, i32 4)			%val = call i1 @llvm.is.fpclass.f32(float poison, i32 4)
	ret i1 %val			ret i1 %val
	}			}

	; FIXME: Should this really be immarg?			; FIXME: Should this really be immarg?
	; define i1 @test_class_undef_undef_f32() nounwind {			; define i1 @test_class_undef_undef_f32() nounwind {
	; %val = call i1 @llvm.is.fpclass.f32(float undef, i32 undef)			; %val = call i1 @llvm.is.fpclass.f32(float undef, i32 undef)
	; ret i1 %val			; ret i1 %val
	; }			; }

	; FIXME: Should this really be immarg?			; FIXME: Should this really be immarg?
	; define i1 @test_class_var_mask_f32(float %x, i32 %mask) nounwind {			; define i1 @test_class_var_mask_f32(float %x, i32 %mask) nounwind {
	; %val = call i1 @llvm.is.fpclass.f32(float %x, i32 %mask)			; %val = call i1 @llvm.is.fpclass.f32(float %x, i32 %mask)
	; ret i1 %val			; ret i1 %val
	; }			; }

	define i1 @test_class_isnan_f32(float %x) nounwind {			define i1 @test_class_isnan_f32(float %x) nounwind {
	; CHECK-LABEL: @test_class_isnan_f32(			; CHECK-LABEL: @test_class_isnan_f32(
	; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 3)			; CHECK-NEXT: [[VAL:%.]] = fcmp uno float [[X:%.]], 0.000000e+00
	; CHECK-NEXT: ret i1 [[VAL]]			; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float %x, i32 3)			%val = call i1 @llvm.is.fpclass.f32(float %x, i32 3)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_isnan_f32_strict(float %x) nounwind {			define i1 @test_class_isnan_f32_strict(float %x) nounwind {
	; CHECK-LABEL: @test_class_isnan_f32_strict(			; CHECK-LABEL: @test_class_isnan_f32_strict(
	; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 3) #[[ATTR2:[0-9]+]]			; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 3) #[[ATTR2:[0-9]+]]
	; CHECK-NEXT: ret i1 [[VAL]]			; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float %x, i32 3) strictfp			%val = call i1 @llvm.is.fpclass.f32(float %x, i32 3) strictfp
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_is_p0_n0_f32(float %x) nounwind {			define i1 @test_class_is_p0_n0_f32(float %x) nounwind {
	; CHECK-LABEL: @test_class_is_p0_n0_f32(			; CHECK-LABEL: @test_class_is_p0_n0_f32(
	; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 96)			; CHECK-NEXT: [[VAL:%.]] = fcmp oeq float [[X:%.]], 0.000000e+00
	; CHECK-NEXT: ret i1 [[VAL]]			; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float %x, i32 96)			%val = call i1 @llvm.is.fpclass.f32(float %x, i32 96)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_is_p0_n0_f32_strict(float %x) nounwind {			define i1 @test_class_is_p0_n0_f32_strict(float %x) nounwind {
	; CHECK-LABEL: @test_class_is_p0_n0_f32_strict(			; CHECK-LABEL: @test_class_is_p0_n0_f32_strict(
	; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 96) #[[ATTR2]]			; CHECK-NEXT: [[VAL:%.]] = call i1 @llvm.is.fpclass.f32(float [[X:%.]], i32 96) #[[ATTR2]]
	; CHECK-NEXT: ret i1 [[VAL]]			; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f32(float %x, i32 96) strictfp			%val = call i1 @llvm.is.fpclass.f32(float %x, i32 96) strictfp
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_snan_test_snan_f64() nounwind {			define i1 @test_constant_class_snan_test_snan_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_snan_test_snan_f64(			; CHECK-LABEL: @test_constant_class_snan_test_snan_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 1)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 1)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 1)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_qnan_test_qnan_f64() nounwind {			define i1 @test_constant_class_qnan_test_qnan_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_qnan_test_qnan_f64(			; CHECK-LABEL: @test_constant_class_qnan_test_qnan_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 2)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 2)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 2)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_qnan_test_snan_f64() nounwind {			define i1 @test_constant_class_qnan_test_snan_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_qnan_test_snan_f64(			; CHECK-LABEL: @test_constant_class_qnan_test_snan_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 1)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 1)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 1)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_ninf_test_ninf_f64() nounwind {			define i1 @test_constant_class_ninf_test_ninf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_ninf_test_ninf_f64(			; CHECK-LABEL: @test_constant_class_ninf_test_ninf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0xFFF0000000000000, i32 4)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0xFFF0000000000000, i32 4)			%val = call i1 @llvm.is.fpclass.f64(double 0xFFF0000000000000, i32 4)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_pinf_test_ninf_f64() nounwind {			define i1 @test_constant_class_pinf_test_ninf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_pinf_test_ninf_f64(			; CHECK-LABEL: @test_constant_class_pinf_test_ninf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000000, i32 4)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000000, i32 4)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000000, i32 4)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_qnan_test_ninf_f64() nounwind {			define i1 @test_constant_class_qnan_test_ninf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_qnan_test_ninf_f64(			; CHECK-LABEL: @test_constant_class_qnan_test_ninf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 4)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 4)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 4)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_snan_test_ninf_f64() nounwind {			define i1 @test_constant_class_snan_test_ninf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_snan_test_ninf_f64(			; CHECK-LABEL: @test_constant_class_snan_test_ninf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 4)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 4)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 4)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_nnormal_test_nnormal_f64() nounwind {			define i1 @test_constant_class_nnormal_test_nnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_nnormal_test_nnormal_f64(			; CHECK-LABEL: @test_constant_class_nnormal_test_nnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double -1.000000e+00, i32 8)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double -1.0, i32 8)			%val = call i1 @llvm.is.fpclass.f64(double -1.0, i32 8)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_pnormal_test_nnormal_f64() nounwind {			define i1 @test_constant_class_pnormal_test_nnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_pnormal_test_nnormal_f64(			; CHECK-LABEL: @test_constant_class_pnormal_test_nnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 1.000000e+00, i32 8)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 1.0, i32 8)			%val = call i1 @llvm.is.fpclass.f64(double 1.0, i32 8)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_nsubnormal_test_nsubnormal_f64() nounwind {			define i1 @test_constant_class_nsubnormal_test_nsubnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_nsubnormal_test_nsubnormal_f64(			; CHECK-LABEL: @test_constant_class_nsubnormal_test_nsubnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x800FFFFFFFFFFFFF, i32 16)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x800fffffffffffff, i32 16)			%val = call i1 @llvm.is.fpclass.f64(double 0x800fffffffffffff, i32 16)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_psubnormal_test_nsubnormal_f64() nounwind {			define i1 @test_constant_class_psubnormal_test_nsubnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_psubnormal_test_nsubnormal_f64(			; CHECK-LABEL: @test_constant_class_psubnormal_test_nsubnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0xFFFFFFFFFFFFF, i32 16)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x000fffffffffffff, i32 16)			%val = call i1 @llvm.is.fpclass.f64(double 0x000fffffffffffff, i32 16)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_nzero_test_nzero_f64() nounwind {			define i1 @test_constant_class_nzero_test_nzero_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_nzero_test_nzero_f64(			; CHECK-LABEL: @test_constant_class_nzero_test_nzero_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double -0.000000e+00, i32 32)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double -0.0, i32 32)			%val = call i1 @llvm.is.fpclass.f64(double -0.0, i32 32)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_pzero_test_nzero_f64() nounwind {			define i1 @test_constant_class_pzero_test_nzero_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_pzero_test_nzero_f64(			; CHECK-LABEL: @test_constant_class_pzero_test_nzero_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0.000000e+00, i32 32)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0.0, i32 32)			%val = call i1 @llvm.is.fpclass.f64(double 0.0, i32 32)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_pzero_test_pzero_f64() nounwind {			define i1 @test_constant_class_pzero_test_pzero_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_pzero_test_pzero_f64(			; CHECK-LABEL: @test_constant_class_pzero_test_pzero_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0.000000e+00, i32 64)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0.0, i32 64)			%val = call i1 @llvm.is.fpclass.f64(double 0.0, i32 64)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_nzero_test_pzero_f64() nounwind {			define i1 @test_constant_class_nzero_test_pzero_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_nzero_test_pzero_f64(			; CHECK-LABEL: @test_constant_class_nzero_test_pzero_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double -0.000000e+00, i32 64)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double -0.0, i32 64)			%val = call i1 @llvm.is.fpclass.f64(double -0.0, i32 64)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_psubnormal_test_psubnormal_f64() nounwind {			define i1 @test_constant_class_psubnormal_test_psubnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_psubnormal_test_psubnormal_f64(			; CHECK-LABEL: @test_constant_class_psubnormal_test_psubnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0xFFFFFFFFFFFFF, i32 128)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x000fffffffffffff, i32 128)			%val = call i1 @llvm.is.fpclass.f64(double 0x000fffffffffffff, i32 128)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_nsubnormal_test_psubnormal_f64() nounwind {			define i1 @test_constant_class_nsubnormal_test_psubnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_nsubnormal_test_psubnormal_f64(			; CHECK-LABEL: @test_constant_class_nsubnormal_test_psubnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x800FFFFFFFFFFFFF, i32 128)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x800fffffffffffff, i32 128)			%val = call i1 @llvm.is.fpclass.f64(double 0x800fffffffffffff, i32 128)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_pnormal_test_pnormal_f64() nounwind {			define i1 @test_constant_class_pnormal_test_pnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_pnormal_test_pnormal_f64(			; CHECK-LABEL: @test_constant_class_pnormal_test_pnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 1.000000e+00, i32 256)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 1.0, i32 256)			%val = call i1 @llvm.is.fpclass.f64(double 1.0, i32 256)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_nnormal_test_pnormal_f64() nounwind {			define i1 @test_constant_class_nnormal_test_pnormal_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_nnormal_test_pnormal_f64(			; CHECK-LABEL: @test_constant_class_nnormal_test_pnormal_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double -1.000000e+00, i32 256)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double -1.0, i32 256)			%val = call i1 @llvm.is.fpclass.f64(double -1.0, i32 256)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_pinf_test_pinf_f64() nounwind {			define i1 @test_constant_class_pinf_test_pinf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_pinf_test_pinf_f64(			; CHECK-LABEL: @test_constant_class_pinf_test_pinf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000000, i32 512)			; CHECK-NEXT: ret i1 true
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000000, i32 512)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000000, i32 512)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_ninf_test_pinf_f64() nounwind {			define i1 @test_constant_class_ninf_test_pinf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_ninf_test_pinf_f64(			; CHECK-LABEL: @test_constant_class_ninf_test_pinf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0xFFF0000000000000, i32 512)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0xFFF0000000000000, i32 512)			%val = call i1 @llvm.is.fpclass.f64(double 0xFFF0000000000000, i32 512)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_qnan_test_pinf_f64() nounwind {			define i1 @test_constant_class_qnan_test_pinf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_qnan_test_pinf_f64(			; CHECK-LABEL: @test_constant_class_qnan_test_pinf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 512)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 512)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF8000000000000, i32 512)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_constant_class_snan_test_pinf_f64() nounwind {			define i1 @test_constant_class_snan_test_pinf_f64() nounwind {
	; CHECK-LABEL: @test_constant_class_snan_test_pinf_f64(			; CHECK-LABEL: @test_constant_class_snan_test_pinf_f64(
	; CHECK-NEXT: [[VAL:%.*]] = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 512)			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: ret i1 [[VAL]]
	;			;
	%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 512)			%val = call i1 @llvm.is.fpclass.f64(double 0x7FF0000000000001, i32 512)
	ret i1 %val			ret i1 %val
	}			}

	define i1 @test_class_is_snan_nnan_src(float %x) {			define i1 @test_class_is_snan_nnan_src(float %x) {
	; CHECK-LABEL: @test_class_is_snan_nnan_src(			; CHECK-LABEL: @test_class_is_snan_nnan_src(
	; CHECK-NEXT: [[NNAN:%.]] = fadd nnan float [[X:%.]], 1.000000e+00			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: [[CLASS:%.*]] = call i1 @llvm.is.fpclass.f32(float [[NNAN]], i32 1)
	; CHECK-NEXT: ret i1 [[CLASS]]
	;			;
	%nnan = fadd nnan float %x, 1.0			%nnan = fadd nnan float %x, 1.0
	%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 1)			%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 1)
	ret i1 %class			ret i1 %class
	}			}

	define i1 @test_class_is_qnan_nnan_src(float %x) {			define i1 @test_class_is_qnan_nnan_src(float %x) {
	; CHECK-LABEL: @test_class_is_qnan_nnan_src(			; CHECK-LABEL: @test_class_is_qnan_nnan_src(
	; CHECK-NEXT: [[NNAN:%.]] = fadd nnan float [[X:%.]], 1.000000e+00			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: [[CLASS:%.*]] = call i1 @llvm.is.fpclass.f32(float [[NNAN]], i32 2)
	; CHECK-NEXT: ret i1 [[CLASS]]
	;			;
	%nnan = fadd nnan float %x, 1.0			%nnan = fadd nnan float %x, 1.0
	%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 2)			%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 2)
	ret i1 %class			ret i1 %class
	}			}

	define i1 @test_class_is_nan_nnan_src(float %x) {			define i1 @test_class_is_nan_nnan_src(float %x) {
	; CHECK-LABEL: @test_class_is_nan_nnan_src(			; CHECK-LABEL: @test_class_is_nan_nnan_src(
	; CHECK-NEXT: [[NNAN:%.]] = fadd nnan float [[X:%.]], 1.000000e+00			; CHECK-NEXT: ret i1 false
	; CHECK-NEXT: [[CLASS:%.*]] = call i1 @llvm.is.fpclass.f32(float [[NNAN]], i32 3)
	; CHECK-NEXT: ret i1 [[CLASS]]
	;			;
	%nnan = fadd nnan float %x, 1.0			%nnan = fadd nnan float %x, 1.0
	%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 3)			%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 3)
	ret i1 %class			ret i1 %class
	}			}

	define i1 @test_class_is_nan_other_nnan_src(float %x) {			define i1 @test_class_is_nan_other_nnan_src(float %x) {
	; CHECK-LABEL: @test_class_is_nan_other_nnan_src(			; CHECK-LABEL: @test_class_is_nan_other_nnan_src(
	; CHECK-NEXT: [[NNAN:%.]] = fadd nnan float [[X:%.]], 1.000000e+00			; CHECK-NEXT: [[NNAN:%.]] = fadd nnan float [[X:%.]], 1.000000e+00
	; CHECK-NEXT: [[CLASS:%.*]] = call i1 @llvm.is.fpclass.f32(float [[NNAN]], i32 267)			; CHECK-NEXT: [[CLASS:%.*]] = call i1 @llvm.is.fpclass.f32(float [[NNAN]], i32 264)
	; CHECK-NEXT: ret i1 [[CLASS]]			; CHECK-NEXT: ret i1 [[CLASS]]
	;			;
	%nnan = fadd nnan float %x, 1.0			%nnan = fadd nnan float %x, 1.0
	%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 267)			%class = call i1 @llvm.is.fpclass.f32(float %nnan, i32 267)
	ret i1 %class			ret i1 %class
	}			}



				arsenmAuthorUnsubmitted Done Reply Inline Actions This will be recovered by D139130, depending on which lands first arsenm: This will be recovered by D139130, depending on which lands first
	declare i1 @llvm.is.fpclass.f32(float, i32 immarg) nounwind readnone			declare i1 @llvm.is.fpclass.f32(float, i32 immarg) nounwind readnone
	declare i1 @llvm.is.fpclass.f64(double, i32 immarg) nounwind readnone			declare i1 @llvm.is.fpclass.f64(double, i32 immarg) nounwind readnone