This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
2/3
CGCall.cpp
-
CodeGenAction.cpp
-
CodeGenModule.h
-
test/
-
CodeGen/
-
denormalfpmode-f32.c
-
denormalfpmode.c
-
CodeGenCUDA/
-
Inputs/
1/1
ocml-sample.cl
4/9
link-builtin-bitcode-denormal-fp-mode.cu
-
Driver/
-
denormal-fp-math.c
-
llvm/
-
docs/
3/4
LangRef.rst
-
include/llvm/
-
llvm/
-
ADT/
-
FloatingPointMode.h
-
Analysis/
-
ConstantFolding.h
-
IR/
-
Attributes.td
-
Function.h
-
lib/
-
Analysis/
1/1
ConstantFolding.cpp
-
CodeGen/
-
CommandFlags.cpp
-
SelectionDAG/
-
TargetLowering.cpp
-
IR/
-
Attributes.cpp
-
Function.cpp
-
Target/AMDGPU/
-
AMDGPU/
-
SIModeRegisterDefaults.h
-
test/
-
CodeGen/
-
Generic/
1/2
denormal-fp-math-cl-opt.ll
-
X86/
-
sqrt-fastmath.ll
-
Transforms/
-
Inline/AMDGPU/
-
AMDGPU/
1/3
inline-denormal-fp-math.ll
-
InstSimplify/
-
canonicalize.ll
-
constant-fold-fp-denormal.ll
-
unittests/ADT/
-
ADT/
-
FloatingPointMode.cpp
-
utils/TableGen/
-
TableGen/
-
Attributes.cpp

Differential D142907

LangRef: Add "dynamic" option to "denormal-fp-math"
ClosedPublic

Authored by arsenm on Jan 30 2023, 9:16 AM.

Download Raw Diff

Details

Reviewers

scanon
spatel
cameron.mcinally
andrew.w.kaylor
tra
jlebar
Anastasia
yaxunl
efriedma
jcranmer-intel
kpn
sepavloff
rampitec
foad
Pierre-vh

Group Reviewers

Restricted Project

Summary

This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.

There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.

Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.

Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.

I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.

If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.

This could also be of use for strictfp handling, where code may be
changing the denormal mode.

Alternative name could be "unknown".

Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.

Diff Detail

Event Timeline

arsenm created this revision.Jan 30 2023, 9:16 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2023, 9:16 AM

Herald added subscribers: kosarev, foad, StephenFan and 6 others. · View Herald Transcript

arsenm requested review of this revision.Jan 30 2023, 9:16 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2023, 9:16 AM

Herald added a subscriber: wdng. · View Herald Transcript

Harbormaster completed remote builds in B210794: Diff 493333.Jan 30 2023, 10:50 AM

My $.02, mostly on the style and the mechanics of applying the new attribute. FP semantics aspects are above my pay grade.

clang/lib/CodeGen/CGCall.cpp
2059–2061	IIUIC, this changes denorm mode attributes on the functions with dynamic denorm mode that we link in. Will that be a problem if the same function, when linked into different modules, would end up with different attributes? E.g. if a function is externally visible and is intended to be common'ed across multiple modules. Should dynamic denorm mode be restricted to the functions private to the module only? We do typically internalize linked bitcode for CUDA, but I don't think it's something we can always implicitly assume.
clang/test/CodeGenCUDA/Inputs/ocml-sample.cl
12	Cosmetic nit: order functions as f16/f32/f64?
clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu
16	Do we want to verify that the compiled samples have the correct function attributes?
21	Would it be useful to check the attributes the functions have w/o linking the sample bitcode? In these tests one can infer expected attributes from the flags and comments, so I'm fine not having explicit checks for that.
78	Nit: CHECK-LABEL ?
91–92	I assume these refer to linked in functions, not their calls. It may be useful to include match define/call to make it obvious.
95	I'm not sure whether it does what it's intended to. AFAICT, at this point we will be past the call sites, so if it's intended to check the call sites in kernel_*, it will likely always succeed, even if we do litter call sites with unwanted attributes. It's also possible that I have a wrong idea about what the expected IR looks like. If you could post it for reference, that would be helpful.
llvm/test/CodeGen/Generic/denormal-fp-math-cl-opt.ll
4	Edit: `Check that the command line flag annotates the IR with the appropriate attributes.`

arsenm marked 2 inline comments as done.Jan 31 2023, 7:01 AM

arsenm added inline comments.

clang/lib/CodeGen/CGCall.cpp
2059–2061	The whole point of -mlink-builtin-bitcode is to apply the attributes for the current compilation to what's linked in. The linked functions are always internalized. The only case where we might not want to internalize is for weak symbols (but it looks like we do internalize those today, but this is something I've thought about changing). I'll add a test with a weak library function In the weak case the right thing to do is probably to not change from dynamic, simply because this linking process is outside of the user's control.
clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu
16	Maybe, but that's already tested separately. This test is a bit complex as it is (and could maybe use a few more combinations)
78	error: found 'CHECK-LABEL:' with variable definition or use
95	I can drop this, I later added the -implicit-check-not=denormal-fp-math to all the FileChecks

arsenm marked an inline comment as done.Jan 31 2023, 7:21 AM

arsenm added inline comments.

clang/lib/CodeGen/CGCall.cpp
2059–2061	It turns out we apply attributes prior to internalization. As a separate patch, we can either: Skip functions that start as interposable, which has an observable change in the IR as it is Move the link and internalize before setting attributes. This would be unobservable but would catch it if the internalization behavior ever changed

Fix losing target-cpu

Fix dropping target-cpu. Also skip interposable functions if we aren't internalizing (this seems to be a theoretical concern, since PropagateAttrs and Internalize are set as a pair)

arsenm added a child revision: D142996: DAG: Fix broken lowering of is.fplcass fcZero with DAZ.Jan 31 2023, 10:33 AM

LGTM for the parts I've commented on.

clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu
78	Interesting. In that case label and attribute checks could be separated into something like this: CHECK-LABEL: name CHECK-SAME: [[attribute]] Up to you.
llvm/test/CodeGen/Generic/denormal-fp-math-cl-opt.ll
4	^^ The comment still needs to be edited.

Harbormaster completed remote builds in B211023: Diff 493653.Jan 31 2023, 12:12 PM

We use "dynamic" for the constrained intrinsics. I'd stay consistent with our terminology and stick with "dynamic" here.

I like the amount of testing. You may have gotten every single combination of cases, but I didn't go far enough to check.

llvm/test/Transforms/Inline/AMDGPU/inline-denormal-fp-math.ll
78	Are we changing the behavior in a way that may cause regressions? It looks like we've changed behavior in the absence of "dynamic".

arsenm marked an inline comment as done.Feb 1 2023, 11:29 AM

arsenm added inline comments.

llvm/test/Transforms/Inline/AMDGPU/inline-denormal-fp-math.ll
78	This case is broken to begin with, calling ieee from daz code. This makes the inlining more conservative / noticeable for debugging

kpn added inline comments.Feb 1 2023, 11:36 AM

llvm/test/Transforms/Inline/AMDGPU/inline-denormal-fp-math.ll
78	Can I talk you into mentioning this in your commit message?

Address comments

Herald added a subscriber: tpr. · View Herald TranscriptFeb 2 2023, 12:52 PM

Looking at the attribute logic here, there is conceptual room for both a dynamic and an unknown mode (i.e., you get a top and a bottom value), but I don't think there is value in distinguishing between them, so I'm fine with keeping just a dynamic.

I didn't bother to look at the clang changes, and my only comments are some minor ones around documentation:

llvm/docs/LangRef.rst
2241–2258	This isn't your fault, but I noticed when reading the LangRef online that this paragraph has a slightly-different indentation that causes most of this attribute's documentation to gain an extra level of indentation.
2243–2244	I feel like the description of this mode should mention that whether or not denormals are flushed is derived from the dynamic state of the FP environment.

Harbormaster completed remote builds in B211568: Diff 494417.Feb 2 2023, 3:22 PM

Documentation fixes

Harbormaster completed remote builds in B211607: Diff 494465.Feb 2 2023, 5:45 PM

arsenm added a child revision: D143264: InstCombine: Fold is.fpclass(x, fcZero) to fcmp oeq 0.Feb 3 2023, 6:38 AM

arsenm added a child revision: D143279: InstCombine: Handle folding fcmp of 0 into llvm.is.fpclass.Feb 3 2023, 9:10 AM

In general, it seems like the denormal mode should be considered part of the floating point environment (though as far as I know the C standard, at least, doesn't document it as such). If it were considered part of the floating point environment, the LLVM rules would tell us we could assume the default setting, which I'd assume to be IEEE, and it would only be legal to change this mode in strict mode. However, for your use case preserving the behavior of fpclass seems like what users would want, even in fast-math modes. In this sense, this is a lot like the problem we have with preserving isnan() behavior when fast-math is enabled. Our rules allow it, but it's not what most people would want.

I think the new denormal mode is a good addition.

Do you need to do something with the inliner to handle the case where functions with different denormal modes are inlined into one another? We don't seem to handle that case correctly now (https://godbolt.org/z/PEsWaMEq6), but with the dynamic mode we could handle it without blocking inlining completely.

In D142907#4119339, @andrew.w.kaylor wrote:

In general, it seems like the denormal mode should be considered part of the floating point environment (though as far as I know the C standard, at least, doesn't document it as such).

There’s no standardization of denormal flushing. OpenCL defines a flag for it but doesn’t really specify what it really means.

If it were considered part of the floating point environment, the LLVM rules would tell us we could assume the default setting, which I'd assume to be IEEE, and it would only be legal to change this mode in strict mode.

It is. The attribute is informative of the default mode. If we really wanted, a similar attribute could declare the assumed rounding mode if we really wanted for the function. It doesn’t imply you can change it dynamically, just that it should be in that mode before the function executes.

Do you need to do something with the inliner to handle the case where functions with different denormal modes are inlined into one another? We don't seem to handle that case correctly now (https://godbolt.org/z/PEsWaMEq6), but with the dynamic mode we could handle it without blocking inlining completely.

This patch fixed some bugs in the inlining to make it more conservative in undefined looking cases. The inlining of dynamic functions fully works

Not entirely sure where the best place to effect this (I think somewhere in the clang driver code?), but on further reflection, it feels like strict fp-model in clang should set the denormal mode to dynamic.

In D142907#4132318, @jcranmer-intel wrote:

Not entirely sure where the best place to effect this (I think somewhere in the clang driver code?), but on further reflection, it feels like strict fp-model in clang should set the denormal mode to dynamic.

I was thinking of changing the default in general to dynamic. I was going to at least change the strictfp default in a follow up

What's the plan for tying this to strictfp? Because I don't it should be tied to cases where we use the constrained intrinsics but the exceptions are ignored and the default rounding is in stated. Those instructions are supposed to behave the same as the non-constrained instructions. So keying off the presence of the strictfp attribute on the function definition, or the (equivalent) presence of constrained intrinsics, would be too simple.

I don't see an obvious connection between denormals and exception behavior, and the rounding mode has the same problem. It would be surprising if changing the rounding mode changed denormal handling or optimization even when the new rounding mode would have identical results. It would also be surprising if changing the rounding mode back to the default round-to-nearest changed how denormals are handled.

Would we get different denormal behavior with a clang flag vs using a #pragma at the top of a source file? That seems surprising as well.

In the abstract I can see lumping in denormal handing with the rest of the FP environment handling. But in the LLVM context I don't see how we can tie use of the constrained intrinsics to denormals.

In D142907#4132543, @kpn wrote:

What's the plan for tying this to strictfp? Because I don't it should be tied to cases where we use the constrained intrinsics but the exceptions are ignored and the default rounding is in stated. Those instructions are supposed to behave the same as the non-constrained instructions. So keying off the presence of the strictfp attribute on the function definition, or the (equivalent) presence of constrained intrinsics, would be too simple.

The denormal mode is exactly parallel to the rounding mode, we just don't have a mirrored field in the constrained intrinsic metadata operands. If we defaulted to using the dynamic mode if you were to use strictfp, everything would be OK. You just couldn't optimize based on knowledge of the denormal mode. I don't really think it's worth putting in the same optimization effort as the rounding mode.

In D142907#4132430, @arsenm wrote:

I was thinking of changing the default in general to dynamic. I was going to at least change the strictfp default in a follow up

I had the same thought too, but I reflected a little further that the default fp model implying that the environment being in the default state means we can assume the FTZ/DAZ are also in a default (IEEE) state.

In D142907#4132543, @kpn wrote:

What's the plan for tying this to strictfp? Because I don't it should be tied to cases where we use the constrained intrinsics but the exceptions are ignored and the default rounding is in stated. Those instructions are supposed to behave the same as the non-constrained instructions. So keying off the presence of the strictfp attribute on the function definition, or the (equivalent) presence of constrained intrinsics, would be too simple.

The way I see it, strictfp is an assertion that every FP instruction has a dependency on the FP environment, which is largely orthogonal to the denormal-mode attribute asserting that the FTZ/DAZ bits in the FP environment have a particular value. The constrained intrinsics also have the ability to assert some properties of the FP environment (specifically, rounding mode and exception behavior) on individual instructions. By not adding any metadata to constrained intrinsics at the same time, we don't get the ability to set the denormal-mode on a per-instruction basis-but I don't think there's much value to be gained by doing so (giving that we already have it at a per-function level).

Would we get different denormal behavior with a clang flag vs using a #pragma at the top of a source file? That seems surprising as well.

One of the consequences of having so many different ways of controlling compiler FP environment assumptions is that there's a crazy amount of interactions to consider. But I think there is ultimately a workable solution for the clang frontend to generate interactions that make sense.

pengfei added inline comments.Feb 16 2023, 10:31 PM

llvm/docs/LangRef.rst
2244	Does it mean users must specify `dynamic` when they change FTZ/DAZ in a function? If 1) is true, is there a way to partially set on functions in its call stack, e.g., main f0 f1 f10 setFtzDaz(true); f2 f3 Ideally, users may want to tell compiler `main`, `f1`, `f10` is `dynamic`, while `f0` is `ieee` and `f2`, `f3` is `positive-zero`, rather than `dynamic` for all. If 2) is true, it looks silly to do it one by one manually. Should compiler help to deduce this information itself?

arsenm added inline comments.Feb 17 2023, 2:05 PM

llvm/docs/LangRef.rst
2244	This is really a question of how strictfp should interact with the default mode and shouldn't be a different policy from how strictfp functions treat the rounding mode. Arbitrary strictfp functions don't make assumptions based on the default rounding mode, assuming it's not the default. In that sense, denormal-fp-mode doesn't really matter for strictfp functions. They just can't make use of it to optimize. If we had a denormal annotation like the rounding mode, we could make use of it in the same way This isn't an area for the backend to deduce, semantic meaning needs to be specific and explicit. I have no interest in making changing the denormal mode simple or easy. Turning on flushing isn't really semantically desirable and is basically obsolete on modern hardware.

In D142907#4132836, @jcranmer-intel wrote:

In D142907#4132430, @arsenm wrote:

I was thinking of changing the default in general to dynamic. I was going to at least change the strictfp default in a follow up

I had the same thought too, but I reflected a little further that the default fp model implying that the environment being in the default state means we can assume the FTZ/DAZ are also in a default (IEEE) state.

In D142907#4132543, @kpn wrote:

What's the plan for tying this to strictfp? Because I don't it should be tied to cases where we use the constrained intrinsics but the exceptions are ignored and the default rounding is in stated. Those instructions are supposed to behave the same as the non-constrained instructions. So keying off the presence of the strictfp attribute on the function definition, or the (equivalent) presence of constrained intrinsics, would be too simple.

The way I see it, strictfp is an assertion that every FP instruction has a dependency on the FP environment, which is largely orthogonal to the denormal-mode attribute asserting that the FTZ/DAZ bits in the FP environment have a particular value. The constrained intrinsics also have the ability to assert some properties of the FP environment (specifically, rounding mode and exception behavior) on individual instructions. By not adding any metadata to constrained intrinsics at the same time, we don't get the ability to set the denormal-mode on a per-instruction basis-but I don't think there's much value to be gained by doing so (giving that we already have it at a per-function level)

I think this is not really useful for strictfp. We could make use of this if we were to inline non-strictfp functions into strictfp functions if we had a denormal mode annotation on the constrained intrinsic, which we don't. Without that, strictfp functions can't make assumptions based on the default rounding mode so this doesn't inform anything useful in the presence of a changeable mode

arsenm mentioned this in D144650: [AMDGPU] Split SIModeRegisterDefaults out of AMDGPUBaseInfo. NFC..Feb 23 2023, 8:34 AM

ping

I'm studiously ignoring the Clang and LLVM codegen changes here, but otherwise, I think the direction of this change is generally good.

llvm/lib/Analysis/ConstantFolding.cpp
1380–1382	You should change the doxygen documentation to indicate that this method returns nullptr if the denormal mode is dynamic. Ditto for ConstantFoldFPInstOperands.

Update doxygen comment

Harbormaster completed remote builds in B219482: Diff 505274.Mar 14 2023, 5:00 PM

ping

arsenm added reviewers: Restricted Project, rampitec, foad, Pierre-vh.Apr 7 2023, 3:53 PM

ping

I'm having trouble understanding the changes on the clang side.

If I'm following correctly; the "denormal-fp-math" setting is a promise from the user to the compiler: if the setting is not "dynamic", the user promises that the definition will only execute in the specified denormal mode. This is similar to, for example, the rounding mode pragmas: the user promises a specific rounding mode unless they explicitly request dynamic rounding.

Given that, I don't follow the whole "merging" thing... we should just be setting whatever mode is active. The attribute setting should not depend on whether the function is interposable. If you have a ODR function, all definitions must have a mode compatible with whatever mode will be used at runtime. If you have a non-ODR weak function, optimizations shouldn't propagate that mode from the callee to the caller.

In D142907#4288247, @efriedma wrote:

Given that, I don't follow the whole "merging" thing... we should just be setting whatever mode is active. The attribute setting should not depend on whether the function is interposable. If you have a ODR function, all definitions must have a mode compatible with whatever mode will be used at runtime. If you have a non-ODR weak function, optimizations shouldn't propagate that mode from the callee to the caller.

The point is to have code that works with either mode. The intent is to avoid duplicating code for an extremely marginal difference. Having to duplicate every library function for denormal flushing and ieee handling defeats the purpose. The specialization will be realized after linking / inlining

If you have a library function that's built with "denormal-fp-math"="dynamic,dynamic", you can link it into code built in any mode, and LTO should be able to propagate that mode from the caller to the callee. That doesn't require clang to do anything special; you can just specify -fdenormal-fp-math=dynamic while building the library, and the user specifies -fdenormal-fp-math=ieee while building their code.

I guess you're worried specifically about ODR inline functions, defined in headers? The user specifies a specific mode because they know their code honors it... but the user might not be aware of the effect on functions defined in library headers. Other libraries in the same binary might use the same header, but specify a different mode. So if the user specifies a denormal mode, we should ignore it for ODR inline functions, because they didn't actually mean to apply the denormal mode to those definitions?

I'm not sure about applying those semantics automatically; I don't think there's any precedent in clang for anything like this. The closest thing I can think of is -fvisibility-inlines-hidden. I'd prefer to RFC it separately from the rest of the patch, and loop in clang frontend owners, since the precedent we set here will apply to other sorts of attributes.

In D142907#4288555, @efriedma wrote:

If you have a library function that's built with "denormal-fp-math"="dynamic,dynamic", you can link it into code built in any mode, and LTO should be able to propagate that mode from the caller to the callee. That doesn't require clang to do anything special; you can just specify -fdenormal-fp-math=dynamic while building the library, and the user specifies -fdenormal-fp-math=ieee while building their code.

That's essentially what this does. I think the part you are missing is the existing special treatment of the builtin device library functions. The default set of attributes for the current translation unit is forcibly set on functions in bitcode libraries linked in and internalized with -mlink-builtin-bitcode. We need to logically merge with the current translation unit's mode, or else we're potentially breaking the linked in function. The main reason I'm doing this in the first place is to move towards a model with less special treatment of these libraries.

I guess you're worried specifically about ODR inline functions, defined in headers? The user specifies a specific mode because they know their code honors it... but the user might not be aware of the effect on functions defined in library headers. Other libraries in the same binary might use the same header, but specify a different mode. So if the user specifies a denormal mode, we should ignore it for ODR inline functions, because they didn't actually mean to apply the denormal mode to those definitions?

No, the clang changes are to handle the headerless bitcode-only device libraries only. The user is supposed to be unaware the builtin libraries exist. They're an implementation detail managed by the clang driver (-mlink-builtin-bitcode is a cc1 only flag) and have a special contract with the compiler.

I'm not sure about applying those semantics automatically; I don't think there's any precedent in clang for anything like this. The closest thing I can think of is -fvisibility-inlines-hidden. I'd prefer to RFC it separately from the rest of the patch, and loop in clang frontend owners, since the precedent we set here will apply to other sorts of attributes.

This isn't new, isn't end user facing, and isn't general purpose. This is the minimum required update to the existing -mlink-builtin-bitcode handling.

Oh, sorry, I missed that the new code specifically runs on functions imported using -mlink-builtin-bitcode. I somehow thought it was running on all functions.

LGTM

This revision is now accepted and ready to land.Apr 25 2023, 10:10 AM

bc37be1855773c1dcf8c6bf577a096a81fd58652

Nuullll added a subscriber: Nuullll.May 10 2023, 1:18 AM

Pierre-vh mentioned this in D152251: [clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode linking.Jun 6 2023, 2:55 AM

Pierre-vh mentioned this in rG23431b524603: [clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode linking.Jun 7 2023, 6:51 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGCall.cpp

106 lines

CodeGenAction.cpp

3 lines

CodeGenModule.h

8 lines

test/

CodeGen/

denormalfpmode-f32.c

14 lines

denormalfpmode.c

2 lines

CodeGenCUDA/

Inputs/

ocml-sample.cl

18 lines

link-builtin-bitcode-denormal-fp-mode.cu

157 lines

Driver/

denormal-fp-math.c

6 lines

llvm/

docs/

LangRef.rst

37 lines

include/

llvm/

ADT/

FloatingPointMode.h

37 lines

Analysis/

ConstantFolding.h

3 lines

IR/

Attributes.td

8 lines

Function.h

9 lines

lib/

Analysis/

ConstantFolding.cpp

28 lines

CodeGen/

CommandFlags.cpp

17 lines

SelectionDAG/

TargetLowering.cpp

29 lines

IR/

Attributes.cpp

31 lines

Function.cpp

25 lines

Target/

AMDGPU/

SIModeRegisterDefaults.h

35 lines

test/

CodeGen/

Generic/

denormal-fp-math-cl-opt.ll

9 lines

X86/

sqrt-fastmath.ll

62 lines

Transforms/

Inline/

AMDGPU/

inline-denormal-fp-math.ll

660 lines

InstSimplify/

canonicalize.ll

94 lines

constant-fold-fp-denormal.ll

107 lines

unittests/

ADT/

FloatingPointMode.cpp

88 lines

utils/

TableGen/

Attributes.cpp

1 line

Diff 505274

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 1,823 Lines • ▼ Show 20 Lines	static bool HasStrictReturn(const CodeGenModule &Module, QualType RetTy,
// We don't want to be too aggressive with the return checking, unless		// We don't want to be too aggressive with the return checking, unless
// it's explicit in the code opts or we're using an appropriate sanitizer.		// it's explicit in the code opts or we're using an appropriate sanitizer.
// Try to respect what the programmer intended.		// Try to respect what the programmer intended.
return Module.getCodeGenOpts().StrictReturn \|\|		return Module.getCodeGenOpts().StrictReturn \|\|
!Module.MayDropFunctionReturn(Module.getContext(), RetTy) \|\|		!Module.MayDropFunctionReturn(Module.getContext(), RetTy) \|\|
Module.getLangOpts().Sanitize.has(SanitizerKind::Return);		Module.getLangOpts().Sanitize.has(SanitizerKind::Return);
}		}

void CodeGenModule::getDefaultFunctionAttributes(StringRef Name,		/// Add denormal-fp-math and denormal-fp-math-f32 as appropriate for the
bool HasOptnone,		/// requested denormal behavior, accounting for the overriding behavior of the
bool AttrOnCallSite,		/// -f32 case.
		static void addDenormalModeAttrs(llvm::DenormalMode FPDenormalMode,
		llvm::DenormalMode FP32DenormalMode,
		llvm::AttrBuilder &FuncAttrs) {
		if (FPDenormalMode != llvm::DenormalMode::getDefault())
		FuncAttrs.addAttribute("denormal-fp-math", FPDenormalMode.str());

		if (FP32DenormalMode != FPDenormalMode && FP32DenormalMode.isValid())
		FuncAttrs.addAttribute("denormal-fp-math-f32", FP32DenormalMode.str());
		}

		/// Add default attributes to a function, which have merge semantics under
		/// -mlink-builtin-bitcode and should not simply overwrite any existing
		/// attributes in the linked library.
		static void
		addMergableDefaultFunctionAttributes(const CodeGenOptions &CodeGenOpts,
		llvm::AttrBuilder &FuncAttrs) {
		addDenormalModeAttrs(CodeGenOpts.FPDenormalMode, CodeGenOpts.FP32DenormalMode,
		FuncAttrs);
		}

		void CodeGenModule::getTrivialDefaultFunctionAttributes(
		StringRef Name, bool HasOptnone, bool AttrOnCallSite,
llvm::AttrBuilder &FuncAttrs) {		llvm::AttrBuilder &FuncAttrs) {
// OptimizeNoneAttr takes precedence over -Os or -Oz. No warning needed.		// OptimizeNoneAttr takes precedence over -Os or -Oz. No warning needed.
if (!HasOptnone) {		if (!HasOptnone) {
if (CodeGenOpts.OptimizeSize)		if (CodeGenOpts.OptimizeSize)
FuncAttrs.addAttribute(llvm::Attribute::OptimizeForSize);		FuncAttrs.addAttribute(llvm::Attribute::OptimizeForSize);
if (CodeGenOpts.OptimizeSize == 2)		if (CodeGenOpts.OptimizeSize == 2)
FuncAttrs.addAttribute(llvm::Attribute::MinSize);		FuncAttrs.addAttribute(llvm::Attribute::MinSize);
}		}

Show All 25 Lines	if (AttrOnCallSite) {
}		}

if (CodeGenOpts.LessPreciseFPMAD)		if (CodeGenOpts.LessPreciseFPMAD)
FuncAttrs.addAttribute("less-precise-fpmad", "true");		FuncAttrs.addAttribute("less-precise-fpmad", "true");

if (CodeGenOpts.NullPointerIsValid)		if (CodeGenOpts.NullPointerIsValid)
FuncAttrs.addAttribute(llvm::Attribute::NullPointerIsValid);		FuncAttrs.addAttribute(llvm::Attribute::NullPointerIsValid);

if (CodeGenOpts.FPDenormalMode != llvm::DenormalMode::getIEEE())
FuncAttrs.addAttribute("denormal-fp-math",
CodeGenOpts.FPDenormalMode.str());
if (CodeGenOpts.FP32DenormalMode != CodeGenOpts.FPDenormalMode) {
FuncAttrs.addAttribute(
"denormal-fp-math-f32",
CodeGenOpts.FP32DenormalMode.str());
}

if (LangOpts.getDefaultExceptionMode() == LangOptions::FPE_Ignore)		if (LangOpts.getDefaultExceptionMode() == LangOptions::FPE_Ignore)
FuncAttrs.addAttribute("no-trapping-math", "true");		FuncAttrs.addAttribute("no-trapping-math", "true");

// TODO: Are these all needed?		// TODO: Are these all needed?
// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.		// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
if (LangOpts.NoHonorInfs)		if (LangOpts.NoHonorInfs)
FuncAttrs.addAttribute("no-infs-fp-math", "true");		FuncAttrs.addAttribute("no-infs-fp-math", "true");
if (LangOpts.NoHonorNaNs)		if (LangOpts.NoHonorNaNs)
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	void CodeGenModule::getTrivialDefaultFunctionAttributes(

for (StringRef Attr : CodeGenOpts.DefaultFunctionAttrs) {		for (StringRef Attr : CodeGenOpts.DefaultFunctionAttrs) {
StringRef Var, Value;		StringRef Var, Value;
std::tie(Var, Value) = Attr.split('=');		std::tie(Var, Value) = Attr.split('=');
FuncAttrs.addAttribute(Var, Value);		FuncAttrs.addAttribute(Var, Value);
}		}
}		}

		void CodeGenModule::getDefaultFunctionAttributes(StringRef Name,
		bool HasOptnone,
		bool AttrOnCallSite,
		llvm::AttrBuilder &FuncAttrs) {
		getTrivialDefaultFunctionAttributes(Name, HasOptnone, AttrOnCallSite,
		FuncAttrs);
		if (!AttrOnCallSite) {
		// If we're just getting the default, get the default values for mergeable
		// attributes.
		addMergableDefaultFunctionAttributes(CodeGenOpts, FuncAttrs);
		}
		}

void CodeGenModule::addDefaultFunctionDefinitionAttributes(llvm::Function &F) {		void CodeGenModule::addDefaultFunctionDefinitionAttributes(llvm::Function &F) {
llvm::AttrBuilder FuncAttrs(F.getContext());		llvm::AttrBuilder FuncAttrs(F.getContext());
getDefaultFunctionAttributes(F.getName(), F.hasOptNone(),		getDefaultFunctionAttributes(F.getName(), F.hasOptNone(),
/* AttrOnCallSite = */ false, FuncAttrs);		/* AttrOnCallSite = */ false, FuncAttrs);
// TODO: call GetCPUAndFeaturesAttributes?		// TODO: call GetCPUAndFeaturesAttributes?
F.addFnAttrs(FuncAttrs);		F.addFnAttrs(FuncAttrs);
}		}

		/// Apply default attributes to \p F, accounting for merge semantics of
		/// attributes that should not overwrite existing attributes.
		void CodeGenModule::mergeDefaultFunctionDefinitionAttributes(
		llvm::Function &F, bool WillInternalize) {
		llvm::AttrBuilder FuncAttrs(F.getContext());
		getTrivialDefaultFunctionAttributes(F.getName(), F.hasOptNone(),
		/AttrOnCallSite=/false, FuncAttrs);
		GetCPUAndFeaturesAttributes(GlobalDecl(), FuncAttrs);

		if (!WillInternalize && F.isInterposable()) {
		// Do not promote "dynamic" denormal-fp-math to this translation unit's
		// setting for weak functions that won't be internalized. The user has no
		// real control for how builtin bitcode is linked, so we shouldn't assume
		// later copies will use a consistent mode.
		F.addFnAttrs(FuncAttrs);
		return;
		}

		llvm::AttributeMask AttrsToRemove;

		llvm::DenormalMode DenormModeToMerge = F.getDenormalModeRaw();
		llvm::DenormalMode DenormModeToMergeF32 = F.getDenormalModeF32Raw();
		llvm::DenormalMode Merged =
		CodeGenOpts.FPDenormalMode.mergeCalleeMode(DenormModeToMerge);
		llvm::DenormalMode MergedF32 = CodeGenOpts.FP32DenormalMode;

		if (DenormModeToMergeF32.isValid()) {
		MergedF32 =
		CodeGenOpts.FP32DenormalMode.mergeCalleeMode(DenormModeToMergeF32);
		}

		if (Merged == llvm::DenormalMode::getDefault()) {
		AttrsToRemove.addAttribute("denormal-fp-math");
		} else if (Merged != DenormModeToMerge) {
		// Overwrite existing attribute
		FuncAttrs.addAttribute("denormal-fp-math",
		CodeGenOpts.FPDenormalMode.str());
		}

		if (MergedF32 == llvm::DenormalMode::getDefault()) {
		traUnsubmitted Not Done Reply Inline Actions IIUIC, this changes denorm mode attributes on the functions with dynamic denorm mode that we link in. Will that be a problem if the same function, when linked into different modules, would end up with different attributes? E.g. if a function is externally visible and is intended to be common'ed across multiple modules. Should dynamic denorm mode be restricted to the functions private to the module only? We do typically internalize linked bitcode for CUDA, but I don't think it's something we can always implicitly assume. tra: IIUIC, this changes denorm mode attributes on the functions with dynamic denorm mode that we…
		arsenmAuthorUnsubmitted Done Reply Inline Actions The whole point of -mlink-builtin-bitcode is to apply the attributes for the current compilation to what's linked in. The linked functions are always internalized. The only case where we might not want to internalize is for weak symbols (but it looks like we do internalize those today, but this is something I've thought about changing). I'll add a test with a weak library function In the weak case the right thing to do is probably to not change from dynamic, simply because this linking process is outside of the user's control. arsenm: The whole point of -mlink-builtin-bitcode is to apply the attributes for the current…
		arsenmAuthorUnsubmitted Done Reply Inline Actions It turns out we apply attributes prior to internalization. As a separate patch, we can either: Skip functions that start as interposable, which has an observable change in the IR as it is Move the link and internalize before setting attributes. This would be unobservable but would catch it if the internalization behavior ever changed arsenm: It turns out we apply attributes prior to internalization. As a separate patch, we can either…
		AttrsToRemove.addAttribute("denormal-fp-math-f32");
		} else if (MergedF32 != DenormModeToMergeF32) {
		// Overwrite existing attribute
		FuncAttrs.addAttribute("denormal-fp-math-f32",
		CodeGenOpts.FP32DenormalMode.str());
		}

		F.removeFnAttrs(AttrsToRemove);
		addDenormalModeAttrs(Merged, MergedF32, FuncAttrs);
		F.addFnAttrs(FuncAttrs);
		}

void CodeGenModule::addDefaultFunctionDefinitionAttributes(		void CodeGenModule::addDefaultFunctionDefinitionAttributes(
llvm::AttrBuilder &attrs) {		llvm::AttrBuilder &attrs) {
getDefaultFunctionAttributes(/function name/ "", /optnone/ false,		getDefaultFunctionAttributes(/function name/ "", /optnone/ false,
/for call/ false, attrs);		/for call/ false, attrs);
GetCPUAndFeaturesAttributes(GlobalDecl(), attrs);		GetCPUAndFeaturesAttributes(GlobalDecl(), attrs);
}		}

static void addNoBuiltinAttributes(llvm::AttrBuilder &FuncAttrs,		static void addNoBuiltinAttributes(llvm::AttrBuilder &FuncAttrs,
const LangOptions &LangOpts,		const LangOptions &LangOpts,
const NoBuiltinAttr *NBA = nullptr) {		const NoBuiltinAttr *NBA = nullptr) {
▲ Show 20 Lines • Show All 3,688 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenAction.cpp

Show First 20 Lines • Show All 264 Lines • ▼ Show 20 Lines	public:
bool LinkInModules() {		bool LinkInModules() {
for (auto &LM : LinkModules) {		for (auto &LM : LinkModules) {
if (LM.PropagateAttrs)		if (LM.PropagateAttrs)
for (Function &F : *LM.Module) {		for (Function &F : *LM.Module) {
// Skip intrinsics. Keep consistent with how intrinsics are created		// Skip intrinsics. Keep consistent with how intrinsics are created
// in LLVM IR.		// in LLVM IR.
if (F.isIntrinsic())		if (F.isIntrinsic())
continue;		continue;
Gen->CGM().addDefaultFunctionDefinitionAttributes(F);		Gen->CGM().mergeDefaultFunctionDefinitionAttributes(F,
		LM.Internalize);
}		}

CurLinkModule = LM.Module.get();		CurLinkModule = LM.Module.get();

bool Err;		bool Err;
if (LM.Internalize) {		if (LM.Internalize) {
Err = Linker::linkModules(		Err = Linker::linkModules(
*getModule(), std::move(LM.Module), LM.LinkFlags,		*getModule(), std::move(LM.Module), LM.LinkFlags,
▲ Show 20 Lines • Show All 993 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenModule.h

Show First 20 Lines • Show All 1,266 Lines • ▼ Show 20 Lines	public:
/// will propagate unsafe-fp-math=false up to every transitive caller of a		/// will propagate unsafe-fp-math=false up to every transitive caller of a
/// function in the bitcode library!		/// function in the bitcode library!
///		///
/// With the exception of fast-math attrs, this will only make the attributes		/// With the exception of fast-math attrs, this will only make the attributes
/// on the function more conservative. But it's unsafe to call this on a		/// on the function more conservative. But it's unsafe to call this on a
/// function which relies on particular fast-math attributes for correctness.		/// function which relies on particular fast-math attributes for correctness.
/// It's up to you to ensure that this is safe.		/// It's up to you to ensure that this is safe.
void addDefaultFunctionDefinitionAttributes(llvm::Function &F);		void addDefaultFunctionDefinitionAttributes(llvm::Function &F);
		void mergeDefaultFunctionDefinitionAttributes(llvm::Function &F,
		bool WillInternalize);

/// Like the overload taking a `Function &`, but intended specifically		/// Like the overload taking a `Function &`, but intended specifically
/// for frontends that want to build on Clang's target-configuration logic.		/// for frontends that want to build on Clang's target-configuration logic.
void addDefaultFunctionDefinitionAttributes(llvm::AttrBuilder &attrs);		void addDefaultFunctionDefinitionAttributes(llvm::AttrBuilder &attrs);

StringRef getMangledName(GlobalDecl GD);		StringRef getMangledName(GlobalDecl GD);
StringRef getBlockMangledName(GlobalDecl GD, const BlockDecl *BD);		StringRef getBlockMangledName(GlobalDecl GD, const BlockDecl *BD);
const GlobalDecl getMangledNameDecl(StringRef);		const GlobalDecl getMangledNameDecl(StringRef);
▲ Show 20 Lines • Show All 446 Lines • ▼ Show 20 Lines	private:
/// definitions whose linkage can change, e.g. implicit function instantions		/// definitions whose linkage can change, e.g. implicit function instantions
/// which may later be explicitly instantiated.		/// which may later be explicitly instantiated.
bool MayBeEmittedEagerly(const ValueDecl *D);		bool MayBeEmittedEagerly(const ValueDecl *D);

/// Check whether we can use a "simpler", more core exceptions personality		/// Check whether we can use a "simpler", more core exceptions personality
/// function.		/// function.
void SimplifyPersonality();		void SimplifyPersonality();

		/// Helper function for getDefaultFunctionAttributes. Builds a set of function
		/// attributes which can be simply added to a function.
		void getTrivialDefaultFunctionAttributes(StringRef Name, bool HasOptnone,
		bool AttrOnCallSite,
		llvm::AttrBuilder &FuncAttrs);

/// Helper function for ConstructAttributeList and		/// Helper function for ConstructAttributeList and
/// addDefaultFunctionDefinitionAttributes. Builds a set of function		/// addDefaultFunctionDefinitionAttributes. Builds a set of function
/// attributes to add to a function with the given properties.		/// attributes to add to a function with the given properties.
void getDefaultFunctionAttributes(StringRef Name, bool HasOptnone,		void getDefaultFunctionAttributes(StringRef Name, bool HasOptnone,
bool AttrOnCallSite,		bool AttrOnCallSite,
llvm::AttrBuilder &FuncAttrs);		llvm::AttrBuilder &FuncAttrs);

llvm::Metadata *CreateMetadataIdentifierImpl(QualType T, MetadataTypeMap &Map,		llvm::Metadata *CreateMetadataIdentifierImpl(QualType T, MetadataTypeMap &Map,
StringRef Suffix);		StringRef Suffix);
};		};

} // end namespace CodeGen		} // end namespace CodeGen
} // end namespace clang		} // end namespace clang

#endif // LLVM_CLANG_LIB_CODEGEN_CODEGENMODULE_H		#endif // LLVM_CLANG_LIB_CODEGEN_CODEGENMODULE_H

clang/test/CodeGen/denormalfpmode-f32.c

	// RUN: %clang_cc1 -S %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE			// RUN: %clang_cc1 -S %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE			// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE			// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE			// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE
				// RUN: %clang_cc1 -S -fdenormal-fp-math=dynamic %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC,CHECK-F32-NONE

	// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE			// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE			// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-IEEE			// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-IEEE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-IEEE			// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-IEEE
				// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-DYNAMIC


	// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS			// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS
	// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS			// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS
	// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE			// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-PS			// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-PS
				// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-DYNAMIC


	// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ			// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ
				// RUN: %clang_cc1 -S -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-DYNAMIC
	// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ			// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ
				// RUN: %clang_cc1 -S -fdenormal-fp-math=dynamic -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC,CHECK-F32-PZ
	// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-PZ			// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-PZ
	// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE			// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE
				// RUN: %clang_cc1 -S -fdenormal-fp-math=dynamic -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - \| FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC,CHECK-F32-NONE


	// CHECK-LABEL: main			// CHECK-LABEL: main

	// CHECK-ATTR: attributes #0 =			// CHECK-ATTR: attributes #0 =
	// CHECK-NONE-NOT:"denormal-fp-math"			// CHECK-NONE-NOT:"denormal-fp-math"
	// CHECK-IEEE: "denormal-fp-math"="ieee,ieee"			// CHECK-IEEE: "denormal-fp-math"="ieee,ieee"
	// CHECK-PS: "denormal-fp-math"="preserve-sign,preserve-sign"			// CHECK-PS: "denormal-fp-math"="preserve-sign,preserve-sign"
	// CHECK-PZ: "denormal-fp-math"="positive-zero,positive-zero"			// CHECK-PZ: "denormal-fp-math"="positive-zero,positive-zero"
				// CHECK-DYNAMIC: "denormal-fp-math"="dynamic,dynamic"

	// CHECK-F32-NONE-NOT:"denormal-fp-math-f32"			// CHECK-F32-NONE-NOT:"denormal-fp-math-f32"
	// CHECK-F32-IEEE: "denormal-fp-math-f32"="ieee,ieee"			// CHECK-F32-IEEE: "denormal-fp-math-f32"="ieee,ieee"
	// CHECK-F32-PS: "denormal-fp-math-f32"="preserve-sign,preserve-sign"			// CHECK-F32-PS: "denormal-fp-math-f32"="preserve-sign,preserve-sign"
	// CHECK-F32-PZ: "denormal-fp-math-f32"="positive-zero,positive-zero"			// CHECK-F32-PZ: "denormal-fp-math-f32"="positive-zero,positive-zero"


				// CHECK-F32-DYNAMIC: "denormal-fp-math-f32"="dynamic,dynamic"

	int main(void) {			int main(void) {
	return 0;			return 0;
	}			}

clang/test/CodeGen/denormalfpmode.c

	// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefix=CHECK-IEEE			// RUN: %clang_cc1 -S -fdenormal-fp-math=ieee %s -emit-llvm -o - \| FileCheck %s --check-prefix=CHECK-IEEE
	// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefix=CHECK-PS			// RUN: %clang_cc1 -S -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - \| FileCheck %s --check-prefix=CHECK-PS
	// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefix=CHECK-PZ			// RUN: %clang_cc1 -S -fdenormal-fp-math=positive-zero %s -emit-llvm -o - \| FileCheck %s --check-prefix=CHECK-PZ
				// RUN: %clang_cc1 -S -fdenormal-fp-math=dynamic %s -emit-llvm -o - \| FileCheck %s --check-prefix=CHECK-DYNAMIC

	// CHECK-LABEL: main			// CHECK-LABEL: main

	// The ieee,ieee is the default, so omit the attribute			// The ieee,ieee is the default, so omit the attribute
	// CHECK-IEEE-NOT:"denormal-fp-math"			// CHECK-IEEE-NOT:"denormal-fp-math"
	// CHECK-PS: attributes #0 = {{.}}"denormal-fp-math"="preserve-sign,preserve-sign"{{.}}			// CHECK-PS: attributes #0 = {{.}}"denormal-fp-math"="preserve-sign,preserve-sign"{{.}}
	// CHECK-PZ: attributes #0 = {{.}}"denormal-fp-math"="positive-zero,positive-zero"{{.}}			// CHECK-PZ: attributes #0 = {{.}}"denormal-fp-math"="positive-zero,positive-zero"{{.}}
				// CHECK-DYNAMIC: attributes #0 = {{.}}"denormal-fp-math"="dynamic,dynamic"{{.}}

	int main(void) {			int main(void) {
	return 0;			return 0;
	}			}

clang/test/CodeGenCUDA/Inputs/ocml-sample.cl

This file was added.

				#pragma OPENCL EXTENSION cl_khr_fp16 : enable

				half do_f16_stuff(half a, half b, half c) {
				return __builtin_fmaf16(a, b, c) + 4.0h;
				}

				float do_f32_stuff(float a, float b, float c) {
				return __builtin_fmaf(a, b, c) + 4.0f;
				}

				double do_f64_stuff(double a, double b, double c) {
				return __builtin_fma(a, b, c) + 4.0;
				traUnsubmitted Done Reply Inline Actions Cosmetic nit: order functions as f16/f32/f64? tra: Cosmetic nit: order functions as f16/f32/f64?
				}

				__attribute__((weak))
				float weak_do_f32_stuff(float a, float b, float c) {
				return c * (a / b);
				}

clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu

This file was added.

				// Verify the behavior of the denormal-fp-mode attributes in the way that
				// rocm-device-libs should be built with. The bitcode should be compiled with
				// denormal-fp-math-f32=dynamic, and should be replaced with the denormal mode
				// of the final TU.

				// Build the fake device library in the way rocm-device-libs should be built.
				//
				// RUN: %clang_cc1 -x cl -triple amdgcn-amd-amdhsa -fdenormal-fp-math-f32=dynamic \
				// RUN: -mcode-object-version=none -emit-llvm-bc \
				// RUN: %S/Inputs/ocml-sample.cl -o %t.dynamic.f32.bc
				//
				// RUN: %clang_cc1 -x cl -triple amdgcn-amd-amdhsa -fdenormal-fp-math=dynamic \
				// RUN: -mcode-object-version=none -emit-llvm-bc \
				// RUN: %S/Inputs/ocml-sample.cl -o %t.dynamic.full.bc


				traUnsubmitted Not Done Reply Inline Actions Do we want to verify that the compiled samples have the correct function attributes? tra: Do we want to verify that the compiled samples have the correct function attributes?
				arsenmAuthorUnsubmitted Done Reply Inline Actions Maybe, but that's already tested separately. This test is a bit complex as it is (and could maybe use a few more combinations) arsenm: Maybe, but that's already tested separately. This test is a bit complex as it is (and could…

				// Check the default behavior with no denormal-fp-math arguments.
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 -fcuda-is-device \
				// RUN: -mlink-builtin-bitcode %t.dynamic.f32.bc \
				// RUN: -emit-llvm %s -o - \| FileCheck -implicit-check-not=denormal-fp-math %s --check-prefixes=CHECK,INTERNALIZE
				traUnsubmitted Not Done Reply Inline Actions Would it be useful to check the attributes the functions have w/o linking the sample bitcode? In these tests one can infer expected attributes from the flags and comments, so I'm fine not having explicit checks for that. tra: Would it be useful to check the attributes the functions have w/o linking the sample bitcode?


				// Check an explicit full ieee request
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 -fcuda-is-device \
				// RUN: -fdenormal-fp-math=ieee \
				// RUN: -mlink-builtin-bitcode %t.dynamic.f32.bc \
				// RUN: -emit-llvm %s -o - \| FileCheck -implicit-check-not=denormal-fp-math %s --check-prefixes=CHECK,INTERNALIZE


				// Check explicit f32-only flushing request
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 \
				// RUN: -fcuda-is-device -fdenormal-fp-math-f32=preserve-sign \
				// RUN: -mlink-builtin-bitcode %t.dynamic.f32.bc -emit-llvm %s -o - \
				// RUN: \| FileCheck -implicit-check-not=denormal-fp-math --enable-var-scope %s --check-prefixes=CHECK,INTERNALIZE,IEEEF64-PSZF32


				// Check explicit flush all request. Only the f32 component of the library is
				// dynamic, so the linked functions should use IEEE as the base mode and the new
				// functions preserve-sign.
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 \
				// RUN: -fcuda-is-device -fdenormal-fp-math=preserve-sign \
				// RUN: -mlink-builtin-bitcode %t.dynamic.f32.bc -emit-llvm %s -o - \
				// RUN: \| FileCheck -implicit-check-not=denormal-fp-math --enable-var-scope %s --check-prefixes=CHECK,INTERNALIZE,PSZ


				// Check explicit f32-only, ieee-other flushing request
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 \
				// RUN: -fcuda-is-device -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=preserve-sign \
				// RUN: -mlink-builtin-bitcode %t.dynamic.f32.bc -emit-llvm %s -o - \
				// RUN: \| FileCheck -implicit-check-not=denormal-fp-math --enable-var-scope %s --check-prefixes=CHECK,INTERNALIZE,IEEEF64-PSZF32


				// Check inverse of normal usage. Requesting IEEE f32, with flushed f16/f64
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 \
				// RUN: -fcuda-is-device -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=ieee \
				// RUN: -mlink-builtin-bitcode %t.dynamic.f32.bc -emit-llvm %s -o - \
				// RUN: \| FileCheck -implicit-check-not=denormal-fp-math --enable-var-scope %s --check-prefixes=CHECK,INTERNALIZE,IEEEF32-PSZF64-DYNF32


				// Check backwards from the normal usage where both library components can be
				// overridden.
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 \
				// RUN: -fcuda-is-device -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=ieee \
				// RUN: -mlink-builtin-bitcode %t.dynamic.full.bc -emit-llvm %s -o - \
				// RUN: \| FileCheck -implicit-check-not=denormal-fp-math --enable-var-scope %s --check-prefixes=CHECK,INTERNALIZE,IEEEF32-PSZF64-DYNFULL



				// Check the case where no internalization is performed
				// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 \
				// RUN: -fcuda-is-device -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=ieee \
				// RUN: -mlink-bitcode-file %t.dynamic.full.bc -emit-llvm %s -o - \
				// RUN: \| FileCheck -implicit-check-not=denormal-fp-math --enable-var-scope %s --check-prefixes=CHECK,NOINTERNALIZE,NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL



				#define __device__ __attribute__((device))
				traUnsubmitted Done Reply Inline Actions Nit: CHECK-LABEL ? tra: Nit: CHECK-LABEL ?
				arsenmAuthorUnsubmitted Done Reply Inline Actions error: found 'CHECK-LABEL:' with variable definition or use arsenm: error: found 'CHECK-LABEL:' with variable definition or use
				traUnsubmitted Not Done Reply Inline Actions Interesting. In that case label and attribute checks could be separated into something like this: CHECK-LABEL: name CHECK-SAME: [[attribute]] Up to you. tra: Interesting. In that case label and attribute checks could be separated into something like…
				#define __global__ __attribute__((global))

				typedef _Float16 half;

				extern "C" {
				__device__ half do_f16_stuff(half a, half b, half c);
				__device__ float do_f32_stuff(float a, float b, float c);

				// Currently all library functions are internalized. Check a weak function in
				// case we ever choose to not internalize these. In that case, the safest thing
				// to do would likely be to preserve the dynamic denormal-fp-math.
				__attribute__((weak)) __device__ float weak_do_f32_stuff(float a, float b, float c);
				__device__ double do_f64_stuff(double a, double b, double c);

				traUnsubmitted Not Done Reply Inline Actions I assume these refer to linked in functions, not their calls. It may be useful to include match define/call to make it obvious. tra: I assume these refer to linked in functions, not their calls. It may be useful to include match…

				// CHECK: kernel_f16({{.*}}) #[[$KERNELATTR:[0-9]+]]
				__global__ void kernel_f16(float* out, float* a, float* b, float* c) {
				traUnsubmitted Not Done Reply Inline Actions I'm not sure whether it does what it's intended to. AFAICT, at this point we will be past the call sites, so if it's intended to check the call sites in kernel_, it will likely always succeed, even if we do litter call sites with unwanted attributes. It's also possible that I have a wrong idea about what the expected IR looks like. If you could post it for reference, that would be helpful. tra:* I'm not sure whether it does what it's intended to. AFAICT, at this point we will be past the…
				arsenmAuthorUnsubmitted Done Reply Inline Actions I can drop this, I later added the -implicit-check-not=denormal-fp-math to all the FileChecks arsenm: I can drop this, I later added the -implicit-check-not=denormal-fp-math to all the FileChecks
				int id = 0;
				out[id] = do_f16_stuff(a[id], b[id], c[id]);
				}

				// CHECK: kernel_f32({{.*}}) #[[$KERNELATTR]]
				__global__ void kernel_f32(float* out, float* a, float* b, float* c) {
				int id = 0;
				out[id] = do_f32_stuff(a[id], b[id], c[id]);
				out[id] += weak_do_f32_stuff(a[id], b[id], c[id]);
				}

				// CHECK: kernel_f64({{.*}}) #[[$KERNELATTR]]
				__global__ void kernel_f64(double* out, double* a, double* b, double* c) {
				int id = 0;
				out[id] = do_f64_stuff(a[id], b[id], c[id]);
				}
				}

				// INTERNALIZE: define internal half @do_f16_stuff({{.*}}) #[[$FUNCATTR:[0-9]+]]
				// INTERNALIZE: define internal float @do_f32_stuff({{.*}}) #[[$FUNCATTR]]
				// INTERNALIZE: define internal double @do_f64_stuff({{.*}}) #[[$FUNCATTR]]
				// INTERNALIZE: define internal float @weak_do_f32_stuff({{.*}}) #[[$WEAK_FUNCATTR:[0-9]+]]


				// NOINTERNALIZE: define dso_local half @do_f16_stuff({{.*}}) #[[$FUNCATTR:[0-9]+]]
				// NOINTERNALIZE: define dso_local float @do_f32_stuff({{.*}}) #[[$FUNCATTR]]
				// NOINTERNALIZE: define dso_local double @do_f64_stuff({{.*}}) #[[$FUNCATTR]]
				// NOINTERNALIZE: define weak float @weak_do_f32_stuff({{.*}}) #[[$WEAK_FUNCATTR:[0-9]+]]



				// We should not be littering call sites with the attribute
				// Everything should use the default ieee with no explicit attribute

				// FIXME: Should check-not "denormal-fp-math" within the denormal-fp-math-f32
				// lines.

				// Default mode relies on the implicit check-not for the denormal-fp-math.

				// PSZ: #[[$KERNELATTR]] = { {{.}} "denormal-fp-math"="preserve-sign,preserve-sign" {{.}} "target-cpu"="gfx803" {{.*}} }
				// PSZ: #[[$FUNCATTR]] = { {{.}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" {{.}} "target-cpu"="gfx803" {{.*}} }
				// PSZ: #[[$WEAK_FUNCATTR]] = { {{.}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" {{.}} "target-cpu"="gfx803" {{.*}} }

				// FIXME: Should check-not "denormal-fp-math" within the line
				// IEEEF64-PSZF32: #[[$KERNELATTR]] = { {{.}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" {{.}} "target-cpu"="gfx803" {{.*}} }
				// IEEEF64-PSZF32: #[[$FUNCATTR]] = { {{.}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" {{.}} "target-cpu"="gfx803" {{.*}} }
				// IEEEF64-PSZF32: #[[$WEAK_FUNCATTR]] = { {{.}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" {{.}} "target-cpu"="gfx803" {{.*}} }

				// IEEEF32-PSZF64-DYNF32: #[[$KERNELATTR]] = { {{.}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" {{.}} "target-cpu"="gfx803" {{.*}} }
				// implicit check-not
				// implicit check-not


				// IEEEF32-PSZF64-DYNFULL: #[[$KERNELATTR]] = { {{.}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" {{.}} "target-cpu"="gfx803" {{.*}} }
				// IEEEF32-PSZF64-DYNFULL: #[[$FUNCATTR]] = { {{.}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" {{.}} "target-cpu"="gfx803" {{.*}} }
				// IEEEF32-PSZF64-DYNFULL: #[[$WEAK_FUNCATTR]] = { {{.}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" {{.}} "target-cpu"="gfx803" {{.*}} }


				// -mlink-bitcode-file doesn't internalize or propagate attributes.
				// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$KERNELATTR]] = { {{.}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" {{.}} "target-cpu"="gfx803" {{.*}} }
				// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$FUNCATTR]] = { {{.}} "denormal-fp-math"="dynamic,dynamic" {{.}} }
				// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$WEAK_FUNCATTR]] = { {{.}} "denormal-fp-math"="dynamic,dynamic" {{.}} }

clang/test/Driver/denormal-fp-math.c

	// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee -v 2>&1 \| FileCheck -check-prefix=CHECK-IEEE %s			// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee -v 2>&1 \| FileCheck -check-prefix=CHECK-IEEE %s
	// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=preserve-sign -v 2>&1 \| FileCheck -check-prefix=CHECK-PS %s			// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=preserve-sign -v 2>&1 \| FileCheck -check-prefix=CHECK-PS %s
	// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=positive-zero -v 2>&1 \| FileCheck -check-prefix=CHECK-PZ %s			// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=positive-zero -v 2>&1 \| FileCheck -check-prefix=CHECK-PZ %s

				// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=dynamic -v 2>&1 \| FileCheck -check-prefix=CHECK-DYNAMIC %s

	// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee -fno-fast-math -v 2>&1 \| FileCheck -check-prefix=CHECK-NO-UNSAFE %s			// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee -fno-fast-math -v 2>&1 \| FileCheck -check-prefix=CHECK-NO-UNSAFE %s
	// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee -fno-unsafe-math-optimizations -v 2>&1 \| FileCheck -check-prefix=CHECK-NO-UNSAFE %s			// RUN: %clang -### -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee -fno-unsafe-math-optimizations -v 2>&1 \| FileCheck -check-prefix=CHECK-NO-UNSAFE %s
	// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=foo -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID0 %s			// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=foo -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID0 %s
	// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee,foo -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID1 %s			// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=ieee,foo -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID1 %s
	// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=foo,ieee -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID2 %s			// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=foo,ieee -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID2 %s
	// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=foo,foo -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID3 %s			// RUN: not %clang -target arm-unknown-linux-gnu -c %s -fdenormal-fp-math=foo,foo -v 2>&1 \| FileCheck -check-prefix=CHECK-INVALID3 %s

	// IEEE is the implied default, and the flag is not passed.			// IEEE is the implied default, and the flag is not passed.
	// CHECK-IEEE-NOT: -fdenormal-fp-math=			// CHECK-IEEE-NOT: -fdenormal-fp-math=
	// CHECK-PS: "-fdenormal-fp-math=preserve-sign,preserve-sign"			// CHECK-PS: "-fdenormal-fp-math=preserve-sign,preserve-sign"
	// CHECK-PZ: "-fdenormal-fp-math=positive-zero,positive-zero"			// CHECK-PZ: "-fdenormal-fp-math=positive-zero,positive-zero"
				// CHECK-DYNAMIC: "-fdenormal-fp-math=dynamic,dynamic"


	// CHECK-NO-UNSAFE-NOT: "-fdenormal-fp-math=ieee"			// CHECK-NO-UNSAFE-NOT: "-fdenormal-fp-math=ieee"
	// CHECK-INVALID0: error: invalid value 'foo' in '-fdenormal-fp-math=foo'			// CHECK-INVALID0: error: invalid value 'foo' in '-fdenormal-fp-math=foo'
	// CHECK-INVALID1: error: invalid value 'ieee,foo' in '-fdenormal-fp-math=ieee,foo'			// CHECK-INVALID1: error: invalid value 'ieee,foo' in '-fdenormal-fp-math=ieee,foo'
	// CHECK-INVALID2: error: invalid value 'foo,ieee' in '-fdenormal-fp-math=foo,ieee'			// CHECK-INVALID2: error: invalid value 'foo,ieee' in '-fdenormal-fp-math=foo,ieee'
	// CHECK-INVALID3: error: invalid value 'foo,foo' in '-fdenormal-fp-math=foo,foo'			// CHECK-INVALID3: error: invalid value 'foo,foo' in '-fdenormal-fp-math=foo,foo'

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,220 Lines • ▼ Show 20 Lines	``strictfp``
not introduce any new floating-point instructions that may trap.		not introduce any new floating-point instructions that may trap.

.. _denormal_fp_math:		.. _denormal_fp_math:

``"denormal-fp-math"``		``"denormal-fp-math"``
This indicates the denormal (subnormal) handling that may be		This indicates the denormal (subnormal) handling that may be
assumed for the default floating-point environment. This is a		assumed for the default floating-point environment. This is a
comma separated pair. The elements may be one of ``"ieee"``,		comma separated pair. The elements may be one of ``"ieee"``,
``"preserve-sign"``, or ``"positive-zero"``. The first entry		``"preserve-sign"``, ``"positive-zero"``, or ``"dynamic"``. The
indicates the flushing mode for the result of floating point		first entry indicates the flushing mode for the result of floating
operations. The second indicates the handling of denormal inputs		point operations. The second indicates the handling of denormal inputs
to floating point instructions. For compatibility with older		to floating point instructions. For compatibility with older
bitcode, if the second value is omitted, both input and output		bitcode, if the second value is omitted, both input and output
modes will assume the same mode.		modes will assume the same mode.

If this is attribute is not specified, the default is		If this is attribute is not specified, the default is ``"ieee,ieee"``.
``"ieee,ieee"``.

If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,		If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
denormal outputs may be flushed to zero by standard floating-point		denormal outputs may be flushed to zero by standard floating-point
operations. It is not mandated that flushing to zero occurs, but if		operations. It is not mandated that flushing to zero occurs, but if
a denormal output is flushed to zero, it must respect the sign		a denormal output is flushed to zero, it must respect the sign
mode. Not all targets support all modes. While this indicates the		mode. Not all targets support all modes.
expected floating point mode the function will be executed with,
this does not make any attempt to ensure the mode is
consistent. User or platform code is expected to set the floating
point mode appropriately before function entry.

If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a		If the mode is ``"dynamic"``, the behavior is derived from the
		jcranmer-intelUnsubmitted Done Reply Inline Actions I feel like the description of this mode should mention that whether or not denormals are flushed is derived from the dynamic state of the FP environment. jcranmer-intel: I feel like the description of this mode should mention that whether or not denormals are…
		pengfeiUnsubmitted Not Done Reply Inline Actions Does it mean users must specify `dynamic` when they change FTZ/DAZ in a function? If 1) is true, is there a way to partially set on functions in its call stack, e.g., main f0 f1 f10 setFtzDaz(true); f2 f3 Ideally, users may want to tell compiler `main`, `f1`, `f10` is `dynamic`, while `f0` is `ieee` and `f2`, `f3` is `positive-zero`, rather than `dynamic` for all. If 2) is true, it looks silly to do it one by one manually. Should compiler help to deduce this information itself? pengfei: 1. Does it mean users must specify `dynamic` when they change FTZ/DAZ in a function? 2. If 1)…
		arsenmAuthorUnsubmitted Done Reply Inline Actions This is really a question of how strictfp should interact with the default mode and shouldn't be a different policy from how strictfp functions treat the rounding mode. Arbitrary strictfp functions don't make assumptions based on the default rounding mode, assuming it's not the default. In that sense, denormal-fp-mode doesn't really matter for strictfp functions. They just can't make use of it to optimize. If we had a denormal annotation like the rounding mode, we could make use of it in the same way This isn't an area for the backend to deduce, semantic meaning needs to be specific and explicit. I have no interest in making changing the denormal mode simple or easy. Turning on flushing isn't really semantically desirable and is basically obsolete on modern hardware. arsenm: This is really a question of how strictfp should interact with the default mode and shouldn't…
floating-point operation must treat any input denormal value as		dynamic state of the floating-point environment. Transformations
		which depend on the behavior of denormal values should not be
		performed.

		While this indicates the expected floating point mode the function
		will be executed with, this does not make any attempt to ensure
		the mode is consistent. User or platform code is expected to set
		the floating point mode appropriately before function entry.

		If the input mode is ``"preserve-sign"``, or ``"positive-zero"``,
		a floating-point operation must treat any input denormal value as
zero. In some situations, if an instruction does not respect this		zero. In some situations, if an instruction does not respect this
mode, the input may need to be converted to 0 as if by		mode, the input may need to be converted to 0 as if by
``@llvm.canonicalize`` during lowering for correctness.		``@llvm.canonicalize`` during lowering for correctness.
		jcranmer-intelUnsubmitted Done Reply Inline Actions This isn't your fault, but I noticed when reading the LangRef online that this paragraph has a slightly-different indentation that causes most of this attribute's documentation to gain an extra level of indentation. jcranmer-intel: This isn't your fault, but I noticed when reading the LangRef online that this paragraph has a…

``"denormal-fp-math-f32"``		``"denormal-fp-math-f32"``
Same as ``"denormal-fp-math"``, but only controls the behavior of		Same as ``"denormal-fp-math"``, but only controls the behavior of
the 32-bit float type (or vectors of 32-bit floats). If both are		the 32-bit float type (or vectors of 32-bit floats). If both are
are present, this overrides ``"denormal-fp-math"``. Not all targets		are present, this overrides ``"denormal-fp-math"``. Not all targets
support separately setting the denormal mode per type, and no		support separately setting the denormal mode per type, and no
attempt is made to diagnose unsupported uses. Currently this		attempt is made to diagnose unsupported uses. Currently this
attribute is respected by the AMDGPU and NVPTX backends.		attribute is respected by the AMDGPU and NVPTX backends.
▲ Show 20 Lines • Show All 24,682 Lines • Show Last 20 Lines

llvm/include/llvm/ADT/FloatingPointMode.h

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	enum DenormalModeKind : int8_t {

/// IEEE-754 denormal numbers preserved.		/// IEEE-754 denormal numbers preserved.
IEEE,		IEEE,

/// The sign of a flushed-to-zero number is preserved in the sign of 0		/// The sign of a flushed-to-zero number is preserved in the sign of 0
PreserveSign,		PreserveSign,

/// Denormals are flushed to positive zero.		/// Denormals are flushed to positive zero.
PositiveZero		PositiveZero,

		/// Denormals have unknown treatment.
		Dynamic
};		};

/// Denormal flushing mode for floating point instruction results in the		/// Denormal flushing mode for floating point instruction results in the
/// default floating point environment.		/// default floating point environment.
DenormalModeKind Output = DenormalModeKind::Invalid;		DenormalModeKind Output = DenormalModeKind::Invalid;

/// Denormal treatment kind for floating point instruction inputs in the		/// Denormal treatment kind for floating point instruction inputs in the
/// default floating-point environment. If this is not DenormalModeKind::IEEE,		/// default floating-point environment. If this is not DenormalModeKind::IEEE,
/// floating-point instructions implicitly treat the input value as 0.		/// floating-point instructions implicitly treat the input value as 0.
DenormalModeKind Input = DenormalModeKind::Invalid;		DenormalModeKind Input = DenormalModeKind::Invalid;

constexpr DenormalMode() = default;		constexpr DenormalMode() = default;
constexpr DenormalMode(DenormalModeKind Out, DenormalModeKind In) :		constexpr DenormalMode(DenormalModeKind Out, DenormalModeKind In) :
Output(Out), Input(In) {}		Output(Out), Input(In) {}


static constexpr DenormalMode getInvalid() {		static constexpr DenormalMode getInvalid() {
return DenormalMode(DenormalModeKind::Invalid, DenormalModeKind::Invalid);		return DenormalMode(DenormalModeKind::Invalid, DenormalModeKind::Invalid);
}		}

		/// Return the assumed default mode for a function without denormal-fp-math.
		static constexpr DenormalMode getDefault() {
		return getIEEE();
		}

static constexpr DenormalMode getIEEE() {		static constexpr DenormalMode getIEEE() {
return DenormalMode(DenormalModeKind::IEEE, DenormalModeKind::IEEE);		return DenormalMode(DenormalModeKind::IEEE, DenormalModeKind::IEEE);
}		}

static constexpr DenormalMode getPreserveSign() {		static constexpr DenormalMode getPreserveSign() {
return DenormalMode(DenormalModeKind::PreserveSign,		return DenormalMode(DenormalModeKind::PreserveSign,
DenormalModeKind::PreserveSign);		DenormalModeKind::PreserveSign);
}		}

static constexpr DenormalMode getPositiveZero() {		static constexpr DenormalMode getPositiveZero() {
return DenormalMode(DenormalModeKind::PositiveZero,		return DenormalMode(DenormalModeKind::PositiveZero,
DenormalModeKind::PositiveZero);		DenormalModeKind::PositiveZero);
}		}

		static constexpr DenormalMode getDynamic() {
		return DenormalMode(DenormalModeKind::Dynamic, DenormalModeKind::Dynamic);
		}

bool operator==(DenormalMode Other) const {		bool operator==(DenormalMode Other) const {
return Output == Other.Output && Input == Other.Input;		return Output == Other.Output && Input == Other.Input;
}		}

bool operator!=(DenormalMode Other) const {		bool operator!=(DenormalMode Other) const {
return !(*this == Other);		return !(*this == Other);
}		}

bool isSimple() const {		bool isSimple() const {
return Input == Output;		return Input == Output;
}		}

bool isValid() const {		bool isValid() const {
return Output != DenormalModeKind::Invalid &&		return Output != DenormalModeKind::Invalid &&
Input != DenormalModeKind::Invalid;		Input != DenormalModeKind::Invalid;
}		}

		/// Get the effective denormal mode if the mode if this caller calls into a
		/// function with \p Callee. This promotes dynamic modes to the mode of the
		/// caller.
		DenormalMode mergeCalleeMode(DenormalMode Callee) const {
		DenormalMode MergedMode = Callee;
		if (Callee.Input == DenormalMode::Dynamic)
		MergedMode.Input = Input;
		if (Callee.Output == DenormalMode::Dynamic)
		MergedMode.Output = Output;
		return MergedMode;
		}

inline void print(raw_ostream &OS) const;		inline void print(raw_ostream &OS) const;

inline std::string str() const {		inline std::string str() const {
std::string storage;		std::string storage;
raw_string_ostream OS(storage);		raw_string_ostream OS(storage);
print(OS);		print(OS);
return OS.str();		return OS.str();
}		}
};		};

inline raw_ostream& operator<<(raw_ostream &OS, DenormalMode Mode) {		inline raw_ostream& operator<<(raw_ostream &OS, DenormalMode Mode) {
Mode.print(OS);		Mode.print(OS);
return OS;		return OS;
}		}

/// Parse the expected names from the denormal-fp-math attribute.		/// Parse the expected names from the denormal-fp-math attribute.
inline DenormalMode::DenormalModeKind		inline DenormalMode::DenormalModeKind
parseDenormalFPAttributeComponent(StringRef Str) {		parseDenormalFPAttributeComponent(StringRef Str) {
// Assume ieee on unspecified attribute.		// Assume ieee on unspecified attribute.
return StringSwitch<DenormalMode::DenormalModeKind>(Str)		return StringSwitch<DenormalMode::DenormalModeKind>(Str)
.Cases("", "ieee", DenormalMode::IEEE)		.Cases("", "ieee", DenormalMode::IEEE)
.Case("preserve-sign", DenormalMode::PreserveSign)		.Case("preserve-sign", DenormalMode::PreserveSign)
.Case("positive-zero", DenormalMode::PositiveZero)		.Case("positive-zero", DenormalMode::PositiveZero)
		.Case("dynamic", DenormalMode::Dynamic)
.Default(DenormalMode::Invalid);		.Default(DenormalMode::Invalid);
}		}

/// Return the name used for the denormal handling mode used by the the		/// Return the name used for the denormal handling mode used by the the
/// expected names from the denormal-fp-math attribute.		/// expected names from the denormal-fp-math attribute.
inline StringRef denormalModeKindName(DenormalMode::DenormalModeKind Mode) {		inline StringRef denormalModeKindName(DenormalMode::DenormalModeKind Mode) {
switch (Mode) {		switch (Mode) {
case DenormalMode::IEEE:		case DenormalMode::IEEE:
return "ieee";		return "ieee";
case DenormalMode::PreserveSign:		case DenormalMode::PreserveSign:
return "preserve-sign";		return "preserve-sign";
case DenormalMode::PositiveZero:		case DenormalMode::PositiveZero:
return "positive-zero";		return "positive-zero";
		case DenormalMode::Dynamic:
		return "dynamic";
default:		default:
return "";		return "";
}		}
}		}

/// Returns the denormal mode to use for inputs and outputs.		/// Returns the denormal mode to use for inputs and outputs.
inline DenormalMode parseDenormalFPAttribute(StringRef Str) {		inline DenormalMode parseDenormalFPAttribute(StringRef Str) {
StringRef OutputStr, InputStr;		StringRef OutputStr, InputStr;
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/ConstantFolding.h

Show First 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	Constant ConstantFoldFPInstOperands(unsigned Opcode, Constant LHS,
Constant *RHS, const DataLayout &DL,		Constant *RHS, const DataLayout &DL,
const Instruction *I);		const Instruction *I);

/// Attempt to flush float point constant according to denormal mode set in the		/// Attempt to flush float point constant according to denormal mode set in the
/// instruction's parent function attributes. If so, return a zero with the		/// instruction's parent function attributes. If so, return a zero with the
/// correct sign, otherwise return the original constant. Inputs and outputs to		/// correct sign, otherwise return the original constant. Inputs and outputs to
/// floating point instructions can have their mode set separately, so the		/// floating point instructions can have their mode set separately, so the
/// direction is also needed.		/// direction is also needed.
		///
		/// If the calling function's "denormal-fp-math" input mode is "dynamic" for the
		/// floating-point type, returns nullptr for denormal inputs.
Constant FlushFPConstant(Constant Operand, const Instruction *I,		Constant FlushFPConstant(Constant Operand, const Instruction *I,
bool IsOutput);		bool IsOutput);

/// Attempt to constant fold a select instruction with the specified		/// Attempt to constant fold a select instruction with the specified
/// operands. The constant result is returned if successful; if not, null is		/// operands. The constant result is returned if successful; if not, null is
/// returned.		/// returned.
Constant ConstantFoldSelectInstruction(Constant Cond, Constant *V1,		Constant ConstantFoldSelectInstruction(Constant Cond, Constant *V1,
Constant *V2);		Constant *V2);
▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Attributes.td

	Show All 35 Lines
	class IntAttr<string S, list<AttrProperty> P> : Attr<S, P>;			class IntAttr<string S, list<AttrProperty> P> : Attr<S, P>;

	/// Type attribute.			/// Type attribute.
	class TypeAttr<string S, list<AttrProperty> P> : Attr<S, P>;			class TypeAttr<string S, list<AttrProperty> P> : Attr<S, P>;

	/// StringBool attribute.			/// StringBool attribute.
	class StrBoolAttr<string S> : Attr<S, []>;			class StrBoolAttr<string S> : Attr<S, []>;

				/// Arbitrary string attribute.
				class ComplexStrAttr<string S, list<AttrProperty> P> : Attr<S, P>;

	/// Target-independent enum attributes.			/// Target-independent enum attributes.

	/// Alignment of parameter (5 bits) stored as log2 of alignment with +1 bias.			/// Alignment of parameter (5 bits) stored as log2 of alignment with +1 bias.
	/// 0 means unaligned (different from align(1)).			/// 0 means unaligned (different from align(1)).
	def Alignment : IntAttr<"align", [ParamAttr, RetAttr]>;			def Alignment : IntAttr<"align", [ParamAttr, RetAttr]>;

	/// Parameter of a function that tells us the alignment of an allocation, as in			/// Parameter of a function that tells us the alignment of an allocation, as in
	/// aligned_alloc and aligned ::operator::new.			/// aligned_alloc and aligned ::operator::new.
	▲ Show 20 Lines • Show All 264 Lines • ▼ Show 20 Lines
	def ApproxFuncFPMath : StrBoolAttr<"approx-func-fp-math">;			def ApproxFuncFPMath : StrBoolAttr<"approx-func-fp-math">;
	def NoSignedZerosFPMath : StrBoolAttr<"no-signed-zeros-fp-math">;			def NoSignedZerosFPMath : StrBoolAttr<"no-signed-zeros-fp-math">;
	def UnsafeFPMath : StrBoolAttr<"unsafe-fp-math">;			def UnsafeFPMath : StrBoolAttr<"unsafe-fp-math">;
	def NoJumpTables : StrBoolAttr<"no-jump-tables">;			def NoJumpTables : StrBoolAttr<"no-jump-tables">;
	def NoInlineLineTables : StrBoolAttr<"no-inline-line-tables">;			def NoInlineLineTables : StrBoolAttr<"no-inline-line-tables">;
	def ProfileSampleAccurate : StrBoolAttr<"profile-sample-accurate">;			def ProfileSampleAccurate : StrBoolAttr<"profile-sample-accurate">;
	def UseSampleProfile : StrBoolAttr<"use-sample-profile">;			def UseSampleProfile : StrBoolAttr<"use-sample-profile">;

				def DenormalFPMath : ComplexStrAttr<"denormal-fp-math", [FnAttr]>;
				def DenormalFPMathF32 : ComplexStrAttr<"denormal-fp-math-f32", [FnAttr]>;

	class CompatRule<string F> {			class CompatRule<string F> {
	// The name of the function called to check the attribute of the caller and			// The name of the function called to check the attribute of the caller and
	// callee and decide whether inlining should be allowed. The function's			// callee and decide whether inlining should be allowed. The function's
	// signature must match "bool(const Function&, const Function &)", where the			// signature must match "bool(const Function&, const Function &)", where the
	// first parameter is the reference to the caller and the second parameter is			// first parameter is the reference to the caller and the second parameter is
	// the reference to the callee. It must return false if the attributes of the			// the reference to the callee. It must return false if the attributes of the
	// caller and callee are incompatible, and true otherwise.			// caller and callee are incompatible, and true otherwise.
	string CompatFunc = F;			string CompatFunc = F;
	}			}

	def : CompatRule<"isEqual<SanitizeAddressAttr>">;			def : CompatRule<"isEqual<SanitizeAddressAttr>">;
	def : CompatRule<"isEqual<SanitizeThreadAttr>">;			def : CompatRule<"isEqual<SanitizeThreadAttr>">;
	def : CompatRule<"isEqual<SanitizeMemoryAttr>">;			def : CompatRule<"isEqual<SanitizeMemoryAttr>">;
	def : CompatRule<"isEqual<SanitizeHWAddressAttr>">;			def : CompatRule<"isEqual<SanitizeHWAddressAttr>">;
	def : CompatRule<"isEqual<SanitizeMemTagAttr>">;			def : CompatRule<"isEqual<SanitizeMemTagAttr>">;
	def : CompatRule<"isEqual<SafeStackAttr>">;			def : CompatRule<"isEqual<SafeStackAttr>">;
	def : CompatRule<"isEqual<ShadowCallStackAttr>">;			def : CompatRule<"isEqual<ShadowCallStackAttr>">;
	def : CompatRule<"isEqual<UseSampleProfileAttr>">;			def : CompatRule<"isEqual<UseSampleProfileAttr>">;
	def : CompatRule<"isEqual<NoProfileAttr>">;			def : CompatRule<"isEqual<NoProfileAttr>">;
				def : CompatRule<"checkDenormMode">;


	class MergeRule<string F> {			class MergeRule<string F> {
	// The name of the function called to merge the attributes of the caller and			// The name of the function called to merge the attributes of the caller and
	// callee. The function's signature must match			// callee. The function's signature must match
	// "void(Function&, const Function &)", where the first parameter is the			// "void(Function&, const Function &)", where the first parameter is the
	// reference to the caller and the second parameter is the reference to the			// reference to the caller and the second parameter is the reference to the
	// callee.			// callee.
	string MergeFunc = F;			string MergeFunc = F;
	Show All 18 Lines

llvm/include/llvm/IR/Function.h

Show First 20 Lines • Show All 648 Lines • ▼ Show 20 Lines	public:
bool hasOptSize() const {		bool hasOptSize() const {
return hasFnAttribute(Attribute::OptimizeForSize) \|\| hasMinSize();		return hasFnAttribute(Attribute::OptimizeForSize) \|\| hasMinSize();
}		}

/// Returns the denormal handling type for the default rounding mode of the		/// Returns the denormal handling type for the default rounding mode of the
/// function.		/// function.
DenormalMode getDenormalMode(const fltSemantics &FPType) const;		DenormalMode getDenormalMode(const fltSemantics &FPType) const;

		/// Return the representational value of "denormal-fp-math". Code interested
		/// in the semantics of the function should use getDenormalMode instead.
		DenormalMode getDenormalModeRaw() const;

		/// Return the representational value of "denormal-fp-math-f32". Code
		/// interested in the semantics of the function should use getDenormalMode
		/// instead.
		DenormalMode getDenormalModeF32Raw() const;

/// copyAttributesFrom - copy all additional attributes (those not needed to		/// copyAttributesFrom - copy all additional attributes (those not needed to
/// create a Function) from the Function Src to this one.		/// create a Function) from the Function Src to this one.
void copyAttributesFrom(const Function *Src);		void copyAttributesFrom(const Function *Src);

/// deleteBody - This method deletes the body of the function, and converts		/// deleteBody - This method deletes the body of the function, and converts
/// the linkage to external.		/// the linkage to external.
///		///
void deleteBody() {		void deleteBody() {
▲ Show 20 Lines • Show All 286 Lines • Show Last 20 Lines

llvm/lib/Analysis/ConstantFolding.cpp

Show First 20 Lines • Show All 1,320 Lines • ▼ Show 20 Lines	if (auto *CE0 = dyn_cast<ConstantExpr>(Ops0)) {
// operands and try again.		// operands and try again.
Predicate = ICmpInst::getSwappedPredicate(Predicate);		Predicate = ICmpInst::getSwappedPredicate(Predicate);
return ConstantFoldCompareInstOperands(Predicate, Ops1, Ops0, DL, TLI);		return ConstantFoldCompareInstOperands(Predicate, Ops1, Ops0, DL, TLI);
}		}

// Flush any denormal constant float input according to denormal handling		// Flush any denormal constant float input according to denormal handling
// mode.		// mode.
Ops0 = FlushFPConstant(Ops0, I, /* IsOutput */ false);		Ops0 = FlushFPConstant(Ops0, I, /* IsOutput */ false);
		if (!Ops0)
		return nullptr;
Ops1 = FlushFPConstant(Ops1, I, /* IsOutput */ false);		Ops1 = FlushFPConstant(Ops1, I, /* IsOutput */ false);
		if (!Ops1)
		return nullptr;

return ConstantExpr::getCompare(Predicate, Ops0, Ops1);		return ConstantExpr::getCompare(Predicate, Ops0, Ops1);
}		}

Constant llvm::ConstantFoldUnaryOpOperand(unsigned Opcode, Constant Op,		Constant llvm::ConstantFoldUnaryOpOperand(unsigned Opcode, Constant Op,
const DataLayout &DL) {		const DataLayout &DL) {
assert(Instruction::isUnaryOp(Opcode));		assert(Instruction::isUnaryOp(Opcode));

Show All 18 Lines	Constant llvm::FlushFPConstant(Constant Operand, const Instruction *I,
if (!I \|\| !I->getParent() \|\| !I->getFunction())		if (!I \|\| !I->getParent() \|\| !I->getFunction())
return Operand;		return Operand;

ConstantFP *CFP = dyn_cast<ConstantFP>(Operand);		ConstantFP *CFP = dyn_cast<ConstantFP>(Operand);
if (!CFP)		if (!CFP)
return Operand;		return Operand;

const APFloat &APF = CFP->getValueAPF();		const APFloat &APF = CFP->getValueAPF();
		// TODO: Should this canonicalize nans?
		if (!APF.isDenormal())
		return Operand;

Type *Ty = CFP->getType();		Type *Ty = CFP->getType();
DenormalMode DenormMode =		DenormalMode DenormMode =
I->getFunction()->getDenormalMode(Ty->getFltSemantics());		I->getFunction()->getDenormalMode(Ty->getFltSemantics());
DenormalMode::DenormalModeKind Mode =		DenormalMode::DenormalModeKind Mode =
IsOutput ? DenormMode.Output : DenormMode.Input;		IsOutput ? DenormMode.Output : DenormMode.Input;
switch (Mode) {		switch (Mode) {
default:		default:
llvm_unreachable("unknown denormal mode");		llvm_unreachable("unknown denormal mode");
return Operand;		case DenormalMode::Dynamic:
		return nullptr;
case DenormalMode::IEEE:		case DenormalMode::IEEE:
		jcranmer-intelUnsubmitted Done Reply Inline Actions You should change the doxygen documentation to indicate that this method returns nullptr if the denormal mode is dynamic. Ditto for ConstantFoldFPInstOperands. jcranmer-intel: You should change the doxygen documentation to indicate that this method returns nullptr if the…
return Operand;		return Operand;
case DenormalMode::PreserveSign:		case DenormalMode::PreserveSign:
if (APF.isDenormal()) {		if (APF.isDenormal()) {
return ConstantFP::get(		return ConstantFP::get(
Ty->getContext(),		Ty->getContext(),
APFloat::getZero(Ty->getFltSemantics(), APF.isNegative()));		APFloat::getZero(Ty->getFltSemantics(), APF.isNegative()));
}		}
return Operand;		return Operand;
case DenormalMode::PositiveZero:		case DenormalMode::PositiveZero:
if (APF.isDenormal()) {		if (APF.isDenormal()) {
return ConstantFP::get(Ty->getContext(),		return ConstantFP::get(Ty->getContext(),
APFloat::getZero(Ty->getFltSemantics(), false));		APFloat::getZero(Ty->getFltSemantics(), false));
}		}
return Operand;		return Operand;
}		}
return Operand;		return Operand;
}		}

Constant llvm::ConstantFoldFPInstOperands(unsigned Opcode, Constant LHS,		Constant llvm::ConstantFoldFPInstOperands(unsigned Opcode, Constant LHS,
Constant *RHS, const DataLayout &DL,		Constant *RHS, const DataLayout &DL,
const Instruction *I) {		const Instruction *I) {
if (Instruction::isBinaryOp(Opcode)) {		if (Instruction::isBinaryOp(Opcode)) {
// Flush denormal inputs if needed.		// Flush denormal inputs if needed.
Constant Op0 = FlushFPConstant(LHS, I, / IsOutput */ false);		Constant Op0 = FlushFPConstant(LHS, I, / IsOutput */ false);
		if (!Op0)
		return nullptr;
Constant Op1 = FlushFPConstant(RHS, I, / IsOutput */ false);		Constant Op1 = FlushFPConstant(RHS, I, / IsOutput */ false);
		if (!Op1)
		return nullptr;

// Calculate constant result.		// Calculate constant result.
Constant *C = ConstantFoldBinaryOpOperands(Opcode, Op0, Op1, DL);		Constant *C = ConstantFoldBinaryOpOperands(Opcode, Op0, Op1, DL);
if (!C)		if (!C)
return nullptr;		return nullptr;

// Flush denormal output if needed.		// Flush denormal output if needed.
return FlushFPConstant(C, I, /* IsOutput */ true);		return FlushFPConstant(C, I, /* IsOutput */ true);
▲ Show 20 Lines • Show All 557 Lines • ▼ Show 20 Lines	static Constant constantFoldCanonicalize(const Type Ty, const CallBase *CI,
// Denorms and nans may have special encodings, but it should be OK to fold a		// Denorms and nans may have special encodings, but it should be OK to fold a
// totally average number.		// totally average number.
if (Src.isNormal() \|\| Src.isInfinity())		if (Src.isNormal() \|\| Src.isInfinity())
return ConstantFP::get(CI->getContext(), Src);		return ConstantFP::get(CI->getContext(), Src);

if (Src.isDenormal() && CI->getParent() && CI->getFunction()) {		if (Src.isDenormal() && CI->getParent() && CI->getFunction()) {
DenormalMode DenormMode =		DenormalMode DenormMode =
CI->getFunction()->getDenormalMode(Src.getSemantics());		CI->getFunction()->getDenormalMode(Src.getSemantics());

		// TODO: Should allow folding for pure IEEE.
if (DenormMode == DenormalMode::getIEEE())		if (DenormMode == DenormalMode::getIEEE())
return nullptr;		return nullptr;

		if (DenormMode == DenormalMode::getDynamic())
		return nullptr;

		// If we know if either input or output is flushed, we can fold.
		if ((DenormMode.Input == DenormalMode::Dynamic &&
		DenormMode.Output == DenormalMode::IEEE) \|\|
		(DenormMode.Input == DenormalMode::IEEE &&
		DenormMode.Output == DenormalMode::Dynamic))
		return nullptr;

bool IsPositive =		bool IsPositive =
(!Src.isNegative() \|\| DenormMode.Input == DenormalMode::PositiveZero \|\|		(!Src.isNegative() \|\| DenormMode.Input == DenormalMode::PositiveZero \|\|
(DenormMode.Output == DenormalMode::PositiveZero &&		(DenormMode.Output == DenormalMode::PositiveZero &&
DenormMode.Input == DenormalMode::IEEE));		DenormMode.Input == DenormalMode::IEEE));

return ConstantFP::get(CI->getContext(),		return ConstantFP::get(CI->getContext(),
APFloat::getZero(Src.getSemantics(), !IsPositive));		APFloat::getZero(Src.getSemantics(), !IsPositive));
}		}

return nullptr;		return nullptr;
}		}

static Constant *ConstantFoldScalarCall1(StringRef Name,		static Constant *ConstantFoldScalarCall1(StringRef Name,
▲ Show 20 Lines • Show All 1,481 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CommandFlags.cpp

Show First 20 Lines • Show All 235 Lines • ▼ Show 20 Lines	#define CGBINDOPT(NAME) \

static cl::opt<bool> EnableNoTrappingFPMath(		static cl::opt<bool> EnableNoTrappingFPMath(
"enable-no-trapping-fp-math",		"enable-no-trapping-fp-math",
cl::desc("Enable setting the FP exceptions build "		cl::desc("Enable setting the FP exceptions build "
"attribute not to use exceptions"),		"attribute not to use exceptions"),
cl::init(false));		cl::init(false));
CGBINDOPT(EnableNoTrappingFPMath);		CGBINDOPT(EnableNoTrappingFPMath);

static const auto DenormFlagEnumOptions =		static const auto DenormFlagEnumOptions = cl::values(
cl::values(clEnumValN(DenormalMode::IEEE, "ieee",		clEnumValN(DenormalMode::IEEE, "ieee", "IEEE 754 denormal numbers"),
"IEEE 754 denormal numbers"),
clEnumValN(DenormalMode::PreserveSign, "preserve-sign",		clEnumValN(DenormalMode::PreserveSign, "preserve-sign",
"the sign of a flushed-to-zero number is preserved "		"the sign of a flushed-to-zero number is preserved "
"in the sign of 0"),		"in the sign of 0"),
clEnumValN(DenormalMode::PositiveZero, "positive-zero",		clEnumValN(DenormalMode::PositiveZero, "positive-zero",
"denormals are flushed to positive zero"));		"denormals are flushed to positive zero"),
		clEnumValN(DenormalMode::Dynamic, "dynamic",
		"denormals have unknown treatment"));

// FIXME: Doesn't have way to specify separate input and output modes.		// FIXME: Doesn't have way to specify separate input and output modes.
static cl::opt<DenormalMode::DenormalModeKind> DenormalFPMath(		static cl::opt<DenormalMode::DenormalModeKind> DenormalFPMath(
"denormal-fp-math",		"denormal-fp-math",
cl::desc("Select which denormal numbers the code is permitted to require"),		cl::desc("Select which denormal numbers the code is permitted to require"),
cl::init(DenormalMode::IEEE),		cl::init(DenormalMode::IEEE),
DenormFlagEnumOptions);		DenormFlagEnumOptions);
CGBINDOPT(DenormalFPMath);		CGBINDOPT(DenormalFPMath);
▲ Show 20 Lines • Show All 459 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 6,739 Lines • ▼ Show 20 Lines
	}			}

	SDValue TargetLowering::getSqrtInputTest(SDValue Op, SelectionDAG &DAG,			SDValue TargetLowering::getSqrtInputTest(SDValue Op, SelectionDAG &DAG,
	const DenormalMode &Mode) const {			const DenormalMode &Mode) const {
	SDLoc DL(Op);			SDLoc DL(Op);
	EVT VT = Op.getValueType();			EVT VT = Op.getValueType();
	EVT CCVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);			EVT CCVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);
	SDValue FPZero = DAG.getConstantFP(0.0, DL, VT);			SDValue FPZero = DAG.getConstantFP(0.0, DL, VT);
	// Testing it with denormal inputs to avoid wrong estimate.
	if (Mode.Input == DenormalMode::IEEE) {
	// This is specifically a check for the handling of denormal inputs,
	// not the result.

				// This is specifically a check for the handling of denormal inputs, not the
				// result.
				if (Mode.Input == DenormalMode::PreserveSign \|\|
				Mode.Input == DenormalMode::PositiveZero) {
				// Test = X == 0.0
				return DAG.getSetCC(DL, CCVT, Op, FPZero, ISD::SETEQ);
				}

				// Testing it with denormal inputs to avoid wrong estimate.
				//
	// Test = fabs(X) < SmallestNormal			// Test = fabs(X) < SmallestNormal
	const fltSemantics &FltSem = DAG.EVTToAPFloatSemantics(VT);			const fltSemantics &FltSem = DAG.EVTToAPFloatSemantics(VT);
	APFloat SmallestNorm = APFloat::getSmallestNormalized(FltSem);			APFloat SmallestNorm = APFloat::getSmallestNormalized(FltSem);
	SDValue NormC = DAG.getConstantFP(SmallestNorm, DL, VT);			SDValue NormC = DAG.getConstantFP(SmallestNorm, DL, VT);
	SDValue Fabs = DAG.getNode(ISD::FABS, DL, VT, Op);			SDValue Fabs = DAG.getNode(ISD::FABS, DL, VT, Op);
	return DAG.getSetCC(DL, CCVT, Fabs, NormC, ISD::SETLT);			return DAG.getSetCC(DL, CCVT, Fabs, NormC, ISD::SETLT);
	}			}
	// Test = X == 0.0
	return DAG.getSetCC(DL, CCVT, Op, FPZero, ISD::SETEQ);
	}

	SDValue TargetLowering::getNegatedExpression(SDValue Op, SelectionDAG &DAG,			SDValue TargetLowering::getNegatedExpression(SDValue Op, SelectionDAG &DAG,
	bool LegalOps, bool OptForSize,			bool LegalOps, bool OptForSize,
	NegatibleCost &Cost,			NegatibleCost &Cost,
	unsigned Depth) const {			unsigned Depth) const {
	// fneg is removable even if it has multiple uses.			// fneg is removable even if it has multiple uses.
	if (Op.getOpcode() == ISD::FNEG) {			if (Op.getOpcode() == ISD::FNEG) {
	Cost = NegatibleCost::Cheaper;			Cost = NegatibleCost::Cheaper;
	▲ Show 20 Lines • Show All 3,800 Lines • Show Last 20 Lines

llvm/lib/IR/Attributes.cpp

	Show First 20 Lines • Show All 1,992 Lines • ▼ Show 20 Lines
	AttributeMask AttributeFuncs::getUBImplyingAttributes() {			AttributeMask AttributeFuncs::getUBImplyingAttributes() {
	AttributeMask AM;			AttributeMask AM;
	AM.addAttribute(Attribute::NoUndef);			AM.addAttribute(Attribute::NoUndef);
	AM.addAttribute(Attribute::Dereferenceable);			AM.addAttribute(Attribute::Dereferenceable);
	AM.addAttribute(Attribute::DereferenceableOrNull);			AM.addAttribute(Attribute::DereferenceableOrNull);
	return AM;			return AM;
	}			}

				/// Callees with dynamic denormal modes are compatible with any caller mode.
				static bool denormModeCompatible(DenormalMode CallerMode,
				DenormalMode CalleeMode) {
				if (CallerMode == CalleeMode \|\| CalleeMode == DenormalMode::getDynamic())
				return true;

				// If they don't exactly match, it's OK if the mismatched component is
				// dynamic.
				if (CalleeMode.Input == CallerMode.Input &&
				CalleeMode.Output == DenormalMode::Dynamic)
				return true;

				if (CalleeMode.Output == CallerMode.Output &&
				CalleeMode.Input == DenormalMode::Dynamic)
				return true;
				return false;
				}

				static bool checkDenormMode(const Function &Caller, const Function &Callee) {
				DenormalMode CallerMode = Caller.getDenormalModeRaw();
				DenormalMode CalleeMode = Callee.getDenormalModeRaw();

				if (denormModeCompatible(CallerMode, CalleeMode)) {
				DenormalMode CallerModeF32 = Caller.getDenormalModeF32Raw();
				DenormalMode CalleeModeF32 = Callee.getDenormalModeF32Raw();
				return denormModeCompatible(CallerModeF32, CalleeModeF32);
				}

				return false;
				}

	template<typename AttrClass>			template<typename AttrClass>
	static bool isEqual(const Function &Caller, const Function &Callee) {			static bool isEqual(const Function &Caller, const Function &Callee) {
	return Caller.getFnAttribute(AttrClass::getKind()) ==			return Caller.getFnAttribute(AttrClass::getKind()) ==
	Callee.getFnAttribute(AttrClass::getKind());			Callee.getFnAttribute(AttrClass::getKind());
	}			}

	/// Compute the logical AND of the attributes of the caller and the			/// Compute the logical AND of the attributes of the caller and the
	/// callee.			/// callee.
	▲ Show 20 Lines • Show All 202 Lines • Show Last 20 Lines

llvm/lib/IR/Function.cpp

	Show First 20 Lines • Show All 696 Lines • ▼ Show 20 Lines
	void Function::addDereferenceableOrNullParamAttr(unsigned ArgNo,			void Function::addDereferenceableOrNullParamAttr(unsigned ArgNo,
	uint64_t Bytes) {			uint64_t Bytes) {
	AttributeSets = AttributeSets.addDereferenceableOrNullParamAttr(getContext(),			AttributeSets = AttributeSets.addDereferenceableOrNullParamAttr(getContext(),
	ArgNo, Bytes);			ArgNo, Bytes);
	}			}

	DenormalMode Function::getDenormalMode(const fltSemantics &FPType) const {			DenormalMode Function::getDenormalMode(const fltSemantics &FPType) const {
	if (&FPType == &APFloat::IEEEsingle()) {			if (&FPType == &APFloat::IEEEsingle()) {
	Attribute Attr = getFnAttribute("denormal-fp-math-f32");			DenormalMode Mode = getDenormalModeF32Raw();
	StringRef Val = Attr.getValueAsString();
	if (!Val.empty())
	return parseDenormalFPAttribute(Val);

	// If the f32 variant of the attribute isn't specified, try to use the			// If the f32 variant of the attribute isn't specified, try to use the
	// generic one.			// generic one.
				if (Mode.isValid())
				return Mode;
	}			}

				return getDenormalModeRaw();
				}

				DenormalMode Function::getDenormalModeRaw() const {
	Attribute Attr = getFnAttribute("denormal-fp-math");			Attribute Attr = getFnAttribute("denormal-fp-math");
	return parseDenormalFPAttribute(Attr.getValueAsString());			StringRef Val = Attr.getValueAsString();
				return parseDenormalFPAttribute(Val);
				}

				DenormalMode Function::getDenormalModeF32Raw() const {
				Attribute Attr = getFnAttribute("denormal-fp-math-f32");
				if (Attr.isValid()) {
				StringRef Val = Attr.getValueAsString();
				return parseDenormalFPAttribute(Val);
				}

				return DenormalMode::getInvalid();
	}			}

	const std::string &Function::getGC() const {			const std::string &Function::getGC() const {
	assert(hasGC() && "Function has no collector");			assert(hasGC() && "Function has no collector");
	return getContext().getGC(*this);			return getContext().getGC(*this);
	}			}

	void Function::setGC(std::string Str) {			void Function::setGC(std::string Str) {
	▲ Show 20 Lines • Show All 1,425 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.h

Show All 28 Lines	struct SIModeRegisterDefaults {
/// If this is set, neither input or output denormals are flushed for most f32		/// If this is set, neither input or output denormals are flushed for most f32
/// instructions.		/// instructions.
DenormalMode FP32Denormals;		DenormalMode FP32Denormals;

/// If this is set, neither input or output denormals are flushed for both f64		/// If this is set, neither input or output denormals are flushed for both f64
/// and f16/v2f16 instructions.		/// and f16/v2f16 instructions.
DenormalMode FP64FP16Denormals;		DenormalMode FP64FP16Denormals;

SIModeRegisterDefaults()		SIModeRegisterDefaults() :
: IEEE(true), DX10Clamp(true), FP32Denormals(DenormalMode::getIEEE()),		IEEE(true),
		DX10Clamp(true),
		FP32Denormals(DenormalMode::getIEEE()),
FP64FP16Denormals(DenormalMode::getIEEE()) {}		FP64FP16Denormals(DenormalMode::getIEEE()) {}

SIModeRegisterDefaults(const Function &F);		SIModeRegisterDefaults(const Function &F);

static SIModeRegisterDefaults getDefaultForCallingConv(CallingConv::ID CC) {		static SIModeRegisterDefaults getDefaultForCallingConv(CallingConv::ID CC) {
SIModeRegisterDefaults Mode;		SIModeRegisterDefaults Mode;
Mode.IEEE = !AMDGPU::isShader(CC);		Mode.IEEE = !AMDGPU::isShader(CC);
return Mode;		return Mode;
}		}
Show All 31 Lines	if (FP64FP16Denormals == DenormalMode::getPreserveSign())
return FP_DENORM_FLUSH_IN_FLUSH_OUT;		return FP_DENORM_FLUSH_IN_FLUSH_OUT;
if (FP64FP16Denormals.Output == DenormalMode::PreserveSign)		if (FP64FP16Denormals.Output == DenormalMode::PreserveSign)
return FP_DENORM_FLUSH_OUT;		return FP_DENORM_FLUSH_OUT;
if (FP64FP16Denormals.Input == DenormalMode::PreserveSign)		if (FP64FP16Denormals.Input == DenormalMode::PreserveSign)
return FP_DENORM_FLUSH_IN;		return FP_DENORM_FLUSH_IN;
return FP_DENORM_FLUSH_NONE;		return FP_DENORM_FLUSH_NONE;
}		}

/// Returns true if a flag is compatible if it's enabled in the callee, but
/// disabled in the caller.
static bool oneWayCompatible(bool CallerMode, bool CalleeMode) {
return CallerMode == CalleeMode \|\| (!CallerMode && CalleeMode);
}

// FIXME: Inlining should be OK for dx10-clamp, since the caller's mode should		// FIXME: Inlining should be OK for dx10-clamp, since the caller's mode should
// be able to override.		// be able to override.
bool isInlineCompatible(SIModeRegisterDefaults CalleeMode) const {		bool isInlineCompatible(SIModeRegisterDefaults CalleeMode) const {
if (DX10Clamp != CalleeMode.DX10Clamp)		return DX10Clamp == CalleeMode.DX10Clamp && IEEE == CalleeMode.IEEE;
return false;
if (IEEE != CalleeMode.IEEE)
return false;

// Allow inlining denormals enabled into denormals flushed functions.
return oneWayCompatible(FP64FP16Denormals.Input !=
DenormalMode::PreserveSign,
CalleeMode.FP64FP16Denormals.Input !=
DenormalMode::PreserveSign) &&
oneWayCompatible(FP64FP16Denormals.Output !=
DenormalMode::PreserveSign,
CalleeMode.FP64FP16Denormals.Output !=
DenormalMode::PreserveSign) &&
oneWayCompatible(FP32Denormals.Input != DenormalMode::PreserveSign,
CalleeMode.FP32Denormals.Input !=
DenormalMode::PreserveSign) &&
oneWayCompatible(FP32Denormals.Output != DenormalMode::PreserveSign,
CalleeMode.FP32Denormals.Output !=
DenormalMode::PreserveSign);
}		}
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_LIB_TARGET_AMDGPU_SIMODEREGISTERDEFAULTS_H		#endif // LLVM_LIB_TARGET_AMDGPU_SIMODEREGISTERDEFAULTS_H

llvm/test/CodeGen/Generic/denormal-fp-math-cl-opt.ll

This file was added.

				; RUN: llc -denormal-fp-math=dynamic --denormal-fp-math-f32=preserve-sign -stop-after=finalize-isel < %s \| FileCheck %s

				; Check that the command line flag annotates the IR with the
				; appropriate attributes.
				traUnsubmitted Not Done Reply Inline Actions Edit: `Check that the command line flag annotates the IR with the appropriate attributes.` tra: Edit: `Check that the command line flag annotates the IR with the appropriate attributes.`
				traUnsubmitted Done Reply Inline Actions ^^ The comment still needs to be edited. tra: ^^ The comment still needs to be edited.

				; CHECK: attributes #0 = { "denormal-fp-math"="dynamic,dynamic" "denormal-fp-math-f32"="preserve-sign,preserve-sign" }
				define float @foo(float %var) {
				ret float %var
				}

llvm/test/CodeGen/X86/sqrt-fastmath.ll

Show First 20 Lines • Show All 177 Lines • ▼ Show 20 Lines
; AVX-LABEL: sqrt_v4f32_check_denorms:		; AVX-LABEL: sqrt_v4f32_check_denorms:
; AVX: # %bb.0:		; AVX: # %bb.0:
; AVX-NEXT: vsqrtps %xmm0, %xmm0		; AVX-NEXT: vsqrtps %xmm0, %xmm0
; AVX-NEXT: retq		; AVX-NEXT: retq
%call = tail call <4 x float> @llvm.sqrt.v4f32(<4 x float> %x) #2		%call = tail call <4 x float> @llvm.sqrt.v4f32(<4 x float> %x) #2
ret <4 x float> %call		ret <4 x float> %call
}		}

define <4 x float> @sqrt_v4f32_check_denorms_ninf(<4 x float> %x) #3 {		define <4 x float> @sqrt_v4f32_check_denorms_ieee_ninf(<4 x float> %x) #3 {
; SSE-LABEL: sqrt_v4f32_check_denorms_ninf:		; SSE-LABEL: sqrt_v4f32_check_denorms_ieee_ninf:
; SSE: # %bb.0:		; SSE: # %bb.0:
; SSE-NEXT: rsqrtps %xmm0, %xmm1		; SSE-NEXT: rsqrtps %xmm0, %xmm1
; SSE-NEXT: movaps %xmm0, %xmm2		; SSE-NEXT: movaps %xmm0, %xmm2
; SSE-NEXT: mulps %xmm1, %xmm2		; SSE-NEXT: mulps %xmm1, %xmm2
; SSE-NEXT: movaps {{.*#+}} xmm3 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]		; SSE-NEXT: movaps {{.*#+}} xmm3 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]
; SSE-NEXT: mulps %xmm2, %xmm3		; SSE-NEXT: mulps %xmm2, %xmm3
; SSE-NEXT: mulps %xmm1, %xmm2		; SSE-NEXT: mulps %xmm1, %xmm2
; SSE-NEXT: addps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2		; SSE-NEXT: addps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2
; SSE-NEXT: mulps %xmm3, %xmm2		; SSE-NEXT: mulps %xmm3, %xmm2
; SSE-NEXT: andps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0		; SSE-NEXT: andps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
; SSE-NEXT: movaps {{.*#+}} xmm1 = [1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]		; SSE-NEXT: movaps {{.*#+}} xmm1 = [1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]
; SSE-NEXT: cmpleps %xmm0, %xmm1		; SSE-NEXT: cmpleps %xmm0, %xmm1
; SSE-NEXT: andps %xmm2, %xmm1		; SSE-NEXT: andps %xmm2, %xmm1
; SSE-NEXT: movaps %xmm1, %xmm0		; SSE-NEXT: movaps %xmm1, %xmm0
; SSE-NEXT: retq		; SSE-NEXT: retq
;		;
; AVX1-LABEL: sqrt_v4f32_check_denorms_ninf:		; AVX1-LABEL: sqrt_v4f32_check_denorms_ieee_ninf:
; AVX1: # %bb.0:		; AVX1: # %bb.0:
; AVX1-NEXT: vrsqrtps %xmm0, %xmm1		; AVX1-NEXT: vrsqrtps %xmm0, %xmm1
; AVX1-NEXT: vmulps %xmm1, %xmm0, %xmm2		; AVX1-NEXT: vmulps %xmm1, %xmm0, %xmm2
; AVX1-NEXT: vmulps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2, %xmm3		; AVX1-NEXT: vmulps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2, %xmm3
; AVX1-NEXT: vmulps %xmm1, %xmm2, %xmm1		; AVX1-NEXT: vmulps %xmm1, %xmm2, %xmm1
; AVX1-NEXT: vaddps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm1		; AVX1-NEXT: vaddps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm1
; AVX1-NEXT: vmulps %xmm1, %xmm3, %xmm1		; AVX1-NEXT: vmulps %xmm1, %xmm3, %xmm1
; AVX1-NEXT: vandps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0		; AVX1-NEXT: vandps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
; AVX1-NEXT: vmovaps {{.*#+}} xmm2 = [1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]		; AVX1-NEXT: vmovaps {{.*#+}} xmm2 = [1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]
; AVX1-NEXT: vcmpleps %xmm0, %xmm2, %xmm0		; AVX1-NEXT: vcmpleps %xmm0, %xmm2, %xmm0
; AVX1-NEXT: vandps %xmm1, %xmm0, %xmm0		; AVX1-NEXT: vandps %xmm1, %xmm0, %xmm0
; AVX1-NEXT: retq		; AVX1-NEXT: retq
;		;
; AVX512-LABEL: sqrt_v4f32_check_denorms_ninf:		; AVX512-LABEL: sqrt_v4f32_check_denorms_ieee_ninf:
		; AVX512: # %bb.0:
		; AVX512-NEXT: vrsqrtps %xmm0, %xmm1
		; AVX512-NEXT: vmulps %xmm1, %xmm0, %xmm2
		; AVX512-NEXT: vbroadcastss {{.*#+}} xmm3 = [-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0]
		; AVX512-NEXT: vfmadd231ps {{.#+}} xmm3 = (xmm2 xmm1) + xmm3
		; AVX512-NEXT: vbroadcastss {{.*#+}} xmm1 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]
		; AVX512-NEXT: vmulps %xmm1, %xmm2, %xmm1
		; AVX512-NEXT: vmulps %xmm3, %xmm1, %xmm1
		; AVX512-NEXT: vbroadcastss {{.*#+}} xmm2 = [NaN,NaN,NaN,NaN]
		; AVX512-NEXT: vandps %xmm2, %xmm0, %xmm0
		; AVX512-NEXT: vbroadcastss {{.*#+}} xmm2 = [1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]
		; AVX512-NEXT: vcmpleps %xmm0, %xmm2, %xmm0
		; AVX512-NEXT: vandps %xmm1, %xmm0, %xmm0
		; AVX512-NEXT: retq
		%call = tail call ninf afn <4 x float> @llvm.sqrt.v4f32(<4 x float> %x) #2
		ret <4 x float> %call
		}

		define <4 x float> @sqrt_v4f32_check_denorms_dynamic_ninf(<4 x float> %x) #6 {
		; SSE-LABEL: sqrt_v4f32_check_denorms_dynamic_ninf:
		; SSE: # %bb.0:
		; SSE-NEXT: rsqrtps %xmm0, %xmm1
		; SSE-NEXT: movaps %xmm0, %xmm2
		; SSE-NEXT: mulps %xmm1, %xmm2
		; SSE-NEXT: movaps {{.*#+}} xmm3 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]
		; SSE-NEXT: mulps %xmm2, %xmm3
		; SSE-NEXT: mulps %xmm1, %xmm2
		; SSE-NEXT: addps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2
		; SSE-NEXT: mulps %xmm3, %xmm2
		; SSE-NEXT: andps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
		; SSE-NEXT: movaps {{.*#+}} xmm1 = [1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]
		; SSE-NEXT: cmpleps %xmm0, %xmm1
		; SSE-NEXT: andps %xmm2, %xmm1
		; SSE-NEXT: movaps %xmm1, %xmm0
		; SSE-NEXT: retq
		;
		; AVX1-LABEL: sqrt_v4f32_check_denorms_dynamic_ninf:
		; AVX1: # %bb.0:
		; AVX1-NEXT: vrsqrtps %xmm0, %xmm1
		; AVX1-NEXT: vmulps %xmm1, %xmm0, %xmm2
		; AVX1-NEXT: vmulps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm2, %xmm3
		; AVX1-NEXT: vmulps %xmm1, %xmm2, %xmm1
		; AVX1-NEXT: vaddps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm1
		; AVX1-NEXT: vmulps %xmm1, %xmm3, %xmm1
		; AVX1-NEXT: vandps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
		; AVX1-NEXT: vmovaps {{.*#+}} xmm2 = [1.17549435E-38,1.17549435E-38,1.17549435E-38,1.17549435E-38]
		; AVX1-NEXT: vcmpleps %xmm0, %xmm2, %xmm0
		; AVX1-NEXT: vandps %xmm1, %xmm0, %xmm0
		; AVX1-NEXT: retq
		;
		; AVX512-LABEL: sqrt_v4f32_check_denorms_dynamic_ninf:
; AVX512: # %bb.0:		; AVX512: # %bb.0:
; AVX512-NEXT: vrsqrtps %xmm0, %xmm1		; AVX512-NEXT: vrsqrtps %xmm0, %xmm1
; AVX512-NEXT: vmulps %xmm1, %xmm0, %xmm2		; AVX512-NEXT: vmulps %xmm1, %xmm0, %xmm2
; AVX512-NEXT: vbroadcastss {{.*#+}} xmm3 = [-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0]		; AVX512-NEXT: vbroadcastss {{.*#+}} xmm3 = [-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0]
; AVX512-NEXT: vfmadd231ps {{.#+}} xmm3 = (xmm2 xmm1) + xmm3		; AVX512-NEXT: vfmadd231ps {{.#+}} xmm3 = (xmm2 xmm1) + xmm3
; AVX512-NEXT: vbroadcastss {{.*#+}} xmm1 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]		; AVX512-NEXT: vbroadcastss {{.*#+}} xmm1 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]
; AVX512-NEXT: vmulps %xmm1, %xmm2, %xmm1		; AVX512-NEXT: vmulps %xmm1, %xmm2, %xmm1
; AVX512-NEXT: vmulps %xmm3, %xmm1, %xmm1		; AVX512-NEXT: vmulps %xmm3, %xmm1, %xmm1
▲ Show 20 Lines • Show All 739 Lines • ▼ Show 20 Lines	; AVX-NEXT: retq
%rsqrt = fdiv fast double 42.0, %sqrt		%rsqrt = fdiv fast double 42.0, %sqrt
store double %rsqrt, ptr %p, align 8		store double %rsqrt, ptr %p, align 8
ret double %sqrt_fast		ret double %sqrt_fast
}		}

attributes #0 = { "unsafe-fp-math"="true" "reciprocal-estimates"="!sqrtf,!vec-sqrtf,!divf,!vec-divf" }		attributes #0 = { "unsafe-fp-math"="true" "reciprocal-estimates"="!sqrtf,!vec-sqrtf,!divf,!vec-divf" }
attributes #1 = { "unsafe-fp-math"="true" "reciprocal-estimates"="sqrt,vec-sqrt" }		attributes #1 = { "unsafe-fp-math"="true" "reciprocal-estimates"="sqrt,vec-sqrt" }
attributes #2 = { nounwind readnone }		attributes #2 = { nounwind readnone }
attributes #3 = { "unsafe-fp-math"="true" "reciprocal-estimates"="sqrt,vec-sqrt" "denormal-fp-math"="ieee" }		attributes #3 = { "unsafe-fp-math"="true" "reciprocal-estimates"="sqrt,vec-sqrt" "denormal-fp-math"="preserve-sign,ieee" }
attributes #4 = { "unsafe-fp-math"="true" "reciprocal-estimates"="sqrt,vec-sqrt" "denormal-fp-math"="ieee,preserve-sign" }		attributes #4 = { "unsafe-fp-math"="true" "reciprocal-estimates"="sqrt,vec-sqrt" "denormal-fp-math"="ieee,preserve-sign" }
attributes #5 = { "unsafe-fp-math"="true" "reciprocal-estimates"="all:0" }		attributes #5 = { "unsafe-fp-math"="true" "reciprocal-estimates"="all:0" }
		attributes #6 = { "unsafe-fp-math"="true" "reciprocal-estimates"="sqrt,vec-sqrt" "denormal-fp-math"="preserve-sign,dynamic" }

llvm/test/Transforms/Inline/AMDGPU/inline-denormal-fp-math.ll

	Show All 31 Lines

	define i32 @func_ieee_psz() #3 {			define i32 @func_ieee_psz() #3 {
	; CHECK-LABEL: @func_ieee_psz(			; CHECK-LABEL: @func_ieee_psz(
	; CHECK-NEXT: ret i32 4			; CHECK-NEXT: ret i32 4
	;			;
	ret i32 4			ret i32 4
	}			}

				define i32 @func_dynamic_dynamic() #4 {
				; CHECK-LABEL: @func_dynamic_dynamic(
				; CHECK-NEXT: ret i32 5
				;
				ret i32 5
				}

				define i32 @func_dynamic_ieee() #5 {
				; CHECK-LABEL: @func_dynamic_ieee(
				; CHECK-NEXT: ret i32 6
				;
				ret i32 6
				}

				define i32 @func_ieee_dynamic() #6 {
				; CHECK-LABEL: @func_ieee_dynamic(
				; CHECK-NEXT: ret i32 7
				;
				ret i32 7
				}

				define i32 @func_psz_dynamic() #7 {
				; CHECK-LABEL: @func_psz_dynamic(
				; CHECK-NEXT: ret i32 8
				;
				ret i32 8
				}

				define i32 @func_dynamic_psz() #8 {
				; CHECK-LABEL: @func_dynamic_psz(
				; CHECK-NEXT: ret i32 9
				;
				ret i32 9
				}

	define i32 @call_default_from_psz_psz() #1 {			define i32 @call_default_from_psz_psz() #1 {
	; CHECK-LABEL: @call_default_from_psz_psz(			; CHECK-LABEL: @call_default_from_psz_psz(
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_default()
				; CHECK-NEXT: ret i32 [[CALL]]
				kpnUnsubmitted Not Done Reply Inline Actions Are we changing the behavior in a way that may cause regressions? It looks like we've changed behavior in the absence of "dynamic". kpn: Are we changing the behavior in a way that may cause regressions? It looks like we've changed…
				arsenmAuthorUnsubmitted Done Reply Inline Actions This case is broken to begin with, calling ieee from daz code. This makes the inlining more conservative / noticeable for debugging arsenm: This case is broken to begin with, calling ieee from daz code. This makes the inlining more…
				kpnUnsubmitted Not Done Reply Inline Actions Can I talk you into mentioning this in your commit message? kpn: Can I talk you into mentioning this in your commit message?
	;			;
	%call = call i32 @func_default()			%call = call i32 @func_default()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_ieee_ieee_from_ieee_ieee() #0 {			define i32 @call_ieee_ieee_from_ieee_ieee() #0 {
	; CHECK-LABEL: @call_ieee_ieee_from_ieee_ieee(			; CHECK-LABEL: @call_ieee_ieee_from_ieee_ieee(
	; CHECK-NEXT: ret i32 1			; CHECK-NEXT: ret i32 1
	;			;
	%call = call i32 @func_ieee_ieee()			%call = call i32 @func_ieee_ieee()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_ieee_ieee_from_psz_psz() #1 {			define i32 @call_ieee_ieee_from_psz_psz() #1 {
	; CHECK-LABEL: @call_ieee_ieee_from_psz_psz(			; CHECK-LABEL: @call_ieee_ieee_from_psz_psz(
	; CHECK-NEXT: ret i32 1			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_ieee_ieee()			%call = call i32 @func_ieee_ieee()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_psz_psz_from_ieee_ieee() #0 {			define i32 @call_psz_psz_from_ieee_ieee() #0 {
	; CHECK-LABEL: @call_psz_psz_from_ieee_ieee(			; CHECK-LABEL: @call_psz_psz_from_ieee_ieee(
	; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()
	; CHECK-NEXT: ret i32 [[CALL]]			; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_psz_psz()			%call = call i32 @func_psz_psz()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_psz_psz_from_psz_psz() #1 {			define i32 @call_psz_psz_from_psz_psz() #1 {
	; CHECK-LABEL: @call_psz_psz_from_psz_psz(			; CHECK-LABEL: @call_psz_psz_from_psz_psz(
	; CHECK-NEXT: ret i32 2			; CHECK-NEXT: ret i32 2
	;			;
	%call = call i32 @func_psz_psz()			%call = call i32 @func_psz_psz()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_psz_ieee_from_psz_psz() #1 {			define i32 @call_psz_ieee_from_psz_psz() #1 {
	; CHECK-LABEL: @call_psz_ieee_from_psz_psz(			; CHECK-LABEL: @call_psz_ieee_from_psz_psz(
	; CHECK-NEXT: ret i32 3			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_psz_ieee()			%call = call i32 @func_psz_ieee()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_ieee_psz_from_psz_psz() #1 {			define i32 @call_ieee_psz_from_psz_psz() #1 {
	; CHECK-LABEL: @call_ieee_psz_from_psz_psz(			; CHECK-LABEL: @call_ieee_psz_from_psz_psz(
	; CHECK-NEXT: ret i32 4			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_ieee_psz()			%call = call i32 @func_ieee_psz()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_psz_ieee_from_ieee_ieee() #0 {			define i32 @call_psz_ieee_from_ieee_ieee() #0 {
	; CHECK-LABEL: @call_psz_ieee_from_ieee_ieee(			; CHECK-LABEL: @call_psz_ieee_from_ieee_ieee(
	; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
	Show All 9 Lines
	; CHECK-NEXT: ret i32 [[CALL]]			; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_ieee_psz()			%call = call i32 @func_ieee_psz()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_ieee_ieee_from_psz_ieee() #2 {			define i32 @call_ieee_ieee_from_psz_ieee() #2 {
	; CHECK-LABEL: @call_ieee_ieee_from_psz_ieee(			; CHECK-LABEL: @call_ieee_ieee_from_psz_ieee(
	; CHECK-NEXT: ret i32 1			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_ieee_ieee()			%call = call i32 @func_ieee_ieee()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_psz_ieee_from_psz_ieee() #2 {			define i32 @call_psz_ieee_from_psz_ieee() #2 {
	; CHECK-LABEL: @call_psz_ieee_from_psz_ieee(			; CHECK-LABEL: @call_psz_ieee_from_psz_ieee(
	; CHECK-NEXT: ret i32 3			; CHECK-NEXT: ret i32 3
	Show All 17 Lines
	; CHECK-NEXT: ret i32 [[CALL]]			; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_psz_psz()			%call = call i32 @func_psz_psz()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_ieee_ieee_from_ieee_psz() #3 {			define i32 @call_ieee_ieee_from_ieee_psz() #3 {
	; CHECK-LABEL: @call_ieee_ieee_from_ieee_psz(			; CHECK-LABEL: @call_ieee_ieee_from_ieee_psz(
	; CHECK-NEXT: ret i32 1			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_ieee_ieee()			%call = call i32 @func_ieee_ieee()
	ret i32 %call			ret i32 %call
	}			}

	define i32 @call_psz_ieee_from_ieee_psz() #3 {			define i32 @call_psz_ieee_from_ieee_psz() #3 {
	; CHECK-LABEL: @call_psz_ieee_from_ieee_psz(			; CHECK-LABEL: @call_psz_ieee_from_ieee_psz(
	; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
	Show All 15 Lines
	; CHECK-LABEL: @call_psz_psz_from_ieee_psz(			; CHECK-LABEL: @call_psz_psz_from_ieee_psz(
	; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()			; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()
	; CHECK-NEXT: ret i32 [[CALL]]			; CHECK-NEXT: ret i32 [[CALL]]
	;			;
	%call = call i32 @func_psz_psz()			%call = call i32 @func_psz_psz()
	ret i32 %call			ret i32 %call
	}			}

				define i32 @call_dynamic_dynamic_from_ieee_ieee() #0 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_ieee_ieee(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_ieee_ieee() #0 {
				; CHECK-LABEL: @call_dynamic_ieee_from_ieee_ieee(
				; CHECK-NEXT: ret i32 6
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_ieee_ieee() #0 {
				; CHECK-LABEL: @call_ieee_dynamic_from_ieee_ieee(
				; CHECK-NEXT: ret i32 7
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_ieee_ieee() #0 {
				; CHECK-LABEL: @call_dynamic_psz_from_ieee_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_ieee_ieee() #0 {
				; CHECK-LABEL: @call_psz_dynamic_from_ieee_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_dynamic_from_psz_psz() #1 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_psz_psz(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_psz_psz() #1 {
				; CHECK-LABEL: @call_ieee_dynamic_from_psz_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_psz_psz() #1 {
				; CHECK-LABEL: @call_dynamic_ieee_from_psz_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_psz_psz() #1 {
				; CHECK-LABEL: @call_psz_dynamic_from_psz_psz(
				; CHECK-NEXT: ret i32 8
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_psz_psz() #1 {
				; CHECK-LABEL: @call_dynamic_psz_from_psz_psz(
				; CHECK-NEXT: ret i32 9
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_dynamic_dynamic_from_psz_ieee() #2 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_psz_ieee(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_psz_ieee() #2 {
				; CHECK-LABEL: @call_dynamic_ieee_from_psz_ieee(
				; CHECK-NEXT: ret i32 6
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_psz_ieee() #2 {
				; CHECK-LABEL: @call_ieee_dynamic_from_psz_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_psz_ieee() #2 {
				; CHECK-LABEL: @call_dynamic_psz_from_psz_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_psz_ieee() #2 {
				; CHECK-LABEL: @call_psz_dynamic_from_psz_ieee(
				; CHECK-NEXT: ret i32 8
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_dynamic_from_ieee_psz() #3 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_ieee_psz(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_ieee_psz() #3 {
				; CHECK-LABEL: @call_ieee_dynamic_from_ieee_psz(
				; CHECK-NEXT: ret i32 7
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_ieee_psz() #3 {
				; CHECK-LABEL: @call_dynamic_ieee_from_ieee_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_ieee_psz() #3 {
				; CHECK-LABEL: @call_dynamic_psz_from_ieee_psz(
				; CHECK-NEXT: ret i32 9
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_ieee_psz() #3 {
				; CHECK-LABEL: @call_psz_dynamic_from_ieee_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_dynamic_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_dynamic_dynamic(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_ieee_ieee_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_ieee_ieee_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_ieee_dynamic_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_dynamic_ieee_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_psz_dynamic_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_dynamic_psz_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_psz_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_psz_psz_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_psz()
				ret i32 %call
				}

				define i32 @call_psz_ieee_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_psz_ieee_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_psz_from_dynamic_dynamic() #4 {
				; CHECK-LABEL: @call_ieee_psz_from_dynamic_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_psz()
				ret i32 %call
				}

				define i32 @call_ieee_ieee_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_ieee_ieee_from_dynamic_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_ieee_dynamic_from_dynamic_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_dynamic_ieee_from_dynamic_ieee(
				; CHECK-NEXT: ret i32 6
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_psz_dynamic_from_dynamic_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_dynamic_psz_from_dynamic_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_psz_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_psz_psz_from_dynamic_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_psz()
				ret i32 %call
				}

				define i32 @call_psz_ieee_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_psz_ieee_from_dynamic_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_psz_from_dynamic_ieee() #5 {
				; CHECK-LABEL: @call_ieee_psz_from_dynamic_ieee(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_psz()
				ret i32 %call
				}

				define i32 @call_ieee_ieee_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_ieee_ieee_from_ieee_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_ieee_dynamic_from_ieee_dynamic(
				; CHECK-NEXT: ret i32 7
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_dynamic_ieee_from_ieee_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_dynamic_dynamic_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_ieee_dynamic(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_dynamic_psz_from_ieee_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_psz_dynamic_from_ieee_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_psz_psz_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_psz_psz_from_ieee_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_psz()
				ret i32 %call
				}

				define i32 @call_psz_ieee_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_psz_ieee_from_ieee_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_psz_from_ieee_dynamic() #6 {
				; CHECK-LABEL: @call_ieee_psz_from_ieee_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_psz()
				ret i32 %call
				}

				define i32 @call_ieee_ieee_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_ieee_ieee_from_psz_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_ieee_dynamic_from_psz_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_dynamic_ieee_from_psz_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_dynamic_psz_from_psz_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_psz_dynamic_from_psz_dynamic(
				; CHECK-NEXT: ret i32 8
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_dynamic_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_psz_dynamic(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_psz_ieee_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_psz_ieee_from_psz_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_psz_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_ieee_psz_from_psz_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_psz()
				ret i32 %call
				}

				define i32 @call_psz_psz_from_psz_dynamic() #7 {
				; CHECK-LABEL: @call_psz_psz_from_psz_dynamic(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_psz()
				ret i32 %call
				}

				define i32 @call_ieee_ieee_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_ieee_ieee_from_dynamic_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_psz_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_ieee_psz_from_dynamic_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_psz()
				ret i32 %call
				}

				define i32 @call_psz_ieee_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_psz_ieee_from_dynamic_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_ieee()
				ret i32 %call
				}

				define i32 @call_dynamic_dynamic_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_dynamic_psz(
				; CHECK-NEXT: ret i32 5
				;
				%call = call i32 @func_dynamic_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_psz_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_dynamic_psz_from_dynamic_psz(
				; CHECK-NEXT: ret i32 9
				;
				%call = call i32 @func_dynamic_psz()
				ret i32 %call
				}

				define i32 @call_psz_dynamic_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_psz_dynamic_from_dynamic_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_dynamic()
				ret i32 %call
				}

				define i32 @call_dynamic_ieee_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_dynamic_ieee_from_dynamic_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_dynamic_ieee()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_dynamic_ieee()
				ret i32 %call
				}

				define i32 @call_ieee_dynamic_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_ieee_dynamic_from_dynamic_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_ieee_dynamic()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_ieee_dynamic()
				ret i32 %call
				}

				define i32 @call_psz_psz_from_dynamic_psz() #8 {
				; CHECK-LABEL: @call_psz_psz_from_dynamic_psz(
				; CHECK-NEXT: [[CALL:%.*]] = call i32 @func_psz_psz()
				; CHECK-NEXT: ret i32 [[CALL]]
				;
				%call = call i32 @func_psz_psz()
				ret i32 %call
				}

				; --------------------------------------------------------------------
				; denormal-fp-math-f32
				; --------------------------------------------------------------------

				define i32 @func_dynamic_dynamic_f32() #9 {
				; CHECK-LABEL: @func_dynamic_dynamic_f32(
				; CHECK-NEXT: ret i32 10
				;
				ret i32 10
				}

				define i32 @func_psz_psz_f32() #10 {
				; CHECK-LABEL: @func_psz_psz_f32(
				; CHECK-NEXT: ret i32 11
				;
				ret i32 11
				}

				define i32 @call_dynamic_dynamic_from_psz_psz_f32() #10 {
				; CHECK-LABEL: @call_dynamic_dynamic_from_psz_psz_f32(
				; CHECK-NEXT: ret i32 10
				;
				%result = call i32 @func_dynamic_dynamic_f32()
				ret i32 %result
				}

				define i32 @call_psz_psz_from_psz_psz_f32() #10 {
				; CHECK-LABEL: @call_psz_psz_from_psz_psz_f32(
				; CHECK-NEXT: ret i32 10
				;
				%result = call i32 @func_dynamic_dynamic_f32()
				ret i32 %result
				}

				define i32 @call_psz_psz_from_ieee_ieee_f32() #11 {
				; CHECK-LABEL: @call_psz_psz_from_ieee_ieee_f32(
				; CHECK-NEXT: [[RESULT:%.*]] = call i32 @func_psz_psz_f32()
				; CHECK-NEXT: ret i32 [[RESULT]]
				;
				%result = call i32 @func_psz_psz_f32()
				ret i32 %result
				}

	attributes #0 = { "denormal-fp-math"="ieee,ieee" }			attributes #0 = { "denormal-fp-math"="ieee,ieee" }
	attributes #1 = { "denormal-fp-math"="preserve-sign,preserve-sign" }			attributes #1 = { "denormal-fp-math"="preserve-sign,preserve-sign" }
	attributes #2 = { "denormal-fp-math"="preserve-sign,ieee" }			attributes #2 = { "denormal-fp-math"="preserve-sign,ieee" }
	attributes #3 = { "denormal-fp-math"="ieee,preserve-sign" }			attributes #3 = { "denormal-fp-math"="ieee,preserve-sign" }
				attributes #4 = { "denormal-fp-math"="dynamic,dynamic" }
				attributes #5 = { "denormal-fp-math"="dynamic,ieee" }
				attributes #6 = { "denormal-fp-math"="ieee,dynamic" }
				attributes #7 = { "denormal-fp-math"="preserve-sign,dynamic" }
				attributes #8 = { "denormal-fp-math"="dynamic,preserve-sign" }
				attributes #9 = { "denormal-fp-math-f32"="dynamic,dynamic" }
				attributes #10 = { "denormal-fp-math-f32"="preserve-sign,preserve-sign" }
				attributes #11 = { "denormal-fp-math-f32"="ieee,ieee" "denormal-fp-math"="preserve-sign,preserve-sign" }

llvm/test/Transforms/InstSimplify/canonicalize.ll

	Show First 20 Lines • Show All 140 Lines • ▼ Show 20 Lines
	define float @canonicalize_neg_denorm_positive_zero_input() "denormal-fp-math"="ieee,positive-zero" {			define float @canonicalize_neg_denorm_positive_zero_input() "denormal-fp-math"="ieee,positive-zero" {
	; CHECK-LABEL: @canonicalize_neg_denorm_positive_zero_input(			; CHECK-LABEL: @canonicalize_neg_denorm_positive_zero_input(
	; CHECK-NEXT: ret float 0.000000e+00			; CHECK-NEXT: ret float 0.000000e+00
	;			;
	%ret = call float @llvm.canonicalize.f32(float bitcast (i32 -2139095041 to float))			%ret = call float @llvm.canonicalize.f32(float bitcast (i32 -2139095041 to float))
	ret float %ret			ret float %ret
	}			}

				define float @canonicalize_pos_denorm_dynamic_dynamic() "denormal-fp-math"="dynamic,dynamic" {
				; CHECK-LABEL: @canonicalize_pos_denorm_dynamic_dynamic(
				; CHECK-NEXT: [[RET:%.*]] = call float @llvm.canonicalize.f32(float 0x380FFFFFC0000000)
				; CHECK-NEXT: ret float [[RET]]
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 8388607 to float))
				ret float %ret
				}

				define float @canonicalize_neg_denorm_dynamic_dynamic() "denormal-fp-math"="dynamic,dynamic" {
				; CHECK-LABEL: @canonicalize_neg_denorm_dynamic_dynamic(
				; CHECK-NEXT: [[RET:%.*]] = call float @llvm.canonicalize.f32(float 0xB80FFFFFC0000000)
				; CHECK-NEXT: ret float [[RET]]
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 -2139095041 to float))
				ret float %ret
				}

				; Dynamic output - cannot flush
				define float @canonicalize_pos_denorm_dynamic_output() "denormal-fp-math"="dynamic,ieee" {
				; CHECK-LABEL: @canonicalize_pos_denorm_dynamic_output(
				; CHECK-NEXT: [[RET:%.*]] = call float @llvm.canonicalize.f32(float 0x380FFFFFC0000000)
				; CHECK-NEXT: ret float [[RET]]
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 8388607 to float))
				ret float %ret
				}

				; Dynamic output - cannot flush
				define float @canonicalize_neg_denorm_dynamic_output() "denormal-fp-math"="dynamic,ieee" {
				; CHECK-LABEL: @canonicalize_neg_denorm_dynamic_output(
				; CHECK-NEXT: [[RET:%.*]] = call float @llvm.canonicalize.f32(float 0xB80FFFFFC0000000)
				; CHECK-NEXT: ret float [[RET]]
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 -2139095041 to float))
				ret float %ret
				}

				; Dynamic input - cannot flush
				define float @canonicalize_pos_denorm_dynamic_input() "denormal-fp-math"="ieee,dynamic" {
				; CHECK-LABEL: @canonicalize_pos_denorm_dynamic_input(
				; CHECK-NEXT: [[RET:%.*]] = call float @llvm.canonicalize.f32(float 0x380FFFFFC0000000)
				; CHECK-NEXT: ret float [[RET]]
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 8388607 to float))
				ret float %ret
				}

				; Dynamic input - cannot flush
				define float @canonicalize_neg_denorm_dynamic_input() "denormal-fp-math"="ieee,dynamic" {
				; CHECK-LABEL: @canonicalize_neg_denorm_dynamic_input(
				; CHECK-NEXT: [[RET:%.*]] = call float @llvm.canonicalize.f32(float 0xB80FFFFFC0000000)
				; CHECK-NEXT: ret float [[RET]]
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 -2139095041 to float))
				ret float %ret
				}

				; Input is flushed, can fold
				define float @canonicalize_pos_denorm_dynamic_output_preserve_sign_input() "denormal-fp-math"="dynamic,preserve-sign" {
				; CHECK-LABEL: @canonicalize_pos_denorm_dynamic_output_preserve_sign_input(
				; CHECK-NEXT: ret float 0.000000e+00
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 8388607 to float))
				ret float %ret
				}

				; Input is flushed, can fold
				define float @canonicalize_neg_denorm_dynamic_output_preserve_sign_input() "denormal-fp-math"="dynamic,preserve-sign" {
				; CHECK-LABEL: @canonicalize_neg_denorm_dynamic_output_preserve_sign_input(
				; CHECK-NEXT: ret float -0.000000e+00
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 -2139095041 to float))
				ret float %ret
				}

				; Output is known flushed, can fold
				define float @canonicalize_pos_preserve_sign_output_denorm_dynamic_input() "denormal-fp-math"="preserve-sign,dynamic" {
				; CHECK-LABEL: @canonicalize_pos_preserve_sign_output_denorm_dynamic_input(
				; CHECK-NEXT: ret float 0.000000e+00
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 8388607 to float))
				ret float %ret
				}

				; Output is known flushed, can fold
				define float @canonicalize_neg_denorm_preserve_sign_output_dynamic_input() "denormal-fp-math"="preserve-sign,dynamic" {
				; CHECK-LABEL: @canonicalize_neg_denorm_preserve_sign_output_dynamic_input(
				; CHECK-NEXT: ret float -0.000000e+00
				;
				%ret = call float @llvm.canonicalize.f32(float bitcast (i32 -2139095041 to float))
				ret float %ret
				}

	define float @canonicalize_inf() {			define float @canonicalize_inf() {
	; CHECK-LABEL: @canonicalize_inf(			; CHECK-LABEL: @canonicalize_inf(
	; CHECK-NEXT: ret float 0x7FF0000000000000			; CHECK-NEXT: ret float 0x7FF0000000000000
	;			;
	%ret = call float @llvm.canonicalize.f32(float 0x7FF0000000000000)			%ret = call float @llvm.canonicalize.f32(float 0x7FF0000000000000)
	ret float %ret			ret float %ret
	}			}

	▲ Show 20 Lines • Show All 525 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll

	Show First 20 Lines • Show All 1,094 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: ret i1 false			; CHECK-NEXT: ret i1 false
	;			;
	entry:			entry:
	%cmp = fcmp uno double 0x0008000000000000, 0x1ff1000000000000			%cmp = fcmp uno double 0x0008000000000000, 0x1ff1000000000000
	ret i1 %cmp			ret i1 %cmp
	}			}

				; ============================================================================ ;
				; dynamic mode tests
				; ============================================================================ ;

				define float @test_float_fadd_dynamic_ieee() #9 {
				; CHECK-LABEL: @test_float_fadd_dynamic_ieee(
				; CHECK-NEXT: [[RESULT:%.*]] = fadd float 0xB810000000000000, 0x3800000000000000
				; CHECK-NEXT: ret float [[RESULT]]
				;
				%result = fadd float 0xB810000000000000, 0x3800000000000000
				ret float %result
				}

				define float @test_float_fadd_ieee_dynamic() #10 {
				; CHECK-LABEL: @test_float_fadd_ieee_dynamic(
				; CHECK-NEXT: [[RESULT:%.*]] = fadd float 0xB810000000000000, 0x3800000000000000
				; CHECK-NEXT: ret float [[RESULT]]
				;
				%result = fadd float 0xB810000000000000, 0x3800000000000000
				ret float %result
				}

				define float @test_float_fadd_dynamic_dynamic() #11 {
				; CHECK-LABEL: @test_float_fadd_dynamic_dynamic(
				; CHECK-NEXT: [[RESULT:%.*]] = fadd float 0xB810000000000000, 0x3800000000000000
				; CHECK-NEXT: ret float [[RESULT]]
				;
				%result = fadd float 0xB810000000000000, 0x3800000000000000
				ret float %result
				}

				; Check for failed to fold on each operand
				define float @test_float_fadd_dynamic_dynamic_commute() #11 {
				; CHECK-LABEL: @test_float_fadd_dynamic_dynamic_commute(
				; CHECK-NEXT: [[RESULT:%.*]] = fadd float 0x3800000000000000, 0xB810000000000000
				; CHECK-NEXT: ret float [[RESULT]]
				;
				%result = fadd float 0x3800000000000000, 0xB810000000000000
				ret float %result
				}

				define i1 @fcmp_double_dynamic_ieee() #9 {
				; CHECK-LABEL: @fcmp_double_dynamic_ieee(
				; CHECK-NEXT: ret i1 true
				;
				%cmp = fcmp une double 0x0008000000000000, 0x0
				ret i1 %cmp
				}

				define i1 @fcmp_double_ieee_dynamic() #10 {
				; CHECK-LABEL: @fcmp_double_ieee_dynamic(
				; CHECK-NEXT: [[CMP:%.*]] = fcmp une double 0x8000000000000, 0.000000e+00
				; CHECK-NEXT: ret i1 [[CMP]]
				;
				%cmp = fcmp une double 0x0008000000000000, 0x0
				ret i1 %cmp
				}

				define i1 @fcmp_double_dynamic_dynamic() #11 {
				; CHECK-LABEL: @fcmp_double_dynamic_dynamic(
				; CHECK-NEXT: [[CMP:%.*]] = fcmp une double 0x8000000000000, 0.000000e+00
				; CHECK-NEXT: ret i1 [[CMP]]
				;
				%cmp = fcmp une double 0x0008000000000000, 0x0
				ret i1 %cmp
				}

				define i1 @fcmp_double_dynamic_dynamic_commute() #11 {
				; CHECK-LABEL: @fcmp_double_dynamic_dynamic_commute(
				; CHECK-NEXT: [[CMP:%.*]] = fcmp une double 0.000000e+00, 0x8000000000000
				; CHECK-NEXT: ret i1 [[CMP]]
				;
				%cmp = fcmp une double 0x0, 0x0008000000000000
				ret i1 %cmp
				}

				; Output doesn't matter.
				define i1 @fcmp_double_dynamic_psz() #12 {
				; CHECK-LABEL: @fcmp_double_dynamic_psz(
				; CHECK-NEXT: ret i1 false
				;
				%cmp = fcmp une double 0x0008000000000000, 0x0
				ret i1 %cmp
				}

				; Non-denormal values should fold
				define float @test_float_fadd_dynamic_dynamic_normals() #11 {
				; CHECK-LABEL: @test_float_fadd_dynamic_dynamic_normals(
				; CHECK-NEXT: ret float 3.000000e+00
				;
				%result = fadd float 1.0, 2.0
				ret float %result
				}

				; Non-denormal values should fold
				define i1 @fcmp_double_dynamic_dynamic_normals() #11 {
				; CHECK-LABEL: @fcmp_double_dynamic_dynamic_normals(
				; CHECK-NEXT: ret i1 true
				;
				%cmp = fcmp une double 1.0, 2.0
				ret i1 %cmp
				}

	attributes #0 = { nounwind "denormal-fp-math"="ieee,ieee" }			attributes #0 = { nounwind "denormal-fp-math"="ieee,ieee" }
	attributes #1 = { nounwind "denormal-fp-math"="positive-zero,ieee" }			attributes #1 = { nounwind "denormal-fp-math"="positive-zero,ieee" }
	attributes #2 = { nounwind "denormal-fp-math"="preserve-sign,ieee" }			attributes #2 = { nounwind "denormal-fp-math"="preserve-sign,ieee" }
	attributes #3 = { nounwind "denormal-fp-math"="ieee,positive-zero" }			attributes #3 = { nounwind "denormal-fp-math"="ieee,positive-zero" }
	attributes #4 = { nounwind "denormal-fp-math"="ieee,preserve-sign" }			attributes #4 = { nounwind "denormal-fp-math"="ieee,preserve-sign" }
	attributes #5 = { nounwind "denormal-fp-math"="ieee,ieee" "denormal-fp-math-f32"="positive-zero,ieee" }			attributes #5 = { nounwind "denormal-fp-math"="ieee,ieee" "denormal-fp-math-f32"="positive-zero,ieee" }
	attributes #6 = { nounwind "denormal-fp-math"="positive-zero,positive-zero" }			attributes #6 = { nounwind "denormal-fp-math"="positive-zero,positive-zero" }
	attributes #7 = { nounwind "denormal-fp-math"="preserve-sign,preserve-sign" }			attributes #7 = { nounwind "denormal-fp-math"="preserve-sign,preserve-sign" }
	attributes #8 = { nounwind "denormal-fp-math"="ieee,ieee" "denormal-fp-math-f32"="positive-zero,positive-zero" }			attributes #8 = { nounwind "denormal-fp-math"="ieee,ieee" "denormal-fp-math-f32"="positive-zero,positive-zero" }
				attributes #9 = { nounwind "denormal-fp-math"="dynamic,ieee" }
				attributes #10 = { nounwind "denormal-fp-math"="ieee,dynamic" }
				attributes #11 = { nounwind "denormal-fp-math"="dynamic,dynamic" }
				attributes #12 = { nounwind "denormal-fp-math"="dynamic,preserve-sign" }

llvm/unittests/ADT/FloatingPointMode.cpp

Show All 14 Lines

TEST(FloatingPointModeTest, ParseDenormalFPAttributeComponent) {		TEST(FloatingPointModeTest, ParseDenormalFPAttributeComponent) {
EXPECT_EQ(DenormalMode::IEEE, parseDenormalFPAttributeComponent("ieee"));		EXPECT_EQ(DenormalMode::IEEE, parseDenormalFPAttributeComponent("ieee"));
EXPECT_EQ(DenormalMode::IEEE, parseDenormalFPAttributeComponent(""));		EXPECT_EQ(DenormalMode::IEEE, parseDenormalFPAttributeComponent(""));
EXPECT_EQ(DenormalMode::PreserveSign,		EXPECT_EQ(DenormalMode::PreserveSign,
parseDenormalFPAttributeComponent("preserve-sign"));		parseDenormalFPAttributeComponent("preserve-sign"));
EXPECT_EQ(DenormalMode::PositiveZero,		EXPECT_EQ(DenormalMode::PositiveZero,
parseDenormalFPAttributeComponent("positive-zero"));		parseDenormalFPAttributeComponent("positive-zero"));
		EXPECT_EQ(DenormalMode::Dynamic,
		parseDenormalFPAttributeComponent("dynamic"));
EXPECT_EQ(DenormalMode::Invalid, parseDenormalFPAttributeComponent("foo"));		EXPECT_EQ(DenormalMode::Invalid, parseDenormalFPAttributeComponent("foo"));
}		}

TEST(FloatingPointModeTest, DenormalAttributeName) {		TEST(FloatingPointModeTest, DenormalAttributeName) {
EXPECT_EQ("ieee", denormalModeKindName(DenormalMode::IEEE));		EXPECT_EQ("ieee", denormalModeKindName(DenormalMode::IEEE));
EXPECT_EQ("preserve-sign", denormalModeKindName(DenormalMode::PreserveSign));		EXPECT_EQ("preserve-sign", denormalModeKindName(DenormalMode::PreserveSign));
EXPECT_EQ("positive-zero", denormalModeKindName(DenormalMode::PositiveZero));		EXPECT_EQ("positive-zero", denormalModeKindName(DenormalMode::PositiveZero));
		EXPECT_EQ("dynamic", denormalModeKindName(DenormalMode::Dynamic));
EXPECT_EQ("", denormalModeKindName(DenormalMode::Invalid));		EXPECT_EQ("", denormalModeKindName(DenormalMode::Invalid));
}		}

TEST(FloatingPointModeTest, ParseDenormalFPAttribute) {		TEST(FloatingPointModeTest, ParseDenormalFPAttribute) {
EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE),		EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE),
parseDenormalFPAttribute("ieee"));		parseDenormalFPAttribute("ieee"));
EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE),		EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE),
parseDenormalFPAttribute("ieee,ieee"));		parseDenormalFPAttribute("ieee,ieee"));
Show All 11 Lines	TEST(FloatingPointModeTest, ParseDenormalFPAttribute) {
EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::PreserveSign),		EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::PreserveSign),
parseDenormalFPAttribute("preserve-sign,preserve-sign"));		parseDenormalFPAttribute("preserve-sign,preserve-sign"));

EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::PositiveZero),		EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::PositiveZero),
parseDenormalFPAttribute("positive-zero"));		parseDenormalFPAttribute("positive-zero"));
EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::PositiveZero),		EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::PositiveZero),
parseDenormalFPAttribute("positive-zero,positive-zero"));		parseDenormalFPAttribute("positive-zero,positive-zero"));

		EXPECT_EQ(DenormalMode(DenormalMode::Dynamic, DenormalMode::Dynamic),
		parseDenormalFPAttribute("dynamic"));
		EXPECT_EQ(DenormalMode(DenormalMode::Dynamic, DenormalMode::Dynamic),
		parseDenormalFPAttribute("dynamic,dynamic"));

EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::PositiveZero),		EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::PositiveZero),
parseDenormalFPAttribute("ieee,positive-zero"));		parseDenormalFPAttribute("ieee,positive-zero"));
EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::IEEE),		EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::IEEE),
parseDenormalFPAttribute("positive-zero,ieee"));		parseDenormalFPAttribute("positive-zero,ieee"));

EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::IEEE),		EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::IEEE),
parseDenormalFPAttribute("preserve-sign,ieee"));		parseDenormalFPAttribute("preserve-sign,ieee"));
EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::PreserveSign),		EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::PreserveSign),
parseDenormalFPAttribute("ieee,preserve-sign"));		parseDenormalFPAttribute("ieee,preserve-sign"));

		EXPECT_EQ(DenormalMode(DenormalMode::Dynamic, DenormalMode::PreserveSign),
		parseDenormalFPAttribute("dynamic,preserve-sign"));
		EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::Dynamic),
		parseDenormalFPAttribute("preserve-sign,dynamic"));

EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),		EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),
parseDenormalFPAttribute("foo"));		parseDenormalFPAttribute("foo"));
EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),		EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),
parseDenormalFPAttribute("foo,foo"));		parseDenormalFPAttribute("foo,foo"));
EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),		EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),
parseDenormalFPAttribute("foo,bar"));		parseDenormalFPAttribute("foo,bar"));
}		}
Show All 21 Lines	TEST(FloatingPointModeTest, RenderDenormalFPAttribute) {

EXPECT_EQ(		EXPECT_EQ(
"preserve-sign,ieee",		"preserve-sign,ieee",
DenormalMode(DenormalMode::PreserveSign, DenormalMode::IEEE).str());		DenormalMode(DenormalMode::PreserveSign, DenormalMode::IEEE).str());

EXPECT_EQ(		EXPECT_EQ(
"preserve-sign,positive-zero",		"preserve-sign,positive-zero",
DenormalMode(DenormalMode::PreserveSign, DenormalMode::PositiveZero).str());		DenormalMode(DenormalMode::PreserveSign, DenormalMode::PositiveZero).str());

		EXPECT_EQ("dynamic,dynamic",
		DenormalMode(DenormalMode::Dynamic, DenormalMode::Dynamic).str());
		EXPECT_EQ("ieee,dynamic",
		DenormalMode(DenormalMode::IEEE, DenormalMode::Dynamic).str());
		EXPECT_EQ("dynamic,ieee",
		DenormalMode(DenormalMode::Dynamic, DenormalMode::IEEE).str());
}		}

TEST(FloatingPointModeTest, DenormalModeIsSimple) {		TEST(FloatingPointModeTest, DenormalModeIsSimple) {
EXPECT_TRUE(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE).isSimple());		EXPECT_TRUE(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE).isSimple());
EXPECT_FALSE(DenormalMode(DenormalMode::IEEE,		EXPECT_FALSE(DenormalMode(DenormalMode::IEEE,
DenormalMode::Invalid).isSimple());		DenormalMode::Invalid).isSimple());
EXPECT_FALSE(DenormalMode(DenormalMode::PreserveSign,		EXPECT_FALSE(DenormalMode(DenormalMode::PreserveSign,
DenormalMode::PositiveZero).isSimple());		DenormalMode::PositiveZero).isSimple());
		EXPECT_FALSE(DenormalMode(DenormalMode::PreserveSign, DenormalMode::Dynamic)
		.isSimple());
		EXPECT_FALSE(DenormalMode(DenormalMode::Dynamic, DenormalMode::PreserveSign)
		.isSimple());
}		}

TEST(FloatingPointModeTest, DenormalModeIsValid) {		TEST(FloatingPointModeTest, DenormalModeIsValid) {
EXPECT_TRUE(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE).isValid());		EXPECT_TRUE(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE).isValid());
EXPECT_FALSE(DenormalMode(DenormalMode::IEEE, DenormalMode::Invalid).isValid());		EXPECT_FALSE(DenormalMode(DenormalMode::IEEE, DenormalMode::Invalid).isValid());
EXPECT_FALSE(DenormalMode(DenormalMode::Invalid, DenormalMode::IEEE).isValid());		EXPECT_FALSE(DenormalMode(DenormalMode::Invalid, DenormalMode::IEEE).isValid());
EXPECT_FALSE(DenormalMode(DenormalMode::Invalid,		EXPECT_FALSE(DenormalMode(DenormalMode::Invalid,
DenormalMode::Invalid).isValid());		DenormalMode::Invalid).isValid());
}		}

TEST(FloatingPointModeTest, DenormalModeConstructor) {		TEST(FloatingPointModeTest, DenormalModeConstructor) {
EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),		EXPECT_EQ(DenormalMode(DenormalMode::Invalid, DenormalMode::Invalid),
DenormalMode::getInvalid());		DenormalMode::getInvalid());
EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE),		EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::IEEE),
DenormalMode::getIEEE());		DenormalMode::getIEEE());
		EXPECT_EQ(DenormalMode::getIEEE(), DenormalMode::getDefault());
		EXPECT_EQ(DenormalMode(DenormalMode::Dynamic, DenormalMode::Dynamic),
		DenormalMode::getDynamic());
EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::PreserveSign),		EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::PreserveSign),
DenormalMode::getPreserveSign());		DenormalMode::getPreserveSign());
EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::PositiveZero),		EXPECT_EQ(DenormalMode(DenormalMode::PositiveZero, DenormalMode::PositiveZero),
DenormalMode::getPositiveZero());		DenormalMode::getPositiveZero());
}		}

		TEST(FloatingPointModeTest, DenormalModeMerge) {
		EXPECT_EQ(
		DenormalMode::getInvalid(),
		DenormalMode::getInvalid().mergeCalleeMode(DenormalMode::getInvalid()));
		EXPECT_EQ(DenormalMode::getIEEE(), DenormalMode::getInvalid().mergeCalleeMode(
		DenormalMode::getIEEE()));
		EXPECT_EQ(DenormalMode::getInvalid(), DenormalMode::getIEEE().mergeCalleeMode(
		DenormalMode::getInvalid()));

		EXPECT_EQ(DenormalMode::getIEEE(), DenormalMode::getIEEE().mergeCalleeMode(
		DenormalMode::getDynamic()));
		EXPECT_EQ(DenormalMode::getPreserveSign(),
		DenormalMode::getPreserveSign().mergeCalleeMode(
		DenormalMode::getDynamic()));
		EXPECT_EQ(DenormalMode::getPositiveZero(),
		DenormalMode::getPositiveZero().mergeCalleeMode(
		DenormalMode::getDynamic()));
		EXPECT_EQ(
		DenormalMode::getDynamic(),
		DenormalMode::getDynamic().mergeCalleeMode(DenormalMode::getDynamic()));

		EXPECT_EQ(DenormalMode(DenormalMode::IEEE, DenormalMode::PreserveSign),
		DenormalMode(DenormalMode::IEEE, DenormalMode::PreserveSign)
		.mergeCalleeMode(
		DenormalMode(DenormalMode::IEEE, DenormalMode::Dynamic)));

		EXPECT_EQ(DenormalMode(DenormalMode::PreserveSign, DenormalMode::IEEE),
		DenormalMode(DenormalMode::PreserveSign, DenormalMode::IEEE)
		.mergeCalleeMode(
		DenormalMode(DenormalMode::Dynamic, DenormalMode::IEEE)));

		EXPECT_EQ(
		DenormalMode(DenormalMode::PositiveZero, DenormalMode::PreserveSign),
		DenormalMode(DenormalMode::PositiveZero, DenormalMode::PreserveSign)
		.mergeCalleeMode(
		DenormalMode(DenormalMode::Dynamic, DenormalMode::Dynamic)));

		EXPECT_EQ(
		DenormalMode(DenormalMode::PositiveZero, DenormalMode::PreserveSign),
		DenormalMode(DenormalMode::PositiveZero, DenormalMode::PreserveSign)
		.mergeCalleeMode(
		DenormalMode(DenormalMode::PositiveZero, DenormalMode::Dynamic)));

		EXPECT_EQ(
		DenormalMode(DenormalMode::PositiveZero, DenormalMode::PreserveSign),
		DenormalMode(DenormalMode::PositiveZero, DenormalMode::PreserveSign)
		.mergeCalleeMode(
		DenormalMode(DenormalMode::Dynamic, DenormalMode::PreserveSign)));

		// Test some invalid / undefined behavior cases
		EXPECT_EQ(
		DenormalMode::getPreserveSign(),
		DenormalMode::getIEEE().mergeCalleeMode(DenormalMode::getPreserveSign()));
		EXPECT_EQ(
		DenormalMode::getPreserveSign(),
		DenormalMode::getIEEE().mergeCalleeMode(DenormalMode::getPreserveSign()));
		EXPECT_EQ(
		DenormalMode::getIEEE(),
		DenormalMode::getPreserveSign().mergeCalleeMode(DenormalMode::getIEEE()));
		EXPECT_EQ(
		DenormalMode::getIEEE(),
		DenormalMode::getPreserveSign().mergeCalleeMode(DenormalMode::getIEEE()));
		}
}		}

llvm/utils/TableGen/Attributes.cpp

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	for (StringRef KindName : KindNames) {
}		}
}		}
OS << "#undef " << MacroName << "\n\n";		OS << "#undef " << MacroName << "\n\n";
};		};

// Emit attribute enums in the same order llvm::Attribute::operator< expects.		// Emit attribute enums in the same order llvm::Attribute::operator< expects.
Emit({"EnumAttr", "TypeAttr", "IntAttr"}, "ATTRIBUTE_ENUM");		Emit({"EnumAttr", "TypeAttr", "IntAttr"}, "ATTRIBUTE_ENUM");
Emit({"StrBoolAttr"}, "ATTRIBUTE_STRBOOL");		Emit({"StrBoolAttr"}, "ATTRIBUTE_STRBOOL");
		Emit({"ComplexStrAttr"}, "ATTRIBUTE_COMPLEXSTR");

OS << "#undef ATTRIBUTE_ALL\n";		OS << "#undef ATTRIBUTE_ALL\n";
OS << "#endif\n\n";		OS << "#endif\n\n";

OS << "#ifdef GET_ATTR_ENUM\n";		OS << "#ifdef GET_ATTR_ENUM\n";
OS << "#undef GET_ATTR_ENUM\n";		OS << "#undef GET_ATTR_ENUM\n";
unsigned Value = 1; // Leave zero for AttrKind::None.		unsigned Value = 1; // Leave zero for AttrKind::None.
for (StringRef KindName : {"EnumAttr", "TypeAttr", "IntAttr"}) {		for (StringRef KindName : {"EnumAttr", "TypeAttr", "IntAttr"}) {
▲ Show 20 Lines • Show All 74 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

LangRef: Add "dynamic" option to "denormal-fp-math"ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 505274

clang/lib/CodeGen/CGCall.cpp

clang/lib/CodeGen/CodeGenAction.cpp

clang/lib/CodeGen/CodeGenModule.h

clang/test/CodeGen/denormalfpmode-f32.c

clang/test/CodeGen/denormalfpmode.c

clang/test/CodeGenCUDA/Inputs/ocml-sample.cl

clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu

clang/test/Driver/denormal-fp-math.c

llvm/docs/LangRef.rst

llvm/include/llvm/ADT/FloatingPointMode.h

llvm/include/llvm/Analysis/ConstantFolding.h

llvm/include/llvm/IR/Attributes.td

llvm/include/llvm/IR/Function.h

llvm/lib/Analysis/ConstantFolding.cpp

llvm/lib/CodeGen/CommandFlags.cpp

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/IR/Attributes.cpp

llvm/lib/IR/Function.cpp

llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.h

llvm/test/CodeGen/Generic/denormal-fp-math-cl-opt.ll

llvm/test/CodeGen/X86/sqrt-fastmath.ll

llvm/test/Transforms/Inline/AMDGPU/inline-denormal-fp-math.ll

llvm/test/Transforms/InstSimplify/canonicalize.ll

llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll

llvm/unittests/ADT/FloatingPointMode.cpp

llvm/utils/TableGen/Attributes.cpp

LangRef: Add "dynamic" option to "denormal-fp-math"
ClosedPublic