This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Driver/
-
clang/
-
Driver/
1
Options.td
-
lib/
-
CodeGen/
-
BackendUtil.cpp
-
CGCall.cpp
-
Driver/ToolChains/
-
ToolChains/
-
Clang.cpp
-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
5/5
ScalarFuncs.def
-
CodeGen/
-
CommandFlags.h
-
Target/
1/1
TargetOptions.h
-
lib/
-
CodeGen/
-
CommandFlags.cpp
-
Target/PowerPC/
-
PowerPC/
-
CMakeLists.txt
-
PPC.h
5/5
PPCGenScalarMASSEntries.cpp
9/10
PPCISelLowering.cpp
-
PPCTargetMachine.cpp
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
2/2
lower-intrinsics-afn-mass.ll
3/3
lower-intrinsics-fast-mass.ll
4/4
lower-intrinsics-mass-aix.ll
1/1
lower-intrinsics-nofast-mass.ll
-
lower-scalar-mass-afn.ll
2/2
lower-scalar-mass-fast.ll
-
pow-025-075-intrinsic-scalar-mass-afn.ll
2/3
pow-025-075-intrinsic-scalar-mass-fast.ll
-
pow-025-075-intrinsic-scalar-mass-nofast.ll

Differential D101759

[PowerPC] Scalar IBM MASS library conversion pass
ClosedPublic

Authored by masoud.ataei on May 3 2021, 7:46 AM.

Download Raw Diff

Details

Reviewers

etiotto
pjeeva01
renenkel
bmahjour
qiucf
shchenz
spatel
efriedma

Group Reviewers

Restricted Project

Summary

This patch introduces an option to enable conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need to link libxlopt.a library.

This patch is tested on PowerPC Linux and AIX.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

masoud.ataei created this revision.May 3 2021, 7:46 AM

Herald added subscribers: steven.zhang, shchenz, kbarton and 3 others. · View Herald TranscriptMay 3 2021, 7:46 AM

masoud.ataei requested review of this revision.May 3 2021, 7:46 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 3 2021, 7:46 AM

Harbormaster completed remote builds in B102283: Diff 342382.May 3 2021, 8:56 AM

steven.zhang added a reviewer: Restricted Project.May 17 2021, 10:24 PM

bmahjour added inline comments.May 18 2021, 4:28 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1367	why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`?

bmahjour added inline comments.May 18 2021, 4:28 PM

llvm/include/llvm/Analysis/ScalarFuncs.def
18	shouldn't these map from llvm.* intrinsics to mass entry points as well?

masoud.ataei added inline comments.May 19 2021, 1:07 PM

llvm/include/llvm/Analysis/ScalarFuncs.def
18	llvm intrinsics is handled in `PPCISelLowering.cpp`.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1367	We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to block any type of existing optimizations (like pow(x,0.5) --> sqrt(x)) and future optimizations (like https://reviews.llvm.org/D94543 ?).

bmahjour added inline comments.May 25 2021, 12:42 PM

llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp
71	There should be a todo comment to handle non-finite entries using fewer fast-math flags.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1367	I see, could you please put a comment in the code to explain that? Alternatively you can put the comment at the top of `llvm/include/llvm/Analysis/ScalarFuncs.def`.
1367	Instead of `TM.Options.UnsafeFPMath` we should test for the individual fast-math flags that are required for safety. Checking for "unsafe-fp-math" has a few drawbacks: To make clang enable that flag it is necessary but not enough to specify `-funsafe-math-optimizations`! You'd have to specify `-fno-math-errno` as well. Clang sets the "unsafe-fp-math" flag when all four of `-fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros` are specified, regardless of other flags... For example this command does the conversion to the _finite calls despite the user request to honor NaNs. `clang t.c -c -O3 -fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros -fhonor-nans` Even if the clang inconsistencies/issues are resolved, it would still be better to check for the individual flags for finer control and for consistency with other front-ends.
llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll
2	why not just use the default `CHECK` prefix? `CHECK-ALL` and `CHECK-LWR` don't distinguish anything based on this run command.
20	CHECK-DFLT is not in the list of prefixes defined.
llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll
148	Remove this line, `#1` is unused.

Sorry it took me so long to update this patch -- I think I addressed all reviews till now.

masoud.ataei marked 8 inline comments as done.Jun 29 2021, 1:28 PM

Harbormaster completed remote builds in B111607: Diff 355347.Jun 29 2021, 2:48 PM

bmahjour added inline comments.Jul 7 2021, 2:04 PM

llvm/include/llvm/Analysis/ScalarFuncs.def
12	[nit] ISelLowing -> PPCISelLowering
llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp
10	Since LLVM math intrinsic lowerings are done in ISellLowering, this comment should not say "and LLVM math intrinsics".
14	llvm.cos.f32 is an intrinsic and not handled by this transformation.
73	remove this line
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1370	Why do you still check for `TM.Options.UnsafeFPMath` ? If you do it out of concerns for `-fno-math-errno`, then it's not needed. Note that these llvm intrinsics already mention that their semantics are identical to their libm counter parts but "without trapping or setting errno".
llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll
148	See above comment and remove "unsafe-fp-math".
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass.ll
302 ↗	(On Diff #355347)	See above comment and remove "unsafe-fp-math".

Removed dependency to unsafe-fp-math and added clang option to
control afn flag.

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2021, 2:07 PM

Herald added subscribers: cfe-commits, ormris, dang. · View Herald Transcript

masoud.ataei updated this revision to Diff 358756.Jul 14 2021, 2:38 PM

jsji added reviewers: qiucf, shchenz.Jul 14 2021, 2:47 PM

Harbormaster completed remote builds in B114101: Diff 358756.Jul 14 2021, 5:08 PM

bmahjour added inline comments.Jul 15 2021, 10:37 AM

clang/include/clang/Driver/Options.td
1726–1727	I think we should separate out the clang driver interface into its own patch and ask for feedback on the mailing list. One key question would be, should this option assume no-errno and no-trapping-math or not (given that there is no IR representation for them). There should also be LIT tests dedicated to this to verify the clang interface. I only see llc interface being tested in this patch.
llvm/include/llvm/Target/TargetOptions.h
179	We already have the `PPCGenScalarMASSEntries` bit, why do we need another one? Perhaps we can remove `PPCGenScalarMASSEntries`, but we should not have to turn on two options to get one transformation enabled.
llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp
73	...but errno and trapping-math would be an issue for non-finite entries as well. Again, I think this function should just check for nnan/ninf/afn flags. We need to find out (with the help of the wider community) how to deal with the concerns surrounding errno and traps separately. One way to do that would be to broaden the definition of the `afn` flag to include no-errno and no-trapping semantics. Another way might be to make clang FE set the `afn` bit only if `-fno-math-errno` and `-fno-trapping-math` options are enabled (less desirable). A third way might be to add corresponding function attributes to the IR for `-fno-math-errno` and `-fno-trapping-math`. Once these issues are sorted out, we can add the appropriate constraints to the `isCandidateSafeToLower` function.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1370	if someone compiles with -Ofast without any extra options, would `TM.Options.ApproxFuncFPMath` be true here?

Removed clang changes from this PR.
Removed extra option for MASS pass.
Now MASS pass is active with -O3 and approx-func option.

Adding another PR for clang changes on approx-func option.

Harbormaster completed remote builds in B114549: Diff 359385.Jul 16 2021, 11:03 AM

masoud.ataei marked 9 inline comments as done.Jul 16 2021, 11:10 AM

masoud.ataei added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1370	In clang changes, I had `Options.ApproxFuncFPMath = LangOpts.ApproxFunc;` in `clang/lib/CodeGen/BackendUtil.cpp`. That was responsible to update this TM option based on the clang approximate func option. And clang approximate func option will be set with -Ofast. Then, the answer for your question is yes.

masoud.ataei mentioned this in D106191: [clang] Option control afn flag.Jul 16 2021, 2:14 PM

Do we *really* need -enable-approx-func-fp-math?
I'm pretty sure we are moving away from such global options, onto relying only on the per-instruction fast-math flags.

In D101759#2967250, @lebedev.ri wrote:

Do we *really* need -enable-approx-func-fp-math?
I'm pretty sure we are moving away from such global options, onto relying only on the per-instruction fast-math flags.

I am handling LLVM intrinsic math functions in PPCISelLowering.cpp, so I need to check for TM.Options.ApproxFuncFPMath. This is the only place that I think I need it.
Currently, I am updating TM.Options.ApproxFuncFPMath in llvm/lib/CodeGen/CommandFlags.cpp using the global option. Please let me know if there is a better way to update TM.Options.ApproxFuncFPMath based on the local fast-math flag.

In D101759#2967331, @masoud.ataei wrote:

In D101759#2967250, @lebedev.ri wrote:

Do we *really* need -enable-approx-func-fp-math?
I'm pretty sure we are moving away from such global options, onto relying only on the per-instruction fast-math flags.

I am handling LLVM intrinsic math functions in PPCISelLowering.cpp, so I need to check for TM.Options.ApproxFuncFPMath. This is the only place that I think I need it.

How is this going to work e.g. in LTO when not all TU's are compiled with fast-math flags?

I'm not familiar with those llc flags, but i'm quite sure that e.g. DAGCombiner
is transitioned away from using them, so i'm wary of adding new ones.

Currently, I am updating TM.Options.ApproxFuncFPMath in llvm/lib/CodeGen/CommandFlags.cpp using the global option. Please let me know if there is a better way to update TM.Options.ApproxFuncFPMath based on the local fast-math flag.

Removing dependency to the global option to convert math functions to MASS.

Herald added subscribers: dexonsmith, jdoerfert. · View Herald TranscriptAug 26 2021, 2:22 PM

Harbormaster completed remote builds in B121408: Diff 368980.Aug 26 2021, 3:21 PM

I'm not familiar with this library, and I haven't looked at current state of how we enable/map optional libs in a while...
We definitely want to avoid adding another target option/debug flag, and if we can avoid relying on a function parameter too, that would be even better.
Ie, the "afn" fast-math-flag (possibly in combination with some other IR- or node-level flags) seems like it should be enough to allow this transform/lowering.
Scanning the earlier review comments, there was some concern about the semantics wrt errno. If we need to adjust the "afn" definition, it's probably fine. There haven't been many uses of that flag AFAIK.

errno handling for math library functions is a mess. Currently, we don't model it properly; we just mark the calls "readnone" and hope for the best. If you don't want to fix that, just check for readnone for now.

I don't think we want to be querying function attributes or options here; afn plus enabling MASS should be enough. The function attributes are the old mechanism; we just haven't completely migrated some parts of SelectionDAG yet.

llvm/include/llvm/Analysis/ScalarFuncs.def
20	Do "__acosf_finite" etc. actually exist on AIX? I thought they only existed on glibc, and the glibc functions are all deprecated. I think I'd prefer to track this information in TargetLibraryInfo, like we do for the vector functions, so we can more easily generalize this mechanism in the future.

In D101759#2971567, @efriedma wrote:

errno handling for math library functions is a mess. Currently, we don't model it properly; we just mark the calls "readnone" and hope for the best. If you don't want to fix that, just check for readnone for now.

I think using readnone would work fine. It seems that clang marks math functions with that attribute when -fno-math-errno is in effect. To get the non-finite MASS lowerings at -O3 one would have to compile with both -fapprox-func and -fno-math-errno, which seems reasonable to me.

I don't think we want to be querying function attributes or options here; afn plus enabling MASS should be enough. The function attributes are the old mechanism; we just haven't completely migrated some parts of SelectionDAG yet.

I agree. I think the problem is that this patch is trying to decide on a global lowering strategy for llvm.* math intrinsics in llvm/lib/Target/PowerPC/PPCISelLowering.cpp but such global decision making does not go well with finer granularity of fast-math flags. My understanding is that the reason we need to handle intrinsic math functions later is because of strength-reduction transformations like pow(x,0.5) --> sqrt(x) that currently operate on intrinsic calls only. If we could apply those operations on things like __xl_pow_finite and produce calls to __xl_sqrt_finite then we wouldn't have this problem. Another possibility might be to have two versions of PPCGenScalarMASSEntries one that handles non-intrinsics and runs earlier, and another one that handles intrinsics after transformations likes pow(x,0.5) --> sqrt(x) are done.

I agree. I think the problem is that this patch is trying to decide on a global lowering strategy for llvm.* math intrinsics in llvm/lib/Target/PowerPC/PPCISelLowering.cpp but such global decision making does not go well with finer granularity of fast-math flags.

Hmm. Instead of using setLibcallName() and letting the legalizer generate the calls, it should be possible to use custom lowering to generate the appropriate calls, at the cost of writing a little more code.

My understanding is that the reason we need to handle intrinsic math functions later is because of strength-reduction transformations like pow(x,0.5) --> sqrt(x) that currently operate on intrinsic calls only.

instcombine should be primarily responsible for this sort of optimization. See LibCallSimplifier::optimizePow. I guess a few transforms (D51630 etc.) landed in DAGCombine; probably we could move them earlier.

masoud.ataei mentioned this in D110288: Move pow transformations to sqrt/cbrt to earlier in the compiler pipeline.Sep 22 2021, 1:27 PM

As suggested before, I removed dependency to the global option to convert math functions to MASS for all intrinsic and non-intrinsic functions.
The main changes here with respect to the last proposal is in PPCIselLowing.cpp file, about how to handle llvm intrinsic math function.

and sorry for taking so long to update the patch.

masoud.ataei added inline comments.Jan 7 2022, 10:54 AM

llvm/include/llvm/Analysis/ScalarFuncs.def
20	Some machines still have the old glibc, so I kept them for compatibility.

Harbormaster completed remote builds in B142123: Diff 398194.Jan 7 2022, 11:56 AM

ormris removed a subscriber: ormris.Jan 18 2022, 10:08 AM

This update will fix the type of arguments passing to the converted math function in PPCISelLowing.cpp.

masoud.ataei marked an inline comment as done.Jan 24 2022, 7:00 AM

Harbormaster completed remote builds in B145229: Diff 402508.Jan 24 2022, 1:34 PM

dexonsmith removed a subscriber: dexonsmith.Jan 24 2022, 6:48 PM

bmahjour added inline comments.Jan 27 2022, 2:04 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
381	what about tan, acos, and the others?
llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass.ll
149	All the calls have `afn`....why do we need this attribute?
llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll
149	do we need this attribute? Can we remove it or have separate tests for functions with attributes?
llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll
2	We don't really need a separate aix file. Can we just add a run line with the aix triple to `llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll`?
llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll
797	shouldn't the tests starting from here move to a different file? This test file is called ...mass-fast.ll so one would expect it only contains tests with fast-math flag on.
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
247	How come pow -> sqrt conversion didn't happen here?
llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
22 ↗	(On Diff #402508)	so pow->sqrt translation never happens for non-intrinsic `pow`. Is that expected? If so, are we planning to recognize these patterns inside PPCGenScalarMASSEntries in the future and do the translation as part of that transform?

masoud.ataei marked 7 inline comments as done.Jan 28 2022, 10:25 AM

masoud.ataei added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
381	These are the list of math functions that llvm creates intrinsic call for them. There is no llvm intrinsic for tan, acos and other math functions which (exist in MASS and) are not in this list.
llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass.ll
149	Removed
llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll
149	Removed
llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll
2	Done
llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll
797	Done
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
247	Honestly, I am not sure why the conversion is not happening in this case. But without this patch we will get `powf` call (the conversion is not happening again). So this is a separate issue that someone needs to look at independent of this patch.
llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
22 ↗	(On Diff #402508)	Correct, pow->sqrt translation is not happening for none intrinsic cases. It is the case independent of this patch. I guess the reason is DAGCombiner only apply this optimization on llvm intrinsics. This is an issue that either we need to handle it in DAGCombiner (same as intrinsic one) or in MASS pass. I feel DAGCombiner is a better option and I think this is also a separate issue.

Fix test cases.

Changing function name: lowerLibCall() -> lowerLibCallType()

Ready for another round of review.

Harbormaster completed remote builds in B146335: Diff 404091.Jan 28 2022, 12:24 PM

Apart from some minor inline comments this revision addresses all my outstanding comments. LGTM.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
17397	[nit] a better name would be `lowerLibCallBasedOnType`
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
247	Could you please make a note of this as a todo comment in each test that is affected?
llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
22 ↗	(On Diff #402508)	Ok, I understand now. We'll have to come back to this later at some point.

This revision is now accepted and ready to land.Feb 1 2022, 11:21 AM

masoud.ataei mentioned this in rG256d2533322c: [PowerPC] Scalar IBM MASS library conversion pass.Feb 2 2022, 7:54 AM

masoud.ataei closed this revision.Feb 2 2022, 8:35 AM

masoud.ataei mentioned this in D121016: [PowerPC] Fix the none tail call in scalar MASS conversion.Mar 4 2022, 11:44 AM

masoud.ataei mentioned this in rG30f30e1c12fa: [PowerPC] Fix the none tail call in scalar MASS conversion.Mar 8 2022, 9:02 AM

Revision Contents

Path

Size

clang/

include/

clang/

Driver/

Options.td

4 lines

lib/

CodeGen/

BackendUtil.cpp

1 line

CGCall.cpp

2 lines

Driver/

ToolChains/

Clang.cpp

6 lines

llvm/

include/

llvm/

Analysis/

ScalarFuncs.def

141 lines

CodeGen/

CommandFlags.h

2 lines

Target/

TargetOptions.h

13 lines

lib/

CodeGen/

CommandFlags.cpp

9 lines

Target/

PowerPC/

CMakeLists.txt

1 line

PPC.h

4 lines

PPCGenScalarMASSEntries.cpp

144 lines

PPCISelLowering.cpp

33 lines

PPCTargetMachine.cpp

15 lines

test/

CodeGen/

PowerPC/

lower-intrinsics-afn-mass.ll

148 lines

lower-intrinsics-fast-mass.ll

148 lines

lower-intrinsics-mass-aix.ll

133 lines

lower-intrinsics-nofast-mass.ll

146 lines

lower-scalar-mass-afn.ll

573 lines

lower-scalar-mass-fast.ll

1145 lines

pow-025-075-intrinsic-scalar-mass-afn.ll

223 lines

pow-025-075-intrinsic-scalar-mass-fast.ll

302 lines

pow-025-075-intrinsic-scalar-mass-nofast.ll

455 lines

Diff 358745

clang/include/clang/Driver/Options.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,717 Lines • ▼ Show 20 Lines	def fno_unsafe_math_optimizations : Flag<["-"], "fno-unsafe-math-optimizations">,
Group<f_Group>;		Group<f_Group>;
def fassociative_math : Flag<["-"], "fassociative-math">, Group<f_Group>;		def fassociative_math : Flag<["-"], "fassociative-math">, Group<f_Group>;
def fno_associative_math : Flag<["-"], "fno-associative-math">, Group<f_Group>;		def fno_associative_math : Flag<["-"], "fno-associative-math">, Group<f_Group>;
defm reciprocal_math : BoolFOption<"reciprocal-math",		defm reciprocal_math : BoolFOption<"reciprocal-math",
LangOpts<"AllowRecip">, DefaultFalse,		LangOpts<"AllowRecip">, DefaultFalse,
PosFlag<SetTrue, [CC1Option], "Allow division operations to be reassociated",		PosFlag<SetTrue, [CC1Option], "Allow division operations to be reassociated",
[menable_unsafe_fp_math.KeyPath]>,		[menable_unsafe_fp_math.KeyPath]>,
NegFlag<SetFalse>>;		NegFlag<SetFalse>>;
def fapprox_func : Flag<["-"], "fapprox-func">, Group<f_Group>, Flags<[CC1Option, NoDriverOption]>,		defm approx_func : BoolFOption<"approx-func", LangOpts<"ApproxFunc">, DefaultFalse,
MarshallingInfoFlag<LangOpts<"ApproxFunc">>, ImpliedByAnyOf<[menable_unsafe_fp_math.KeyPath]>;		PosFlag<SetTrue, [CC1Option], "", [menable_unsafe_fp_math.KeyPath]>, NegFlag<SetFalse>>;
		bmahjourUnsubmitted Not Done Reply Inline Actions I think we should separate out the clang driver interface into its own patch and ask for feedback on the mailing list. One key question would be, should this option assume no-errno and no-trapping-math or not (given that there is no IR representation for them). There should also be LIT tests dedicated to this to verify the clang interface. I only see llc interface being tested in this patch. bmahjour: I think we should separate out the clang driver interface into its own patch and ask for…
defm finite_math_only : BoolFOption<"finite-math-only",		defm finite_math_only : BoolFOption<"finite-math-only",
LangOpts<"FiniteMathOnly">, DefaultFalse,		LangOpts<"FiniteMathOnly">, DefaultFalse,
PosFlag<SetTrue, [CC1Option], "", [cl_finite_math_only.KeyPath, ffast_math.KeyPath]>,		PosFlag<SetTrue, [CC1Option], "", [cl_finite_math_only.KeyPath, ffast_math.KeyPath]>,
NegFlag<SetFalse>>;		NegFlag<SetFalse>>;
defm signed_zeros : BoolFOption<"signed-zeros",		defm signed_zeros : BoolFOption<"signed-zeros",
LangOpts<"NoSignedZero">, DefaultFalse,		LangOpts<"NoSignedZero">, DefaultFalse,
NegFlag<SetTrue, [CC1Option], "Allow optimizations that ignore the sign of floating point zeros",		NegFlag<SetTrue, [CC1Option], "Allow optimizations that ignore the sign of floating point zeros",
[cl_no_signed_zeros.KeyPath, menable_unsafe_fp_math.KeyPath]>,		[cl_no_signed_zeros.KeyPath, menable_unsafe_fp_math.KeyPath]>,
▲ Show 20 Lines • Show All 4,593 Lines • Show Last 20 Lines

clang/lib/CodeGen/BackendUtil.cpp

Show First 20 Lines • Show All 528 Lines • ▼ Show 20 Lines	if (LangOpts.hasDWARFExceptions())
Options.ExceptionModel = llvm::ExceptionHandling::DwarfCFI;		Options.ExceptionModel = llvm::ExceptionHandling::DwarfCFI;
if (LangOpts.hasWasmExceptions())		if (LangOpts.hasWasmExceptions())
Options.ExceptionModel = llvm::ExceptionHandling::Wasm;		Options.ExceptionModel = llvm::ExceptionHandling::Wasm;

Options.NoInfsFPMath = LangOpts.NoHonorInfs;		Options.NoInfsFPMath = LangOpts.NoHonorInfs;
Options.NoNaNsFPMath = LangOpts.NoHonorNaNs;		Options.NoNaNsFPMath = LangOpts.NoHonorNaNs;
Options.NoZerosInBSS = CodeGenOpts.NoZeroInitializedInBSS;		Options.NoZerosInBSS = CodeGenOpts.NoZeroInitializedInBSS;
Options.UnsafeFPMath = LangOpts.UnsafeFPMath;		Options.UnsafeFPMath = LangOpts.UnsafeFPMath;
		Options.ApproxFuncFPMath = LangOpts.ApproxFunc;

Options.BBSections =		Options.BBSections =
llvm::StringSwitch<llvm::BasicBlockSection>(CodeGenOpts.BBSections)		llvm::StringSwitch<llvm::BasicBlockSection>(CodeGenOpts.BBSections)
.Case("all", llvm::BasicBlockSection::All)		.Case("all", llvm::BasicBlockSection::All)
.Case("labels", llvm::BasicBlockSection::Labels)		.Case("labels", llvm::BasicBlockSection::Labels)
.StartsWith("list=", llvm::BasicBlockSection::List)		.StartsWith("list=", llvm::BasicBlockSection::List)
.Case("none", llvm::BasicBlockSection::None)		.Case("none", llvm::BasicBlockSection::None)
.Default(llvm::BasicBlockSection::None);		.Default(llvm::BasicBlockSection::None);
▲ Show 20 Lines • Show All 1,137 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 1,807 Lines • ▼ Show 20 Lines	if (!CodeGenOpts.StrictFloatCastOverflow)
FuncAttrs.addAttribute("strict-float-cast-overflow", "false");		FuncAttrs.addAttribute("strict-float-cast-overflow", "false");

// TODO: Are these all needed?		// TODO: Are these all needed?
// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.		// unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
if (LangOpts.NoHonorInfs)		if (LangOpts.NoHonorInfs)
FuncAttrs.addAttribute("no-infs-fp-math", "true");		FuncAttrs.addAttribute("no-infs-fp-math", "true");
if (LangOpts.NoHonorNaNs)		if (LangOpts.NoHonorNaNs)
FuncAttrs.addAttribute("no-nans-fp-math", "true");		FuncAttrs.addAttribute("no-nans-fp-math", "true");
		if (LangOpts.ApproxFunc)
		FuncAttrs.addAttribute("approx-func-fp-math", "true");
if (LangOpts.UnsafeFPMath)		if (LangOpts.UnsafeFPMath)
FuncAttrs.addAttribute("unsafe-fp-math", "true");		FuncAttrs.addAttribute("unsafe-fp-math", "true");
if (CodeGenOpts.SoftFloat)		if (CodeGenOpts.SoftFloat)
FuncAttrs.addAttribute("use-soft-float", "true");		FuncAttrs.addAttribute("use-soft-float", "true");
FuncAttrs.addAttribute("stack-protector-buffer-size",		FuncAttrs.addAttribute("stack-protector-buffer-size",
llvm::utostr(CodeGenOpts.SSPBufferSize));		llvm::utostr(CodeGenOpts.SSPBufferSize));
if (LangOpts.NoSignedZero)		if (LangOpts.NoSignedZero)
FuncAttrs.addAttribute("no-signed-zeros-fp-math", "true");		FuncAttrs.addAttribute("no-signed-zeros-fp-math", "true");
▲ Show 20 Lines • Show All 3,698 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/Clang.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,618 Lines • ▼ Show 20 Lines	static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
const JobAction &JA) {		const JobAction &JA) {
// Handle various floating point optimization flags, mapping them to the		// Handle various floating point optimization flags, mapping them to the
// appropriate LLVM code generation flags. This is complicated by several		// appropriate LLVM code generation flags. This is complicated by several
// "umbrella" flags, so we do this by stepping through the flags incrementally		// "umbrella" flags, so we do this by stepping through the flags incrementally
// adjusting what we think is enabled/disabled, then at the end setting the		// adjusting what we think is enabled/disabled, then at the end setting the
// LLVM flags based on the final state.		// LLVM flags based on the final state.
bool HonorINFs = true;		bool HonorINFs = true;
bool HonorNaNs = true;		bool HonorNaNs = true;
		bool ApproxFunc = false;
// -fmath-errno is the default on some platforms, e.g. BSD-derived OSes.		// -fmath-errno is the default on some platforms, e.g. BSD-derived OSes.
bool MathErrno = TC.IsMathErrnoDefault();		bool MathErrno = TC.IsMathErrnoDefault();
bool AssociativeMath = false;		bool AssociativeMath = false;
bool ReciprocalMath = false;		bool ReciprocalMath = false;
bool SignedZeros = true;		bool SignedZeros = true;
bool TrappingMath = false; // Implemented via -ffp-exception-behavior		bool TrappingMath = false; // Implemented via -ffp-exception-behavior
bool TrappingMathPresent = false; // Is trapping-math in args, and not		bool TrappingMathPresent = false; // Is trapping-math in args, and not
// overriden by ffp-exception-behavior?		// overriden by ffp-exception-behavior?
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	for (const Arg *A : Args) {
// If this isn't an FP option skip the claim below		// If this isn't an FP option skip the claim below
default: continue;		default: continue;

// Options controlling individual features		// Options controlling individual features
case options::OPT_fhonor_infinities: HonorINFs = true; break;		case options::OPT_fhonor_infinities: HonorINFs = true; break;
case options::OPT_fno_honor_infinities: HonorINFs = false; break;		case options::OPT_fno_honor_infinities: HonorINFs = false; break;
case options::OPT_fhonor_nans: HonorNaNs = true; break;		case options::OPT_fhonor_nans: HonorNaNs = true; break;
case options::OPT_fno_honor_nans: HonorNaNs = false; break;		case options::OPT_fno_honor_nans: HonorNaNs = false; break;
		case options::OPT_fapprox_func: ApproxFunc = true; break;
		case options::OPT_fno_approx_func: ApproxFunc = false; break;
case options::OPT_fmath_errno: MathErrno = true; break;		case options::OPT_fmath_errno: MathErrno = true; break;
case options::OPT_fno_math_errno: MathErrno = false; break;		case options::OPT_fno_math_errno: MathErrno = false; break;
case options::OPT_fassociative_math: AssociativeMath = true; break;		case options::OPT_fassociative_math: AssociativeMath = true; break;
case options::OPT_fno_associative_math: AssociativeMath = false; break;		case options::OPT_fno_associative_math: AssociativeMath = false; break;
case options::OPT_freciprocal_math: ReciprocalMath = true; break;		case options::OPT_freciprocal_math: ReciprocalMath = true; break;
case options::OPT_fno_reciprocal_math: ReciprocalMath = false; break;		case options::OPT_fno_reciprocal_math: ReciprocalMath = false; break;
case options::OPT_fsigned_zeros: SignedZeros = true; break;		case options::OPT_fsigned_zeros: SignedZeros = true; break;
case options::OPT_fno_signed_zeros: SignedZeros = false; break;		case options::OPT_fno_signed_zeros: SignedZeros = false; break;
▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
}		}

if (!HonorINFs)		if (!HonorINFs)
CmdArgs.push_back("-menable-no-infs");		CmdArgs.push_back("-menable-no-infs");

if (!HonorNaNs)		if (!HonorNaNs)
CmdArgs.push_back("-menable-no-nans");		CmdArgs.push_back("-menable-no-nans");

		if (ApproxFunc)
		CmdArgs.push_back("-fapprox-func");

if (MathErrno)		if (MathErrno)
CmdArgs.push_back("-fmath-errno");		CmdArgs.push_back("-fmath-errno");

if (!MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros &&		if (!MathErrno && AssociativeMath && ReciprocalMath && !SignedZeros &&
!TrappingMath)		!TrappingMath)
CmdArgs.push_back("-menable-unsafe-fp-math");		CmdArgs.push_back("-menable-unsafe-fp-math");

if (!SignedZeros)		if (!SignedZeros)
▲ Show 20 Lines • Show All 4,856 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/ScalarFuncs.def

This file was added.

				//===-- ScalarFuncs.def - Library information ----------- C++ -----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				// This .def file creates mapping from standard IEEE math functions
				// their corresponding entries in the IBM MASS (scalar) library.

				#if defined(TLI_DEFINE_SCALAR_MASS_FUNCS)
				bmahjourUnsubmitted Done Reply Inline Actions [nit] ISelLowing -> PPCISelLowering bmahjour: [nit] ISelLowing -> PPCISelLowering
				#define TLI_DEFINE_SCALAR_MASS_FUNC(SCAL, MASSENTRY) {SCAL, MASSENTRY},
				#endif

				TLI_DEFINE_SCALAR_MASS_FUNC("acosf", "__xl_acosf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__acosf_finite", "__xl_acosf")
				TLI_DEFINE_SCALAR_MASS_FUNC("acos", "__xl_acos")
				bmahjourUnsubmitted Done Reply Inline Actions shouldn't these map from llvm.* intrinsics to mass entry points as well? bmahjour: shouldn't these map from llvm.* intrinsics to mass entry points as well?
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions llvm intrinsics is handled in `PPCISelLowering.cpp`. masoud.ataei: llvm intrinsics is handled in `PPCISelLowering.cpp`.
				TLI_DEFINE_SCALAR_MASS_FUNC("__acos_finite", "__xl_acos")

				efriedmaUnsubmitted Done Reply Inline Actions Do "__acosf_finite" etc. actually exist on AIX? I thought they only existed on glibc, and the glibc functions are all deprecated. I think I'd prefer to track this information in TargetLibraryInfo, like we do for the vector functions, so we can more easily generalize this mechanism in the future. efriedma: Do "__acosf_finite" etc. actually exist on AIX? I thought they only existed on glibc, and the…
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Some machines still have the old glibc, so I kept them for compatibility. masoud.ataei: Some machines still have the old glibc, so I kept them for compatibility.
				TLI_DEFINE_SCALAR_MASS_FUNC("acoshf", "__xl_acoshf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__acoshf_finite", "__xl_acoshf")
				TLI_DEFINE_SCALAR_MASS_FUNC("acosh", "__xl_acosh")
				TLI_DEFINE_SCALAR_MASS_FUNC("__acosh_finite", "__xl_acosh")

				TLI_DEFINE_SCALAR_MASS_FUNC("asinf", "__xl_asinf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asinf_finite", "__xl_asinf")
				TLI_DEFINE_SCALAR_MASS_FUNC("asin", "__xl_asin")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asin_finite", "__xl_asin")

				TLI_DEFINE_SCALAR_MASS_FUNC("asinhf", "__xl_asinhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asinhf_finite", "__xl_asinhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("asinh", "__xl_asinh")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asinh_finite", "__xl_asinh")

				TLI_DEFINE_SCALAR_MASS_FUNC("atanf", "__xl_atanf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atanf_finite", "__xl_atanf")
				TLI_DEFINE_SCALAR_MASS_FUNC("atan", "__xl_atan")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atan_finite", "__xl_atan")

				TLI_DEFINE_SCALAR_MASS_FUNC("atan2f", "__xl_atan2f")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atan2f_finite", "__xl_atan2f")
				TLI_DEFINE_SCALAR_MASS_FUNC("atan2", "__xl_atan2")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atan2_finite", "__xl_atan2")

				TLI_DEFINE_SCALAR_MASS_FUNC("atanhf", "__xl_atanhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atanhf_finite", "__xl_atanhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("atanh", "__xl_atanh")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atanh_finite", "__xl_atanh")

				TLI_DEFINE_SCALAR_MASS_FUNC("cbrtf", "__xl_cbrtf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cbrtf_finite", "__xl_cbrtf")
				TLI_DEFINE_SCALAR_MASS_FUNC("cbrt", "__xl_cbrt")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cbrt_finite", "__xl_cbrt")

				TLI_DEFINE_SCALAR_MASS_FUNC("cosf", "__xl_cosf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cosf_finite", "__xl_cosf")
				TLI_DEFINE_SCALAR_MASS_FUNC("cos", "__xl_cos")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cos_finite", "__xl_cos")

				TLI_DEFINE_SCALAR_MASS_FUNC("coshf", "__xl_coshf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__coshf_finite", "__xl_coshf")
				TLI_DEFINE_SCALAR_MASS_FUNC("cosh", "__xl_cosh")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cosh_finite", "__xl_cosh")

				TLI_DEFINE_SCALAR_MASS_FUNC("erff", "__xl_erff")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erff_finite", "__xl_erff")
				TLI_DEFINE_SCALAR_MASS_FUNC("erf", "__xl_erf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erf_finite", "__xl_erf")

				TLI_DEFINE_SCALAR_MASS_FUNC("erfcf", "__xl_erfcf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erfcf_finite", "__xl_erfcf")
				TLI_DEFINE_SCALAR_MASS_FUNC("erfc", "__xl_erfc")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erfc_finite", "__xl_erfc")

				TLI_DEFINE_SCALAR_MASS_FUNC("expf", "__xl_expf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__expf_finite", "__xl_expf")
				TLI_DEFINE_SCALAR_MASS_FUNC("exp", "__xl_exp")
				TLI_DEFINE_SCALAR_MASS_FUNC("__exp_finite", "__xl_exp")

				TLI_DEFINE_SCALAR_MASS_FUNC("expm1f", "__xl_expm1f")
				TLI_DEFINE_SCALAR_MASS_FUNC("__expm1f_finite", "__xl_expm1f")
				TLI_DEFINE_SCALAR_MASS_FUNC("expm1", "__xl_expm1")
				TLI_DEFINE_SCALAR_MASS_FUNC("__expm1_finite", "__xl_expm1")

				TLI_DEFINE_SCALAR_MASS_FUNC("hypotf", "__xl_hypotf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__hypotf_finite", "__xl_hypotf")
				TLI_DEFINE_SCALAR_MASS_FUNC("hypot", "__xl_hypot")
				TLI_DEFINE_SCALAR_MASS_FUNC("__hypot_finite", "__xl_hypot")

				TLI_DEFINE_SCALAR_MASS_FUNC("lgammaf", "__xl_lgammaf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__lgammaf_finite", "__xl_lgammaf")
				TLI_DEFINE_SCALAR_MASS_FUNC("lgamma", "__xl_lgamma")
				TLI_DEFINE_SCALAR_MASS_FUNC("__lgamma_finite", "__xl_lgamma")

				TLI_DEFINE_SCALAR_MASS_FUNC("logf", "__xl_logf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__logf_finite", "__xl_logf")
				TLI_DEFINE_SCALAR_MASS_FUNC("log", "__xl_log")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log_finite", "__xl_log")

				TLI_DEFINE_SCALAR_MASS_FUNC("log10f", "__xl_log10f")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log10f_finite", "__xl_log10f")
				TLI_DEFINE_SCALAR_MASS_FUNC("log10", "__xl_log10")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log10_finite", "__xl_log10")

				TLI_DEFINE_SCALAR_MASS_FUNC("log1pf", "__xl_log1pf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log1pf_finite", "__xl_log1pf")
				TLI_DEFINE_SCALAR_MASS_FUNC("log1p", "__xl_log1p")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log1p_finite", "__xl_log1p")

				TLI_DEFINE_SCALAR_MASS_FUNC("powf", "__xl_powf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__powf_finite", "__xl_powf")
				TLI_DEFINE_SCALAR_MASS_FUNC("pow", "__xl_pow")
				TLI_DEFINE_SCALAR_MASS_FUNC("__pow_finite", "__xl_pow")

				TLI_DEFINE_SCALAR_MASS_FUNC("rsqrt", "__xl_rsqrt")

				TLI_DEFINE_SCALAR_MASS_FUNC("sinf", "__xl_sinf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sinf_finite", "__xl_sinf")
				TLI_DEFINE_SCALAR_MASS_FUNC("sin", "__xl_sin")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sin_finite", "__xl_sin")

				TLI_DEFINE_SCALAR_MASS_FUNC("sinhf", "__xl_sinhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sinhf_finite", "__xl_sinhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("sinh", "__xl_sinh")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sinh_finite", "__xl_sinh")

				TLI_DEFINE_SCALAR_MASS_FUNC("sqrt", "__xl_sqrt")

				TLI_DEFINE_SCALAR_MASS_FUNC("tanf", "__xl_tanf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tanf_finite", "__xl_tanf")
				TLI_DEFINE_SCALAR_MASS_FUNC("tan", "__xl_tan")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tan_finite", "__xl_tan")

				TLI_DEFINE_SCALAR_MASS_FUNC("tanhf", "__xl_tanhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tanhf_finite", "__xl_tanhf")
				TLI_DEFINE_SCALAR_MASS_FUNC("tanh", "__xl_tanh")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tanh_finite", "__xl_tanh")

				#undef TLI_DEFINE_SCALAR_MASS_FUNCS
				#undef TLI_DEFINE_SCALAR_MASS_FUNC

llvm/include/llvm/CodeGen/CommandFlags.h

	Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines
	bool getEnableUnsafeFPMath();			bool getEnableUnsafeFPMath();

	bool getEnableNoInfsFPMath();			bool getEnableNoInfsFPMath();

	bool getEnableNoNaNsFPMath();			bool getEnableNoNaNsFPMath();

	bool getEnableNoSignedZerosFPMath();			bool getEnableNoSignedZerosFPMath();

				bool getEnableApproxFuncFPMath();

	bool getEnableNoTrappingFPMath();			bool getEnableNoTrappingFPMath();

	DenormalMode::DenormalModeKind getDenormalFPMath();			DenormalMode::DenormalModeKind getDenormalFPMath();
	DenormalMode::DenormalModeKind getDenormalFP32Math();			DenormalMode::DenormalModeKind getDenormalFP32Math();

	bool getEnableHonorSignDependentRoundingFPMath();			bool getEnableHonorSignDependentRoundingFPMath();

	llvm::FloatABI::ABIType getFloatABIForCalls();			llvm::FloatABI::ABIType getFloatABIForCalls();
	▲ Show 20 Lines • Show All 102 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetOptions.h

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	enum class GlobalISelAbortMode {
DisableWithDiag // Disable the abort but emit a diagnostic on failure.		DisableWithDiag // Disable the abort but emit a diagnostic on failure.
};		};

class TargetOptions {		class TargetOptions {
public:		public:
TargetOptions()		TargetOptions()
: UnsafeFPMath(false), NoInfsFPMath(false), NoNaNsFPMath(false),		: UnsafeFPMath(false), NoInfsFPMath(false), NoNaNsFPMath(false),
NoTrappingFPMath(true), NoSignedZerosFPMath(false),		NoTrappingFPMath(true), NoSignedZerosFPMath(false),
EnableAIXExtendedAltivecABI(false),		ApproxFuncFPMath(false), EnableAIXExtendedAltivecABI(false),
HonorSignDependentRoundingFPMathOption(false), NoZerosInBSS(false),		HonorSignDependentRoundingFPMathOption(false), NoZerosInBSS(false),
GuaranteedTailCallOpt(false), StackSymbolOrdering(true),		GuaranteedTailCallOpt(false), StackSymbolOrdering(true),
EnableFastISel(false), EnableGlobalISel(false), UseInitArray(false),		EnableFastISel(false), EnableGlobalISel(false), UseInitArray(false),
DisableIntegratedAS(false), RelaxELFRelocations(false),		DisableIntegratedAS(false), RelaxELFRelocations(false),
FunctionSections(false), DataSections(false),		FunctionSections(false), DataSections(false),
IgnoreXCOFFVisibility(false), XCOFFTracebackTable(true),		IgnoreXCOFFVisibility(false), XCOFFTracebackTable(true),
UniqueSectionNames(true), UniqueBasicBlockSectionNames(false),		UniqueSectionNames(true), UniqueBasicBlockSectionNames(false),
TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),		TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),
EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),		EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),
EmitStackSizeSection(false), EnableMachineOutliner(false),		EmitStackSizeSection(false), EnableMachineOutliner(false),
EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),		EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),
EmitAddrsig(false), EmitCallSiteInfo(false),		EmitAddrsig(false), EmitCallSiteInfo(false),
SupportsDebugEntryValues(false), EnableDebugEntryValues(false),		SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
PseudoProbeForProfiling(false), ValueTrackingVariableLocations(false),		PseudoProbeForProfiling(false), ValueTrackingVariableLocations(false),
ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),		ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),
DebugStrictDwarf(false),		DebugStrictDwarf(false), PPCGenScalarMASSEntries(false),
FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}		FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}

/// DisableFramePointerElim - This returns true if frame pointer elimination		/// DisableFramePointerElim - This returns true if frame pointer elimination
/// optimization should be disabled for the given machine function.		/// optimization should be disabled for the given machine function.
bool DisableFramePointerElim(const MachineFunction &MF) const;		bool DisableFramePointerElim(const MachineFunction &MF) const;

/// If greater than 0, override the default value of		/// If greater than 0, override the default value of
/// MCAsmInfo::BinutilsVersion.		/// MCAsmInfo::BinutilsVersion.
Show All 24 Lines	public:
unsigned NoTrappingFPMath : 1;		unsigned NoTrappingFPMath : 1;

/// NoSignedZerosFPMath - This flag is enabled when the		/// NoSignedZerosFPMath - This flag is enabled when the
/// -enable-no-signed-zeros-fp-math is specified on the command line. This		/// -enable-no-signed-zeros-fp-math is specified on the command line. This
/// specifies that optimizations are allowed to treat the sign of a zero		/// specifies that optimizations are allowed to treat the sign of a zero
/// argument or result as insignificant.		/// argument or result as insignificant.
unsigned NoSignedZerosFPMath : 1;		unsigned NoSignedZerosFPMath : 1;

		/// ApproxFuncFPMath - This flag is enabled when the
		/// -enable-approx-func-fp-math is specified on the command line. This
		/// specifies that optimizations are allowed to substitute math functions
		/// with approximate calculations
		unsigned ApproxFuncFPMath : 1;
		bmahjourUnsubmitted Done Reply Inline Actions We already have the `PPCGenScalarMASSEntries` bit, why do we need another one? Perhaps we can remove `PPCGenScalarMASSEntries`, but we should not have to turn on two options to get one transformation enabled. bmahjour: We already have the `PPCGenScalarMASSEntries` bit, why do we need another one? Perhaps we can…

/// EnableAIXExtendedAltivecABI - This flag returns true when -vec-extabi is		/// EnableAIXExtendedAltivecABI - This flag returns true when -vec-extabi is
/// specified. The code generator is then able to use both volatile and		/// specified. The code generator is then able to use both volatile and
/// nonvolitle vector regisers. When false, the code generator only uses		/// nonvolitle vector regisers. When false, the code generator only uses
/// volatile vector registers which is the default setting on AIX.		/// volatile vector registers which is the default setting on AIX.
unsigned EnableAIXExtendedAltivecABI : 1;		unsigned EnableAIXExtendedAltivecABI : 1;

/// HonorSignDependentRoundingFPMath - This returns true when the		/// HonorSignDependentRoundingFPMath - This returns true when the
/// -enable-sign-dependent-rounding-fp-math is specified. If this returns		/// -enable-sign-dependent-rounding-fp-math is specified. If this returns
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	public:

/// Emit XRay Function Index section		/// Emit XRay Function Index section
unsigned XRayOmitFunctionIndex : 1;		unsigned XRayOmitFunctionIndex : 1;

/// When set to true, don't use DWARF extensions in later DWARF versions.		/// When set to true, don't use DWARF extensions in later DWARF versions.
/// By default, it is set to false.		/// By default, it is set to false.
unsigned DebugStrictDwarf : 1;		unsigned DebugStrictDwarf : 1;

		/// Enables scalar MASS conversions
		unsigned PPCGenScalarMASSEntries : 1;

/// Name of the stack usage file (i.e., .su file) if user passes		/// Name of the stack usage file (i.e., .su file) if user passes
/// -fstack-usage. If empty, it can be implied that -fstack-usage is not		/// -fstack-usage. If empty, it can be implied that -fstack-usage is not
/// passed on the command line.		/// passed on the command line.
std::string StackUsageOutput;		std::string StackUsageOutput;

/// FloatABIType - This setting is set by -float-abi=xxx option is specfied		/// FloatABIType - This setting is set by -float-abi=xxx option is specfied
/// on the command line. This setting may either be Default, Soft, or Hard.		/// on the command line. This setting may either be Default, Soft, or Hard.
/// Default selects the target's default behavior. Soft selects the ABI for		/// Default selects the target's default behavior. Soft selects the ABI for
▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CommandFlags.cpp

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
CGOPT_EXP(CodeModel::Model, CodeModel)		CGOPT_EXP(CodeModel::Model, CodeModel)
CGOPT(ExceptionHandling, ExceptionModel)		CGOPT(ExceptionHandling, ExceptionModel)
CGOPT_EXP(CodeGenFileType, FileType)		CGOPT_EXP(CodeGenFileType, FileType)
CGOPT(FramePointerKind, FramePointerUsage)		CGOPT(FramePointerKind, FramePointerUsage)
CGOPT(bool, EnableUnsafeFPMath)		CGOPT(bool, EnableUnsafeFPMath)
CGOPT(bool, EnableNoInfsFPMath)		CGOPT(bool, EnableNoInfsFPMath)
CGOPT(bool, EnableNoNaNsFPMath)		CGOPT(bool, EnableNoNaNsFPMath)
CGOPT(bool, EnableNoSignedZerosFPMath)		CGOPT(bool, EnableNoSignedZerosFPMath)
		CGOPT(bool, EnableApproxFuncFPMath)
CGOPT(bool, EnableNoTrappingFPMath)		CGOPT(bool, EnableNoTrappingFPMath)
CGOPT(bool, EnableAIXExtendedAltivecABI)		CGOPT(bool, EnableAIXExtendedAltivecABI)
CGOPT(DenormalMode::DenormalModeKind, DenormalFPMath)		CGOPT(DenormalMode::DenormalModeKind, DenormalFPMath)
CGOPT(DenormalMode::DenormalModeKind, DenormalFP32Math)		CGOPT(DenormalMode::DenormalModeKind, DenormalFP32Math)
CGOPT(bool, EnableHonorSignDependentRoundingFPMath)		CGOPT(bool, EnableHonorSignDependentRoundingFPMath)
CGOPT(FloatABI::ABIType, FloatABIForCalls)		CGOPT(FloatABI::ABIType, FloatABIForCalls)
CGOPT(FPOpFusion::FPOpFusionMode, FuseFPOps)		CGOPT(FPOpFusion::FPOpFusionMode, FuseFPOps)
CGOPT(bool, DontPlaceZerosInBSS)		CGOPT(bool, DontPlaceZerosInBSS)
▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines	#define CGBINDOPT(NAME) \

static cl::opt<bool> EnableNoSignedZerosFPMath(		static cl::opt<bool> EnableNoSignedZerosFPMath(
"enable-no-signed-zeros-fp-math",		"enable-no-signed-zeros-fp-math",
cl::desc("Enable FP math optimizations that assume "		cl::desc("Enable FP math optimizations that assume "
"the sign of 0 is insignificant"),		"the sign of 0 is insignificant"),
cl::init(false));		cl::init(false));
CGBINDOPT(EnableNoSignedZerosFPMath);		CGBINDOPT(EnableNoSignedZerosFPMath);

		static cl::opt<bool> EnableApproxFuncFPMath(
		"enable-approx-func-fp-math",
		cl::desc("Enable FP math optimizations that assume approx func"),
		cl::init(false));
		CGBINDOPT(EnableApproxFuncFPMath);

static cl::opt<bool> EnableNoTrappingFPMath(		static cl::opt<bool> EnableNoTrappingFPMath(
"enable-no-trapping-fp-math",		"enable-no-trapping-fp-math",
cl::desc("Enable setting the FP exceptions build "		cl::desc("Enable setting the FP exceptions build "
"attribute not to use exceptions"),		"attribute not to use exceptions"),
cl::init(false));		cl::init(false));
CGBINDOPT(EnableNoTrappingFPMath);		CGBINDOPT(EnableNoTrappingFPMath);

static const auto DenormFlagEnumOptions =		static const auto DenormFlagEnumOptions =
▲ Show 20 Lines • Show All 254 Lines • ▼ Show 20 Lines
TargetOptions		TargetOptions
codegen::InitTargetOptionsFromCodeGenFlags(const Triple &TheTriple) {		codegen::InitTargetOptionsFromCodeGenFlags(const Triple &TheTriple) {
TargetOptions Options;		TargetOptions Options;
Options.AllowFPOpFusion = getFuseFPOps();		Options.AllowFPOpFusion = getFuseFPOps();
Options.UnsafeFPMath = getEnableUnsafeFPMath();		Options.UnsafeFPMath = getEnableUnsafeFPMath();
Options.NoInfsFPMath = getEnableNoInfsFPMath();		Options.NoInfsFPMath = getEnableNoInfsFPMath();
Options.NoNaNsFPMath = getEnableNoNaNsFPMath();		Options.NoNaNsFPMath = getEnableNoNaNsFPMath();
Options.NoSignedZerosFPMath = getEnableNoSignedZerosFPMath();		Options.NoSignedZerosFPMath = getEnableNoSignedZerosFPMath();
		Options.ApproxFuncFPMath = getEnableApproxFuncFPMath();
Options.NoTrappingFPMath = getEnableNoTrappingFPMath();		Options.NoTrappingFPMath = getEnableNoTrappingFPMath();

DenormalMode::DenormalModeKind DenormKind = getDenormalFPMath();		DenormalMode::DenormalModeKind DenormKind = getDenormalFPMath();

// FIXME: Should have separate input and output flags		// FIXME: Should have separate input and output flags
Options.setFPDenormalMode(DenormalMode(DenormKind, DenormKind));		Options.setFPDenormalMode(DenormalMode(DenormKind, DenormKind));

Options.HonorSignDependentRoundingFPMathOption =		Options.HonorSignDependentRoundingFPMathOption =
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	NewAttrs.addAttribute("disable-tail-calls",
toStringRef(getDisableTailCalls()));		toStringRef(getDisableTailCalls()));
if (getStackRealign())		if (getStackRealign())
NewAttrs.addAttribute("stackrealign");		NewAttrs.addAttribute("stackrealign");

HANDLE_BOOL_ATTR(EnableUnsafeFPMathView, "unsafe-fp-math");		HANDLE_BOOL_ATTR(EnableUnsafeFPMathView, "unsafe-fp-math");
HANDLE_BOOL_ATTR(EnableNoInfsFPMathView, "no-infs-fp-math");		HANDLE_BOOL_ATTR(EnableNoInfsFPMathView, "no-infs-fp-math");
HANDLE_BOOL_ATTR(EnableNoNaNsFPMathView, "no-nans-fp-math");		HANDLE_BOOL_ATTR(EnableNoNaNsFPMathView, "no-nans-fp-math");
HANDLE_BOOL_ATTR(EnableNoSignedZerosFPMathView, "no-signed-zeros-fp-math");		HANDLE_BOOL_ATTR(EnableNoSignedZerosFPMathView, "no-signed-zeros-fp-math");
		HANDLE_BOOL_ATTR(EnableApproxFuncFPMathView, "approx-func-fp-math");

if (DenormalFPMathView->getNumOccurrences() > 0 &&		if (DenormalFPMathView->getNumOccurrences() > 0 &&
!F.hasFnAttribute("denormal-fp-math")) {		!F.hasFnAttribute("denormal-fp-math")) {
DenormalMode::DenormalModeKind DenormKind = getDenormalFPMath();		DenormalMode::DenormalModeKind DenormKind = getDenormalFPMath();

// FIXME: Command line flag should expose separate input/output modes.		// FIXME: Command line flag should expose separate input/output modes.
NewAttrs.addAttribute("denormal-fp-math",		NewAttrs.addAttribute("denormal-fp-math",
DenormalMode(DenormKind, DenormKind).str());		DenormalMode(DenormKind, DenormKind).str());
Show All 35 Lines

llvm/lib/Target/PowerPC/CMakeLists.txt

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	add_llvm_target(PowerPCCodeGen
PPCTLSDynamicCall.cpp		PPCTLSDynamicCall.cpp
PPCVSXCopy.cpp		PPCVSXCopy.cpp
PPCReduceCRLogicals.cpp		PPCReduceCRLogicals.cpp
PPCVSXFMAMutate.cpp		PPCVSXFMAMutate.cpp
PPCVSXSwapRemoval.cpp		PPCVSXSwapRemoval.cpp
PPCExpandISEL.cpp		PPCExpandISEL.cpp
PPCPreEmitPeephole.cpp		PPCPreEmitPeephole.cpp
PPCLowerMASSVEntries.cpp		PPCLowerMASSVEntries.cpp
		PPCGenScalarMASSEntries.cpp
GISel/PPCCallLowering.cpp		GISel/PPCCallLowering.cpp
GISel/PPCRegisterBankInfo.cpp		GISel/PPCRegisterBankInfo.cpp
GISel/PPCLegalizerInfo.cpp		GISel/PPCLegalizerInfo.cpp

LINK_COMPONENTS		LINK_COMPONENTS
Analysis		Analysis
AsmPrinter		AsmPrinter
BinaryFormat		BinaryFormat
Show All 20 Lines

llvm/lib/Target/PowerPC/PPC.h

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	#endif
void initializePPCMIPeepholePass(PassRegistry&);		void initializePPCMIPeepholePass(PassRegistry&);

extern char &PPCVSXFMAMutateID;		extern char &PPCVSXFMAMutateID;

ModulePass *createPPCLowerMASSVEntriesPass();		ModulePass *createPPCLowerMASSVEntriesPass();
void initializePPCLowerMASSVEntriesPass(PassRegistry &);		void initializePPCLowerMASSVEntriesPass(PassRegistry &);
extern char &PPCLowerMASSVEntriesID;		extern char &PPCLowerMASSVEntriesID;

		ModulePass *createPPCGenScalarMASSEntriesPass();
		void initializePPCGenScalarMASSEntriesPass(PassRegistry &);
		extern char &PPCGenScalarMASSEntriesID;

InstructionSelector *		InstructionSelector *
createPPCInstructionSelector(const PPCTargetMachine &, const PPCSubtarget &,		createPPCInstructionSelector(const PPCTargetMachine &, const PPCSubtarget &,
const PPCRegisterBankInfo &);		const PPCRegisterBankInfo &);
namespace PPCII {		namespace PPCII {

/// Target Operand Flag enum.		/// Target Operand Flag enum.
enum TOF {		enum TOF {
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp

This file was added.

				//===-- PPCGenScalarMASSEntries.cpp ---------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This transformation converts standard math functions and LLVM math intrinsics
				// into their corresponding MASS (scalar) entries for PowerPC targets.
				bmahjourUnsubmitted Done Reply Inline Actions Since LLVM math intrinsic lowerings are done in ISellLowering, this comment should not say "and LLVM math intrinsics". bmahjour: Since LLVM math intrinsic lowerings are done in ISellLowering, this comment should not say "and…
				// Following are examples of such conversion:
				// tanh ---> __xl_tanh_finite
				// llvm.cos.f32 --> __xl_cosf_finite
				// Such lowering is legal under the fast-math option.
				bmahjourUnsubmitted Done Reply Inline Actions llvm.cos.f32 is an intrinsic and not handled by this transformation. bmahjour: llvm.cos.f32 is an intrinsic and not handled by this transformation.
				//
				//===----------------------------------------------------------------------===//

				#include "PPC.h"
				#include "PPCSubtarget.h"
				#include "PPCTargetMachine.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/CodeGen/TargetPassConfig.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/Module.h"

				#define DEBUG_TYPE "ppc-gen-scalar-mass"

				using namespace llvm;

				namespace {

				class PPCGenScalarMASSEntries : public ModulePass {
				public:
				static char ID;

				PPCGenScalarMASSEntries() : ModulePass(ID) {
				ScalarMASSFuncs = {
				#define TLI_DEFINE_SCALAR_MASS_FUNCS
				#include "llvm/Analysis/ScalarFuncs.def"
				};
				}

				bool runOnModule(Module &M) override;

				StringRef getPassName() const override {
				return "PPC Generate Scalar MASS Entries";
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<TargetTransformInfoWrapperPass>();
				}

				private:
				std::map<StringRef, StringRef> ScalarMASSFuncs;
				bool isCandidateSafeToLower(const CallInst &CI) const;
				bool isFiniteCallSafe(const CallInst &CI) const;
				bool createScalarMASSCall(StringRef MASSEntry, CallInst &CI,
				Function &Func) const;
				};

				} // namespace

				// Returns true if 'afn' flag exists on the call instruction with the math
				// function
				bool PPCGenScalarMASSEntries::isCandidateSafeToLower(const CallInst &CI) const {
				return CI.hasApproxFunc();
				}

				// Returns true if 'fast' flag exists on the call instruction with the math
				// function
				bool PPCGenScalarMASSEntries::isFiniteCallSafe(const CallInst &CI) const {
				bmahjourUnsubmitted Done Reply Inline Actions There should be a todo comment to handle non-finite entries using fewer fast-math flags. bmahjour: There should be a todo comment to handle non-finite entries using fewer fast-math flags.
				// checking fast flag insread of nnan/ninf/afn because errno and
				// trapping-math don't have IR representation.
				bmahjourUnsubmitted Done Reply Inline Actions remove this line bmahjour: remove this line
				bmahjourUnsubmitted Done Reply Inline Actions ...but errno and trapping-math would be an issue for non-finite entries as well. Again, I think this function should just check for nnan/ninf/afn flags. We need to find out (with the help of the wider community) how to deal with the concerns surrounding errno and traps separately. One way to do that would be to broaden the definition of the `afn` flag to include no-errno and no-trapping semantics. Another way might be to make clang FE set the `afn` bit only if `-fno-math-errno` and `-fno-trapping-math` options are enabled (less desirable). A third way might be to add corresponding function attributes to the IR for `-fno-math-errno` and `-fno-trapping-math`. Once these issues are sorted out, we can add the appropriate constraints to the `isCandidateSafeToLower` function. bmahjour: ...but errno and trapping-math would be an issue for non-finite entries as well. Again, I…
				return CI.isFast();
				}

				/// Lowers scalar math function or math intrinsic \p Func to its PowerPC
				/// target-specific entry in the scalar MASS library.
				/// e.g.: tanh --> __xl_tanh_finite or __xl_tanh
				/// llvm.cos.f32 --> __xl_cosf_finite or __xl_cosf
				/// Both function prototype and its callsite is updated during lowering.
				bool PPCGenScalarMASSEntries::createScalarMASSCall(StringRef MASSEntry,
				CallInst &CI,
				Function &Func) const {
				if (CI.use_empty())
				return false;

				Module *M = Func.getParent();
				assert(M && "Expecting a valid Module");

				std::string MASSEntryStr = MASSEntry.str();
				if (isFiniteCallSafe(CI))
				MASSEntryStr += "_finite";

				FunctionCallee FCache = M->getOrInsertFunction(
				MASSEntryStr, Func.getFunctionType(), Func.getAttributes());

				CI.setCalledFunction(FCache);

				return true;
				}

				bool PPCGenScalarMASSEntries::runOnModule(Module &M) {
				bool Changed = false;

				auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
				if (!TPC \|\| skipModule(M))
				return false;

				for (Function &Func : M) {
				if (!Func.isDeclaration())
				continue;

				auto Iter = ScalarMASSFuncs.find(Func.getName());
				if (Iter == ScalarMASSFuncs.end())
				continue;

				// The call to createScalarMASSCall() invalidates the iterator over users
				// upon replacing the users. Precomputing the current list of users allows
				// us to replace all the call sites.
				SmallVector<User *, 4> TheUsers;
				for (auto *User : Func.users())
				TheUsers.push_back(User);

				for (auto *User : TheUsers)
				if (auto *CI = dyn_cast_or_null<CallInst>(User)) {
				if (isCandidateSafeToLower(*CI))
				Changed \|= createScalarMASSCall(Iter->second, *CI, Func);
				}
				}

				return Changed;
				}

				char PPCGenScalarMASSEntries::ID = 0;

				char &llvm::PPCGenScalarMASSEntriesID = PPCGenScalarMASSEntries::ID;

				INITIALIZE_PASS(PPCGenScalarMASSEntries, DEBUG_TYPE,
				"Generate Scalar MASS entries", false, false)

				ModulePass *llvm::createPPCGenScalarMASSEntriesPass() {
				return new PPCGenScalarMASSEntries();
				}

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 372 Lines • ▼ Show 20 Lines	PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setOperationAction(ISD::FCOS , MVT::f32, Expand);		setOperationAction(ISD::FCOS , MVT::f32, Expand);
setOperationAction(ISD::FSINCOS, MVT::f32, Expand);		setOperationAction(ISD::FSINCOS, MVT::f32, Expand);
setOperationAction(ISD::FREM , MVT::f32, Expand);		setOperationAction(ISD::FREM , MVT::f32, Expand);
setOperationAction(ISD::FPOW , MVT::f32, Expand);		setOperationAction(ISD::FPOW , MVT::f32, Expand);
if (Subtarget.hasSPE()) {		if (Subtarget.hasSPE()) {
setOperationAction(ISD::FMA , MVT::f64, Expand);		setOperationAction(ISD::FMA , MVT::f64, Expand);
setOperationAction(ISD::FMA , MVT::f32, Expand);		setOperationAction(ISD::FMA , MVT::f32, Expand);
} else {		} else {
setOperationAction(ISD::FMA , MVT::f64, Legal);		setOperationAction(ISD::FMA , MVT::f64, Legal);
		bmahjourUnsubmitted Done Reply Inline Actions what about tan, acos, and the others? bmahjour: what about tan, acos, and the others?
		masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions These are the list of math functions that llvm creates intrinsic call for them. There is no llvm intrinsic for tan, acos and other math functions which (exist in MASS and) are not in this list. masoud.ataei: These are the list of math functions that llvm creates intrinsic call for them. There is no…
setOperationAction(ISD::FMA , MVT::f32, Legal);		setOperationAction(ISD::FMA , MVT::f32, Legal);
}		}

if (Subtarget.hasSPE())		if (Subtarget.hasSPE())
setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f32, Expand);		setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f32, Expand);

setOperationAction(ISD::FLT_ROUNDS_, MVT::i32, Custom);		setOperationAction(ISD::FLT_ROUNDS_, MVT::i32, Custom);

▲ Show 20 Lines • Show All 969 Lines • ▼ Show 20 Lines
setLibcallName(RTLIB::ROUND_F128, "roundf128");		setLibcallName(RTLIB::ROUND_F128, "roundf128");
setLibcallName(RTLIB::LROUND_F128, "lroundf128");		setLibcallName(RTLIB::LROUND_F128, "lroundf128");
setLibcallName(RTLIB::LLROUND_F128, "llroundf128");		setLibcallName(RTLIB::LLROUND_F128, "llroundf128");
setLibcallName(RTLIB::RINT_F128, "rintf128");		setLibcallName(RTLIB::RINT_F128, "rintf128");
setLibcallName(RTLIB::LRINT_F128, "lrintf128");		setLibcallName(RTLIB::LRINT_F128, "lrintf128");
setLibcallName(RTLIB::LLRINT_F128, "llrintf128");		setLibcallName(RTLIB::LLRINT_F128, "llrintf128");
setLibcallName(RTLIB::NEARBYINT_F128, "nearbyintf128");		setLibcallName(RTLIB::NEARBYINT_F128, "nearbyintf128");
setLibcallName(RTLIB::FMA_F128, "fmaf128");		setLibcallName(RTLIB::FMA_F128, "fmaf128");

		bmahjourUnsubmitted Done Reply Inline Actions why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`? bmahjour: why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`?
		masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to block any type of existing optimizations (like pow(x,0.5) --> sqrt(x)) and future optimizations (like https://reviews.llvm.org/D94543 ?). masoud.ataei: We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to…
		bmahjourUnsubmitted Done Reply Inline Actions I see, could you please put a comment in the code to explain that? Alternatively you can put the comment at the top of `llvm/include/llvm/Analysis/ScalarFuncs.def`. bmahjour: I see, could you please put a comment in the code to explain that? Alternatively you can put…
		bmahjourUnsubmitted Done Reply Inline Actions Instead of `TM.Options.UnsafeFPMath` we should test for the individual fast-math flags that are required for safety. Checking for "unsafe-fp-math" has a few drawbacks: To make clang enable that flag it is necessary but not enough to specify `-funsafe-math-optimizations`! You'd have to specify `-fno-math-errno` as well. Clang sets the "unsafe-fp-math" flag when all four of `-fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros` are specified, regardless of other flags... For example this command does the conversion to the _finite calls despite the user request to honor NaNs. `clang t.c -c -O3 -fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros -fhonor-nans` Even if the clang inconsistencies/issues are resolved, it would still be better to check for the individual flags for finer control and for consistency with other front-ends. bmahjour: Instead of `TM.Options.UnsafeFPMath` we should test for the individual fast-math flags that are…
		// MASS transformation for LLVM intrinsics with replicating fast-math flag
		// to be consistent to PPCGenScalarMASSEntries pass
		if (TM.Options.PPCGenScalarMASSEntries && TM.Options.ApproxFuncFPMath) {
		bmahjourUnsubmitted Done Reply Inline Actions Why do you still check for `TM.Options.UnsafeFPMath` ? If you do it out of concerns for `-fno-math-errno`, then it's not needed. Note that these llvm intrinsics already mention that their semantics are identical to their libm counter parts but "without trapping or setting errno". bmahjour: Why do you still check for `TM.Options.UnsafeFPMath` ? If you do it out of concerns for `-fno…
		bmahjourUnsubmitted Done Reply Inline Actions if someone compiles with -Ofast without any extra options, would `TM.Options.ApproxFuncFPMath` be true here? bmahjour: if someone compiles with -Ofast without any extra options, would `TM.Options.ApproxFuncFPMath`…
		masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions In clang changes, I had `Options.ApproxFuncFPMath = LangOpts.ApproxFunc;` in `clang/lib/CodeGen/BackendUtil.cpp`. That was responsible to update this TM option based on the clang approximate func option. And clang approximate func option will be set with -Ofast. Then, the answer for your question is yes. masoud.ataei: In clang changes, I had `Options.ApproxFuncFPMath = LangOpts.ApproxFunc;` in…
		if (TM.Options.NoInfsFPMath && TM.Options.NoNaNsFPMath &&
		TM.Options.NoSignedZerosFPMath) {
		setLibcallName(RTLIB::COS_F64, "__xl_cos_finite");
		setLibcallName(RTLIB::COS_F32, "__xl_cosf_finite");
		setLibcallName(RTLIB::EXP_F64, "__xl_exp_finite");
		setLibcallName(RTLIB::EXP_F32, "__xl_expf_finite");
		setLibcallName(RTLIB::LOG_F64, "__xl_log_finite");
		setLibcallName(RTLIB::LOG_F32, "__xl_logf_finite");
		setLibcallName(RTLIB::LOG10_F64, "__xl_log10_finite");
		setLibcallName(RTLIB::LOG10_F32, "__xl_log10f_finite");
		setLibcallName(RTLIB::POW_F64, "__xl_pow_finite");
		setLibcallName(RTLIB::POW_F32, "__xl_powf_finite");
		setLibcallName(RTLIB::SIN_F64, "__xl_sin_finite");
		setLibcallName(RTLIB::SIN_F32, "__xl_sinf_finite");
		} else {
		setLibcallName(RTLIB::COS_F64, "__xl_cos");
		setLibcallName(RTLIB::COS_F32, "__xl_cosf");
		setLibcallName(RTLIB::EXP_F64, "__xl_exp");
		setLibcallName(RTLIB::EXP_F32, "__xl_expf");
		setLibcallName(RTLIB::LOG_F64, "__xl_log");
		setLibcallName(RTLIB::LOG_F32, "__xl_logf");
		setLibcallName(RTLIB::LOG10_F64, "__xl_log10");
		setLibcallName(RTLIB::LOG10_F32, "__xl_log10f");
		setLibcallName(RTLIB::POW_F64, "__xl_pow");
		setLibcallName(RTLIB::POW_F32, "__xl_powf");
		setLibcallName(RTLIB::SIN_F64, "__xl_sin");
		setLibcallName(RTLIB::SIN_F32, "__xl_sinf");
		}
		}

// With 32 condition bits, we don't need to sink (and duplicate) compares		// With 32 condition bits, we don't need to sink (and duplicate) compares
// aggressively in CodeGenPrep.		// aggressively in CodeGenPrep.
if (Subtarget.useCRBits()) {		if (Subtarget.useCRBits()) {
setHasMultipleConditionRegisters();		setHasMultipleConditionRegisters();
setJumpIsExpensive();		setJumpIsExpensive();
}		}

setMinFunctionAlignment(Align(4));		setMinFunctionAlignment(Align(4));
▲ Show 20 Lines • Show All 15,980 Lines • ▼ Show 20 Lines	case PPC::AM_DQForm: {
// register and the displacement will be the immediate unless it		// register and the displacement will be the immediate unless it
// isn't sufficiently aligned.		// isn't sufficiently aligned.
if (Flags & PPC::MOF_RPlusSImm16) {		if (Flags & PPC::MOF_RPlusSImm16) {
SDValue Op0 = N.getOperand(0);		SDValue Op0 = N.getOperand(0);
SDValue Op1 = N.getOperand(1);		SDValue Op1 = N.getOperand(1);
int16_t Imm = cast<ConstantSDNode>(Op1)->getAPIntValue().getZExtValue();		int16_t Imm = cast<ConstantSDNode>(Op1)->getAPIntValue().getZExtValue();
if (!Align \|\| isAligned(*Align, Imm)) {		if (!Align \|\| isAligned(*Align, Imm)) {
Disp = DAG.getTargetConstant(Imm, DL, N.getValueType());		Disp = DAG.getTargetConstant(Imm, DL, N.getValueType());
Base = Op0;		Base = Op0;
		bmahjourUnsubmitted Not Done Reply Inline Actions [nit] a better name would be `lowerLibCallBasedOnType` bmahjour: [nit] a better name would be `lowerLibCallBasedOnType`
if (FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(Op0)) {		if (FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(Op0)) {
Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType());		Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType());
fixupFuncForFI(DAG, FI->getIndex(), N.getValueType());		fixupFuncForFI(DAG, FI->getIndex(), N.getValueType());
}		}
break;		break;
}		}
}		}
// This is a register plus the @lo relocation. The base is the register		// This is a register plus the @lo relocation. The base is the register
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCTargetMachine.cpp

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
EnableMachineCombinerPass("ppc-machine-combiner",		EnableMachineCombinerPass("ppc-machine-combiner",
cl::desc("Enable the machine combiner pass"),		cl::desc("Enable the machine combiner pass"),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

static cl::opt<bool>		static cl::opt<bool>
ReduceCRLogical("ppc-reduce-cr-logicals",		ReduceCRLogical("ppc-reduce-cr-logicals",
cl::desc("Expand eligible cr-logical binary ops to branches"),		cl::desc("Expand eligible cr-logical binary ops to branches"),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

		static cl::opt<bool> EnablePPCGenScalarMASSEntries(
		"enable-ppc-gen-scalar-mass", cl::init(false),
		cl::desc("Enable lowering math functions to their corresponding MASS "
		"(scalar) entries"),
		cl::Hidden);

extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializePowerPCTarget() {		extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializePowerPCTarget() {
// Register the targets		// Register the targets
RegisterTargetMachine<PPCTargetMachine> A(getThePPC32Target());		RegisterTargetMachine<PPCTargetMachine> A(getThePPC32Target());
RegisterTargetMachine<PPCTargetMachine> B(getThePPC32LETarget());		RegisterTargetMachine<PPCTargetMachine> B(getThePPC32LETarget());
RegisterTargetMachine<PPCTargetMachine> C(getThePPC64Target());		RegisterTargetMachine<PPCTargetMachine> C(getThePPC64Target());
RegisterTargetMachine<PPCTargetMachine> D(getThePPC64LETarget());		RegisterTargetMachine<PPCTargetMachine> D(getThePPC64LETarget());

PassRegistry &PR = *PassRegistry::getPassRegistry();		PassRegistry &PR = *PassRegistry::getPassRegistry();
Show All 10 Lines	#endif
initializePPCBSelPass(PR);		initializePPCBSelPass(PR);
initializePPCBranchCoalescingPass(PR);		initializePPCBranchCoalescingPass(PR);
initializePPCBoolRetToIntPass(PR);		initializePPCBoolRetToIntPass(PR);
initializePPCExpandISELPass(PR);		initializePPCExpandISELPass(PR);
initializePPCPreEmitPeepholePass(PR);		initializePPCPreEmitPeepholePass(PR);
initializePPCTLSDynamicCallPass(PR);		initializePPCTLSDynamicCallPass(PR);
initializePPCMIPeepholePass(PR);		initializePPCMIPeepholePass(PR);
initializePPCLowerMASSVEntriesPass(PR);		initializePPCLowerMASSVEntriesPass(PR);
		initializePPCGenScalarMASSEntriesPass(PR);
initializeGlobalISel(PR);		initializeGlobalISel(PR);
}		}

static bool isLittleEndianTriple(const Triple &T) {		static bool isLittleEndianTriple(const Triple &T) {
return T.getArch() == Triple::ppc64le \|\| T.getArch() == Triple::ppcle;		return T.getArch() == Triple::ppc64le \|\| T.getArch() == Triple::ppcle;
}		}

/// Return the datalayout string of a subtarget.		/// Return the datalayout string of a subtarget.
▲ Show 20 Lines • Show All 289 Lines • ▼ Show 20 Lines
void PPCPassConfig::addIRPasses() {		void PPCPassConfig::addIRPasses() {
if (TM->getOptLevel() != CodeGenOpt::None)		if (TM->getOptLevel() != CodeGenOpt::None)
addPass(createPPCBoolRetToIntPass());		addPass(createPPCBoolRetToIntPass());
addPass(createAtomicExpandPass());		addPass(createAtomicExpandPass());

// Lower generic MASSV routines to PowerPC subtarget-specific entries.		// Lower generic MASSV routines to PowerPC subtarget-specific entries.
addPass(createPPCLowerMASSVEntriesPass());		addPass(createPPCLowerMASSVEntriesPass());

		// Generate PowerPC target-specific entries for scalar math functions
		// that are available in IBM MASS (scalar) library.
		if (TM->getOptLevel() != CodeGenOpt::None && EnablePPCGenScalarMASSEntries) {
		TM->Options.PPCGenScalarMASSEntries = EnablePPCGenScalarMASSEntries;
		addPass(createPPCGenScalarMASSEntriesPass());
		}

// If explicitly requested, add explicit data prefetch intrinsics.		// If explicitly requested, add explicit data prefetch intrinsics.
if (EnablePrefetch.getNumOccurrences() > 0)		if (EnablePrefetch.getNumOccurrences() > 0)
addPass(createLoopDataPrefetchPass());		addPass(createLoopDataPrefetchPass());

if (TM->getOptLevel() >= CodeGenOpt::Default && EnableGEPOpt) {		if (TM->getOptLevel() >= CodeGenOpt::Default && EnableGEPOpt) {
// Call SeparateConstOffsetFromGEP pass to extract constants within indices		// Call SeparateConstOffsetFromGEP pass to extract constants within indices
// and lower a GEP with multiple indices to either arithmetic operations or		// and lower a GEP with multiple indices to either arithmetic operations or
// multiple GEPs with single index.		// multiple GEPs with single index.
▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s
				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck %s

				declare float @llvm.cos.f32(float)
				declare float @llvm.exp.f32(float)
				declare float @llvm.log10.f32(float)
				declare float @llvm.log.f32(float)
				declare float @llvm.pow.f32(float, float)
				declare float @llvm.rint.f32(float)
				declare float @llvm.sin.f32(float)
				declare double @llvm.cos.f64(double)
				declare double @llvm.exp.f64(double)
				declare double @llvm.log.f64(double)
				declare double @llvm.log10.f64(double)
				declare double @llvm.pow.f64(double, double)
				declare double @llvm.sin.f64(double)

				; With afn flag specified per-function
				define float @cosf_f32(float %a) #1 {
				; CHECK-LABEL: cosf_f32
				; CHECK: __xl_cosf
				; CHECK: blr
				entry:
				%0 = tail call afn float @llvm.cos.f32(float %a)
				ret float %0
				}

				; With afn flag specified per-function
				define float @expf_f32(float %a) #1 {
				; CHECK-LABEL: expf_f32
				; CHECK: __xl_expf
				; CHECK: blr
				entry:
				%0 = tail call afn float @llvm.exp.f32(float %a)
				ret float %0
				}

				; With afn flag specified per-function
				define float @log10f_f32(float %a) #1 {
				; CHECK-LABEL: log10f_f32
				; CHECK: __xl_log10f
				; CHECK: blr
				entry:
				%0 = tail call afn float @llvm.log10.f32(float %a)
				ret float %0
				}

				; With afn flag specified per-function
				define float @logf_f32(float %a) #1 {
				; CHECK-LABEL: logf_f32
				; CHECK: __xl_logf
				; CHECK: blr
				entry:
				%0 = tail call afn float @llvm.log.f32(float %a)
				ret float %0
				}

				; With afn flag specified per-function
				define float @powf_f32(float %a, float %b) #1 {
				; CHECK-LABEL: powf_f32
				; CHECK: __xl_powf
				; CHECK: blr
				entry:
				%0 = tail call afn float @llvm.pow.f32(float %a, float %b)
				ret float %0
				}

				; With afn flag specified per-function
				define float @rintf_f32(float %a) #1 {
				; CHECK-LABEL: rintf_f32
				; CHECK-NOT: bl __xl_rintf
				; CHECK: blr
				entry:
				%0 = tail call afn float @llvm.rint.f32(float %a)
				ret float %0
				}

				; With afn flag specified per-function
				define float @sinf_f32(float %a) #1 {
				; CHECK-LABEL: sinf_f32
				; CHECK: __xl_sinf
				; CHECK: blr
				entry:
				%0 = tail call afn float @llvm.sin.f32(float %a)
				ret float %0
				}

				; With afn flag specified per-function
				define double @cos_f64(double %a) #1 {
				; CHECK-LABEL: cos_f64
				; CHECK: __xl_cos
				; CHECK: blr
				entry:
				%0 = tail call afn double @llvm.cos.f64(double %a)
				ret double %0
				}

				; With afn flag specified per-function
				define double @exp_f64(double %a) #1 {
				; CHECK-LABEL: exp_f64
				; CHECK: __xl_exp
				; CHECK: blr
				entry:
				%0 = tail call afn double @llvm.exp.f64(double %a)
				ret double %0
				}

				; With afn flag specified per-function
				define double @log_f64(double %a) #1 {
				; CHECK-LABEL: log_f64
				; CHECK: __xl_log
				; CHECK: blr
				entry:
				%0 = tail call afn double @llvm.log.f64(double %a)
				ret double %0
				}

				; With afn flag specified per-function
				define double @log10_f64(double %a) #1 {
				; CHECK-LABEL: log10_f64
				; CHECK: __xl_log10
				; CHECK: blr
				entry:
				%0 = tail call afn double @llvm.log10.f64(double %a)
				ret double %0
				}

				; With afn flag specified per-function
				define double @pow_f64(double %a, double %b) #1 {
				; CHECK-LABEL: pow_f64
				; CHECK: __xl_pow
				; CHECK: blr
				entry:
				%0 = tail call afn double @llvm.pow.f64(double %a, double %b)
				ret double %0
				}

				; With afn flag specified per-function
				define double @sin_f64(double %a) #1 {
				; CHECK-LABEL: sin_f64
				; CHECK: __xl_sin
				; CHECK: blr
				entry:
				%0 = tail call afn double @llvm.sin.f64(double %a)
				ret double %0
				}

				attributes #1 = { "approx-func-fp-math"="true" }
				bmahjourUnsubmitted Done Reply Inline Actions All the calls have `afn`....why do we need this attribute? bmahjour: All the calls have `afn`....why do we need this attribute?
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Removed masoud.ataei: Removed

llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s
				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck %s

				declare float @llvm.cos.f32(float)
				declare float @llvm.exp.f32(float)
				declare float @llvm.log10.f32(float)
				declare float @llvm.log.f32(float)
				declare float @llvm.pow.f32(float, float)
				declare float @llvm.rint.f32(float)
				declare float @llvm.sin.f32(float)
				declare double @llvm.cos.f64(double)
				declare double @llvm.exp.f64(double)
				declare double @llvm.log.f64(double)
				declare double @llvm.log10.f64(double)
				declare double @llvm.pow.f64(double, double)
				declare double @llvm.sin.f64(double)

				; With fast math flag specified per-function
				define float @cosf_f32(float %a) #1 {
				; CHECK-LABEL: cosf_f32
				; CHECK: __xl_cosf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.cos.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @expf_f32(float %a) #1 {
				; CHECK-LABEL: expf_f32
				; CHECK: __xl_expf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.exp.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @log10f_f32(float %a) #1 {
				; CHECK-LABEL: log10f_f32
				; CHECK: __xl_log10f_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.log10.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @logf_f32(float %a) #1 {
				; CHECK-LABEL: logf_f32
				; CHECK: __xl_logf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.log.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @powf_f32(float %a, float %b) #1 {
				; CHECK-LABEL: powf_f32
				; CHECK: __xl_powf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.pow.f32(float %a, float %b)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @rintf_f32(float %a) #1 {
				; CHECK-LABEL: rintf_f32
				; CHECK-NOT: bl __xl_rintf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.rint.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @sinf_f32(float %a) #1 {
				; CHECK-LABEL: sinf_f32
				; CHECK: __xl_sinf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.sin.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define double @cos_f64(double %a) #1 {
				; CHECK-LABEL: cos_f64
				; CHECK: __xl_cos_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.cos.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @exp_f64(double %a) #1 {
				; CHECK-LABEL: exp_f64
				; CHECK: __xl_exp_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.exp.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @log_f64(double %a) #1 {
				; CHECK-LABEL: log_f64
				; CHECK: __xl_log_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.log.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @log10_f64(double %a) #1 {
				; CHECK-LABEL: log10_f64
				; CHECK: __xl_log10_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.log10.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @pow_f64(double %a, double %b) #1 {
				; CHECK-LABEL: pow_f64
				; CHECK: __xl_pow_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.pow.f64(double %a, double %b)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @sin_f64(double %a) #1 {
				; CHECK-LABEL: sin_f64
				; CHECK: __xl_sin_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.sin.f64(double %a)
				ret double %0
				}

				attributes #1 = { "no-infs-fp-math"="true" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "approx-func-fp-math"="true" }
				bmahjourUnsubmitted Done Reply Inline Actions See above comment and remove "unsafe-fp-math". bmahjour: See above comment and remove "unsafe-fp-math".
				bmahjourUnsubmitted Done Reply Inline Actions do we need this attribute? Can we remove it or have separate tests for functions with attributes? bmahjour: do we need this attribute? Can we remove it or have separate tests for functions with…
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Removed masoud.ataei: Removed

llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck %s

				bmahjourUnsubmitted Done Reply Inline Actions why not just use the default `CHECK` prefix? `CHECK-ALL` and `CHECK-LWR` don't distinguish anything based on this run command. bmahjour: why not just use the default `CHECK` prefix? `CHECK-ALL` and `CHECK-LWR` don't distinguish…
				bmahjourUnsubmitted Done Reply Inline Actions We don't really need a separate aix file. Can we just add a run line with the aix triple to `llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll`? bmahjour: We don't really need a separate aix file. Can we just add a run line with the aix triple to…
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Done masoud.ataei: Done
				declare float @llvm.cos.f32(float)
				declare double @llvm.cos.f64(double)
				declare float @llvm.exp.f32(float)
				declare double @llvm.exp.f64(double)
				declare float @llvm.log10.f32(float)
				declare double @llvm.log10.f64(double)
				declare float @llvm.log.f32(float)
				declare double @llvm.log.f64(double)
				declare float @llvm.pow.f32(float, float)
				declare double @llvm.pow.f64(double, double)
				declare float @llvm.rint.f32(float)
				declare float @llvm.sin.f32(float)
				declare double @llvm.sin.f64(double)

				define float @cosf_f32(float %a) {
				; CHECK-LABEL: cosf_f32
				; CHECK-NOT: bl __xl_cosf_finite
				; CHECK: blr
				bmahjourUnsubmitted Done Reply Inline Actions CHECK-DFLT is not in the list of prefixes defined. bmahjour: CHECK-DFLT is not in the list of prefixes defined.
				entry:
				%0 = tail call float @llvm.cos.f32(float %a)
				ret float %0
				}

				define double @cos_f64(double %a) {
				; CHECK-LABEL: cos_f64
				; CHECK-NOT: bl __xl_cos_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.cos.f64(double %a)
				ret double %0
				}

				define float @expf_f32(float %a) {
				; CHECK-LABEL: expf_f32
				; CHECK-NOT: bl __xl_expf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.exp.f32(float %a)
				ret float %0
				}

				define double @exp_f64(double %a) {
				; CHECK-LABEL: exp_f64
				; CHECK-NOT: bl __xl_exp_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.exp.f64(double %a)
				ret double %0
				}

				define float @log10f_f32(float %a) {
				; CHECK-LABEL: log10f_f32
				; CHECK-NOT: bl __xl_log10f_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.log10.f32(float %a)
				ret float %0
				}

				define double @log_f64(double %a) {
				; CHECK-LABEL: log_f64
				; CHECK-NOT: bl __xl_log_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.log.f64(double %a)
				ret double %0
				}

				define float @logf_f32(float %a) {
				; CHECK-LABEL: logf_f32
				; CHECK-NOT: bl __xl_logf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.log.f32(float %a)
				ret float %0
				}

				define double @log10_f64(double %a) {
				; CHECK-LABEL: log10_f64
				; CHECK-NOT: bl __xl_log10_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.log10.f64(double %a)
				ret double %0
				}

				define float @powf_f32(float %a, float %b) {
				; CHECK-LABEL: powf_f32
				; CHECK-NOT: bl __xl_powf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.pow.f32(float %a, float %b)
				ret float %0
				}

				define double @pow_f64(double %a, double %b) {
				; CHECK-LABEL: pow_f64
				; CHECK-NOT: bl __xl_pow_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.pow.f64(double %a, double %b)
				ret double %0
				}

				define float @rintf_f32(float %a) {
				; CHECK-LABEL: rintf_f32
				; CHECK-NOT: bl __xl_rintf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.rint.f32(float %a)
				ret float %0
				}

				define float @sinf_f32(float %a) {
				; CHECK-LABEL: sinf_f32
				; CHECK-NOT: bl __xl_sinf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.sin.f32(float %a)
				ret float %0
				}

				define double @sin_f64(double %a) {
				; CHECK-LABEL: sin_f64
				; CHECK-NOT: bl __xl_sin_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.sin.f64(double %a)
				ret double %0
				}

llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s

				declare float @llvm.cos.f32(float)
				declare float @llvm.exp.f32(float)
				declare float @llvm.log10.f32(float)
				declare float @llvm.log.f32(float)
				declare float @llvm.pow.f32(float, float)
				declare float @llvm.rint.f32(float)
				declare float @llvm.sin.f32(float)
				declare double @llvm.cos.f64(double)
				declare double @llvm.exp.f64(double)
				declare double @llvm.log.f64(double)
				declare double @llvm.log10.f64(double)
				declare double @llvm.pow.f64(double, double)
				declare double @llvm.sin.f64(double)


				; With no fast math flag specified per-function
				define float @cosf_f32_nofast(float %a) {
				; CHECK-LABEL: cosf_f32_nofast
				; CHECK-NOT: bl __xl_cosf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.cos.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @expf_f32_nofast(float %a) {
				; CHECK-LABEL: expf_f32_nofast
				; CHECK-NOT: bl __xl_expf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.exp.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @log10f_f32_nofast(float %a) {
				; CHECK-LABEL: log10f_f32_nofast
				; CHECK-NOT: bl __xl_log10f_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.log10.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @logf_f32_nofast(float %a) {
				; CHECK-LABEL: logf_f32_nofast
				; CHECK-NOT: bl __xl_logf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.log.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @powf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: powf_f32_nofast
				; CHECK-NOT: bl __xl_powf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.pow.f32(float %a, float %b)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @rintf_f32_nofast(float %a) {
				; CHECK-LABEL: rintf_f32_nofast
				; CHECK-NOT: bl __xl_rintf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.rint.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @sinf_f32_nofast(float %a) {
				; CHECK-LABEL: sinf_f32_nofast
				; CHECK-NOT: bl __xl_sinf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.sin.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define double @cos_f64_nofast(double %a) {
				; CHECK-LABEL: cos_f64_nofast
				; CHECK-NOT: bl __xl_cos_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.cos.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @exp_f64_nofast(double %a) {
				; CHECK-LABEL: exp_f64_nofast
				; CHECK-NOT: bl __xl_exp_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.exp.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @log_f64_nofast(double %a) {
				; CHECK-LABEL: log_f64_nofast
				; CHECK-NOT: bl __xl_log_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.log.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @log10_f64_nofast(double %a) {
				; CHECK-LABEL: log10_f64_nofast
				; CHECK-NOT: bl __xl_log10_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.log10.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @pow_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: pow_f64_nofast
				; CHECK-NOT: bl __xl_pow_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.pow.f64(double %a, double %b)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @sin_f64_nofast(double %a) {
				; CHECK-LABEL: sin_f64_nofast
				; CHECK-NOT: bl __xl_sin_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.sin.f64(double %a)
				ret double %0
				}
				bmahjourUnsubmitted Done Reply Inline Actions Remove this line, `#1` is unused. bmahjour: Remove this line, `#1` is unused.

llvm/test/CodeGen/PowerPC/lower-scalar-mass-afn.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s
				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck %s

				declare float @acosf (float);
				declare float @acoshf (float);
				declare float @asinf (float);
				declare float @asinhf (float);
				declare float @atan2f (float);
				declare float @atanf (float);
				declare float @atanhf (float);
				declare float @cbrtf (float);
				declare float @copysignf (float, float);
				declare float @cosf (float);
				declare float @coshf (float);
				declare float @erfcf (float);
				declare float @erff (float);
				declare float @expf (float);
				declare float @expm1f (float);
				declare float @hypotf (float, float);
				declare float @lgammaf (float);
				declare float @log10f (float);
				declare float @log1pf (float);
				declare float @logf (float);
				declare float @powf (float, float);
				declare float @rintf (float);
				declare float @sinf (float);
				declare float @sinhf (float);
				declare float @tanf (float);
				declare float @tanhf (float);
				declare double @acos (double);
				declare double @acosh (double);
				declare double @anint (double);
				declare double @asin (double);
				declare double @asinh (double);
				declare double @atan (double);
				declare double @atan2 (double);
				declare double @atanh (double);
				declare double @cbrt (double);
				declare double @copysign (double, double);
				declare double @cos (double);
				declare double @cosh (double);
				declare double @cosisin (double);
				declare double @dnint (double);
				declare double @erf (double);
				declare double @erfc (double);
				declare double @exp (double);
				declare double @expm1 (double);
				declare double @hypot (double, double);
				declare double @lgamma (double);
				declare double @log (double);
				declare double @log10 (double);
				declare double @log1p (double);
				declare double @pow (double, double);
				declare double @rsqrt (double);
				declare double @sin (double);
				declare double @sincos (double);
				declare double @sinh (double);
				declare double @sqrt (double);
				declare double @tan (double);
				declare double @tanh (double);

				define float @acosf_f32(float %a) {
				; CHECK-LABEL: acosf_f32
				; CHECK: __xl_acosf
				; CHECK: blr
				entry:
				%call = tail call afn float @acosf(float %a)
				ret float %call
				}

				define float @acoshf_f32(float %a) {
				; CHECK-LABEL: acoshf_f32
				; CHECK: __xl_acoshf
				; CHECK: blr
				entry:
				%call = tail call afn float @acoshf(float %a)
				ret float %call
				}

				define float @asinf_f32(float %a) {
				; CHECK-LABEL: asinf_f32
				; CHECK: __xl_asinf
				; CHECK: blr
				entry:
				%call = tail call afn float @asinf(float %a)
				ret float %call
				}

				define float @asinhf_f32(float %a) {
				; CHECK-LABEL: asinhf_f32
				; CHECK: __xl_asinhf
				; CHECK: blr
				entry:
				%call = tail call afn float @asinhf(float %a)
				ret float %call
				}

				define float @atan2f_f32(float %a) {
				; CHECK-LABEL: atan2f_f32
				; CHECK: __xl_atan2f
				; CHECK: blr
				entry:
				%call = tail call afn float @atan2f(float %a)
				ret float %call
				}

				define float @atanf_f32(float %a) {
				; CHECK-LABEL: atanf_f32
				; CHECK: __xl_atanf
				; CHECK: blr
				entry:
				%call = tail call afn float @atanf(float %a)
				ret float %call
				}

				define float @atanhf_f32(float %a) {
				; CHECK-LABEL: atanhf_f32
				; CHECK: __xl_atanhf
				; CHECK: blr
				entry:
				%call = tail call afn float @atanhf(float %a)
				ret float %call
				}

				define float @cbrtf_f32(float %a) {
				; CHECK-LABEL: cbrtf_f32
				; CHECK: __xl_cbrtf
				; CHECK: blr
				entry:
				%call = tail call afn float @cbrtf(float %a)
				ret float %call
				}

				define float @copysignf_f32(float %a, float %b) {
				; CHECK-LABEL: copysignf_f32
				; CHECK: copysignf
				; CHECK: blr
				entry:
				%call = tail call afn float @copysignf(float %a, float %b)
				ret float %call
				}

				define float @cosf_f32(float %a) {
				; CHECK-LABEL: cosf_f32
				; CHECK: __xl_cosf
				; CHECK: blr
				entry:
				%call = tail call afn float @cosf(float %a)
				ret float %call
				}

				define float @coshf_f32(float %a) {
				; CHECK-LABEL: coshf_f32
				; CHECK: __xl_coshf
				; CHECK: blr
				entry:
				%call = tail call afn float @coshf(float %a)
				ret float %call
				}

				define float @erfcf_f32(float %a) {
				; CHECK-LABEL: erfcf_f32
				; CHECK: __xl_erfcf
				; CHECK: blr
				entry:
				%call = tail call afn float @erfcf(float %a)
				ret float %call
				}

				define float @erff_f32(float %a) {
				; CHECK-LABEL: erff_f32
				; CHECK: __xl_erff
				; CHECK: blr
				entry:
				%call = tail call afn float @erff(float %a)
				ret float %call
				}

				define float @expf_f32(float %a) {
				; CHECK-LABEL: expf_f32
				; CHECK: __xl_expf
				; CHECK: blr
				entry:
				%call = tail call afn float @expf(float %a)
				ret float %call
				}

				define float @expm1f_f32(float %a) {
				; CHECK-LABEL: expm1f_f32
				; CHECK: __xl_expm1f
				; CHECK: blr
				entry:
				%call = tail call afn float @expm1f(float %a)
				ret float %call
				}

				define float @hypotf_f32(float %a, float %b) {
				; CHECK-LABEL: hypotf_f32
				; CHECK: __xl_hypotf
				; CHECK: blr
				entry:
				%call = tail call afn float @hypotf(float %a, float %b)
				ret float %call
				}

				define float @lgammaf_f32(float %a) {
				; CHECK-LABEL: lgammaf_f32
				; CHECK: __xl_lgammaf
				; CHECK: blr
				entry:
				%call = tail call afn float @lgammaf(float %a)
				ret float %call
				}

				define float @log10f_f32(float %a) {
				; CHECK-LABEL: log10f_f32
				; CHECK: __xl_log10f
				; CHECK: blr
				entry:
				%call = tail call afn float @log10f(float %a)
				ret float %call
				}

				define float @log1pf_f32(float %a) {
				; CHECK-LABEL: log1pf_f32
				; CHECK: __xl_log1pf
				; CHECK: blr
				entry:
				%call = tail call afn float @log1pf(float %a)
				ret float %call
				}

				define float @logf_f32(float %a) {
				; CHECK-LABEL: logf_f32
				; CHECK: __xl_logf
				; CHECK: blr
				entry:
				%call = tail call afn float @logf(float %a)
				ret float %call
				}

				define float @powf_f32(float %a, float %b) {
				; CHECK-LABEL: powf_f32
				; CHECK: __xl_powf
				; CHECK: blr
				entry:
				%call = tail call afn float @powf(float %a, float %b)
				ret float %call
				}

				define float @rintf_f32(float %a) {
				; CHECK-LABEL: rintf_f32
				; CHECK-NOT: __xl_rintf
				; CHECK: blr
				entry:
				%call = tail call afn float @rintf(float %a)
				ret float %call
				}

				define float @sinf_f32(float %a) {
				; CHECK-LABEL: sinf_f32
				; CHECK: __xl_sinf
				; CHECK: blr
				entry:
				%call = tail call afn float @sinf(float %a)
				ret float %call
				}

				define float @sinhf_f32(float %a) {
				; CHECK-LABEL: sinhf_f32
				; CHECK: __xl_sinhf
				; CHECK: blr
				entry:
				%call = tail call afn float @sinhf(float %a)
				ret float %call
				}

				define float @tanf_f32(float %a) {
				; CHECK-LABEL: tanf_f32
				; CHECK: __xl_tanf
				; CHECK: blr
				entry:
				%call = tail call afn float @tanf(float %a)
				ret float %call
				}

				define float @tanhf_f32(float %a) {
				; CHECK-LABEL: tanhf_f32
				; CHECK: __xl_tanhf
				; CHECK: blr
				entry:
				%call = tail call afn float @tanhf(float %a)
				ret float %call
				}

				define double @acos_f64(double %a) {
				; CHECK-LABEL: acos_f64
				; CHECK: __xl_acos
				; CHECK: blr
				entry:
				%call = tail call afn double @acos(double %a)
				ret double %call
				}

				define double @acosh_f64(double %a) {
				; CHECK-LABEL: acosh_f64
				; CHECK: __xl_acosh
				; CHECK: blr
				entry:
				%call = tail call afn double @acosh(double %a)
				ret double %call
				}

				define double @anint_f64(double %a) {
				; CHECK-LABEL: anint_f64
				; CHECK-NOT: __xl_anint
				; CHECK: blr
				entry:
				%call = tail call afn double @anint(double %a)
				ret double %call
				}

				define double @asin_f64(double %a) {
				; CHECK-LABEL: asin_f64
				; CHECK: __xl_asin
				; CHECK: blr
				entry:
				%call = tail call afn double @asin(double %a)
				ret double %call
				}

				define double @asinh_f64(double %a) {
				; CHECK-LABEL: asinh_f64
				; CHECK: __xl_asinh
				; CHECK: blr
				entry:
				%call = tail call afn double @asinh(double %a)
				ret double %call
				}

				define double @atan_f64(double %a) {
				; CHECK-LABEL: atan_f64
				; CHECK: __xl_atan
				; CHECK: blr
				entry:
				%call = tail call afn double @atan(double %a)
				ret double %call
				}

				define double @atan2_f64(double %a) {
				; CHECK-LABEL: atan2_f64
				; CHECK: __xl_atan2
				; CHECK: blr
				entry:
				%call = tail call afn double @atan2(double %a)
				ret double %call
				}

				define double @atanh_f64(double %a) {
				; CHECK-LABEL: atanh_f64
				; CHECK: __xl_atanh
				; CHECK: blr
				entry:
				%call = tail call afn double @atanh(double %a)
				ret double %call
				}

				define double @cbrt_f64(double %a) {
				; CHECK-LABEL: cbrt_f64
				; CHECK: __xl_cbrt
				; CHECK: blr
				entry:
				%call = tail call afn double @cbrt(double %a)
				ret double %call
				}

				define double @copysign_f64(double %a, double %b) {
				; CHECK-LABEL: copysign_f64
				; CHECK: copysign
				; CHECK: blr
				entry:
				%call = tail call afn double @copysign(double %a, double %b)
				ret double %call
				}

				define double @cos_f64(double %a) {
				; CHECK-LABEL: cos_f64
				; CHECK: __xl_cos
				; CHECK: blr
				entry:
				%call = tail call afn double @cos(double %a)
				ret double %call
				}

				define double @cosh_f64(double %a) {
				; CHECK-LABEL: cosh_f64
				; CHECK: __xl_cosh
				; CHECK: blr
				entry:
				%call = tail call afn double @cosh(double %a)
				ret double %call
				}

				define double @cosisin_f64(double %a) {
				; CHECK-LABEL: cosisin_f64
				; CHECK-NOT: __xl_cosisin
				; CHECK: blr
				entry:
				%call = tail call afn double @cosisin(double %a)
				ret double %call
				}

				define double @dnint_f64(double %a) {
				; CHECK-LABEL: dnint_f64
				; CHECK-NOT: __xl_dnint
				; CHECK: blr
				entry:
				%call = tail call afn double @dnint(double %a)
				ret double %call
				}

				define double @erf_f64(double %a) {
				; CHECK-LABEL: erf_f64
				; CHECK: __xl_erf
				; CHECK: blr
				entry:
				%call = tail call afn double @erf(double %a)
				ret double %call
				}

				define double @erfc_f64(double %a) {
				; CHECK-LABEL: erfc_f64
				; CHECK: __xl_erfc
				; CHECK: blr
				entry:
				%call = tail call afn double @erfc(double %a)
				ret double %call
				}

				define double @exp_f64(double %a) {
				; CHECK-LABEL: exp_f64
				; CHECK: __xl_exp
				; CHECK: blr
				entry:
				%call = tail call afn double @exp(double %a)
				ret double %call
				}

				define double @expm1_f64(double %a) {
				; CHECK-LABEL: expm1_f64
				; CHECK: __xl_expm1
				; CHECK: blr
				entry:
				%call = tail call afn double @expm1(double %a)
				ret double %call
				}

				define double @hypot_f64(double %a, double %b) {
				; CHECK-LABEL: hypot_f64
				; CHECK: __xl_hypot
				; CHECK: blr
				entry:
				%call = tail call afn double @hypot(double %a, double %b)
				ret double %call
				}

				define double @lgamma_f64(double %a) {
				; CHECK-LABEL: lgamma_f64
				; CHECK: __xl_lgamma
				; CHECK: blr
				entry:
				%call = tail call afn double @lgamma(double %a)
				ret double %call
				}

				define double @log_f64(double %a) {
				; CHECK-LABEL: log_f64
				; CHECK: __xl_log
				; CHECK: blr
				entry:
				%call = tail call afn double @log(double %a)
				ret double %call
				}

				define double @log10_f64(double %a) {
				; CHECK-LABEL: log10_f64
				; CHECK: __xl_log10
				; CHECK: blr
				entry:
				%call = tail call afn double @log10(double %a)
				ret double %call
				}

				define double @log1p_f64(double %a) {
				; CHECK-LABEL: log1p_f64
				; CHECK: __xl_log1p
				; CHECK: blr
				entry:
				%call = tail call afn double @log1p(double %a)
				ret double %call
				}

				define double @pow_f64(double %a, double %b) {
				; CHECK-LABEL: pow_f64
				; CHECK: __xl_pow
				; CHECK: blr
				entry:
				%call = tail call afn double @pow(double %a, double %b)
				ret double %call
				}

				define double @rsqrt_f64(double %a) {
				; CHECK-LABEL: rsqrt_f64
				; CHECK: __xl_rsqrt
				; CHECK: blr
				entry:
				%call = tail call afn double @rsqrt(double %a)
				ret double %call
				}

				define double @sin_f64(double %a) {
				; CHECK-LABEL: sin_f64
				; CHECK: __xl_sin
				; CHECK: blr
				entry:
				%call = tail call afn double @sin(double %a)
				ret double %call
				}

				define double @sincos_f64(double %a) {
				; CHECK-LABEL: sincos_f64
				; CHECK-NOT: __xl_sincos
				; CHECK: blr
				entry:
				%call = tail call afn double @sincos(double %a)
				ret double %call
				}

				define double @sinh_f64(double %a) {
				; CHECK-LABEL: sinh_f64
				; CHECK: __xl_sinh
				; CHECK: blr
				entry:
				%call = tail call afn double @sinh(double %a)
				ret double %call
				}

				define double @sqrt_f64(double %a) {
				; CHECK-LABEL: sqrt_f64
				; CHECK: __xl_sqrt
				; CHECK: blr
				entry:
				%call = tail call afn double @sqrt(double %a)
				ret double %call
				}

				define double @tan_f64(double %a) {
				; CHECK-LABEL: tan_f64
				; CHECK: __xl_tan
				; CHECK: blr
				entry:
				%call = tail call afn double @tan(double %a)
				ret double %call
				}

				define double @tanh_f64(double %a) {
				; CHECK-LABEL: tanh_f64
				; CHECK: __xl_tanh
				; CHECK: blr
				entry:
				%call = tail call afn double @tanh(double %a)
				ret double %call
				}

llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s
				; RUN: llc -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck %s

				declare float @acosf (float);
				declare float @acoshf (float);
				declare float @asinf (float);
				declare float @asinhf (float);
				declare float @atan2f (float);
				declare float @atanf (float);
				declare float @atanhf (float);
				declare float @cbrtf (float);
				declare float @copysignf (float, float);
				declare float @cosf (float);
				declare float @coshf (float);
				declare float @erfcf (float);
				declare float @erff (float);
				declare float @expf (float);
				declare float @expm1f (float);
				declare float @hypotf (float, float);
				declare float @lgammaf (float);
				declare float @log10f (float);
				declare float @log1pf (float);
				declare float @logf (float);
				declare float @powf (float, float);
				declare float @rintf (float);
				declare float @sinf (float);
				declare float @sinhf (float);
				declare float @tanf (float);
				declare float @tanhf (float);
				declare double @acos (double);
				declare double @acosh (double);
				declare double @anint (double);
				declare double @asin (double);
				declare double @asinh (double);
				declare double @atan (double);
				declare double @atan2 (double);
				declare double @atanh (double);
				declare double @cbrt (double);
				declare double @copysign (double, double);
				declare double @cos (double);
				declare double @cosh (double);
				declare double @cosisin (double);
				declare double @dnint (double);
				declare double @erf (double);
				declare double @erfc (double);
				declare double @exp (double);
				declare double @expm1 (double);
				declare double @hypot (double, double);
				declare double @lgamma (double);
				declare double @log (double);
				declare double @log10 (double);
				declare double @log1p (double);
				declare double @pow (double, double);
				declare double @rsqrt (double);
				declare double @sin (double);
				declare double @sincos (double);
				declare double @sinh (double);
				declare double @sqrt (double);
				declare double @tan (double);
				declare double @tanh (double);

				define float @acosf_f32(float %a) {
				; CHECK-LABEL: acosf_f32
				; CHECK: __xl_acosf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @acosf(float %a)
				ret float %call
				}

				define float @acoshf_f32(float %a) {
				; CHECK-LABEL: acoshf_f32
				; CHECK: __xl_acoshf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @acoshf(float %a)
				ret float %call
				}

				define float @asinf_f32(float %a) {
				; CHECK-LABEL: asinf_f32
				; CHECK: __xl_asinf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @asinf(float %a)
				ret float %call
				}

				define float @asinhf_f32(float %a) {
				; CHECK-LABEL: asinhf_f32
				; CHECK: __xl_asinhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @asinhf(float %a)
				ret float %call
				}

				define float @atan2f_f32(float %a) {
				; CHECK-LABEL: atan2f_f32
				; CHECK: __xl_atan2f_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @atan2f(float %a)
				ret float %call
				}

				define float @atanf_f32(float %a) {
				; CHECK-LABEL: atanf_f32
				; CHECK: __xl_atanf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @atanf(float %a)
				ret float %call
				}

				define float @atanhf_f32(float %a) {
				; CHECK-LABEL: atanhf_f32
				; CHECK: __xl_atanhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @atanhf(float %a)
				ret float %call
				}

				define float @cbrtf_f32(float %a) {
				; CHECK-LABEL: cbrtf_f32
				; CHECK: __xl_cbrtf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @cbrtf(float %a)
				ret float %call
				}

				define float @copysignf_f32(float %a, float %b) {
				; CHECK-LABEL: copysignf_f32
				; CHECK: copysignf
				; CHECK: blr
				entry:
				%call = tail call fast float @copysignf(float %a, float %b)
				ret float %call
				}

				define float @cosf_f32(float %a) {
				; CHECK-LABEL: cosf_f32
				; CHECK: __xl_cosf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @cosf(float %a)
				ret float %call
				}

				define float @coshf_f32(float %a) {
				; CHECK-LABEL: coshf_f32
				; CHECK: __xl_coshf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @coshf(float %a)
				ret float %call
				}

				define float @erfcf_f32(float %a) {
				; CHECK-LABEL: erfcf_f32
				; CHECK: __xl_erfcf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @erfcf(float %a)
				ret float %call
				}

				define float @erff_f32(float %a) {
				; CHECK-LABEL: erff_f32
				; CHECK: __xl_erff_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @erff(float %a)
				ret float %call
				}

				define float @expf_f32(float %a) {
				; CHECK-LABEL: expf_f32
				; CHECK: __xl_expf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @expf(float %a)
				ret float %call
				}

				define float @expm1f_f32(float %a) {
				; CHECK-LABEL: expm1f_f32
				; CHECK: __xl_expm1f_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @expm1f(float %a)
				ret float %call
				}

				define float @hypotf_f32(float %a, float %b) {
				; CHECK-LABEL: hypotf_f32
				; CHECK: __xl_hypotf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @hypotf(float %a, float %b)
				ret float %call
				}

				define float @lgammaf_f32(float %a) {
				; CHECK-LABEL: lgammaf_f32
				; CHECK: __xl_lgammaf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @lgammaf(float %a)
				ret float %call
				}

				define float @log10f_f32(float %a) {
				; CHECK-LABEL: log10f_f32
				; CHECK: __xl_log10f_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @log10f(float %a)
				ret float %call
				}

				define float @log1pf_f32(float %a) {
				; CHECK-LABEL: log1pf_f32
				; CHECK: __xl_log1pf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @log1pf(float %a)
				ret float %call
				}

				define float @logf_f32(float %a) {
				; CHECK-LABEL: logf_f32
				; CHECK: __xl_logf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @logf(float %a)
				ret float %call
				}

				define float @powf_f32(float %a, float %b) {
				; CHECK-LABEL: powf_f32
				; CHECK: __xl_powf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @powf(float %a, float %b)
				ret float %call
				}

				define float @rintf_f32(float %a) {
				; CHECK-LABEL: rintf_f32
				; CHECK-NOT: __xl_rintf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @rintf(float %a)
				ret float %call
				}

				define float @sinf_f32(float %a) {
				; CHECK-LABEL: sinf_f32
				; CHECK: __xl_sinf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @sinf(float %a)
				ret float %call
				}

				define float @sinhf_f32(float %a) {
				; CHECK-LABEL: sinhf_f32
				; CHECK: __xl_sinhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @sinhf(float %a)
				ret float %call
				}

				define float @tanf_f32(float %a) {
				; CHECK-LABEL: tanf_f32
				; CHECK: __xl_tanf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @tanf(float %a)
				ret float %call
				}

				define float @tanhf_f32(float %a) {
				; CHECK-LABEL: tanhf_f32
				; CHECK: __xl_tanhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @tanhf(float %a)
				ret float %call
				}

				define double @acos_f64(double %a) {
				; CHECK-LABEL: acos_f64
				; CHECK: __xl_acos_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @acos(double %a)
				ret double %call
				}

				define double @acosh_f64(double %a) {
				; CHECK-LABEL: acosh_f64
				; CHECK: __xl_acosh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @acosh(double %a)
				ret double %call
				}

				define double @anint_f64(double %a) {
				; CHECK-LABEL: anint_f64
				; CHECK-NOT: __xl_anint_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @anint(double %a)
				ret double %call
				}

				define double @asin_f64(double %a) {
				; CHECK-LABEL: asin_f64
				; CHECK: __xl_asin_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @asin(double %a)
				ret double %call
				}

				define double @asinh_f64(double %a) {
				; CHECK-LABEL: asinh_f64
				; CHECK: __xl_asinh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @asinh(double %a)
				ret double %call
				}

				define double @atan_f64(double %a) {
				; CHECK-LABEL: atan_f64
				; CHECK: __xl_atan_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @atan(double %a)
				ret double %call
				}

				define double @atan2_f64(double %a) {
				; CHECK-LABEL: atan2_f64
				; CHECK: __xl_atan2_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @atan2(double %a)
				ret double %call
				}

				define double @atanh_f64(double %a) {
				; CHECK-LABEL: atanh_f64
				; CHECK: __xl_atanh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @atanh(double %a)
				ret double %call
				}

				define double @cbrt_f64(double %a) {
				; CHECK-LABEL: cbrt_f64
				; CHECK: __xl_cbrt_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cbrt(double %a)
				ret double %call
				}

				define double @copysign_f64(double %a, double %b) {
				; CHECK-LABEL: copysign_f64
				; CHECK: copysign
				; CHECK: blr
				entry:
				%call = tail call fast double @copysign(double %a, double %b)
				ret double %call
				}

				define double @cos_f64(double %a) {
				; CHECK-LABEL: cos_f64
				; CHECK: __xl_cos_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cos(double %a)
				ret double %call
				}

				define double @cosh_f64(double %a) {
				; CHECK-LABEL: cosh_f64
				; CHECK: __xl_cosh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cosh(double %a)
				ret double %call
				}

				define double @cosisin_f64(double %a) {
				; CHECK-LABEL: cosisin_f64
				; CHECK-NOT: __xl_cosisin_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cosisin(double %a)
				ret double %call
				}

				define double @dnint_f64(double %a) {
				; CHECK-LABEL: dnint_f64
				; CHECK-NOT: __xl_dnint_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @dnint(double %a)
				ret double %call
				}

				define double @erf_f64(double %a) {
				; CHECK-LABEL: erf_f64
				; CHECK: __xl_erf_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @erf(double %a)
				ret double %call
				}

				define double @erfc_f64(double %a) {
				; CHECK-LABEL: erfc_f64
				; CHECK: __xl_erfc_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @erfc(double %a)
				ret double %call
				}

				define double @exp_f64(double %a) {
				; CHECK-LABEL: exp_f64
				; CHECK: __xl_exp_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @exp(double %a)
				ret double %call
				}

				define double @expm1_f64(double %a) {
				; CHECK-LABEL: expm1_f64
				; CHECK: __xl_expm1_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @expm1(double %a)
				ret double %call
				}

				define double @hypot_f64(double %a, double %b) {
				; CHECK-LABEL: hypot_f64
				; CHECK: __xl_hypot_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @hypot(double %a, double %b)
				ret double %call
				}

				define double @lgamma_f64(double %a) {
				; CHECK-LABEL: lgamma_f64
				; CHECK: __xl_lgamma_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @lgamma(double %a)
				ret double %call
				}

				define double @log_f64(double %a) {
				; CHECK-LABEL: log_f64
				; CHECK: __xl_log_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @log(double %a)
				ret double %call
				}

				define double @log10_f64(double %a) {
				; CHECK-LABEL: log10_f64
				; CHECK: __xl_log10_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @log10(double %a)
				ret double %call
				}

				define double @log1p_f64(double %a) {
				; CHECK-LABEL: log1p_f64
				; CHECK: __xl_log1p_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @log1p(double %a)
				ret double %call
				}

				define double @pow_f64(double %a, double %b) {
				; CHECK-LABEL: pow_f64
				; CHECK: __xl_pow_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @pow(double %a, double %b)
				ret double %call
				}

				define double @rsqrt_f64(double %a) {
				; CHECK-LABEL: rsqrt_f64
				; CHECK: __xl_rsqrt_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @rsqrt(double %a)
				ret double %call
				}

				define double @sin_f64(double %a) {
				; CHECK-LABEL: sin_f64
				; CHECK: __xl_sin_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sin(double %a)
				ret double %call
				}

				define double @sincos_f64(double %a) {
				; CHECK-LABEL: sincos_f64
				; CHECK-NOT: __xl_sincos_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sincos(double %a)
				ret double %call
				}

				define double @sinh_f64(double %a) {
				; CHECK-LABEL: sinh_f64
				; CHECK: __xl_sinh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sinh(double %a)
				ret double %call
				}

				define double @sqrt_f64(double %a) {
				; CHECK-LABEL: sqrt_f64
				; CHECK: __xl_sqrt_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sqrt(double %a)
				ret double %call
				}

				define double @tan_f64(double %a) {
				; CHECK-LABEL: tan_f64
				; CHECK: __xl_tan_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @tan(double %a)
				ret double %call
				}

				define double @tanh_f64(double %a) {
				; CHECK-LABEL: tanh_f64
				; CHECK: __xl_tanh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @tanh(double %a)
				ret double %call
				}


				; Without fast flag on the call instruction
				define float @acosf_f32_nofast(float %a) {
				; CHECK-LABEL: acosf_f32_nofast
				; CHECK-NOT: __xl_acosf_finite
				; CHECK: blr
				entry:
				%call = tail call float @acosf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @acoshf_f32_nofast(float %a) {
				; CHECK-LABEL: acoshf_f32_nofast
				; CHECK-NOT: __xl_acoshf_finite
				; CHECK: blr
				entry:
				%call = tail call float @acoshf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @asinf_f32_nofast(float %a) {
				; CHECK-LABEL: asinf_f32_nofast
				; CHECK-NOT: __xl_asinf_finite
				; CHECK: blr
				entry:
				%call = tail call float @asinf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @asinhf_f32_nofast(float %a) {
				; CHECK-LABEL: asinhf_f32_nofast
				; CHECK-NOT: __xl_asinhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @asinhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @atan2f_f32_nofast(float %a) {
				; CHECK-LABEL: atan2f_f32_nofast
				; CHECK-NOT: __xl_atan2f_finite
				; CHECK: blr
				entry:
				%call = tail call float @atan2f(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @atanf_f32_nofast(float %a) {
				; CHECK-LABEL: atanf_f32_nofast
				; CHECK-NOT: __xl_atanf_finite
				; CHECK: blr
				entry:
				%call = tail call float @atanf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @atanhf_f32_nofast(float %a) {
				; CHECK-LABEL: atanhf_f32_nofast
				; CHECK-NOT: __xl_atanhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @atanhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @cbrtf_f32_nofast(float %a) {
				; CHECK-LABEL: cbrtf_f32_nofast
				; CHECK-NOT: __xl_cbrtf_finite
				; CHECK: blr
				entry:
				%call = tail call float @cbrtf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @copysignf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: copysignf_f32_nofast
				; CHECK-NOT: __xl_copysignf_finite
				; CHECK: blr
				entry:
				%call = tail call float @copysignf(float %a, float %b)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @cosf_f32_nofast(float %a) {
				; CHECK-LABEL: cosf_f32_nofast
				; CHECK-NOT: __xl_cosf_finite
				; CHECK: blr
				entry:
				%call = tail call float @cosf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @coshf_f32_nofast(float %a) {
				; CHECK-LABEL: coshf_f32_nofast
				; CHECK-NOT: __xl_coshf_finite
				; CHECK: blr
				entry:
				%call = tail call float @coshf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @erfcf_f32_nofast(float %a) {
				; CHECK-LABEL: erfcf_f32_nofast
				; CHECK-NOT: __xl_erfcf_finite
				; CHECK: blr
				entry:
				%call = tail call float @erfcf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @erff_f32_nofast(float %a) {
				; CHECK-LABEL: erff_f32_nofast
				; CHECK-NOT: __xl_erff_finite
				; CHECK: blr
				entry:
				%call = tail call float @erff(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @expf_f32_nofast(float %a) {
				; CHECK-LABEL: expf_f32_nofast
				; CHECK-NOT: __xl_expf_finite
				; CHECK: blr
				entry:
				%call = tail call float @expf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @expm1f_f32_nofast(float %a) {
				; CHECK-LABEL: expm1f_f32_nofast
				; CHECK-NOT: __xl_expm1f_finite
				; CHECK: blr
				entry:
				%call = tail call float @expm1f(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @hypotf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: hypotf_f32_nofast
				; CHECK-NOT: __xl_hypotf_finite
				; CHECK: blr
				entry:
				%call = tail call float @hypotf(float %a, float %b)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @lgammaf_f32_nofast(float %a) {
				; CHECK-LABEL: lgammaf_f32_nofast
				; CHECK-NOT: __xl_lgammaf_finite
				; CHECK: blr
				entry:
				%call = tail call float @lgammaf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @log10f_f32_nofast(float %a) {
				; CHECK-LABEL: log10f_f32_nofast
				; CHECK-NOT: __xl_log10f_finite
				; CHECK: blr
				entry:
				%call = tail call float @log10f(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @log1pf_f32_nofast(float %a) {
				; CHECK-LABEL: log1pf_f32_nofast
				; CHECK-NOT: __xl_log1pf_finite
				; CHECK: blr
				entry:
				%call = tail call float @log1pf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @logf_f32_nofast(float %a) {
				; CHECK-LABEL: logf_f32_nofast
				; CHECK-NOT: __xl_logf_finite
				; CHECK: blr
				entry:
				%call = tail call float @logf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @powf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: powf_f32_nofast
				; CHECK-NOT: __xl_powf_finite
				; CHECK: blr
				entry:
				%call = tail call float @powf(float %a, float %b)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @rintf_f32_nofast(float %a) {
				; CHECK-LABEL: rintf_f32_nofast
				; CHECK-NOT: __xl_rintf_finite
				; CHECK: blr
				entry:
				%call = tail call float @rintf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @sinf_f32_nofast(float %a) {
				bmahjourUnsubmitted Done Reply Inline Actions shouldn't the tests starting from here move to a different file? This test file is called ...mass-fast.ll so one would expect it only contains tests with fast-math flag on. bmahjour: shouldn't the tests starting from here move to a different file? This test file is called ...
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Done masoud.ataei: Done
				; CHECK-LABEL: sinf_f32_nofast
				; CHECK-NOT: __xl_sinf_finite
				; CHECK: blr
				entry:
				%call = tail call float @sinf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @sinhf_f32_nofast(float %a) {
				; CHECK-LABEL: sinhf_f32_nofast
				; CHECK-NOT: __xl_sinhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @sinhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @tanf_f32_nofast(float %a) {
				; CHECK-LABEL: tanf_f32_nofast
				; CHECK-NOT: __xl_tanf_finite
				; CHECK: blr
				entry:
				%call = tail call float @tanf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @tanhf_f32_nofast(float %a) {
				; CHECK-LABEL: tanhf_f32_nofast
				; CHECK-NOT: __xl_tanhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @tanhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define double @acos_f64_nofast(double %a) {
				; CHECK-LABEL: acos_f64_nofast
				; CHECK-NOT: __xl_acos_finite
				; CHECK: blr
				entry:
				%call = tail call double @acos(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @acosh_f64_nofast(double %a) {
				; CHECK-LABEL: acosh_f64_nofast
				; CHECK-NOT: __xl_acosh_finite
				; CHECK: blr
				entry:
				%call = tail call double @acosh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @anint_f64_nofast(double %a) {
				; CHECK-LABEL: anint_f64_nofast
				; CHECK-NOT: __xl_anint_finite
				; CHECK: blr
				entry:
				%call = tail call double @anint(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @asin_f64_nofast(double %a) {
				; CHECK-LABEL: asin_f64_nofast
				; CHECK-NOT: __xl_asin_finite
				; CHECK: blr
				entry:
				%call = tail call double @asin(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @asinh_f64_nofast(double %a) {
				; CHECK-LABEL: asinh_f64_nofast
				; CHECK-NOT: __xl_asinh_finite
				; CHECK: blr
				entry:
				%call = tail call double @asinh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @atan_f64_nofast(double %a) {
				; CHECK-LABEL: atan_f64_nofast
				; CHECK-NOT: __xl_atan_finite
				; CHECK: blr
				entry:
				%call = tail call double @atan(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @atan2_f64_nofast(double %a) {
				; CHECK-LABEL: atan2_f64_nofast
				; CHECK-NOT: __xl_atan2_finite
				; CHECK: blr
				entry:
				%call = tail call double @atan2(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @atanh_f64_nofast(double %a) {
				; CHECK-LABEL: atanh_f64_nofast
				; CHECK-NOT: __xl_atanh_finite
				; CHECK: blr
				entry:
				%call = tail call double @atanh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cbrt_f64_nofast(double %a) {
				; CHECK-LABEL: cbrt_f64_nofast
				; CHECK-NOT: __xl_cbrt_finite
				; CHECK: blr
				entry:
				%call = tail call double @cbrt(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @copysign_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: copysign_f64_nofast
				; CHECK-NOT: __xl_copysign_finite
				; CHECK: blr
				entry:
				%call = tail call double @copysign(double %a, double %b)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cos_f64_nofast(double %a) {
				; CHECK-LABEL: cos_f64_nofast
				; CHECK-NOT: __xl_cos_finite
				; CHECK: blr
				entry:
				%call = tail call double @cos(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cosh_f64_nofast(double %a) {
				; CHECK-LABEL: cosh_f64_nofast
				; CHECK-NOT: __xl_cosh_finite
				; CHECK: blr
				entry:
				%call = tail call double @cosh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cosisin_f64_nofast(double %a) {
				; CHECK-LABEL: cosisin_f64_nofast
				; CHECK-NOT: __xl_cosisin_finite
				; CHECK: blr
				entry:
				%call = tail call double @cosisin(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @dnint_f64_nofast(double %a) {
				; CHECK-LABEL: dnint_f64_nofast
				; CHECK-NOT: __xl_dnint_finite
				; CHECK: blr
				entry:
				%call = tail call double @dnint(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @erf_f64_nofast(double %a) {
				; CHECK-LABEL: erf_f64_nofast
				; CHECK-NOT: __xl_erf_finite
				; CHECK: blr
				entry:
				%call = tail call double @erf(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @erfc_f64_nofast(double %a) {
				; CHECK-LABEL: erfc_f64_nofast
				; CHECK-NOT: __xl_erfc_finite
				; CHECK: blr
				entry:
				%call = tail call double @erfc(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @exp_f64_nofast(double %a) {
				; CHECK-LABEL: exp_f64_nofast
				; CHECK-NOT: __xl_exp_finite
				; CHECK: blr
				entry:
				%call = tail call double @exp(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @expm1_f64_nofast(double %a) {
				; CHECK-LABEL: expm1_f64_nofast
				; CHECK-NOT: __xl_expm1_finite
				; CHECK: blr
				entry:
				%call = tail call double @expm1(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @hypot_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: hypot_f64_nofast
				; CHECK-NOT: __xl_hypot_finite
				; CHECK: blr
				entry:
				%call = tail call double @hypot(double %a, double %b)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @lgamma_f64_nofast(double %a) {
				; CHECK-LABEL: lgamma_f64_nofast
				; CHECK-NOT: __xl_lgamma_finite
				; CHECK: blr
				entry:
				%call = tail call double @lgamma(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @log_f64_nofast(double %a) {
				; CHECK-LABEL: log_f64_nofast
				; CHECK-NOT: __xl_log_finite
				; CHECK: blr
				entry:
				%call = tail call double @log(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @log10_f64_nofast(double %a) {
				; CHECK-LABEL: log10_f64_nofast
				; CHECK-NOT: __xl_log10_finite
				; CHECK: blr
				entry:
				%call = tail call double @log10(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @log1p_f64_nofast(double %a) {
				; CHECK-LABEL: log1p_f64_nofast
				; CHECK-NOT: __xl_log1p_finite
				; CHECK: blr
				entry:
				%call = tail call double @log1p(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @pow_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: pow_f64_nofast
				; CHECK-NOT: __xl_pow_finite
				; CHECK: blr
				entry:
				%call = tail call double @pow(double %a, double %b)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @rsqrt_f64_nofast(double %a) {
				; CHECK-LABEL: rsqrt_f64_nofast
				; CHECK-NOT: __xl_rsqrt_finite
				; CHECK: blr
				entry:
				%call = tail call double @rsqrt(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sin_f64_nofast(double %a) {
				; CHECK-LABEL: sin_f64_nofast
				; CHECK-NOT: __xl_sin_finite
				; CHECK: blr
				entry:
				%call = tail call double @sin(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sincos_f64_nofast(double %a) {
				; CHECK-LABEL: sincos_f64_nofast
				; CHECK-NOT: __xl_sincos_finite
				; CHECK: blr
				entry:
				%call = tail call double @sincos(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sinh_f64_nofast(double %a) {
				; CHECK-LABEL: sinh_f64_nofast
				; CHECK-NOT: __xl_sinh_finite
				; CHECK: blr
				entry:
				%call = tail call double @sinh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sqrt_f64_nofast(double %a) {
				; CHECK-LABEL: sqrt_f64_nofast
				; CHECK-NOT: __xl_sqrt_finite
				; CHECK: blr
				entry:
				%call = tail call double @sqrt(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @tan_f64_nofast(double %a) {
				; CHECK-LABEL: tan_f64_nofast
				; CHECK-NOT: __xl_tan_finite
				; CHECK: blr
				entry:
				%call = tail call double @tan(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @tanh_f64_nofast(double %a) {
				; CHECK-LABEL: tanh_f64_nofast
				; CHECK-NOT: __xl_tanh_finite
				; CHECK: blr
				entry:
				%call = tail call double @tanh(double %a)
				ret double %call
				}

llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-afn.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck --check-prefix=CHECK-LNX %s
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck --check-prefix=CHECK-AIX %s

				declare float @llvm.pow.f32 (float, float);
				declare double @llvm.pow.f64 (double, double);

				; afn flag powf with 0.25
				define float @llvmintr_powf_f32_afn025(float %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_powf_f32_afn025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI0_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_afn025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C0(2) # %const.0
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call afn float @llvm.pow.f32(float %a, float 2.500000e-01)
				ret float %call
				}

				; afn flag pow with 0.25
				define double @llvmintr_pow_f64_afn025(double %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_pow_f64_afn025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI1_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI1_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_afn025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C1(2) # %const.0
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call afn double @llvm.pow.f64(double %a, double 2.500000e-01)
				ret double %call
				}

				; afn flag powf with 0.75
				define float @llvmintr_powf_f32_afn075(float %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_powf_f32_afn075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI2_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_afn075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C2(2) # %const.0
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call afn float @llvm.pow.f32(float %a, float 7.500000e-01)
				ret float %call
				}

				; afn flag pow with 0.75
				define double @llvmintr_pow_f64_afn075(double %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_pow_f64_afn075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI3_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI3_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_afn075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C3(2) # %const.0
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call afn double @llvm.pow.f64(double %a, double 7.500000e-01)
				ret double %call
				}

				; afn flag powf with 0.50
				define float @llvmintr_powf_f32_afn050(float %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_powf_f32_afn050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI4_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_afn050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C4(2) # %const.0
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call afn float @llvm.pow.f32(float %a, float 5.000000e-01)
				ret float %call
				}

				; afn flag pow with 0.50
				define double @llvmintr_pow_f64_afn050(double %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_pow_f64_afn050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI5_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI5_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_afn050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C5(2) # %const.0
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call afn double @llvm.pow.f64(double %a, double 5.000000e-01)
				ret double %call
				}
				attributes #1 = { "approx-func-fp-math"="true" }

llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck --check-prefix=CHECK-LNX %s
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -enable-approx-func-fp-math -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck --check-prefix=CHECK-AIX %s

				declare float @llvm.pow.f32 (float, float);
				declare double @llvm.pow.f64 (double, double);

				; fast-math powf with 0.25
				define float @llvmintr_powf_f32_fast025(float %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_powf_f32_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: xsrsqrtesp 0, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_0@toc@ha
				; CHECK-LNX-NEXT: lfs 3, .LCPI0_0@toc@l(3)
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_1@toc@ha
				; CHECK-LNX-NEXT: lfs 4, .LCPI0_1@toc@l(3)
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_2@toc@ha
				; CHECK-LNX-NEXT: fmr 5, 3
				; CHECK-LNX-NEXT: xsmulsp 2, 1, 0
				; CHECK-LNX-NEXT: xsabsdp 1, 1
				; CHECK-LNX-NEXT: xsmaddasp 5, 2, 0
				; CHECK-LNX-NEXT: xsmulsp 0, 2, 4
				; CHECK-LNX-NEXT: lfs 2, .LCPI0_2@toc@l(3)
				; CHECK-LNX-NEXT: xssubsp 1, 1, 2
				; CHECK-LNX-NEXT: xsmulsp 0, 0, 5
				; CHECK-LNX-NEXT: xxlxor 5, 5, 5
				; CHECK-LNX-NEXT: fsel 0, 1, 0, 5
				; CHECK-LNX-NEXT: xsrsqrtesp 1, 0
				; CHECK-LNX-NEXT: xsmulsp 6, 0, 1
				; CHECK-LNX-NEXT: xsabsdp 0, 0
				; CHECK-LNX-NEXT: xsmaddasp 3, 6, 1
				; CHECK-LNX-NEXT: xsmulsp 1, 6, 4
				; CHECK-LNX-NEXT: xssubsp 0, 0, 2
				; CHECK-LNX-NEXT: xsmulsp 1, 1, 3
				; CHECK-LNX-NEXT: fsel 1, 0, 1, 5
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C0(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @llvm.pow.f32(float %a, float 2.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.25
				define double @llvmintr_pow_f64_fast025(double %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_pow_f64_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI1_0@toc@ha
				; CHECK-LNX-NEXT: addis 4, 2, .LCPI1_1@toc@ha
				; CHECK-LNX-NEXT: lfs 0, .LCPI1_0@toc@l(3)
				; CHECK-LNX-NEXT: lfs 2, .LCPI1_1@toc@l(4)
				; CHECK-LNX-NEXT: bc 12, 2, .LBB1_3
				; CHECK-LNX-NEXT: # %bb.1: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: fmr 4, 0
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 4, 1, 3
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 2
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 4
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 4, 2, .LBB1_4
				; CHECK-LNX-NEXT: .LBB1_2:
				; CHECK-LNX-NEXT: xssqrtdp 1, 1
				; CHECK-LNX-NEXT: blr
				; CHECK-LNX-NEXT: .LBB1_3:
				; CHECK-LNX-NEXT: xssqrtdp 1, 1
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 12, 2, .LBB1_2
				; CHECK-LNX-NEXT: .LBB1_4: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 0, 1, 3
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 2
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C1(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @llvm.pow.f64(double %a, double 2.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.75
				define float @llvmintr_powf_f32_fast075(float %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_powf_f32_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: xsrsqrtesp 0, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_0@toc@ha
				; CHECK-LNX-NEXT: lfs 3, .LCPI2_0@toc@l(3)
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_1@toc@ha
				; CHECK-LNX-NEXT: lfs 4, .LCPI2_1@toc@l(3)
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_2@toc@ha
				; CHECK-LNX-NEXT: fmr 5, 3
				; CHECK-LNX-NEXT: xsmulsp 2, 1, 0
				; CHECK-LNX-NEXT: xsabsdp 1, 1
				; CHECK-LNX-NEXT: xsmaddasp 5, 2, 0
				; CHECK-LNX-NEXT: xsmulsp 0, 2, 4
				; CHECK-LNX-NEXT: lfs 2, .LCPI2_2@toc@l(3)
				; CHECK-LNX-NEXT: xssubsp 1, 1, 2
				; CHECK-LNX-NEXT: xsmulsp 0, 0, 5
				; CHECK-LNX-NEXT: xxlxor 5, 5, 5
				; CHECK-LNX-NEXT: fsel 0, 1, 0, 5
				; CHECK-LNX-NEXT: xsrsqrtesp 1, 0
				; CHECK-LNX-NEXT: xsmulsp 6, 0, 1
				; CHECK-LNX-NEXT: xsmaddasp 3, 6, 1
				; CHECK-LNX-NEXT: xsmulsp 1, 6, 4
				; CHECK-LNX-NEXT: xsabsdp 4, 0
				; CHECK-LNX-NEXT: xsmulsp 1, 1, 3
				; CHECK-LNX-NEXT: xssubsp 2, 4, 2
				; CHECK-LNX-NEXT: fsel 1, 2, 1, 5
				; CHECK-LNX-NEXT: xsmulsp 1, 0, 1
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C2(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @llvm.pow.f32(float %a, float 7.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.75
				define double @llvmintr_pow_f64_fast075(double %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_pow_f64_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI3_0@toc@ha
				; CHECK-LNX-NEXT: addis 4, 2, .LCPI3_1@toc@ha
				; CHECK-LNX-NEXT: lfs 0, .LCPI3_0@toc@l(3)
				; CHECK-LNX-NEXT: lfs 2, .LCPI3_1@toc@l(4)
				; CHECK-LNX-NEXT: bc 12, 2, .LBB3_3
				; CHECK-LNX-NEXT: # %bb.1: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: fmr 4, 0
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 4, 1, 3
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 2
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 4
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 4, 2, .LBB3_4
				; CHECK-LNX-NEXT: .LBB3_2:
				; CHECK-LNX-NEXT: xssqrtdp 0, 1
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 0
				; CHECK-LNX-NEXT: blr
				; CHECK-LNX-NEXT: .LBB3_3:
				; CHECK-LNX-NEXT: xssqrtdp 1, 1
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 12, 2, .LBB3_2
				; CHECK-LNX-NEXT: .LBB3_4: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 0, 4, 3
				; CHECK-LNX-NEXT: xsmuldp 2, 4, 2
				; CHECK-LNX-NEXT: xsmuldp 0, 2, 0
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C3(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @llvm.pow.f64(double %a, double 7.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.50
				define float @llvmintr_powf_f32_fast050(float %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_powf_f32_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI4_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				bmahjourUnsubmitted Done Reply Inline Actions How come pow -> sqrt conversion didn't happen here? bmahjour: How come pow -> sqrt conversion didn't happen here?
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Honestly, I am not sure why the conversion is not happening in this case. But without this patch we will get `powf` call (the conversion is not happening again). So this is a separate issue that someone needs to look at independent of this patch. masoud.ataei: Honestly, I am not sure why the conversion is not happening in this case. But without this…
				bmahjourUnsubmitted Not Done Reply Inline Actions Could you please make a note of this as a todo comment in each test that is affected? bmahjour: Could you please make a note of this as a todo comment in each test that is affected?
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C4(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @llvm.pow.f32(float %a, float 5.000000e-01)
				ret float %call
				}

				; fast-math pow with 0.50
				define double @llvmintr_pow_f64_fast050(double %a) #1 {
				; CHECK-LNX-LABEL: llvmintr_pow_f64_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI5_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI5_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C5(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @llvm.pow.f64(double %a, double 5.000000e-01)
				ret double %call
				}
				attributes #1 = { "no-infs-fp-math"="true" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "approx-func-fp-math"="true" }

llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-nofast.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck --check-prefix=CHECK-LNX %s
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck --check-prefix=CHECK-AIX %s

				declare float @powf (float, float);
				declare double @pow (double, double);
				declare float @__powf_finite (float, float);
				declare double @__pow_finite (double, double);

				; fast-math powf with 0.25
				define float @powf_f32_fast025(float %a) {
				;
				; CHECK-LNX-LABEL: powf_f32_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI0_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: powf_f32_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C0(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @powf(float %a, float 2.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.25
				define double @pow_f64_fast025(double %a) {
				;
				; CHECK-LNX-LABEL: pow_f64_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI1_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI1_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: pow_f64_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C1(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @pow(double %a, double 2.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.75
				define float @powf_f32_fast075(float %a) {
				;
				; CHECK-LNX-LABEL: powf_f32_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI2_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: powf_f32_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C2(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @powf(float %a, float 7.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.75
				define double @pow_f64_fast075(double %a) {
				;
				; CHECK-LNX-LABEL: pow_f64_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI3_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI3_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: pow_f64_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C3(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @pow(double %a, double 7.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.50
				define float @powf_f32_fast050(float %a) {
				;
				; CHECK-LNX-LABEL: powf_f32_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI4_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: powf_f32_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C4(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @powf(float %a, float 5.000000e-01)
				ret float %call
				}

				; fast-math pow with 0.50
				define double @pow_f64_fast050(double %a) {
				;
				; CHECK-LNX-LABEL: pow_f64_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI5_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI5_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: pow_f64_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C5(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @pow(double %a, double 5.000000e-01)
				ret double %call
				}

				;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

				; fast-math __powf_finite with 0.25
				define float @__powf_finite_f32_fast025(float %a) {
				;
				; CHECK-LNX-LABEL: __powf_finite_f32_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI6_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI6_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __powf_finite_f32_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C6(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @__powf_finite(float %a, float 2.500000e-01)
				ret float %call
				}

				; fast-math __pow_finite with 0.25
				define double @__pow_finite_f64_fast025(double %a) {
				;
				; CHECK-LNX-LABEL: __pow_finite_f64_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI7_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI7_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __pow_finite_f64_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C7(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @__pow_finite(double %a, double 2.500000e-01)
				ret double %call
				}

				; fast-math __powf_finite with 0.75
				define float @__powf_finite_f32_fast075(float %a) {
				;
				; CHECK-LNX-LABEL: __powf_finite_f32_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI8_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI8_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __powf_finite_f32_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C8(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @__powf_finite(float %a, float 7.500000e-01)
				ret float %call
				}

				; fast-math __pow_finite with 0.75
				define double @__pow_finite_f64_fast075(double %a) {
				;
				; CHECK-LNX-LABEL: __pow_finite_f64_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI9_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI9_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __pow_finite_f64_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C9(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @__pow_finite(double %a, double 7.500000e-01)
				ret double %call
				}

				; fast-math __powf_finite with 0.50
				define float @__powf_finite_f32_fast050(float %a) {
				;
				; CHECK-LNX-LABEL: __powf_finite_f32_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI10_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI10_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __powf_finite_f32_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C10(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @__powf_finite(float %a, float 5.000000e-01)
				ret float %call
				}

				; fast-math __pow_finite with 0.50
				define double @__pow_finite_f64_fast050(double %a) {
				;
				; CHECK-LNX-LABEL: __pow_finite_f64_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI11_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI11_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __pow_finite_f64_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C11(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @__pow_finite(double %a, double 5.000000e-01)
				ret double %call
				}

This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Scalar IBM MASS library conversion passClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 358745

clang/include/clang/Driver/Options.td

clang/lib/CodeGen/BackendUtil.cpp

clang/lib/CodeGen/CGCall.cpp

clang/lib/Driver/ToolChains/Clang.cpp

llvm/include/llvm/Analysis/ScalarFuncs.def

llvm/include/llvm/CodeGen/CommandFlags.h

llvm/include/llvm/Target/TargetOptions.h

llvm/lib/CodeGen/CommandFlags.cpp

llvm/lib/Target/PowerPC/CMakeLists.txt

llvm/lib/Target/PowerPC/PPC.h

llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

llvm/lib/Target/PowerPC/PPCTargetMachine.cpp

llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass.ll

llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll

llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll

llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll

llvm/test/CodeGen/PowerPC/lower-scalar-mass-afn.ll

llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll

llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-afn.ll

llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll

llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-nofast.ll

[PowerPC] Scalar IBM MASS library conversion pass
ClosedPublic