This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
5/5
ScalarFuncs.def
-
Target/
1/1
TargetOptions.h
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
-
CMakeLists.txt
-
PPC.h
5/5
PPCGenScalarMASSEntries.cpp
9/10
PPCISelLowering.cpp
-
PPCTargetMachine.cpp
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
3/3
lower-intrinsics-fast-mass.ll
4/4
lower-intrinsics-mass-aix.ll
1/1
lower-intrinsics-nofast-mass.ll
2/2
lower-scalar-mass-fast.ll
1
pow-025-075-intrinsic-scalar-mass.ll
-
pow-025-075-scalar-mass.ll

Differential D101759

[PowerPC] Scalar IBM MASS library conversion pass
ClosedPublic

Authored by masoud.ataei on May 3 2021, 7:46 AM.

Download Raw Diff

Details

Reviewers

etiotto
pjeeva01
renenkel
bmahjour
qiucf
shchenz
spatel
efriedma

Group Reviewers

Restricted Project

Summary

This patch introduces an option to enable conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need to link libxlopt.a library.

This patch is tested on PowerPC Linux and AIX.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	200 ms	x64 debian > LLVM.Examples/OrcV2Examples::lljit-with-remote-debugging.test

Event Timeline

masoud.ataei created this revision.May 3 2021, 7:46 AM

Herald added subscribers: steven.zhang, shchenz, kbarton and 3 others. · View Herald TranscriptMay 3 2021, 7:46 AM

masoud.ataei requested review of this revision.May 3 2021, 7:46 AM

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 3 2021, 7:46 AM

Harbormaster completed remote builds in B102283: Diff 342382.May 3 2021, 8:56 AM

steven.zhang added a reviewer: Restricted Project.May 17 2021, 10:24 PM

bmahjour added inline comments.May 18 2021, 4:28 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1361	why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`?

bmahjour added inline comments.May 18 2021, 4:28 PM

llvm/include/llvm/Analysis/ScalarFuncs.def
17	shouldn't these map from llvm.* intrinsics to mass entry points as well?

masoud.ataei added inline comments.May 19 2021, 1:07 PM

llvm/include/llvm/Analysis/ScalarFuncs.def
17	llvm intrinsics is handled in `PPCISelLowering.cpp`.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1361	We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to block any type of existing optimizations (like pow(x,0.5) --> sqrt(x)) and future optimizations (like https://reviews.llvm.org/D94543 ?).

bmahjour added inline comments.May 25 2021, 12:42 PM

llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp
70	There should be a todo comment to handle non-finite entries using fewer fast-math flags.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1361	I see, could you please put a comment in the code to explain that? Alternatively you can put the comment at the top of `llvm/include/llvm/Analysis/ScalarFuncs.def`.
1361	Instead of `TM.Options.UnsafeFPMath` we should test for the individual fast-math flags that are required for safety. Checking for "unsafe-fp-math" has a few drawbacks: To make clang enable that flag it is necessary but not enough to specify `-funsafe-math-optimizations`! You'd have to specify `-fno-math-errno` as well. Clang sets the "unsafe-fp-math" flag when all four of `-fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros` are specified, regardless of other flags... For example this command does the conversion to the _finite calls despite the user request to honor NaNs. `clang t.c -c -O3 -fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros -fhonor-nans` Even if the clang inconsistencies/issues are resolved, it would still be better to check for the individual flags for finer control and for consistency with other front-ends.
llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll
1	why not just use the default `CHECK` prefix? `CHECK-ALL` and `CHECK-LWR` don't distinguish anything based on this run command.
19	CHECK-DFLT is not in the list of prefixes defined.
llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll
147	Remove this line, `#1` is unused.

Sorry it took me so long to update this patch -- I think I addressed all reviews till now.

masoud.ataei marked 8 inline comments as done.Jun 29 2021, 1:28 PM

Harbormaster completed remote builds in B111607: Diff 355347.Jun 29 2021, 2:48 PM

bmahjour added inline comments.Jul 7 2021, 2:04 PM

llvm/include/llvm/Analysis/ScalarFuncs.def
12	[nit] ISelLowing -> PPCISelLowering
llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp
10	Since LLVM math intrinsic lowerings are done in ISellLowering, this comment should not say "and LLVM math intrinsics".
14	llvm.cos.f32 is an intrinsic and not handled by this transformation.
73	remove this line
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1378	Why do you still check for `TM.Options.UnsafeFPMath` ? If you do it out of concerns for `-fno-math-errno`, then it's not needed. Note that these llvm intrinsics already mention that their semantics are identical to their libm counter parts but "without trapping or setting errno".
llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll
148	See above comment and remove "unsafe-fp-math".
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass.ll
303	See above comment and remove "unsafe-fp-math".

Removed dependency to unsafe-fp-math and added clang option to
control afn flag.

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2021, 2:07 PM

Herald added subscribers: cfe-commits, ormris, dang. · View Herald Transcript

masoud.ataei updated this revision to Diff 358756.Jul 14 2021, 2:38 PM

jsji added reviewers: qiucf, shchenz.Jul 14 2021, 2:47 PM

Harbormaster completed remote builds in B114101: Diff 358756.Jul 14 2021, 5:08 PM

bmahjour added inline comments.Jul 15 2021, 10:37 AM

clang/include/clang/Driver/Options.td
1726 ↗	(On Diff #358756)	I think we should separate out the clang driver interface into its own patch and ask for feedback on the mailing list. One key question would be, should this option assume no-errno and no-trapping-math or not (given that there is no IR representation for them). There should also be LIT tests dedicated to this to verify the clang interface. I only see llc interface being tested in this patch.
llvm/include/llvm/Target/TargetOptions.h
189	We already have the `PPCGenScalarMASSEntries` bit, why do we need another one? Perhaps we can remove `PPCGenScalarMASSEntries`, but we should not have to turn on two options to get one transformation enabled.
llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp
73	...but errno and trapping-math would be an issue for non-finite entries as well. Again, I think this function should just check for nnan/ninf/afn flags. We need to find out (with the help of the wider community) how to deal with the concerns surrounding errno and traps separately. One way to do that would be to broaden the definition of the `afn` flag to include no-errno and no-trapping semantics. Another way might be to make clang FE set the `afn` bit only if `-fno-math-errno` and `-fno-trapping-math` options are enabled (less desirable). A third way might be to add corresponding function attributes to the IR for `-fno-math-errno` and `-fno-trapping-math`. Once these issues are sorted out, we can add the appropriate constraints to the `isCandidateSafeToLower` function.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1378	if someone compiles with -Ofast without any extra options, would `TM.Options.ApproxFuncFPMath` be true here?

Removed clang changes from this PR.
Removed extra option for MASS pass.
Now MASS pass is active with -O3 and approx-func option.

Adding another PR for clang changes on approx-func option.

Harbormaster completed remote builds in B114549: Diff 359385.Jul 16 2021, 11:03 AM

masoud.ataei marked 9 inline comments as done.Jul 16 2021, 11:10 AM

masoud.ataei added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1378	In clang changes, I had `Options.ApproxFuncFPMath = LangOpts.ApproxFunc;` in `clang/lib/CodeGen/BackendUtil.cpp`. That was responsible to update this TM option based on the clang approximate func option. And clang approximate func option will be set with -Ofast. Then, the answer for your question is yes.

masoud.ataei mentioned this in D106191: [clang] Option control afn flag.Jul 16 2021, 2:14 PM

Do we *really* need -enable-approx-func-fp-math?
I'm pretty sure we are moving away from such global options, onto relying only on the per-instruction fast-math flags.

In D101759#2967250, @lebedev.ri wrote:

Do we *really* need -enable-approx-func-fp-math?
I'm pretty sure we are moving away from such global options, onto relying only on the per-instruction fast-math flags.

I am handling LLVM intrinsic math functions in PPCISelLowering.cpp, so I need to check for TM.Options.ApproxFuncFPMath. This is the only place that I think I need it.
Currently, I am updating TM.Options.ApproxFuncFPMath in llvm/lib/CodeGen/CommandFlags.cpp using the global option. Please let me know if there is a better way to update TM.Options.ApproxFuncFPMath based on the local fast-math flag.

In D101759#2967331, @masoud.ataei wrote:

In D101759#2967250, @lebedev.ri wrote:

Do we *really* need -enable-approx-func-fp-math?
I'm pretty sure we are moving away from such global options, onto relying only on the per-instruction fast-math flags.

I am handling LLVM intrinsic math functions in PPCISelLowering.cpp, so I need to check for TM.Options.ApproxFuncFPMath. This is the only place that I think I need it.

How is this going to work e.g. in LTO when not all TU's are compiled with fast-math flags?

I'm not familiar with those llc flags, but i'm quite sure that e.g. DAGCombiner
is transitioned away from using them, so i'm wary of adding new ones.

Currently, I am updating TM.Options.ApproxFuncFPMath in llvm/lib/CodeGen/CommandFlags.cpp using the global option. Please let me know if there is a better way to update TM.Options.ApproxFuncFPMath based on the local fast-math flag.

Removing dependency to the global option to convert math functions to MASS.

Herald added subscribers: dexonsmith, jdoerfert. · View Herald TranscriptAug 26 2021, 2:22 PM

Harbormaster completed remote builds in B121408: Diff 368980.Aug 26 2021, 3:21 PM

I'm not familiar with this library, and I haven't looked at current state of how we enable/map optional libs in a while...
We definitely want to avoid adding another target option/debug flag, and if we can avoid relying on a function parameter too, that would be even better.
Ie, the "afn" fast-math-flag (possibly in combination with some other IR- or node-level flags) seems like it should be enough to allow this transform/lowering.
Scanning the earlier review comments, there was some concern about the semantics wrt errno. If we need to adjust the "afn" definition, it's probably fine. There haven't been many uses of that flag AFAIK.

errno handling for math library functions is a mess. Currently, we don't model it properly; we just mark the calls "readnone" and hope for the best. If you don't want to fix that, just check for readnone for now.

I don't think we want to be querying function attributes or options here; afn plus enabling MASS should be enough. The function attributes are the old mechanism; we just haven't completely migrated some parts of SelectionDAG yet.

llvm/include/llvm/Analysis/ScalarFuncs.def
20	Do "__acosf_finite" etc. actually exist on AIX? I thought they only existed on glibc, and the glibc functions are all deprecated. I think I'd prefer to track this information in TargetLibraryInfo, like we do for the vector functions, so we can more easily generalize this mechanism in the future.

In D101759#2971567, @efriedma wrote:

errno handling for math library functions is a mess. Currently, we don't model it properly; we just mark the calls "readnone" and hope for the best. If you don't want to fix that, just check for readnone for now.

I think using readnone would work fine. It seems that clang marks math functions with that attribute when -fno-math-errno is in effect. To get the non-finite MASS lowerings at -O3 one would have to compile with both -fapprox-func and -fno-math-errno, which seems reasonable to me.

I don't think we want to be querying function attributes or options here; afn plus enabling MASS should be enough. The function attributes are the old mechanism; we just haven't completely migrated some parts of SelectionDAG yet.

I agree. I think the problem is that this patch is trying to decide on a global lowering strategy for llvm.* math intrinsics in llvm/lib/Target/PowerPC/PPCISelLowering.cpp but such global decision making does not go well with finer granularity of fast-math flags. My understanding is that the reason we need to handle intrinsic math functions later is because of strength-reduction transformations like pow(x,0.5) --> sqrt(x) that currently operate on intrinsic calls only. If we could apply those operations on things like __xl_pow_finite and produce calls to __xl_sqrt_finite then we wouldn't have this problem. Another possibility might be to have two versions of PPCGenScalarMASSEntries one that handles non-intrinsics and runs earlier, and another one that handles intrinsics after transformations likes pow(x,0.5) --> sqrt(x) are done.

I agree. I think the problem is that this patch is trying to decide on a global lowering strategy for llvm.* math intrinsics in llvm/lib/Target/PowerPC/PPCISelLowering.cpp but such global decision making does not go well with finer granularity of fast-math flags.

Hmm. Instead of using setLibcallName() and letting the legalizer generate the calls, it should be possible to use custom lowering to generate the appropriate calls, at the cost of writing a little more code.

My understanding is that the reason we need to handle intrinsic math functions later is because of strength-reduction transformations like pow(x,0.5) --> sqrt(x) that currently operate on intrinsic calls only.

instcombine should be primarily responsible for this sort of optimization. See LibCallSimplifier::optimizePow. I guess a few transforms (D51630 etc.) landed in DAGCombine; probably we could move them earlier.

masoud.ataei mentioned this in D110288: Move pow transformations to sqrt/cbrt to earlier in the compiler pipeline.Sep 22 2021, 1:27 PM

As suggested before, I removed dependency to the global option to convert math functions to MASS for all intrinsic and non-intrinsic functions.
The main changes here with respect to the last proposal is in PPCIselLowing.cpp file, about how to handle llvm intrinsic math function.

and sorry for taking so long to update the patch.

masoud.ataei added inline comments.Jan 7 2022, 10:54 AM

llvm/include/llvm/Analysis/ScalarFuncs.def
20	Some machines still have the old glibc, so I kept them for compatibility.

Harbormaster completed remote builds in B142123: Diff 398194.Jan 7 2022, 11:56 AM

ormris removed a subscriber: ormris.Jan 18 2022, 10:08 AM

This update will fix the type of arguments passing to the converted math function in PPCISelLowing.cpp.

masoud.ataei marked an inline comment as done.Jan 24 2022, 7:00 AM

Harbormaster completed remote builds in B145229: Diff 402508.Jan 24 2022, 1:34 PM

dexonsmith removed a subscriber: dexonsmith.Jan 24 2022, 6:48 PM

bmahjour added inline comments.Jan 27 2022, 2:04 PM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
375	what about tan, acos, and the others?
llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass.ll
148 ↗	(On Diff #402508)	All the calls have `afn`....why do we need this attribute?
llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll
149	do we need this attribute? Can we remove it or have separate tests for functions with attributes?
llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll
2	We don't really need a separate aix file. Can we just add a run line with the aix triple to `llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll`?
llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll
797	shouldn't the tests starting from here move to a different file? This test file is called ...mass-fast.ll so one would expect it only contains tests with fast-math flag on.
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
246 ↗	(On Diff #402508)	How come pow -> sqrt conversion didn't happen here?
llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
22 ↗	(On Diff #402508)	so pow->sqrt translation never happens for non-intrinsic `pow`. Is that expected? If so, are we planning to recognize these patterns inside PPCGenScalarMASSEntries in the future and do the translation as part of that transform?

masoud.ataei marked 7 inline comments as done.Jan 28 2022, 10:25 AM

masoud.ataei added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
375	These are the list of math functions that llvm creates intrinsic call for them. There is no llvm intrinsic for tan, acos and other math functions which (exist in MASS and) are not in this list.
llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass.ll
148 ↗	(On Diff #402508)	Removed
llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll
149	Removed
llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll
2	Done
llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll
797	Done
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
246 ↗	(On Diff #402508)	Honestly, I am not sure why the conversion is not happening in this case. But without this patch we will get `powf` call (the conversion is not happening again). So this is a separate issue that someone needs to look at independent of this patch.
llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
22 ↗	(On Diff #402508)	Correct, pow->sqrt translation is not happening for none intrinsic cases. It is the case independent of this patch. I guess the reason is DAGCombiner only apply this optimization on llvm intrinsics. This is an issue that either we need to handle it in DAGCombiner (same as intrinsic one) or in MASS pass. I feel DAGCombiner is a better option and I think this is also a separate issue.

Fix test cases.

Changing function name: lowerLibCall() -> lowerLibCallType()

Ready for another round of review.

Harbormaster completed remote builds in B146335: Diff 404091.Jan 28 2022, 12:24 PM

Apart from some minor inline comments this revision addresses all my outstanding comments. LGTM.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
17217	[nit] a better name would be `lowerLibCallBasedOnType`
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
246 ↗	(On Diff #402508)	Could you please make a note of this as a todo comment in each test that is affected?
llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
22 ↗	(On Diff #402508)	Ok, I understand now. We'll have to come back to this later at some point.

This revision is now accepted and ready to land.Feb 1 2022, 11:21 AM

masoud.ataei mentioned this in rG256d2533322c: [PowerPC] Scalar IBM MASS library conversion pass.Feb 2 2022, 7:54 AM

masoud.ataei closed this revision.Feb 2 2022, 8:35 AM

masoud.ataei mentioned this in D121016: [PowerPC] Fix the none tail call in scalar MASS conversion.Mar 4 2022, 11:44 AM

masoud.ataei mentioned this in rG30f30e1c12fa: [PowerPC] Fix the none tail call in scalar MASS conversion.Mar 8 2022, 9:02 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

ScalarFuncs.def

148 lines

Target/

TargetOptions.h

4 lines

lib/

Target/

PowerPC/

CMakeLists.txt

1 line

PPC.h

4 lines

PPCGenScalarMASSEntries.cpp

137 lines

PPCISelLowering.cpp

14 lines

PPCTargetMachine.cpp

15 lines

test/

CodeGen/

PowerPC/

lower-intrinsics-fast-mass.ll

147 lines

lower-intrinsics-mass-aix.ll

146 lines

lower-intrinsics-nofast-mass.ll

147 lines

lower-scalar-mass-fast.ll

1145 lines

pow-025-075-intrinsic-scalar-mass.ll

318 lines

pow-025-075-scalar-mass.ll

455 lines

Diff 342382

llvm/include/llvm/Analysis/ScalarFuncs.def

This file was added.

				//===-- ScalarFuncs.def - Library information ----------- C++ -----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				// This .def file creates mapping from standard IEEE math functions and LLVM
				// math intrinsics to their corresponding entries in the IBM MASS (scalar)
				// library.

				bmahjourUnsubmitted Done Reply Inline Actions [nit] ISelLowing -> PPCISelLowering bmahjour: [nit] ISelLowing -> PPCISelLowering
				#if defined(TLI_DEFINE_SCALAR_MASS_FUNCS)
				#define TLI_DEFINE_SCALAR_MASS_FUNC(SCAL, MASSENTRY) {SCAL, MASSENTRY},
				#endif

				TLI_DEFINE_SCALAR_MASS_FUNC("acosf", "__xl_acosf_finite")
				bmahjourUnsubmitted Done Reply Inline Actions shouldn't these map from llvm.* intrinsics to mass entry points as well? bmahjour: shouldn't these map from llvm.* intrinsics to mass entry points as well?
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions llvm intrinsics is handled in `PPCISelLowering.cpp`. masoud.ataei: llvm intrinsics is handled in `PPCISelLowering.cpp`.
				TLI_DEFINE_SCALAR_MASS_FUNC("__acosf_finite", "__xl_acosf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("acos", "__xl_acos_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__acos_finite", "__xl_acos_finite")
				efriedmaUnsubmitted Done Reply Inline Actions Do "__acosf_finite" etc. actually exist on AIX? I thought they only existed on glibc, and the glibc functions are all deprecated. I think I'd prefer to track this information in TargetLibraryInfo, like we do for the vector functions, so we can more easily generalize this mechanism in the future. efriedma: Do "__acosf_finite" etc. actually exist on AIX? I thought they only existed on glibc, and the…
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Some machines still have the old glibc, so I kept them for compatibility. masoud.ataei: Some machines still have the old glibc, so I kept them for compatibility.

				TLI_DEFINE_SCALAR_MASS_FUNC("acoshf", "__xl_acoshf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__acoshf_finite", "__xl_acoshf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("acosh", "__xl_acosh_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__acosh_finite", "__xl_acosh_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("asinf", "__xl_asinf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asinf_finite", "__xl_asinf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("asin", "__xl_asin_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asin_finite", "__xl_asin_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("asinhf", "__xl_asinhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asinhf_finite", "__xl_asinhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("asinh", "__xl_asinh_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__asinh_finite", "__xl_asinh_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("atanf", "__xl_atanf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atanf_finite", "__xl_atanf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("atan", "__xl_atan_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atan_finite", "__xl_atan_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("atan2f", "__xl_atan2f_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atan2f_finite", "__xl_atan2f_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("atan2", "__xl_atan2_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atan2_finite", "__xl_atan2_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("atanhf", "__xl_atanhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atanhf_finite", "__xl_atanhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("atanh", "__xl_atanh_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__atanh_finite", "__xl_atanh_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("cbrtf", "__xl_cbrtf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cbrtf_finite", "__xl_cbrtf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("cbrt", "__xl_cbrt_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cbrt_finite", "__xl_cbrt_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("cosf", "__xl_cosf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cosf_finite", "__xl_cosf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("cos", "__xl_cos_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cos_finite", "__xl_cos_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("coshf", "__xl_coshf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__coshf_finite", "__xl_coshf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("cosh", "__xl_cosh_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cosh_finite", "__xl_cosh_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("cosisin", "__xl_cosisin_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__cosisin_finite", "__xl_cosisin_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("erff", "__xl_erff_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erff_finite", "__xl_erff_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("erf", "__xl_erf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erf_finite", "__xl_erf_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("erfcf", "__xl_erfcf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erfcf_finite", "__xl_erfcf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("erfc", "__xl_erfc_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__erfc_finite", "__xl_erfc_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("expf", "__xl_expf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__expf_finite", "__xl_expf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("exp", "__xl_exp_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__exp_finite", "__xl_exp_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("expm1f", "__xl_expm1f_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__expm1f_finite", "__xl_expm1f_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("expm1", "__xl_expm1_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__expm1_finite", "__xl_expm1_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("hypotf", "__xl_hypotf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__hypotf_finite", "__xl_hypotf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("hypot", "__xl_hypot_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__hypot_finite", "__xl_hypot_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("lgammaf", "__xl_lgammaf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__lgammaf_finite", "__xl_lgammaf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("lgamma", "__xl_lgamma_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__lgamma_finite", "__xl_lgamma_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("logf", "__xl_logf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__logf_finite", "__xl_logf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("log", "__xl_log_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log_finite", "__xl_log_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("log10f", "__xl_log10f_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log10f_finite", "__xl_log10f_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("log10", "__xl_log10_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log10_finite", "__xl_log10_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("log1pf", "__xl_log1pf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log1pf_finite", "__xl_log1pf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("log1p", "__xl_log1p_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__log1p_finite", "__xl_log1p_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("powf", "__xl_powf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__powf_finite", "__xl_powf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("pow", "__xl_pow_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__pow_finite", "__xl_pow_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("rsqrt", "__xl_rsqrt_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("sincos", "__xl_sincos_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sincos_finite", "__xl_sincos_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("sinf", "__xl_sinf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sinf_finite", "__xl_sinf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("sin", "__xl_sin_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sin_finite", "__xl_sin_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("sinhf", "__xl_sinhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sinhf_finite", "__xl_sinhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("sinh", "__xl_sinh_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__sinh_finite", "__xl_sinh_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("sqrt", "__xl_sqrt_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("tanf", "__xl_tanf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tanf_finite", "__xl_tanf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("tan", "__xl_tan_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tan_finite", "__xl_tan_finite")

				TLI_DEFINE_SCALAR_MASS_FUNC("tanhf", "__xl_tanhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tanhf_finite", "__xl_tanhf_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("tanh", "__xl_tanh_finite")
				TLI_DEFINE_SCALAR_MASS_FUNC("__tanh_finite", "__xl_tanh_finite")

				#undef TLI_DEFINE_SCALAR_MASS_FUNCS
				#undef TLI_DEFINE_SCALAR_MASS_FUNC

llvm/include/llvm/Target/TargetOptions.h

Show First 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	TargetOptions()
TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),		TrapUnreachable(false), NoTrapAfterNoreturn(false), TLSSize(0),
EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),		EmulatedTLS(false), ExplicitEmulatedTLS(false), EnableIPRA(false),
EmitStackSizeSection(false), EnableMachineOutliner(false),		EmitStackSizeSection(false), EnableMachineOutliner(false),
EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),		EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),
EmitAddrsig(false), EmitCallSiteInfo(false),		EmitAddrsig(false), EmitCallSiteInfo(false),
SupportsDebugEntryValues(false), EnableDebugEntryValues(false),		SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
PseudoProbeForProfiling(false), ValueTrackingVariableLocations(false),		PseudoProbeForProfiling(false), ValueTrackingVariableLocations(false),
ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),		ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),
		PPCGenScalarMASSEntries(false),
FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}		FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}

/// DisableFramePointerElim - This returns true if frame pointer elimination		/// DisableFramePointerElim - This returns true if frame pointer elimination
/// optimization should be disabled for the given machine function.		/// optimization should be disabled for the given machine function.
bool DisableFramePointerElim(const MachineFunction &MF) const;		bool DisableFramePointerElim(const MachineFunction &MF) const;

/// If greater than 0, override the default value of		/// If greater than 0, override the default value of
/// MCAsmInfo::BinutilsVersion.		/// MCAsmInfo::BinutilsVersion.
Show All 28 Lines	public:
/// specifies that optimizations are allowed to treat the sign of a zero		/// specifies that optimizations are allowed to treat the sign of a zero
/// argument or result as insignificant.		/// argument or result as insignificant.
unsigned NoSignedZerosFPMath : 1;		unsigned NoSignedZerosFPMath : 1;

/// EnableAIXExtendedAltivecABI - This flag returns true when -vec-extabi is		/// EnableAIXExtendedAltivecABI - This flag returns true when -vec-extabi is
/// specified. The code generator is then able to use both volatile and		/// specified. The code generator is then able to use both volatile and
/// nonvolitle vector regisers. When false, the code generator only uses		/// nonvolitle vector regisers. When false, the code generator only uses
/// volatile vector registers which is the default setting on AIX.		/// volatile vector registers which is the default setting on AIX.
unsigned EnableAIXExtendedAltivecABI : 1;		unsigned EnableAIXExtendedAltivecABI : 1;
		bmahjourUnsubmitted Done Reply Inline Actions We already have the `PPCGenScalarMASSEntries` bit, why do we need another one? Perhaps we can remove `PPCGenScalarMASSEntries`, but we should not have to turn on two options to get one transformation enabled. bmahjour: We already have the `PPCGenScalarMASSEntries` bit, why do we need another one? Perhaps we can…

/// HonorSignDependentRoundingFPMath - This returns true when the		/// HonorSignDependentRoundingFPMath - This returns true when the
/// -enable-sign-dependent-rounding-fp-math is specified. If this returns		/// -enable-sign-dependent-rounding-fp-math is specified. If this returns
/// false (the default), the code generator is allowed to assume that the		/// false (the default), the code generator is allowed to assume that the
/// rounding behavior is the default (round-to-zero for all floating point		/// rounding behavior is the default (round-to-zero for all floating point
/// to integer conversions, and round-to-nearest for all other arithmetic		/// to integer conversions, and round-to-nearest for all other arithmetic
/// truncations). If this is enabled (set to true), the code generator must		/// truncations). If this is enabled (set to true), the code generator must
/// assume that the rounding mode may dynamically change.		/// assume that the rounding mode may dynamically change.
▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	public:
unsigned ValueTrackingVariableLocations : 1;		unsigned ValueTrackingVariableLocations : 1;

/// Emit DWARF debug frame section.		/// Emit DWARF debug frame section.
unsigned ForceDwarfFrameSection : 1;		unsigned ForceDwarfFrameSection : 1;

/// Emit XRay Function Index section		/// Emit XRay Function Index section
unsigned XRayOmitFunctionIndex : 1;		unsigned XRayOmitFunctionIndex : 1;

		/// Enables scalar MASS conversions
		unsigned PPCGenScalarMASSEntries : 1;

/// Stack protector guard offset to use.		/// Stack protector guard offset to use.
int StackProtectorGuardOffset = INT_MAX;		int StackProtectorGuardOffset = INT_MAX;

/// Stack protector guard mode to use, e.g. tls, global.		/// Stack protector guard mode to use, e.g. tls, global.
StackProtectorGuards StackProtectorGuard =		StackProtectorGuards StackProtectorGuard =
StackProtectorGuards::None;		StackProtectorGuards::None;

/// Stack protector guard reg to use, e.g. usually fs or gs in X86.		/// Stack protector guard reg to use, e.g. usually fs or gs in X86.
▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/CMakeLists.txt

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	add_llvm_target(PowerPCCodeGen
PPCTLSDynamicCall.cpp		PPCTLSDynamicCall.cpp
PPCVSXCopy.cpp		PPCVSXCopy.cpp
PPCReduceCRLogicals.cpp		PPCReduceCRLogicals.cpp
PPCVSXFMAMutate.cpp		PPCVSXFMAMutate.cpp
PPCVSXSwapRemoval.cpp		PPCVSXSwapRemoval.cpp
PPCExpandISEL.cpp		PPCExpandISEL.cpp
PPCPreEmitPeephole.cpp		PPCPreEmitPeephole.cpp
PPCLowerMASSVEntries.cpp		PPCLowerMASSVEntries.cpp
		PPCGenScalarMASSEntries.cpp
GISel/PPCCallLowering.cpp		GISel/PPCCallLowering.cpp
GISel/PPCRegisterBankInfo.cpp		GISel/PPCRegisterBankInfo.cpp
GISel/PPCLegalizerInfo.cpp		GISel/PPCLegalizerInfo.cpp

LINK_COMPONENTS		LINK_COMPONENTS
Analysis		Analysis
AsmPrinter		AsmPrinter
BinaryFormat		BinaryFormat
Show All 20 Lines

llvm/lib/Target/PowerPC/PPC.h

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	#endif
void initializePPCMIPeepholePass(PassRegistry&);		void initializePPCMIPeepholePass(PassRegistry&);

extern char &PPCVSXFMAMutateID;		extern char &PPCVSXFMAMutateID;

ModulePass *createPPCLowerMASSVEntriesPass();		ModulePass *createPPCLowerMASSVEntriesPass();
void initializePPCLowerMASSVEntriesPass(PassRegistry &);		void initializePPCLowerMASSVEntriesPass(PassRegistry &);
extern char &PPCLowerMASSVEntriesID;		extern char &PPCLowerMASSVEntriesID;

		ModulePass *createPPCGenScalarMASSEntriesPass();
		void initializePPCGenScalarMASSEntriesPass(PassRegistry &);
		extern char &PPCGenScalarMASSEntriesID;

InstructionSelector *		InstructionSelector *
createPPCInstructionSelector(const PPCTargetMachine &, const PPCSubtarget &,		createPPCInstructionSelector(const PPCTargetMachine &, const PPCSubtarget &,
const PPCRegisterBankInfo &);		const PPCRegisterBankInfo &);
namespace PPCII {		namespace PPCII {

/// Target Operand Flag enum.		/// Target Operand Flag enum.
enum TOF {		enum TOF {
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp

This file was added.

				//===-- PPCGenScalarMASSEntries.cpp ---------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This transformation converts standard math functions and LLVM math intrinsics
				// into their corresponding MASS (scalar) entries for PowerPC targets.
				bmahjourUnsubmitted Done Reply Inline Actions Since LLVM math intrinsic lowerings are done in ISellLowering, this comment should not say "and LLVM math intrinsics". bmahjour: Since LLVM math intrinsic lowerings are done in ISellLowering, this comment should not say "and…
				// Following are examples of such conversion:
				// tanh ---> __xl_tanh_finite
				// llvm.cos.f32 --> __xl_cosf_finite
				// Such lowering is legal under the fast-math option.
				bmahjourUnsubmitted Done Reply Inline Actions llvm.cos.f32 is an intrinsic and not handled by this transformation. bmahjour: llvm.cos.f32 is an intrinsic and not handled by this transformation.
				//
				// While building multiple Compilation Units (CUs) with/without LTO, one
				// can build a CU with the enable-unsafe-fp-math option and the rest of
				// the CUs without it. In the link step, even if enable-unsafe-fp-math
				// option is specified, the compiler must guarantee that the math functions
				// in the CUs (that were built without the enable-unsafe-fp-math) option
				// must not be lowered by this transformation.
				//===----------------------------------------------------------------------===//

				#include "PPC.h"
				#include "PPCSubtarget.h"
				#include "PPCTargetMachine.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/CodeGen/TargetPassConfig.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/Module.h"

				#define DEBUG_TYPE "ppc-gen-scalar-mass"

				using namespace llvm;

				namespace {

				class PPCGenScalarMASSEntries : public ModulePass {
				public:
				static char ID;

				PPCGenScalarMASSEntries() : ModulePass(ID) {
				ScalarMASSFuncs = {
				#define TLI_DEFINE_SCALAR_MASS_FUNCS
				#include "llvm/Analysis/ScalarFuncs.def"
				};
				}

				bool runOnModule(Module &M) override;

				StringRef getPassName() const override {
				return "PPC Generate Scalar MASS Entries";
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<TargetTransformInfoWrapperPass>();
				}

				private:
				std::map<StringRef, StringRef> ScalarMASSFuncs;
				bool isCandidateSafeToLower(const CallInst &CI) const;
				bool createScalarMASSCall(StringRef MASSEntry, CallInst &CI,
				Function &Func) const;
				};

				} // namespace

				/// Returns true if 'fast' flag exists on the call instruction with the math
				/// function
				bool PPCGenScalarMASSEntries::isCandidateSafeToLower(const CallInst &CI) const {
				bmahjourUnsubmitted Done Reply Inline Actions There should be a todo comment to handle non-finite entries using fewer fast-math flags. bmahjour: There should be a todo comment to handle non-finite entries using fewer fast-math flags.
				return CI.isFast();
				}

				bmahjourUnsubmitted Done Reply Inline Actions remove this line bmahjour: remove this line
				bmahjourUnsubmitted Done Reply Inline Actions ...but errno and trapping-math would be an issue for non-finite entries as well. Again, I think this function should just check for nnan/ninf/afn flags. We need to find out (with the help of the wider community) how to deal with the concerns surrounding errno and traps separately. One way to do that would be to broaden the definition of the `afn` flag to include no-errno and no-trapping semantics. Another way might be to make clang FE set the `afn` bit only if `-fno-math-errno` and `-fno-trapping-math` options are enabled (less desirable). A third way might be to add corresponding function attributes to the IR for `-fno-math-errno` and `-fno-trapping-math`. Once these issues are sorted out, we can add the appropriate constraints to the `isCandidateSafeToLower` function. bmahjour: ...but errno and trapping-math would be an issue for non-finite entries as well. Again, I…
				/// Lowers scalar math function or math intrinsic \p Func to its PowerPC
				/// target-specific entry in the scalar MASS library.
				/// e.g.: tanh --> __xl_tanh_finite
				/// llvm.cos.f32 --> __xl_cosf_finite
				/// Both function prototype and its callsite is updated during lowering.
				bool PPCGenScalarMASSEntries::createScalarMASSCall(StringRef MASSEntry,
				CallInst &CI,
				Function &Func) const {
				if (CI.use_empty())
				return false;

				Module *M = Func.getParent();
				assert(M && "Expecting a valid Module");

				FunctionCallee FCache = M->getOrInsertFunction(
				MASSEntry, Func.getFunctionType(), Func.getAttributes());

				CI.setCalledFunction(FCache);

				return true;
				}

				bool PPCGenScalarMASSEntries::runOnModule(Module &M) {
				bool Changed = false;

				auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
				if (!TPC \|\| skipModule(M))
				return false;

				for (Function &Func : M) {
				if (!Func.isDeclaration())
				continue;

				auto Iter = ScalarMASSFuncs.find(Func.getName());
				if (Iter == ScalarMASSFuncs.end())
				continue;

				// The call to createScalarMASSCall() invalidates the iterator over users
				// upon replacing the users. Precomputing the current list of users allows
				// us to replace all the call sites.
				SmallVector<User *, 4> TheUsers;
				for (auto *User : Func.users())
				TheUsers.push_back(User);

				for (auto *User : TheUsers)
				if (auto *CI = dyn_cast_or_null<CallInst>(User)) {
				if (isCandidateSafeToLower(*CI))
				Changed \|= createScalarMASSCall(Iter->second, *CI, Func);
				}
				}

				return Changed;
				}

				char PPCGenScalarMASSEntries::ID = 0;

				char &llvm::PPCGenScalarMASSEntriesID = PPCGenScalarMASSEntries::ID;

				INITIALIZE_PASS(PPCGenScalarMASSEntries, DEBUG_TYPE,
				"Generate Scalar MASS entries", false, false)

				ModulePass *llvm::createPPCGenScalarMASSEntriesPass() {
				return new PPCGenScalarMASSEntries();
				}

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 366 Lines • ▼ Show 20 Lines	PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setOperationAction(ISD::FCOS , MVT::f32, Expand);		setOperationAction(ISD::FCOS , MVT::f32, Expand);
setOperationAction(ISD::FSINCOS, MVT::f32, Expand);		setOperationAction(ISD::FSINCOS, MVT::f32, Expand);
setOperationAction(ISD::FREM , MVT::f32, Expand);		setOperationAction(ISD::FREM , MVT::f32, Expand);
setOperationAction(ISD::FPOW , MVT::f32, Expand);		setOperationAction(ISD::FPOW , MVT::f32, Expand);
if (Subtarget.hasSPE()) {		if (Subtarget.hasSPE()) {
setOperationAction(ISD::FMA , MVT::f64, Expand);		setOperationAction(ISD::FMA , MVT::f64, Expand);
setOperationAction(ISD::FMA , MVT::f32, Expand);		setOperationAction(ISD::FMA , MVT::f32, Expand);
} else {		} else {
setOperationAction(ISD::FMA , MVT::f64, Legal);		setOperationAction(ISD::FMA , MVT::f64, Legal);
		bmahjourUnsubmitted Done Reply Inline Actions what about tan, acos, and the others? bmahjour: what about tan, acos, and the others?
		masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions These are the list of math functions that llvm creates intrinsic call for them. There is no llvm intrinsic for tan, acos and other math functions which (exist in MASS and) are not in this list. masoud.ataei: These are the list of math functions that llvm creates intrinsic call for them. There is no…
setOperationAction(ISD::FMA , MVT::f32, Legal);		setOperationAction(ISD::FMA , MVT::f32, Legal);
}		}

if (Subtarget.hasSPE())		if (Subtarget.hasSPE())
setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f32, Expand);		setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f32, Expand);

setOperationAction(ISD::FLT_ROUNDS_, MVT::i32, Custom);		setOperationAction(ISD::FLT_ROUNDS_, MVT::i32, Custom);

▲ Show 20 Lines • Show All 969 Lines • ▼ Show 20 Lines
setLibcallName(RTLIB::ROUND_F128, "roundf128");		setLibcallName(RTLIB::ROUND_F128, "roundf128");
setLibcallName(RTLIB::LROUND_F128, "lroundf128");		setLibcallName(RTLIB::LROUND_F128, "lroundf128");
setLibcallName(RTLIB::LLROUND_F128, "llroundf128");		setLibcallName(RTLIB::LLROUND_F128, "llroundf128");
setLibcallName(RTLIB::RINT_F128, "rintf128");		setLibcallName(RTLIB::RINT_F128, "rintf128");
setLibcallName(RTLIB::LRINT_F128, "lrintf128");		setLibcallName(RTLIB::LRINT_F128, "lrintf128");
setLibcallName(RTLIB::LLRINT_F128, "llrintf128");		setLibcallName(RTLIB::LLRINT_F128, "llrintf128");
setLibcallName(RTLIB::NEARBYINT_F128, "nearbyintf128");		setLibcallName(RTLIB::NEARBYINT_F128, "nearbyintf128");
setLibcallName(RTLIB::FMA_F128, "fmaf128");		setLibcallName(RTLIB::FMA_F128, "fmaf128");
		if (TM.Options.PPCGenScalarMASSEntries && TM.Options.UnsafeFPMath) {
		bmahjourUnsubmitted Done Reply Inline Actions why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`? bmahjour: why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`?
		masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to block any type of existing optimizations (like pow(x,0.5) --> sqrt(x)) and future optimizations (like https://reviews.llvm.org/D94543 ?). masoud.ataei: We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to…
		bmahjourUnsubmitted Done Reply Inline Actions I see, could you please put a comment in the code to explain that? Alternatively you can put the comment at the top of `llvm/include/llvm/Analysis/ScalarFuncs.def`. bmahjour: I see, could you please put a comment in the code to explain that? Alternatively you can put…
		bmahjourUnsubmitted Done Reply Inline Actions Instead of `TM.Options.UnsafeFPMath` we should test for the individual fast-math flags that are required for safety. Checking for "unsafe-fp-math" has a few drawbacks: To make clang enable that flag it is necessary but not enough to specify `-funsafe-math-optimizations`! You'd have to specify `-fno-math-errno` as well. Clang sets the "unsafe-fp-math" flag when all four of `-fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros` are specified, regardless of other flags... For example this command does the conversion to the _finite calls despite the user request to honor NaNs. `clang t.c -c -O3 -fno-math-errno -fassociative-math -freciprocal-math -fno-signed-zeros -fhonor-nans` Even if the clang inconsistencies/issues are resolved, it would still be better to check for the individual flags for finer control and for consistency with other front-ends. bmahjour: Instead of `TM.Options.UnsafeFPMath` we should test for the individual fast-math flags that are…
		setLibcallName(RTLIB::COS_F64, "__xl_cos_finite");
		setLibcallName(RTLIB::COS_F32, "__xl_cosf_finite");
		setLibcallName(RTLIB::EXP_F64, "__xl_exp_finite");
		setLibcallName(RTLIB::EXP_F32, "__xl_expf_finite");
		setLibcallName(RTLIB::LOG_F64, "__xl_log_finite");
		setLibcallName(RTLIB::LOG_F32, "__xl_logf_finite");
		setLibcallName(RTLIB::LOG10_F64, "__xl_log10_finite");
		setLibcallName(RTLIB::LOG10_F32, "__xl_log10f_finite");
		setLibcallName(RTLIB::POW_F64, "__xl_pow_finite");
		setLibcallName(RTLIB::POW_F32, "__xl_powf_finite");
		setLibcallName(RTLIB::SIN_F64, "__xl_sin_finite");
		setLibcallName(RTLIB::SIN_F32, "__xl_sinf_finite");
		}

// With 32 condition bits, we don't need to sink (and duplicate) compares		// With 32 condition bits, we don't need to sink (and duplicate) compares
// aggressively in CodeGenPrep.		// aggressively in CodeGenPrep.
if (Subtarget.useCRBits()) {		if (Subtarget.useCRBits()) {
		bmahjourUnsubmitted Done Reply Inline Actions Why do you still check for `TM.Options.UnsafeFPMath` ? If you do it out of concerns for `-fno-math-errno`, then it's not needed. Note that these llvm intrinsics already mention that their semantics are identical to their libm counter parts but "without trapping or setting errno". bmahjour: Why do you still check for `TM.Options.UnsafeFPMath` ? If you do it out of concerns for `-fno…
		bmahjourUnsubmitted Done Reply Inline Actions if someone compiles with -Ofast without any extra options, would `TM.Options.ApproxFuncFPMath` be true here? bmahjour: if someone compiles with -Ofast without any extra options, would `TM.Options.ApproxFuncFPMath`…
		masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions In clang changes, I had `Options.ApproxFuncFPMath = LangOpts.ApproxFunc;` in `clang/lib/CodeGen/BackendUtil.cpp`. That was responsible to update this TM option based on the clang approximate func option. And clang approximate func option will be set with -Ofast. Then, the answer for your question is yes. masoud.ataei: In clang changes, I had `Options.ApproxFuncFPMath = LangOpts.ApproxFunc;` in…
setHasMultipleConditionRegisters();		setHasMultipleConditionRegisters();
setJumpIsExpensive();		setJumpIsExpensive();
}		}

setMinFunctionAlignment(Align(4));		setMinFunctionAlignment(Align(4));

switch (Subtarget.getCPUDirective()) {		switch (Subtarget.getCPUDirective()) {
default: break;		default: break;
▲ Show 20 Lines • Show All 15,822 Lines • ▼ Show 20 Lines	case PPC::AM_DQForm: {
// register and the displacement will be the immediate unless it		// register and the displacement will be the immediate unless it
// isn't sufficiently aligned.		// isn't sufficiently aligned.
if (Flags & PPC::MOF_RPlusSImm16) {		if (Flags & PPC::MOF_RPlusSImm16) {
SDValue Op0 = N.getOperand(0);		SDValue Op0 = N.getOperand(0);
SDValue Op1 = N.getOperand(1);		SDValue Op1 = N.getOperand(1);
ConstantSDNode *CN = dyn_cast<ConstantSDNode>(Op1);		ConstantSDNode *CN = dyn_cast<ConstantSDNode>(Op1);
int16_t Imm = CN->getAPIntValue().getZExtValue();		int16_t Imm = CN->getAPIntValue().getZExtValue();
if (!Align \|\| isAligned(*Align, Imm)) {		if (!Align \|\| isAligned(*Align, Imm)) {
Disp = DAG.getTargetConstant(Imm, DL, N.getValueType());		Disp = DAG.getTargetConstant(Imm, DL, N.getValueType());
		bmahjourUnsubmitted Not Done Reply Inline Actions [nit] a better name would be `lowerLibCallBasedOnType` bmahjour: [nit] a better name would be `lowerLibCallBasedOnType`
Base = Op0;		Base = Op0;
if (FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(Op0)) {		if (FrameIndexSDNode *FI = dyn_cast<FrameIndexSDNode>(Op0)) {
Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType());		Base = DAG.getTargetFrameIndex(FI->getIndex(), N.getValueType());
fixupFuncForFI(DAG, FI->getIndex(), N.getValueType());		fixupFuncForFI(DAG, FI->getIndex(), N.getValueType());
}		}
break;		break;
}		}
}		}
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCTargetMachine.cpp

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
EnableMachineCombinerPass("ppc-machine-combiner",		EnableMachineCombinerPass("ppc-machine-combiner",
cl::desc("Enable the machine combiner pass"),		cl::desc("Enable the machine combiner pass"),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

static cl::opt<bool>		static cl::opt<bool>
ReduceCRLogical("ppc-reduce-cr-logicals",		ReduceCRLogical("ppc-reduce-cr-logicals",
cl::desc("Expand eligible cr-logical binary ops to branches"),		cl::desc("Expand eligible cr-logical binary ops to branches"),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

		static cl::opt<bool> EnablePPCGenScalarMASSEntries(
		"enable-ppc-gen-scalar-mass", cl::init(false),
		cl::desc("Enable lowering math functions to their corresponding MASS "
		"(scalar) entries"),
		cl::Hidden);

extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializePowerPCTarget() {		extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializePowerPCTarget() {
// Register the targets		// Register the targets
RegisterTargetMachine<PPCTargetMachine> A(getThePPC32Target());		RegisterTargetMachine<PPCTargetMachine> A(getThePPC32Target());
RegisterTargetMachine<PPCTargetMachine> B(getThePPC32LETarget());		RegisterTargetMachine<PPCTargetMachine> B(getThePPC32LETarget());
RegisterTargetMachine<PPCTargetMachine> C(getThePPC64Target());		RegisterTargetMachine<PPCTargetMachine> C(getThePPC64Target());
RegisterTargetMachine<PPCTargetMachine> D(getThePPC64LETarget());		RegisterTargetMachine<PPCTargetMachine> D(getThePPC64LETarget());

PassRegistry &PR = *PassRegistry::getPassRegistry();		PassRegistry &PR = *PassRegistry::getPassRegistry();
Show All 10 Lines	#endif
initializePPCBSelPass(PR);		initializePPCBSelPass(PR);
initializePPCBranchCoalescingPass(PR);		initializePPCBranchCoalescingPass(PR);
initializePPCBoolRetToIntPass(PR);		initializePPCBoolRetToIntPass(PR);
initializePPCExpandISELPass(PR);		initializePPCExpandISELPass(PR);
initializePPCPreEmitPeepholePass(PR);		initializePPCPreEmitPeepholePass(PR);
initializePPCTLSDynamicCallPass(PR);		initializePPCTLSDynamicCallPass(PR);
initializePPCMIPeepholePass(PR);		initializePPCMIPeepholePass(PR);
initializePPCLowerMASSVEntriesPass(PR);		initializePPCLowerMASSVEntriesPass(PR);
		initializePPCGenScalarMASSEntriesPass(PR);
initializeGlobalISel(PR);		initializeGlobalISel(PR);
}		}

static bool isLittleEndianTriple(const Triple &T) {		static bool isLittleEndianTriple(const Triple &T) {
return T.getArch() == Triple::ppc64le \|\| T.getArch() == Triple::ppcle;		return T.getArch() == Triple::ppc64le \|\| T.getArch() == Triple::ppcle;
}		}

/// Return the datalayout string of a subtarget.		/// Return the datalayout string of a subtarget.
▲ Show 20 Lines • Show All 288 Lines • ▼ Show 20 Lines
void PPCPassConfig::addIRPasses() {		void PPCPassConfig::addIRPasses() {
if (TM->getOptLevel() != CodeGenOpt::None)		if (TM->getOptLevel() != CodeGenOpt::None)
addPass(createPPCBoolRetToIntPass());		addPass(createPPCBoolRetToIntPass());
addPass(createAtomicExpandPass());		addPass(createAtomicExpandPass());

// Lower generic MASSV routines to PowerPC subtarget-specific entries.		// Lower generic MASSV routines to PowerPC subtarget-specific entries.
addPass(createPPCLowerMASSVEntriesPass());		addPass(createPPCLowerMASSVEntriesPass());

		// Generate PowerPC target-specific entries for scalar math functions
		// that are available in IBM MASS (scalar) library.
		if (TM->getOptLevel() != CodeGenOpt::None && EnablePPCGenScalarMASSEntries) {
		TM->Options.PPCGenScalarMASSEntries = EnablePPCGenScalarMASSEntries;
		addPass(createPPCGenScalarMASSEntriesPass());
		}

// If explicitly requested, add explicit data prefetch intrinsics.		// If explicitly requested, add explicit data prefetch intrinsics.
if (EnablePrefetch.getNumOccurrences() > 0)		if (EnablePrefetch.getNumOccurrences() > 0)
addPass(createLoopDataPrefetchPass());		addPass(createLoopDataPrefetchPass());

if (TM->getOptLevel() >= CodeGenOpt::Default && EnableGEPOpt) {		if (TM->getOptLevel() >= CodeGenOpt::Default && EnableGEPOpt) {
// Call SeparateConstOffsetFromGEP pass to extract constants within indices		// Call SeparateConstOffsetFromGEP pass to extract constants within indices
// and lower a GEP with multiple indices to either arithmetic operations or		// and lower a GEP with multiple indices to either arithmetic operations or
// multiple GEPs with single index.		// multiple GEPs with single index.
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s

				declare float @llvm.cos.f32(float)
				declare float @llvm.exp.f32(float)
				declare float @llvm.log10.f32(float)
				declare float @llvm.log.f32(float)
				declare float @llvm.pow.f32(float, float)
				declare float @llvm.rint.f32(float)
				declare float @llvm.sin.f32(float)
				declare double @llvm.cos.f64(double)
				declare double @llvm.exp.f64(double)
				declare double @llvm.log.f64(double)
				declare double @llvm.log10.f64(double)
				declare double @llvm.pow.f64(double, double)
				declare double @llvm.sin.f64(double)

				; With fast math flag specified per-function
				define float @cosf_f32(float %a) #1 {
				; CHECK-LABEL: cosf_f32
				; CHECK: bl __xl_cosf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.cos.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @expf_f32(float %a) #1 {
				; CHECK-LABEL: expf_f32
				; CHECK: bl __xl_expf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.exp.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @log10f_f32(float %a) #1 {
				; CHECK-LABEL: log10f_f32
				; CHECK: bl __xl_log10f_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.log10.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @logf_f32(float %a) #1 {
				; CHECK-LABEL: logf_f32
				; CHECK: bl __xl_logf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.log.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @powf_f32(float %a, float %b) #1 {
				; CHECK-LABEL: powf_f32
				; CHECK: bl __xl_powf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.pow.f32(float %a, float %b)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @rintf_f32(float %a) #1 {
				; CHECK-LABEL: rintf_f32
				; CHECK-NOT: bl __xl_rintf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.rint.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define float @sinf_f32(float %a) #1 {
				; CHECK-LABEL: sinf_f32
				; CHECK: bl __xl_sinf_finite
				; CHECK: blr
				entry:
				%0 = tail call fast float @llvm.sin.f32(float %a)
				ret float %0
				}

				; With fast math flag specified per-function
				define double @cos_f64(double %a) #1 {
				; CHECK-LABEL: cos_f64
				; CHECK: bl __xl_cos_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.cos.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @exp_f64(double %a) #1 {
				; CHECK-LABEL: exp_f64
				; CHECK: bl __xl_exp_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.exp.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @log_f64(double %a) #1 {
				; CHECK-LABEL: log_f64
				; CHECK: bl __xl_log_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.log.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @log10_f64(double %a) #1 {
				; CHECK-LABEL: log10_f64
				; CHECK: bl __xl_log10_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.log10.f64(double %a)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @pow_f64(double %a, double %b) #1 {
				; CHECK-LABEL: pow_f64
				; CHECK: bl __xl_pow_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.pow.f64(double %a, double %b)
				ret double %0
				}

				; With fast math flag specified per-function
				define double @sin_f64(double %a) #1 {
				; CHECK-LABEL: sin_f64
				; CHECK: bl __xl_sin_finite
				; CHECK: blr
				entry:
				%0 = tail call fast double @llvm.sin.f64(double %a)
				ret double %0
				}

				attributes #1 = { "unsafe-fp-math"="true" }
				bmahjourUnsubmitted Done Reply Inline Actions See above comment and remove "unsafe-fp-math". bmahjour: See above comment and remove "unsafe-fp-math".
				bmahjourUnsubmitted Done Reply Inline Actions do we need this attribute? Can we remove it or have separate tests for functions with attributes? bmahjour: do we need this attribute? Can we remove it or have separate tests for functions with…
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Removed masoud.ataei: Removed

llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck -check-prefixes CHECK-ALL,CHECK-LWR %s
				bmahjourUnsubmitted Done Reply Inline Actions why not just use the default `CHECK` prefix? `CHECK-ALL` and `CHECK-LWR` don't distinguish anything based on this run command. bmahjour: why not just use the default `CHECK` prefix? `CHECK-ALL` and `CHECK-LWR` don't distinguish…

				bmahjourUnsubmitted Done Reply Inline Actions We don't really need a separate aix file. Can we just add a run line with the aix triple to `llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll`? bmahjour: We don't really need a separate aix file. Can we just add a run line with the aix triple to…
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Done masoud.ataei: Done
				declare float @llvm.cos.f32(float)
				declare double @llvm.cos.f64(double)
				declare float @llvm.exp.f32(float)
				declare double @llvm.exp.f64(double)
				declare float @llvm.log10.f32(float)
				declare double @llvm.log10.f64(double)
				declare float @llvm.log.f32(float)
				declare double @llvm.log.f64(double)
				declare float @llvm.pow.f32(float, float)
				declare double @llvm.pow.f64(double, double)
				declare float @llvm.rint.f32(float)
				declare float @llvm.sin.f32(float)
				declare double @llvm.sin.f64(double)

				define float @cosf_f32(float %a) {
				; CHECK-ALL-LABEL: cosf_f32
				; CHECK-DFLT: bl __xl_cosf_finite
				bmahjourUnsubmitted Done Reply Inline Actions CHECK-DFLT is not in the list of prefixes defined. bmahjour: CHECK-DFLT is not in the list of prefixes defined.
				; CHECK-LWR-NOT: bl __xl_cosf_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call float @llvm.cos.f32(float %a)
				ret float %0
				}

				define double @cos_f64(double %a) {
				; CHECK-ALL-LABEL: cos_f64
				; CHECK-DFLT: bl __xl_cos_finite
				; CHECK-LWR-NOT: bl __xl_cos_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call double @llvm.cos.f64(double %a)
				ret double %0
				}

				define float @expf_f32(float %a) {
				; CHECK-ALL-LABEL: expf_f32
				; CHECK-DFLT: bl __xl_expf_finite
				; CHECK-LWR-NOT: bl __xl_expf_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call float @llvm.exp.f32(float %a)
				ret float %0
				}

				define double @exp_f64(double %a) {
				; CHECK-ALL-LABEL: exp_f64
				; CHECK-DFLT: bl __xl_exp_finite
				; CHECK-LWR-NOT: bl __xl_exp_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call double @llvm.exp.f64(double %a)
				ret double %0
				}

				define float @log10f_f32(float %a) {
				; CHECK-ALL-LABEL: log10f_f32
				; CHECK-DFLT: bl __xl_log10f_finite
				; CHECK-LWR-NOT: bl __xl_log10f_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call float @llvm.log10.f32(float %a)
				ret float %0
				}

				define double @log_f64(double %a) {
				; CHECK-ALL-LABEL: log_f64
				; CHECK-DFLT: bl __xl_log_finite
				; CHECK-LWR-NOT: bl __xl_log_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call double @llvm.log.f64(double %a)
				ret double %0
				}

				define float @logf_f32(float %a) {
				; CHECK-ALL-LABEL: logf_f32
				; CHECK-DFLT: bl __xl_logf_finite
				; CHECK-LWR-NOT: bl __xl_logf_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call float @llvm.log.f32(float %a)
				ret float %0
				}

				define double @log10_f64(double %a) {
				; CHECK-ALL-LABEL: log10_f64
				; CHECK-DFLT: bl __xl_log10_finite
				; CHECK-LWR-NOT: bl __xl_log10_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call double @llvm.log10.f64(double %a)
				ret double %0
				}

				define float @powf_f32(float %a, float %b) {
				; CHECK-ALL-LABEL: powf_f32
				; CHECK-DFLT: bl __xl_powf_finite
				; CHECK-LWR-NOT: bl __xl_powf_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call float @llvm.pow.f32(float %a, float %b)
				ret float %0
				}

				define double @pow_f64(double %a, double %b) {
				; CHECK-ALL-LABEL: pow_f64
				; CHECK-DFLT: bl __xl_pow_finite
				; CHECK-LWR-NOT: bl __xl_pow_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call double @llvm.pow.f64(double %a, double %b)
				ret double %0
				}

				define float @rintf_f32(float %a) {
				; CHECK-ALL-LABEL: rintf_f32
				; CHECK-DFLT: bl __xl_rintf_finite
				; CHECK-LWR-NOT: bl __xl_rintf_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call float @llvm.rint.f32(float %a)
				ret float %0
				}

				define float @sinf_f32(float %a) {
				; CHECK-ALL-LABEL: sinf_f32
				; CHECK-DFLT: bl __xl_sinf_finite
				; CHECK-LWR-NOT: bl __xl_sinf_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call float @llvm.sin.f32(float %a)
				ret float %0
				}

				define double @sin_f64(double %a) {
				; CHECK-ALL-LABEL: sin_f64
				; CHECK-DFLT: bl __xl_sin_finite
				; CHECK-LWR-NOT: bl __xl_sin_finite
				; CHECK-ALL: blr
				entry:
				%0 = tail call double @llvm.sin.f64(double %a)
				ret double %0
				}

llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s

				declare float @llvm.cos.f32(float)
				declare float @llvm.exp.f32(float)
				declare float @llvm.log10.f32(float)
				declare float @llvm.log.f32(float)
				declare float @llvm.pow.f32(float, float)
				declare float @llvm.rint.f32(float)
				declare float @llvm.sin.f32(float)
				declare double @llvm.cos.f64(double)
				declare double @llvm.exp.f64(double)
				declare double @llvm.log.f64(double)
				declare double @llvm.log10.f64(double)
				declare double @llvm.pow.f64(double, double)
				declare double @llvm.sin.f64(double)


				; With no fast math flag specified per-function
				define float @cosf_f32_nofast(float %a) {
				; CHECK-LABEL: cosf_f32_nofast
				; CHECK-NOT: bl __xl_cosf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.cos.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @expf_f32_nofast(float %a) {
				; CHECK-LABEL: expf_f32_nofast
				; CHECK-NOT: bl __xl_expf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.exp.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @log10f_f32_nofast(float %a) {
				; CHECK-LABEL: log10f_f32_nofast
				; CHECK-NOT: bl __xl_log10f_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.log10.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @logf_f32_nofast(float %a) {
				; CHECK-LABEL: logf_f32_nofast
				; CHECK-NOT: bl __xl_logf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.log.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @powf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: powf_f32_nofast
				; CHECK-NOT: bl __xl_powf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.pow.f32(float %a, float %b)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @rintf_f32_nofast(float %a) {
				; CHECK-LABEL: rintf_f32_nofast
				; CHECK-NOT: bl __xl_rintf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.rint.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define float @sinf_f32_nofast(float %a) {
				; CHECK-LABEL: sinf_f32_nofast
				; CHECK-NOT: bl __xl_sinf_finite
				; CHECK: blr
				entry:
				%0 = tail call float @llvm.sin.f32(float %a)
				ret float %0
				}

				; With no fast math flag specified per-function
				define double @cos_f64_nofast(double %a) {
				; CHECK-LABEL: cos_f64_nofast
				; CHECK-NOT: bl __xl_cos_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.cos.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @exp_f64_nofast(double %a) {
				; CHECK-LABEL: exp_f64_nofast
				; CHECK-NOT: bl __xl_exp_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.exp.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @log_f64_nofast(double %a) {
				; CHECK-LABEL: log_f64_nofast
				; CHECK-NOT: bl __xl_log_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.log.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @log10_f64_nofast(double %a) {
				; CHECK-LABEL: log10_f64_nofast
				; CHECK-NOT: bl __xl_log10_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.log10.f64(double %a)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @pow_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: pow_f64_nofast
				; CHECK-NOT: bl __xl_pow_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.pow.f64(double %a, double %b)
				ret double %0
				}

				; With no fast math flag specified per-function
				define double @sin_f64_nofast(double %a) {
				; CHECK-LABEL: sin_f64_nofast
				; CHECK-NOT: bl __xl_sin_finite
				; CHECK: blr
				entry:
				%0 = tail call double @llvm.sin.f64(double %a)
				ret double %0
				}
				attributes #1 = { "unsafe-fp-math"="true" }
				bmahjourUnsubmitted Done Reply Inline Actions Remove this line, `#1` is unused. bmahjour: Remove this line, `#1` is unused.

llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll

This file was added.

				; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck %s
				; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck %s

				declare float @acosf (float);
				declare float @acoshf (float);
				declare float @asinf (float);
				declare float @asinhf (float);
				declare float @atan2f (float);
				declare float @atanf (float);
				declare float @atanhf (float);
				declare float @cbrtf (float);
				declare float @copysignf (float, float);
				declare float @cosf (float);
				declare float @coshf (float);
				declare float @erfcf (float);
				declare float @erff (float);
				declare float @expf (float);
				declare float @expm1f (float);
				declare float @hypotf (float, float);
				declare float @lgammaf (float);
				declare float @log10f (float);
				declare float @log1pf (float);
				declare float @logf (float);
				declare float @powf (float, float);
				declare float @rintf (float);
				declare float @sinf (float);
				declare float @sinhf (float);
				declare float @tanf (float);
				declare float @tanhf (float);
				declare double @acos (double);
				declare double @acosh (double);
				declare double @anint (double);
				declare double @asin (double);
				declare double @asinh (double);
				declare double @atan (double);
				declare double @atan2 (double);
				declare double @atanh (double);
				declare double @cbrt (double);
				declare double @copysign (double, double);
				declare double @cos (double);
				declare double @cosh (double);
				declare double @cosisin (double);
				declare double @dnint (double);
				declare double @erf (double);
				declare double @erfc (double);
				declare double @exp (double);
				declare double @expm1 (double);
				declare double @hypot (double, double);
				declare double @lgamma (double);
				declare double @log (double);
				declare double @log10 (double);
				declare double @log1p (double);
				declare double @pow (double, double);
				declare double @rsqrt (double);
				declare double @sin (double);
				declare double @sincos (double);
				declare double @sinh (double);
				declare double @sqrt (double);
				declare double @tan (double);
				declare double @tanh (double);

				define float @acosf_f32(float %a) {
				; CHECK-LABEL: acosf_f32
				; CHECK: __xl_acosf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @acosf(float %a)
				ret float %call
				}

				define float @acoshf_f32(float %a) {
				; CHECK-LABEL: acoshf_f32
				; CHECK: __xl_acoshf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @acoshf(float %a)
				ret float %call
				}

				define float @asinf_f32(float %a) {
				; CHECK-LABEL: asinf_f32
				; CHECK: __xl_asinf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @asinf(float %a)
				ret float %call
				}

				define float @asinhf_f32(float %a) {
				; CHECK-LABEL: asinhf_f32
				; CHECK: __xl_asinhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @asinhf(float %a)
				ret float %call
				}

				define float @atan2f_f32(float %a) {
				; CHECK-LABEL: atan2f_f32
				; CHECK: __xl_atan2f_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @atan2f(float %a)
				ret float %call
				}

				define float @atanf_f32(float %a) {
				; CHECK-LABEL: atanf_f32
				; CHECK: __xl_atanf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @atanf(float %a)
				ret float %call
				}

				define float @atanhf_f32(float %a) {
				; CHECK-LABEL: atanhf_f32
				; CHECK: __xl_atanhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @atanhf(float %a)
				ret float %call
				}

				define float @cbrtf_f32(float %a) {
				; CHECK-LABEL: cbrtf_f32
				; CHECK: __xl_cbrtf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @cbrtf(float %a)
				ret float %call
				}

				define float @copysignf_f32(float %a, float %b) {
				; CHECK-LABEL: copysignf_f32
				; CHECK: copysignf
				; CHECK: blr
				entry:
				%call = tail call fast float @copysignf(float %a, float %b)
				ret float %call
				}

				define float @cosf_f32(float %a) {
				; CHECK-LABEL: cosf_f32
				; CHECK: __xl_cosf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @cosf(float %a)
				ret float %call
				}

				define float @coshf_f32(float %a) {
				; CHECK-LABEL: coshf_f32
				; CHECK: __xl_coshf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @coshf(float %a)
				ret float %call
				}

				define float @erfcf_f32(float %a) {
				; CHECK-LABEL: erfcf_f32
				; CHECK: __xl_erfcf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @erfcf(float %a)
				ret float %call
				}

				define float @erff_f32(float %a) {
				; CHECK-LABEL: erff_f32
				; CHECK: __xl_erff_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @erff(float %a)
				ret float %call
				}

				define float @expf_f32(float %a) {
				; CHECK-LABEL: expf_f32
				; CHECK: __xl_expf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @expf(float %a)
				ret float %call
				}

				define float @expm1f_f32(float %a) {
				; CHECK-LABEL: expm1f_f32
				; CHECK: __xl_expm1f_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @expm1f(float %a)
				ret float %call
				}

				define float @hypotf_f32(float %a, float %b) {
				; CHECK-LABEL: hypotf_f32
				; CHECK: __xl_hypotf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @hypotf(float %a, float %b)
				ret float %call
				}

				define float @lgammaf_f32(float %a) {
				; CHECK-LABEL: lgammaf_f32
				; CHECK: __xl_lgammaf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @lgammaf(float %a)
				ret float %call
				}

				define float @log10f_f32(float %a) {
				; CHECK-LABEL: log10f_f32
				; CHECK: __xl_log10f_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @log10f(float %a)
				ret float %call
				}

				define float @log1pf_f32(float %a) {
				; CHECK-LABEL: log1pf_f32
				; CHECK: __xl_log1pf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @log1pf(float %a)
				ret float %call
				}

				define float @logf_f32(float %a) {
				; CHECK-LABEL: logf_f32
				; CHECK: __xl_logf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @logf(float %a)
				ret float %call
				}

				define float @powf_f32(float %a, float %b) {
				; CHECK-LABEL: powf_f32
				; CHECK: __xl_powf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @powf(float %a, float %b)
				ret float %call
				}

				define float @rintf_f32(float %a) {
				; CHECK-LABEL: rintf_f32
				; CHECK-NOT: __xl_rintf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @rintf(float %a)
				ret float %call
				}

				define float @sinf_f32(float %a) {
				; CHECK-LABEL: sinf_f32
				; CHECK: __xl_sinf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @sinf(float %a)
				ret float %call
				}

				define float @sinhf_f32(float %a) {
				; CHECK-LABEL: sinhf_f32
				; CHECK: __xl_sinhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @sinhf(float %a)
				ret float %call
				}

				define float @tanf_f32(float %a) {
				; CHECK-LABEL: tanf_f32
				; CHECK: __xl_tanf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @tanf(float %a)
				ret float %call
				}

				define float @tanhf_f32(float %a) {
				; CHECK-LABEL: tanhf_f32
				; CHECK: __xl_tanhf_finite
				; CHECK: blr
				entry:
				%call = tail call fast float @tanhf(float %a)
				ret float %call
				}

				define double @acos_f64(double %a) {
				; CHECK-LABEL: acos_f64
				; CHECK: __xl_acos_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @acos(double %a)
				ret double %call
				}

				define double @acosh_f64(double %a) {
				; CHECK-LABEL: acosh_f64
				; CHECK: __xl_acosh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @acosh(double %a)
				ret double %call
				}

				define double @anint_f64(double %a) {
				; CHECK-LABEL: anint_f64
				; CHECK-NOT: __xl_anint_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @anint(double %a)
				ret double %call
				}

				define double @asin_f64(double %a) {
				; CHECK-LABEL: asin_f64
				; CHECK: __xl_asin_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @asin(double %a)
				ret double %call
				}

				define double @asinh_f64(double %a) {
				; CHECK-LABEL: asinh_f64
				; CHECK: __xl_asinh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @asinh(double %a)
				ret double %call
				}

				define double @atan_f64(double %a) {
				; CHECK-LABEL: atan_f64
				; CHECK: __xl_atan_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @atan(double %a)
				ret double %call
				}

				define double @atan2_f64(double %a) {
				; CHECK-LABEL: atan2_f64
				; CHECK: __xl_atan2_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @atan2(double %a)
				ret double %call
				}

				define double @atanh_f64(double %a) {
				; CHECK-LABEL: atanh_f64
				; CHECK: __xl_atanh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @atanh(double %a)
				ret double %call
				}

				define double @cbrt_f64(double %a) {
				; CHECK-LABEL: cbrt_f64
				; CHECK: __xl_cbrt_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cbrt(double %a)
				ret double %call
				}

				define double @copysign_f64(double %a, double %b) {
				; CHECK-LABEL: copysign_f64
				; CHECK: copysign
				; CHECK: blr
				entry:
				%call = tail call fast double @copysign(double %a, double %b)
				ret double %call
				}

				define double @cos_f64(double %a) {
				; CHECK-LABEL: cos_f64
				; CHECK: __xl_cos_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cos(double %a)
				ret double %call
				}

				define double @cosh_f64(double %a) {
				; CHECK-LABEL: cosh_f64
				; CHECK: __xl_cosh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cosh(double %a)
				ret double %call
				}

				define double @cosisin_f64(double %a) {
				; CHECK-LABEL: cosisin_f64
				; CHECK: __xl_cosisin_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @cosisin(double %a)
				ret double %call
				}

				define double @dnint_f64(double %a) {
				; CHECK-LABEL: dnint_f64
				; CHECK-NOT: __xl_dnint_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @dnint(double %a)
				ret double %call
				}

				define double @erf_f64(double %a) {
				; CHECK-LABEL: erf_f64
				; CHECK: __xl_erf_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @erf(double %a)
				ret double %call
				}

				define double @erfc_f64(double %a) {
				; CHECK-LABEL: erfc_f64
				; CHECK: __xl_erfc_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @erfc(double %a)
				ret double %call
				}

				define double @exp_f64(double %a) {
				; CHECK-LABEL: exp_f64
				; CHECK: __xl_exp_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @exp(double %a)
				ret double %call
				}

				define double @expm1_f64(double %a) {
				; CHECK-LABEL: expm1_f64
				; CHECK: __xl_expm1_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @expm1(double %a)
				ret double %call
				}

				define double @hypot_f64(double %a, double %b) {
				; CHECK-LABEL: hypot_f64
				; CHECK: __xl_hypot_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @hypot(double %a, double %b)
				ret double %call
				}

				define double @lgamma_f64(double %a) {
				; CHECK-LABEL: lgamma_f64
				; CHECK: __xl_lgamma_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @lgamma(double %a)
				ret double %call
				}

				define double @log_f64(double %a) {
				; CHECK-LABEL: log_f64
				; CHECK: __xl_log_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @log(double %a)
				ret double %call
				}

				define double @log10_f64(double %a) {
				; CHECK-LABEL: log10_f64
				; CHECK: __xl_log10_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @log10(double %a)
				ret double %call
				}

				define double @log1p_f64(double %a) {
				; CHECK-LABEL: log1p_f64
				; CHECK: __xl_log1p_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @log1p(double %a)
				ret double %call
				}

				define double @pow_f64(double %a, double %b) {
				; CHECK-LABEL: pow_f64
				; CHECK: __xl_pow_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @pow(double %a, double %b)
				ret double %call
				}

				define double @rsqrt_f64(double %a) {
				; CHECK-LABEL: rsqrt_f64
				; CHECK: __xl_rsqrt_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @rsqrt(double %a)
				ret double %call
				}

				define double @sin_f64(double %a) {
				; CHECK-LABEL: sin_f64
				; CHECK: __xl_sin_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sin(double %a)
				ret double %call
				}

				define double @sincos_f64(double %a) {
				; CHECK-LABEL: sincos_f64
				; CHECK: __xl_sincos_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sincos(double %a)
				ret double %call
				}

				define double @sinh_f64(double %a) {
				; CHECK-LABEL: sinh_f64
				; CHECK: __xl_sinh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sinh(double %a)
				ret double %call
				}

				define double @sqrt_f64(double %a) {
				; CHECK-LABEL: sqrt_f64
				; CHECK: __xl_sqrt_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @sqrt(double %a)
				ret double %call
				}

				define double @tan_f64(double %a) {
				; CHECK-LABEL: tan_f64
				; CHECK: __xl_tan_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @tan(double %a)
				ret double %call
				}

				define double @tanh_f64(double %a) {
				; CHECK-LABEL: tanh_f64
				; CHECK: __xl_tanh_finite
				; CHECK: blr
				entry:
				%call = tail call fast double @tanh(double %a)
				ret double %call
				}


				; Without fast flag on the call instruction
				define float @acosf_f32_nofast(float %a) {
				; CHECK-LABEL: acosf_f32_nofast
				; CHECK-NOT: __xl_acosf_finite
				; CHECK: blr
				entry:
				%call = tail call float @acosf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @acoshf_f32_nofast(float %a) {
				; CHECK-LABEL: acoshf_f32_nofast
				; CHECK-NOT: __xl_acoshf_finite
				; CHECK: blr
				entry:
				%call = tail call float @acoshf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @asinf_f32_nofast(float %a) {
				; CHECK-LABEL: asinf_f32_nofast
				; CHECK-NOT: __xl_asinf_finite
				; CHECK: blr
				entry:
				%call = tail call float @asinf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @asinhf_f32_nofast(float %a) {
				; CHECK-LABEL: asinhf_f32_nofast
				; CHECK-NOT: __xl_asinhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @asinhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @atan2f_f32_nofast(float %a) {
				; CHECK-LABEL: atan2f_f32_nofast
				; CHECK-NOT: __xl_atan2f_finite
				; CHECK: blr
				entry:
				%call = tail call float @atan2f(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @atanf_f32_nofast(float %a) {
				; CHECK-LABEL: atanf_f32_nofast
				; CHECK-NOT: __xl_atanf_finite
				; CHECK: blr
				entry:
				%call = tail call float @atanf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @atanhf_f32_nofast(float %a) {
				; CHECK-LABEL: atanhf_f32_nofast
				; CHECK-NOT: __xl_atanhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @atanhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @cbrtf_f32_nofast(float %a) {
				; CHECK-LABEL: cbrtf_f32_nofast
				; CHECK-NOT: __xl_cbrtf_finite
				; CHECK: blr
				entry:
				%call = tail call float @cbrtf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @copysignf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: copysignf_f32_nofast
				; CHECK-NOT: __xl_copysignf_finite
				; CHECK: blr
				entry:
				%call = tail call float @copysignf(float %a, float %b)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @cosf_f32_nofast(float %a) {
				; CHECK-LABEL: cosf_f32_nofast
				; CHECK-NOT: __xl_cosf_finite
				; CHECK: blr
				entry:
				%call = tail call float @cosf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @coshf_f32_nofast(float %a) {
				; CHECK-LABEL: coshf_f32_nofast
				; CHECK-NOT: __xl_coshf_finite
				; CHECK: blr
				entry:
				%call = tail call float @coshf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @erfcf_f32_nofast(float %a) {
				; CHECK-LABEL: erfcf_f32_nofast
				; CHECK-NOT: __xl_erfcf_finite
				; CHECK: blr
				entry:
				%call = tail call float @erfcf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @erff_f32_nofast(float %a) {
				; CHECK-LABEL: erff_f32_nofast
				; CHECK-NOT: __xl_erff_finite
				; CHECK: blr
				entry:
				%call = tail call float @erff(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @expf_f32_nofast(float %a) {
				; CHECK-LABEL: expf_f32_nofast
				; CHECK-NOT: __xl_expf_finite
				; CHECK: blr
				entry:
				%call = tail call float @expf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @expm1f_f32_nofast(float %a) {
				; CHECK-LABEL: expm1f_f32_nofast
				; CHECK-NOT: __xl_expm1f_finite
				; CHECK: blr
				entry:
				%call = tail call float @expm1f(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @hypotf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: hypotf_f32_nofast
				; CHECK-NOT: __xl_hypotf_finite
				; CHECK: blr
				entry:
				%call = tail call float @hypotf(float %a, float %b)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @lgammaf_f32_nofast(float %a) {
				; CHECK-LABEL: lgammaf_f32_nofast
				; CHECK-NOT: __xl_lgammaf_finite
				; CHECK: blr
				entry:
				%call = tail call float @lgammaf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @log10f_f32_nofast(float %a) {
				; CHECK-LABEL: log10f_f32_nofast
				; CHECK-NOT: __xl_log10f_finite
				; CHECK: blr
				entry:
				%call = tail call float @log10f(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @log1pf_f32_nofast(float %a) {
				; CHECK-LABEL: log1pf_f32_nofast
				; CHECK-NOT: __xl_log1pf_finite
				; CHECK: blr
				entry:
				%call = tail call float @log1pf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @logf_f32_nofast(float %a) {
				; CHECK-LABEL: logf_f32_nofast
				; CHECK-NOT: __xl_logf_finite
				; CHECK: blr
				entry:
				%call = tail call float @logf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @powf_f32_nofast(float %a, float %b) {
				; CHECK-LABEL: powf_f32_nofast
				; CHECK-NOT: __xl_powf_finite
				; CHECK: blr
				entry:
				%call = tail call float @powf(float %a, float %b)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @rintf_f32_nofast(float %a) {
				; CHECK-LABEL: rintf_f32_nofast
				; CHECK-NOT: __xl_rintf_finite
				; CHECK: blr
				entry:
				%call = tail call float @rintf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @sinf_f32_nofast(float %a) {
				bmahjourUnsubmitted Done Reply Inline Actions shouldn't the tests starting from here move to a different file? This test file is called ...mass-fast.ll so one would expect it only contains tests with fast-math flag on. bmahjour: shouldn't the tests starting from here move to a different file? This test file is called ...
				masoud.ataeiAuthorUnsubmitted Done Reply Inline Actions Done masoud.ataei: Done
				; CHECK-LABEL: sinf_f32_nofast
				; CHECK-NOT: __xl_sinf_finite
				; CHECK: blr
				entry:
				%call = tail call float @sinf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @sinhf_f32_nofast(float %a) {
				; CHECK-LABEL: sinhf_f32_nofast
				; CHECK-NOT: __xl_sinhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @sinhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @tanf_f32_nofast(float %a) {
				; CHECK-LABEL: tanf_f32_nofast
				; CHECK-NOT: __xl_tanf_finite
				; CHECK: blr
				entry:
				%call = tail call float @tanf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define float @tanhf_f32_nofast(float %a) {
				; CHECK-LABEL: tanhf_f32_nofast
				; CHECK-NOT: __xl_tanhf_finite
				; CHECK: blr
				entry:
				%call = tail call float @tanhf(float %a)
				ret float %call
				}

				; Without fast flag on the call instruction
				define double @acos_f64_nofast(double %a) {
				; CHECK-LABEL: acos_f64_nofast
				; CHECK-NOT: __xl_acos_finite
				; CHECK: blr
				entry:
				%call = tail call double @acos(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @acosh_f64_nofast(double %a) {
				; CHECK-LABEL: acosh_f64_nofast
				; CHECK-NOT: __xl_acosh_finite
				; CHECK: blr
				entry:
				%call = tail call double @acosh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @anint_f64_nofast(double %a) {
				; CHECK-LABEL: anint_f64_nofast
				; CHECK-NOT: __xl_anint_finite
				; CHECK: blr
				entry:
				%call = tail call double @anint(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @asin_f64_nofast(double %a) {
				; CHECK-LABEL: asin_f64_nofast
				; CHECK-NOT: __xl_asin_finite
				; CHECK: blr
				entry:
				%call = tail call double @asin(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @asinh_f64_nofast(double %a) {
				; CHECK-LABEL: asinh_f64_nofast
				; CHECK-NOT: __xl_asinh_finite
				; CHECK: blr
				entry:
				%call = tail call double @asinh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @atan_f64_nofast(double %a) {
				; CHECK-LABEL: atan_f64_nofast
				; CHECK-NOT: __xl_atan_finite
				; CHECK: blr
				entry:
				%call = tail call double @atan(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @atan2_f64_nofast(double %a) {
				; CHECK-LABEL: atan2_f64_nofast
				; CHECK-NOT: __xl_atan2_finite
				; CHECK: blr
				entry:
				%call = tail call double @atan2(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @atanh_f64_nofast(double %a) {
				; CHECK-LABEL: atanh_f64_nofast
				; CHECK-NOT: __xl_atanh_finite
				; CHECK: blr
				entry:
				%call = tail call double @atanh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cbrt_f64_nofast(double %a) {
				; CHECK-LABEL: cbrt_f64_nofast
				; CHECK-NOT: __xl_cbrt_finite
				; CHECK: blr
				entry:
				%call = tail call double @cbrt(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @copysign_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: copysign_f64_nofast
				; CHECK-NOT: __xl_copysign_finite
				; CHECK: blr
				entry:
				%call = tail call double @copysign(double %a, double %b)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cos_f64_nofast(double %a) {
				; CHECK-LABEL: cos_f64_nofast
				; CHECK-NOT: __xl_cos_finite
				; CHECK: blr
				entry:
				%call = tail call double @cos(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cosh_f64_nofast(double %a) {
				; CHECK-LABEL: cosh_f64_nofast
				; CHECK-NOT: __xl_cosh_finite
				; CHECK: blr
				entry:
				%call = tail call double @cosh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @cosisin_f64_nofast(double %a) {
				; CHECK-LABEL: cosisin_f64_nofast
				; CHECK-NOT: __xl_cosisin_finite
				; CHECK: blr
				entry:
				%call = tail call double @cosisin(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @dnint_f64_nofast(double %a) {
				; CHECK-LABEL: dnint_f64_nofast
				; CHECK-NOT: __xl_dnint_finite
				; CHECK: blr
				entry:
				%call = tail call double @dnint(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @erf_f64_nofast(double %a) {
				; CHECK-LABEL: erf_f64_nofast
				; CHECK-NOT: __xl_erf_finite
				; CHECK: blr
				entry:
				%call = tail call double @erf(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @erfc_f64_nofast(double %a) {
				; CHECK-LABEL: erfc_f64_nofast
				; CHECK-NOT: __xl_erfc_finite
				; CHECK: blr
				entry:
				%call = tail call double @erfc(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @exp_f64_nofast(double %a) {
				; CHECK-LABEL: exp_f64_nofast
				; CHECK-NOT: __xl_exp_finite
				; CHECK: blr
				entry:
				%call = tail call double @exp(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @expm1_f64_nofast(double %a) {
				; CHECK-LABEL: expm1_f64_nofast
				; CHECK-NOT: __xl_expm1_finite
				; CHECK: blr
				entry:
				%call = tail call double @expm1(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @hypot_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: hypot_f64_nofast
				; CHECK-NOT: __xl_hypot_finite
				; CHECK: blr
				entry:
				%call = tail call double @hypot(double %a, double %b)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @lgamma_f64_nofast(double %a) {
				; CHECK-LABEL: lgamma_f64_nofast
				; CHECK-NOT: __xl_lgamma_finite
				; CHECK: blr
				entry:
				%call = tail call double @lgamma(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @log_f64_nofast(double %a) {
				; CHECK-LABEL: log_f64_nofast
				; CHECK-NOT: __xl_log_finite
				; CHECK: blr
				entry:
				%call = tail call double @log(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @log10_f64_nofast(double %a) {
				; CHECK-LABEL: log10_f64_nofast
				; CHECK-NOT: __xl_log10_finite
				; CHECK: blr
				entry:
				%call = tail call double @log10(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @log1p_f64_nofast(double %a) {
				; CHECK-LABEL: log1p_f64_nofast
				; CHECK-NOT: __xl_log1p_finite
				; CHECK: blr
				entry:
				%call = tail call double @log1p(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @pow_f64_nofast(double %a, double %b) {
				; CHECK-LABEL: pow_f64_nofast
				; CHECK-NOT: __xl_pow_finite
				; CHECK: blr
				entry:
				%call = tail call double @pow(double %a, double %b)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @rsqrt_f64_nofast(double %a) {
				; CHECK-LABEL: rsqrt_f64_nofast
				; CHECK-NOT: __xl_rsqrt_finite
				; CHECK: blr
				entry:
				%call = tail call double @rsqrt(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sin_f64_nofast(double %a) {
				; CHECK-LABEL: sin_f64_nofast
				; CHECK-NOT: __xl_sin_finite
				; CHECK: blr
				entry:
				%call = tail call double @sin(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sincos_f64_nofast(double %a) {
				; CHECK-LABEL: sincos_f64_nofast
				; CHECK-NOT: __xl_sincos_finite
				; CHECK: blr
				entry:
				%call = tail call double @sincos(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sinh_f64_nofast(double %a) {
				; CHECK-LABEL: sinh_f64_nofast
				; CHECK-NOT: __xl_sinh_finite
				; CHECK: blr
				entry:
				%call = tail call double @sinh(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @sqrt_f64_nofast(double %a) {
				; CHECK-LABEL: sqrt_f64_nofast
				; CHECK-NOT: __xl_sqrt_finite
				; CHECK: blr
				entry:
				%call = tail call double @sqrt(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @tan_f64_nofast(double %a) {
				; CHECK-LABEL: tan_f64_nofast
				; CHECK-NOT: __xl_tan_finite
				; CHECK: blr
				entry:
				%call = tail call double @tan(double %a)
				ret double %call
				}

				; Without fast flag on the call instruction
				define double @tanh_f64_nofast(double %a) {
				; CHECK-LABEL: tanh_f64_nofast
				; CHECK-NOT: __xl_tanh_finite
				; CHECK: blr
				entry:
				%call = tail call double @tanh(double %a)
				ret double %call
				}

llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck --check-prefix=CHECK-LNX %s
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck --check-prefix=CHECK-AIX %s

				declare float @llvm.pow.f32 (float, float);
				declare double @llvm.pow.f64 (double, double);

				; fast-math powf with 0.25
				define float @llvmintr_powf_f32_fast025(float %a) #1 {
				;
				; CHECK-LNX-LABEL: llvmintr_powf_f32_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: fmr 0, 1
				; CHECK-LNX-NEXT: xsabsdp 1, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_2@toc@ha
				; CHECK-LNX-NEXT: addis 4, 2, .LCPI0_1@toc@ha
				; CHECK-LNX-NEXT: xxlxor 5, 5, 5
				; CHECK-LNX-NEXT: lfs 4, .LCPI0_2@toc@l(3)
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_0@toc@ha
				; CHECK-LNX-NEXT: lfs 3, .LCPI0_1@toc@l(4)
				; CHECK-LNX-NEXT: lfs 2, .LCPI0_0@toc@l(3)
				; CHECK-LNX-NEXT: fcmpu 0, 1, 4
				; CHECK-LNX-NEXT: xxlxor 1, 1, 1
				; CHECK-LNX-NEXT: blt 0, .LBB0_2
				; CHECK-LNX-NEXT: # %bb.1: # %entry
				; CHECK-LNX-NEXT: xsrsqrtesp 5, 0
				; CHECK-LNX-NEXT: fmr 6, 2
				; CHECK-LNX-NEXT: xsmulsp 0, 0, 5
				; CHECK-LNX-NEXT: xsmaddasp 6, 0, 5
				; CHECK-LNX-NEXT: xsmulsp 0, 0, 3
				; CHECK-LNX-NEXT: xsmulsp 5, 0, 6
				; CHECK-LNX-NEXT: .LBB0_2: # %entry
				; CHECK-LNX-NEXT: xsabsdp 0, 5
				; CHECK-LNX-NEXT: fcmpu 0, 0, 4
				; CHECK-LNX-NEXT: bltlr 0
				; CHECK-LNX-NEXT: # %bb.3: # %entry
				; CHECK-LNX-NEXT: xsrsqrtesp 0, 5
				; CHECK-LNX-NEXT: xsmulsp 1, 5, 0
				; CHECK-LNX-NEXT: xsmaddasp 2, 1, 0
				; CHECK-LNX-NEXT: xsmulsp 0, 1, 3
				; CHECK-LNX-NEXT: xsmulsp 1, 0, 2
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C0(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @llvm.pow.f32(float %a, float 2.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.25
				define double @llvmintr_pow_f64_fast025(double %a) #1 {
				;
				; CHECK-LNX-LABEL: llvmintr_pow_f64_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI1_0@toc@ha
				; CHECK-LNX-NEXT: addis 4, 2, .LCPI1_1@toc@ha
				; CHECK-LNX-NEXT: lfs 0, .LCPI1_0@toc@l(3)
				; CHECK-LNX-NEXT: lfs 2, .LCPI1_1@toc@l(4)
				; CHECK-LNX-NEXT: bc 12, 2, .LBB1_3
				; CHECK-LNX-NEXT: # %bb.1: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: fmr 4, 0
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 4, 1, 3
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 2
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 4
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 4, 2, .LBB1_4
				; CHECK-LNX-NEXT: .LBB1_2:
				; CHECK-LNX-NEXT: xssqrtdp 1, 1
				; CHECK-LNX-NEXT: blr
				; CHECK-LNX-NEXT: .LBB1_3:
				; CHECK-LNX-NEXT: xssqrtdp 1, 1
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 12, 2, .LBB1_2
				; CHECK-LNX-NEXT: .LBB1_4: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 0, 1, 3
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 2
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C1(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @llvm.pow.f64(double %a, double 2.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.75
				define float @llvmintr_powf_f32_fast075(float %a) #1 {
				;
				; CHECK-LNX-LABEL: llvmintr_powf_f32_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: xsabsdp 0, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_2@toc@ha
				; CHECK-LNX-NEXT: addis 4, 2, .LCPI2_1@toc@ha
				; CHECK-LNX-NEXT: xxlxor 2, 2, 2
				; CHECK-LNX-NEXT: lfs 5, .LCPI2_2@toc@l(3)
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_0@toc@ha
				; CHECK-LNX-NEXT: lfs 3, .LCPI2_1@toc@l(4)
				; CHECK-LNX-NEXT: xxlxor 4, 4, 4
				; CHECK-LNX-NEXT: fcmpu 0, 0, 5
				; CHECK-LNX-NEXT: lfs 0, .LCPI2_0@toc@l(3)
				; CHECK-LNX-NEXT: blt 0, .LBB2_2
				; CHECK-LNX-NEXT: # %bb.1: # %entry
				; CHECK-LNX-NEXT: xsrsqrtesp 4, 1
				; CHECK-LNX-NEXT: fmr 6, 0
				; CHECK-LNX-NEXT: xsmulsp 1, 1, 4
				; CHECK-LNX-NEXT: xsmaddasp 6, 1, 4
				; CHECK-LNX-NEXT: xsmulsp 1, 1, 3
				; CHECK-LNX-NEXT: xsmulsp 4, 1, 6
				; CHECK-LNX-NEXT: .LBB2_2: # %entry
				; CHECK-LNX-NEXT: xsabsdp 1, 4
				; CHECK-LNX-NEXT: fcmpu 0, 1, 5
				; CHECK-LNX-NEXT: blt 0, .LBB2_4
				; CHECK-LNX-NEXT: # %bb.3: # %entry
				; CHECK-LNX-NEXT: xsrsqrtesp 1, 4
				; CHECK-LNX-NEXT: xsmulsp 2, 4, 1
				; CHECK-LNX-NEXT: xsmaddasp 0, 2, 1
				; CHECK-LNX-NEXT: xsmulsp 1, 2, 3
				; CHECK-LNX-NEXT: xsmulsp 2, 1, 0
				; CHECK-LNX-NEXT: .LBB2_4: # %entry
				; CHECK-LNX-NEXT: xsmulsp 1, 4, 2
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C2(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @llvm.pow.f32(float %a, float 7.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.75
				define double @llvmintr_pow_f64_fast075(double %a) #1 {
				;
				; CHECK-LNX-LABEL: llvmintr_pow_f64_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI3_0@toc@ha
				; CHECK-LNX-NEXT: addis 4, 2, .LCPI3_1@toc@ha
				; CHECK-LNX-NEXT: lfs 0, .LCPI3_0@toc@l(3)
				; CHECK-LNX-NEXT: lfs 2, .LCPI3_1@toc@l(4)
				; CHECK-LNX-NEXT: bc 12, 2, .LBB3_3
				; CHECK-LNX-NEXT: # %bb.1: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: fmr 4, 0
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 4, 1, 3
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 2
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 4
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 4, 2, .LBB3_4
				; CHECK-LNX-NEXT: .LBB3_2:
				; CHECK-LNX-NEXT: xssqrtdp 0, 1
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 0
				; CHECK-LNX-NEXT: blr
				; CHECK-LNX-NEXT: .LBB3_3:
				; CHECK-LNX-NEXT: xssqrtdp 1, 1
				; CHECK-LNX-NEXT: xstsqrtdp 0, 1
				; CHECK-LNX-NEXT: bc 12, 2, .LBB3_2
				; CHECK-LNX-NEXT: .LBB3_4: # %entry
				; CHECK-LNX-NEXT: xsrsqrtedp 3, 1
				; CHECK-LNX-NEXT: fmr 5, 0
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 5, 4, 3
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 2
				; CHECK-LNX-NEXT: xsmuldp 3, 3, 5
				; CHECK-LNX-NEXT: xsmuldp 4, 1, 3
				; CHECK-LNX-NEXT: xsmaddadp 0, 4, 3
				; CHECK-LNX-NEXT: xsmuldp 2, 4, 2
				; CHECK-LNX-NEXT: xsmuldp 0, 2, 0
				; CHECK-LNX-NEXT: xsmuldp 1, 1, 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C3(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @llvm.pow.f64(double %a, double 7.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.50
				define float @llvmintr_powf_f32_fast050(float %a) #1 {
				;
				; CHECK-LNX-LABEL: llvmintr_powf_f32_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI4_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_powf_f32_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C4(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @llvm.pow.f32(float %a, float 5.000000e-01)
				ret float %call
				}

				; fast-math pow with 0.50
				define double @llvmintr_pow_f64_fast050(double %a) #1 {
				;
				; CHECK-LNX-LABEL: llvmintr_pow_f64_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI5_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI5_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: llvmintr_pow_f64_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				bmahjourUnsubmitted Not Done Reply Inline Actions See above comment and remove "unsafe-fp-math". bmahjour: See above comment and remove "unsafe-fp-math".
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C5(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @llvm.pow.f64(double %a, double 5.000000e-01)
				ret double %call
				}
				attributes #1 = { "unsafe-fp-math"="true" }

llvm/test/CodeGen/PowerPC/pow-025-075-scalar-mass.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -mtriple=powerpc64le-unknown-linux-gnu < %s \| FileCheck --check-prefix=CHECK-LNX %s
				; RUN: llc -verify-machineinstrs -enable-ppc-gen-scalar-mass -mtriple=powerpc-ibm-aix-xcoff < %s \| FileCheck --check-prefix=CHECK-AIX %s

				declare float @powf (float, float);
				declare double @pow (double, double);
				declare float @__powf_finite (float, float);
				declare double @__pow_finite (double, double);

				; fast-math powf with 0.25
				define float @powf_f32_fast025(float %a) {
				;
				; CHECK-LNX-LABEL: powf_f32_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI0_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI0_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: powf_f32_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C0(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @powf(float %a, float 2.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.25
				define double @pow_f64_fast025(double %a) {
				;
				; CHECK-LNX-LABEL: pow_f64_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI1_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI1_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: pow_f64_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C1(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @pow(double %a, double 2.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.75
				define float @powf_f32_fast075(float %a) {
				;
				; CHECK-LNX-LABEL: powf_f32_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI2_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI2_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: powf_f32_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C2(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @powf(float %a, float 7.500000e-01)
				ret float %call
				}

				; fast-math pow with 0.75
				define double @pow_f64_fast075(double %a) {
				;
				; CHECK-LNX-LABEL: pow_f64_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI3_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI3_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: pow_f64_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C3(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @pow(double %a, double 7.500000e-01)
				ret double %call
				}

				; fast-math powf with 0.50
				define float @powf_f32_fast050(float %a) {
				;
				; CHECK-LNX-LABEL: powf_f32_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI4_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI4_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: powf_f32_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C4(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @powf(float %a, float 5.000000e-01)
				ret float %call
				}

				; fast-math pow with 0.50
				define double @pow_f64_fast050(double %a) {
				;
				; CHECK-LNX-LABEL: pow_f64_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI5_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI5_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: pow_f64_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C5(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @pow(double %a, double 5.000000e-01)
				ret double %call
				}

				;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

				; fast-math __powf_finite with 0.25
				define float @__powf_finite_f32_fast025(float %a) {
				;
				; CHECK-LNX-LABEL: __powf_finite_f32_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI6_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI6_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __powf_finite_f32_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C6(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @__powf_finite(float %a, float 2.500000e-01)
				ret float %call
				}

				; fast-math __pow_finite with 0.25
				define double @__pow_finite_f64_fast025(double %a) {
				;
				; CHECK-LNX-LABEL: __pow_finite_f64_fast025:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI7_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI7_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __pow_finite_f64_fast025:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C7(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @__pow_finite(double %a, double 2.500000e-01)
				ret double %call
				}

				; fast-math __powf_finite with 0.75
				define float @__powf_finite_f32_fast075(float %a) {
				;
				; CHECK-LNX-LABEL: __powf_finite_f32_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI8_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI8_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __powf_finite_f32_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C8(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @__powf_finite(float %a, float 7.500000e-01)
				ret float %call
				}

				; fast-math __pow_finite with 0.75
				define double @__pow_finite_f64_fast075(double %a) {
				;
				; CHECK-LNX-LABEL: __pow_finite_f64_fast075:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI9_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI9_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __pow_finite_f64_fast075:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C9(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @__pow_finite(double %a, double 7.500000e-01)
				ret double %call
				}

				; fast-math __powf_finite with 0.50
				define float @__powf_finite_f32_fast050(float %a) {
				;
				; CHECK-LNX-LABEL: __powf_finite_f32_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI10_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI10_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_powf_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __powf_finite_f32_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C10(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_powf_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast float @__powf_finite(float %a, float 5.000000e-01)
				ret float %call
				}

				; fast-math __pow_finite with 0.50
				define double @__pow_finite_f64_fast050(double %a) {
				;
				; CHECK-LNX-LABEL: __pow_finite_f64_fast050:
				; CHECK-LNX: # %bb.0: # %entry
				; CHECK-LNX-NEXT: mflr 0
				; CHECK-LNX-NEXT: std 0, 16(1)
				; CHECK-LNX-NEXT: stdu 1, -32(1)
				; CHECK-LNX-NEXT: .cfi_def_cfa_offset 32
				; CHECK-LNX-NEXT: .cfi_offset lr, 16
				; CHECK-LNX-NEXT: addis 3, 2, .LCPI11_0@toc@ha
				; CHECK-LNX-NEXT: lfs 2, .LCPI11_0@toc@l(3)
				; CHECK-LNX-NEXT: bl __xl_pow_finite
				; CHECK-LNX-NEXT: nop
				; CHECK-LNX-NEXT: addi 1, 1, 32
				; CHECK-LNX-NEXT: ld 0, 16(1)
				; CHECK-LNX-NEXT: mtlr 0
				; CHECK-LNX-NEXT: blr
				;
				; CHECK-AIX-LABEL: __pow_finite_f64_fast050:
				; CHECK-AIX: # %bb.0: # %entry
				; CHECK-AIX-NEXT: mflr 0
				; CHECK-AIX-NEXT: stw 0, 8(1)
				; CHECK-AIX-NEXT: stwu 1, -64(1)
				; CHECK-AIX-NEXT: lwz 3, L..C11(2)
				; CHECK-AIX-NEXT: lfs 2, 0(3)
				; CHECK-AIX-NEXT: bl .__xl_pow_finite[PR]
				; CHECK-AIX-NEXT: nop
				; CHECK-AIX-NEXT: addi 1, 1, 64
				; CHECK-AIX-NEXT: lwz 0, 8(1)
				; CHECK-AIX-NEXT: mtlr 0
				; CHECK-AIX-NEXT: blr
				entry:
				%call = tail call fast double @__pow_finite(double %a, double 5.000000e-01)
				ret double %call
				}