This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
flang/
-
include/flang/Optimizer/Support/
-
flang/
-
Optimizer/
-
Support/
-
InitFIR.h
-
lib/
-
Lower/
17/40
IntrinsicCall.cpp
-
Optimizer/CodeGen/
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
-
test/
-
Intrinsics/
1/2
late-math-codegen.fir
-
Lower/
-
Intrinsics/
-
exp.f90
-
log.f90
-
math-runtime-options.f90
2/4
late-math-lowering.f90
-
llvm-math.f90
-
sqrt.f90
-
trigonometric-intrinsics.f90

Differential D128385

[flang] Lower Fortran math intrinsic operations into MLIR ops or libm calls.
ClosedPublic

Authored by vzakhari on Jun 22 2022, 2:57 PM.

Download Raw Diff

Details

Reviewers

schweitz
jeanPerier
klausler
sscalpone
clementval
ftynse
kiranchandramohan
awarzynski

Commits

rG9f35657983c5: [flang] Lower Fortran math intrinsic operations into MLIR ops or libm calls.

Summary

Added new -lower-math-early option that defaults to 'true' that matches
the current math lowering scheme. If set to 'false', the intrinsic math
operations will be lowered to MLIR operations, which should potentially
enable more MLIR optimizations, or libm calls, if there is no corresponding
MLIR operation exists or if "precise" mode is requested.
The generated math MLIR operations are then converted to LLVM dialect
during codegen phase.

The -lower-math-early option is not exposed to users currently. I plan to
get rid of the "early" lowering completely, when "late" lowering
is robust enough to support all math intrinsics that are currently
supported via pgmath. So "late" mode will become default and -lower-math-early
option will not be needed. This will effectively eliminate the mandatory
dependency on pgmath in Fortran lowering, but this is WIP.

Diff Detail

Event Timeline

vzakhari created this revision.Jun 22 2022, 2:57 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 22 2022, 2:57 PM

Herald added subscribers: bzcheeseman, sdasgup3, wenzhicui and 20 others. · View Herald Transcript

vzakhari requested review of this revision.Jun 22 2022, 2:57 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 2:57 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache, jdoerfert. · View Herald Transcript

Please review.

I can extract Math dialect changes into a separate commit, if needed.

kiranchandramohan added a subscriber: kiranchandramohan.Jun 22 2022, 3:06 PM

Harbormaster completed remote builds in B171432: Diff 439168.Jun 22 2022, 3:25 PM

clementval added a reviewer: kiranchandramohan.Jun 22 2022, 9:56 PM

In D128385#3603117, @vzakhari wrote:

Please review.

I can extract Math dialect changes into a separate commit, if needed.

+1. Yes, that would be a fast way to get that portion of the patch in.

This is really great to see. I like using the MLIR math dialect operations and it also provides a pathway to not depend on external libraries.

Also, is this a one-off patch or do you plan to work further on this? Would we have to extend the math dialect to handle all the fortran math intrinsics? Have you explored using the complex dialect operations directly, like e.g. https://mlir.llvm.org/docs/Dialects/ComplexOps/#complexsin-mlircomplexsinop?

awarzynski added a subscriber: awarzynski.Jun 23 2022, 2:33 AM

Looks great to me thanks !

If you do not split the patch, please get someone from MLIR core to team to also approve the Math dialect change.

This revision is now accepted and ready to land.Jun 23 2022, 2:44 AM

Hi @vzakhari !

Thanks for working on this. Before diving into details, I have a couple of design questions.

Question 1.
What's the end goal here? I can see how this could allow us to drop the dependency on libpgmath. Is that what you are aiming here for? What's the intended workflow for end users? We've discussed this before, see #1377. It would be good to clarify what points from that discussion are being addressed here.

Question 2.

Added new -math-lowering option that defaults to 'early' that matches the current math lowering scheme.

Could you clarify what you mean by this? There are 4 drivers in Flang (bbc, tco, flang-new and flang-new -fc1). From what I can see, you've used the llvm::cl API, but flang-new and flang-new -fc1 options are defined in Clang's Options.td. I've noticed that you've used -mmlir to forward these options, but -mmlir is intended for developers rather than end-users. So what would be the path for enabling this for end users? If this effort is to remove the dependency on libpgmath, then I think that it would make sense to make this available sooner rather than later.

Question 3.
It's good to see that you are testing with both flang-new and bbc, but why not test with both everywhere? For example, in "late-math-lowering.f90"? No worries if this is just an commission (could you update accordingly?), but if this is deliberate then it would be good to document the rationale for using only one driver.

Question 4.
AFAIK, fast, relaxed and precise are names from pgmath. I was under the impression that in #1377 we agreed to converge towards Clang's -ffp-model=<strict|precise|fast>. Like @kiranchandramohan points out, folks in Clang took some time to unify and to clarify their design in this area. I think that we should either aim for consistency with Clang. If it's decided otherwise, it would be good to clarify the rationale for diverging from that.

Thanks for taking a look,
Andrzej

vzakhari added a parent revision: D128454: [mlir][math] Lower atan to libm.Jun 23 2022, 10:18 AM

vzakhari updated this revision to Diff 439457.Jun 23 2022, 10:26 AM

Harbormaster completed remote builds in B171649: Diff 439457.Jun 23 2022, 10:27 AM

In D128385#3604073, @kiranchandramohan wrote:

+1. Yes, that would be a fast way to get that portion of the patch in.

Thanks. I extracted the math change into https://reviews.llvm.org/D128454

This is really great to see. I like using the MLIR math dialect operations and it also provides a pathway to not depend on external libraries.

Also, is this a one-off patch or do you plan to work further on this? Would we have to extend the math dialect to handle all the fortran math intrinsics? Have you explored using the complex dialect operations directly, like e.g. https://mlir.llvm.org/docs/Dialects/ComplexOps/#complexsin-mlircomplexsinop?

Correct, I am going to work further on this and using complex dialect is the next step. The intention of this initial patch is to agree on the direction of using more MLIR operations, which should enable more MLIR optimizations.

Regarding adding MLIR operations for Fortran intrinsics or other math "primitives" that Fortran intrinsics expand into, I think we have to decide this case by case. I suppose there might be some math operations for which that we do not expect many optimizations happen right now. For example, I do not expect any optimizations for a group of BESSEL intrinsics currently, so it may not make much sense to introduce math operations for them right now. We may reconsider this in future, when math dialect optimizations (e.g.) support more operations. Of course, there are other optimizations that may be applied even to BESSEL intrinsics, e.g. CSE, but I would like to enable it via the means of setting proper SideEffects for the libm calls (which is TBD: e.g. libm's j0 may have no side effects on FP control/status word unless FP strict mode is requested, which may enable CSEing j0 calls under fast). So I would expect that some operations are going to be represented as libm calls for a while.

Hi @awarzynski,

Thank you for reviewing!

In D128385#3604257, @awarzynski wrote:

Hi @vzakhari !

Thanks for working on this. Before diving into details, I have a couple of design questions.

Question 1.
What's the end goal here? I can see how this could allow us to drop the dependency on libpgmath. Is that what you are aiming here for? What's the intended workflow for end users? We've discussed this before, see #1377. It would be good to clarify what points from that discussion are being addressed here.

The initial goal was to expose as many MLIR operations as possible for the purpose of enabling more optimizations. There are existing math dialect optimizations and there will be more going forward, so lowering to MLIR operations seems like a good idea. Moreover, the used MLIR operations have no side effects comparing to generic library calls, so using MLIR operations seems to be a good way for enabling generic optimizations like CSE, LICM, etc. (though, we have to keep in mind that FP strict does imply side effects for math operations and this is not currently modelled in MLIR, so generating MLIR operations may be considered a "regression" comparing to generic library calls under FP strict).

With that said, I did not plan to generate libm calls instead of pgmath calls, but after discussion with the team it seemed like a good opportunity to also get rid of pgmath dependency under late math lowering option. This should simplify flang toolchain setup for the upstream users.

Question 2.

Added new -math-lowering option that defaults to 'early' that matches the current math lowering scheme.

Could you clarify what you mean by this? There are 4 drivers in Flang (bbc, tco, flang-new and flang-new -fc1). From what I can see, you've used the llvm::cl API, but flang-new and flang-new -fc1 options are defined in Clang's Options.td. I've noticed that you've used -mmlir to forward these options, but -mmlir is intended for developers rather than end-users. So what would be the path for enabling this for end users? If this effort is to remove the dependency on libpgmath, then I think that it would make sense to make this available sooner rather than later.

Right, this commit only adds initial support for lowering into MLIR math operations or libm calls. I did not want to expose user option(s) to select late mode before confirming that the proposed change seems reasonable/useful to everyone. For the time being, I use -mllvm to enable this "experimental" mode, but I think it will have to become default and the pgmath dependency should go away completely (at least in the lowering implementation). Note that pgmath dependency may still be optional, e.g. for vectorizing libm calls into pgmath vector calls as in here (see TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib() changes in lib/Analysis/TargetLibraryInfo.cpp).

To summarize, I expect that we will enable late lowering by default and an option will not be needed.

Question 3.
It's good to see that you are testing with both flang-new and bbc, but why not test with both everywhere? For example, in "late-math-lowering.f90"? No worries if this is just an commission (could you update accordingly?), but if this is deliberate then it would be good to document the rationale for using only one driver.

Thank you for pointing it out! I do not see any problems with testing it with both drivers, so I will make changes.

Question 4.
AFAIK, fast, relaxed and precise are names from pgmath. I was under the impression that in #1377 we agreed to converge towards Clang's -ffp-model=<strict|precise|fast>. Like @kiranchandramohan points out, folks in Clang took some time to unify and to clarify their design in this area. I think that we should either aim for consistency with Clang. If it's decided otherwise, it would be good to clarify the rationale for diverging from that.

Thank you for bringing up this point! Yes, I agree that it makes great sense to align flang behavior with clang, and -ffp-model is the right option to control how MLIR operations and the libm calls are handled in the compiler.

I believe the different versions of pgmath can be described as:

precise - matches libm functions behavior (e.g. matching errno, exception behavior). With regards to math library calls it seems to match clang's -ffp-model=strict.
fast - matching accuracy for scalar and vector versions; does not set errno; exception behavior may not match precise in some cases. I guess we can say that it matches -ffp-model=fast, but I am not 100% sure: e.g. I think llvm.sqrt with Fast-Math Flag 'afn' (which is set under -ffp-model=fast) will be less accurate than pgmath's fast sqrt call. So maybe -ffp-model=fast actually matches pgmath's relaxed, and pgmath's fast matches -ffp-model=fast with some Fast-Math Flags disabled (which may be controlled by clang's other options).
relaxed - more inaccurate than fast; no specified accuracy for any entry point; scalar and vector versions may provide different accuracy.

If we switch to libm calls, I think it makes sense to align with clang completely, i.e.:

-ffp-model=fast - use math/complex MLIR operations or libm calls (if there is no corresponding MLIR operation) with all MLIR Fast-Math Flags set. Note that the 'afn' flag should probably enable "unsafe" optimizations such as math polynomial approximations. To avoid too much inaccuracy, users will be able to specify -fno-approx-func.
-ffp-model=strict - until MLIR operations are able to model FP strict behavior, I guess, we will have to generate generic library calls for all math operations and hope that the optimizations do not violate it too much. Note that we are not honoring FP strict behavior not only for math function, but for user calls as well, e.g. LLVM IR uses strictfp call attribute to control the inlining, and there is nothing like that in MLIR currently.
-ffp-model=precise - it must support errno and IEEE denormal-fp-math which is currently not modelled in MLIR, so at the current point in time we can treat it the same as -ffp-mode=strict.

So my current changes are just trying to match the existing precise/fast/relaxed behavior for pgmath, and more work will be needed to properly follow clang's -ffp-model within MLIR. I will be glad to hear your thoughts on this topic. FYI, I think SideEffectsInterface is one of the ways to model fenv_access and -ffp-exception-behavior=strict, but we also need to model the rounding mode and denormal behavior in MLIR.

Thanks for taking a look,
Andrzej

vzakhari updated this revision to Diff 439532.Jun 23 2022, 2:09 PM

Harbormaster completed remote builds in B171697: Diff 439532.Jun 23 2022, 2:39 PM

Thanks for looking into this. Looks good.

Hi @vzakhari, thank you for you very comprehensive reply and taking the time to address my question, much appreciated!

I've left a few inline comments, but that's mostly asking for clarifications.

I've noticed that in "late-math-lowering.f90" you test lowering and in "late-math-codegen.f90" you test lowering + code-gen. Since "late-math-lowering.f90" is there to make sure that lowering works as expected, why not focus on code-gen in "late-math-codegen.f90"? Wouldn't that make future triaging easier? Otherwise, any failure in "late-math-codegen.f90" could originate from 2 different compiler stages (either lowering or code-gen).

More comments below.

In D128385#3606162, @vzakhari wrote:

With that said, I did not plan to generate libm calls instead of pgmath calls, but after discussion with the team it seemed like a good opportunity to also get rid of pgmath dependency under late math lowering option. This should simplify flang toolchain setup for the upstream users.

+1 Great to see this being included :) In my view, it would be good to advertise this a bit (perhaps on Discourse). It's a non-trivial design change that will affect people. I don't expect anyone to oppose, but we should strive to make this discussion inclusive. It doesn't have to happen immediately - you may want to wait until there's enough support to drop the pgmath dependency.

Right, this commit only adds initial support for lowering into MLIR math operations or libm calls. I did not want to expose user option(s) to select late mode before confirming that the proposed change seems reasonable/useful to everyone. For the time being, I use -mllvm to enable this "experimental" mode, but I think it will have to become default and the pgmath dependency should go away completely (at least in the lowering implementation).

That's fine, but it wasn't clear from the summary :) This is quite an important functional change and folks might interested to learn more about. I think that it's worth pointing all this out in the commit msg.

Note that pgmath dependency may still be optional, e.g. for vectorizing libm calls into pgmath vector calls as in here (see TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib() changes in lib/Analysis/TargetLibraryInfo.cpp).

I agree that we should allow users to select their preferred Math library. Probably through a compiler driver flag.

To summarize, I expect that we will enable late lowering by default and an option will not be needed.

+1 Thanks for clarifying!

Question 4.
AFAIK, fast, relaxed and precise are names from pgmath. I was under the impression that in #1377 we agreed to converge towards Clang's -ffp-model=<strict|precise|fast>. Like @kiranchandramohan points out, folks in Clang took some time to unify and to clarify their design in this area. I think that we should either aim for consistency with Clang. If it's decided otherwise, it would be good to clarify the rationale for diverging from that.

Thank you for bringing up this point! Yes, I agree that it makes great sense to align flang behavior with clang, and -ffp-model is the right option to control how MLIR operations and the libm calls are handled in the compiler.

Great!

If we switch to libm calls

Does libm implement all Math functions that we need? Also, perhaps we should discuss the mapping between FP modes/options in a separate thread?

Thank you,
-Andrzej

flang/lib/Lower/IntrinsicCall.cpp
964–965	[nit] The command line option is defined on line 991 rather than 981.
968–970	Basically, the mathRuntimeVersion generation will happen for these math operations late during FIR conversion. It won't happen late, right? It will happen either late or early and that will be determined by the value of `mathLowering`.
972–980	I feel that this comment mixes overall justification for the approach taken here _with_ the documentation for what `MathLoweringMode` represents. For example: In order to preserve strict FP behavior with late math lowering we have to extend the dialects used by the late lowering such that they model strict FP behavior properly. I think that this is a generic design challenge here that's orthogonal to the meaning of `MathLoweringMode`. Would you mind expanding/splitting this a bit? Easier said than done, I know!
982–983	I'm a bit confused. In your tests you use `-math-lowering=late -mllvm -math-runtime=precise`, which implies that `mathRuntimeVersion` applies to both `earlyLowering` and `lateLowering`. But this comment suggests that `mathRuntimeVersion` is only relevant for `earlyLowering`.
986–987	[nit] `lateLowering` suggests that this defines when Math intrinsics are going to be lowered. But the comment focus on what Math intrinsics are going to be lowered to instead of when it's going to happen.
1107
1171–1173	[nit] "Map mathematical intrinsic operations into MLIR operations" is a bit misleading - "map" is a verb, but this is a variable that represents state and I would expect a noun.
1174	[nit] Rather than saying "more", could you clarify where to look for the complete list? Or perhaps "remaining Fortran Math intrinsics"?
1234	Why can't it be removed now?
1463–1464	Could you add doxygen?
1489	Function names should be verb phrases (as they represent actions), and command-like function should be imperative. https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
1981	So this is effectively `earlyLowering`, right? Which will also happen if `lateLowering` fails?

It would be nice if you could lower most of the Fortran math intrinsics to the Math dialect. You can extend it as needed. LLVM backends can then decide whether they need libcalls.

It should give you some optimisation opportunities in MLIR.

Hi @awarzynski,

Thank you for the thorough review! Please see my comments below.

In D128385#3608587, @awarzynski wrote:

Hi @vzakhari, thank you for you very comprehensive reply and taking the time to address my question, much appreciated!

I've left a few inline comments, but that's mostly asking for clarifications.

I've noticed that in "late-math-lowering.f90" you test lowering and in "late-math-codegen.f90" you test lowering + code-gen. Since "late-math-lowering.f90" is there to make sure that lowering works as expected, why not focus on code-gen in "late-math-codegen.f90"? Wouldn't that make future triaging easier? Otherwise, any failure in "late-math-codegen.f90" could originate from 2 different compiler stages (either lowering or code-gen).

Sounds reasonable to me. I changed the test.

More comments below.

In D128385#3606162, @vzakhari wrote:

With that said, I did not plan to generate libm calls instead of pgmath calls, but after discussion with the team it seemed like a good opportunity to also get rid of pgmath dependency under late math lowering option. This should simplify flang toolchain setup for the upstream users.

+1 Great to see this being included :) In my view, it would be good to advertise this a bit (perhaps on Discourse). It's a non-trivial design change that will affect people. I don't expect anyone to oppose, but we should strive to make this discussion inclusive. It doesn't have to happen immediately - you may want to wait until there's enough support to drop the pgmath dependency.

Sounds good. I will post a message on Discourse, when late lowering is sound. Is there any particular "channel" on Discourse or should I use special tags?

Right, this commit only adds initial support for lowering into MLIR math operations or libm calls. I did not want to expose user option(s) to select late mode before confirming that the proposed change seems reasonable/useful to everyone. For the time being, I use -mllvm to enable this "experimental" mode, but I think it will have to become default and the pgmath dependency should go away completely (at least in the lowering implementation).

That's fine, but it wasn't clear from the summary :) This is quite an important functional change and folks might interested to learn more about. I think that it's worth pointing all this out in the commit msg.

I will update the commit message.

Note that pgmath dependency may still be optional, e.g. for vectorizing libm calls into pgmath vector calls as in here (see TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib() changes in lib/Analysis/TargetLibraryInfo.cpp).

I agree that we should allow users to select their preferred Math library. Probably through a compiler driver flag.

To summarize, I expect that we will enable late lowering by default and an option will not be needed.

+1 Thanks for clarifying!

Question 4.
AFAIK, fast, relaxed and precise are names from pgmath. I was under the impression that in #1377 we agreed to converge towards Clang's -ffp-model=<strict|precise|fast>. Like @kiranchandramohan points out, folks in Clang took some time to unify and to clarify their design in this area. I think that we should either aim for consistency with Clang. If it's decided otherwise, it would be good to clarify the rationale for diverging from that.

Thank you for bringing up this point! Yes, I agree that it makes great sense to align flang behavior with clang, and -ffp-model is the right option to control how MLIR operations and the libm calls are handled in the compiler.

Great!

If we switch to libm calls

Does libm implement all Math functions that we need?

Good question. No, libm is definitely missing some functions, e.g. there is no pow with integer exponent. There is a set of powi* functions in libgcc, but I do not think we can rely on this. We may have PowI operation in math and complex dialects, but then we have to lower it into LLVM IR dialect somehow. One of the solutions is to have all needed powi versions in Fortran numeric runtime and generate calls to them late in codegen.

Also, perhaps we should discuss the mapping between FP modes/options in a separate thread?

Yes, the options mapping and (arith, math, complex, etc.) MLIR operations support for different FP modes need to be discussed so that we can match user expectations regarding FP behavior.

Thank you,
-Andrzej

flang/lib/Lower/IntrinsicCall.cpp
968–970	You are correct. This (and other comments) is a leftover from the initial implementation that tried to convert MLIR operations into pgmath calls late in codegen. I will fix it.
972–980	Right, I think I will just remove this comment and we will discuss the general issue on Discourse.
982–983	Right, `mathRuntimeVersion` applies to both `early` and `late` lowering.
986–987	I guess the options should be renamed to something else, but I cannot come up with a good name :)
1171–1173	Fixed.
1174	Fixed.
1234	It is still used in the `early` mode both under `-math-runtime=llvm` and under any other `-math-runtime` if the corresponding operation is not found in pgmath list in `flang/include/flang/Evaluate/pgmath.h.inc`.
1463–1464	Done.
1489	Renamed to `checkPrecisionLoss`.
1981	Exactly. The `late` lowering falls back to `early` lowering currently.

In D128385#3608828, @tschuett wrote:

It would be nice if you could lower most of the Fortran math intrinsics to the Math dialect. You can extend it as needed. LLVM backends can then decide whether they need libcalls.

It should give you some optimisation opportunities in MLIR.

@tschuett, thank you for the comment. In general, I agree with you, but please see my comment regarding BESSEL functions here: https://reviews.llvm.org/D128385#3605776
I think having MLIR operations may not make much sense for math functions, which have no math-specific optimization opportunities. As I said before, if we can model SideEffects for generic calls such that, for example, a math call has NoSideEffect under -ffp-model=fast, then non-math-specific optimizations (e.g. CSE, LICM) should just start working.

vzakhari updated this revision to Diff 439858.Jun 24 2022, 12:43 PM

vzakhari edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B171925: Diff 439858.Jun 24 2022, 12:58 PM

Thanks for all the updates, @vzakhari !

Is there any reason not to split your tests in "late-math-lowering.f90" and "late-math-codegen.f90" into smaller testable units? Similarly to e.g. convert-to-llvm-ir.fir?

With the way the tests are written right now, it's hard to match the input with the output (both are very long). This becomes much easier when testing one intrinsic at a time (i.e. there's a separate function to test a particular intrinsic call).
By splitting the tests, you could also leverage CHECK-SAME and CHECK-LABEL more. This would help verifying that a particular intrinsic receives correct results.
Also, the current approach makes adding new test rather hard? More specifically, this is a very complex test input for somebody who is only interested in the abs intrinsic:

function test_real4(x, y, c, s, i)
  real :: x, y, test_real4
  complex(4) :: c
  integer(2) :: s
  integer(4) :: i
  test_real4 = abs(x) + abs(c) + aint(x) + anint(x) + atan(x) + atan2(x, y) + &
       ceiling(x) + cos(x) + erf(x) + exp(x) + floor(x) + log(x) + log10(x) + &
       nint(x, 4) + nint(x, 8) + x ** s + x ** y + x ** i + sign(x, y) + &
       sin(x) + tanh(x)
end function

Would you be open to refactoring the tests to match the style adapted in other test files? More comments inline.

-Andrzej

flang/lib/Lower/IntrinsicCall.cpp
986–987	Naming is hard! Since there are only two values, you may want to change this to a `bool` option instead, e.g.: static llvm::cl::opt<bool> lowerEarlyToLibCall("lower-early", llvm::cl::desc("Controls when to lower Math intrinsics to library calls"), llvm::cl::init(true)); That would probably simplify the code and communicate the overall intent a bit better. And with only one option to document, you will require fewer comments :)
1083	Shouldn't this be `genF64IntF64FuncType` instead of `genF64F64IntFuncType`? As in, the arguments are `F64` and `Int` and the return type is `F64`. Same for `genF32F32IntFuncType`.
1110–1111	[nit] Perhaps `runtimeFunc` so that the intent is clearer?
1114	Where is this condition checked? (i.e. `if (!funcGenerator)` or something similar). Also, I'm a bit confused about the relationship between `symbol` and `funcGenerator`. To me this comment suggests that either `symbol` or `funcGenerator` is used, but in fact `funcGenerator` will use `symbol` to generate the correct call: result = mathOp->funcGenerator(builder, loc, mathOp->symbol, actualFuncType, convertedArguments); Could you clarify?
1121–1134	Does my suggestion make sense? Perhaps I'm confusing things here. My point is that `name` is quite enigmatic and it's tricky to follow this function without more descriptive names. Same comment for `genMathOp`.
1137–1139
1234	Thanks! [nit] it would be helpful to have a comment like this here :) (basically, so that we know when to delete this)

@awarzynski, thank you for the review!

I modified the tests so that each split part tests particular Fortran operation. I will upload it shortly. Please also see my replies to your comments inlined.

flang/lib/Lower/IntrinsicCall.cpp
1083	I followed the same convention as in `genIntF32FuncType` above: the first type is the result type, and the following types are the argument types, so `genF64F64IntFuncType` stands for `F64 (*)(F64, Int<Bits>)`, which seems to be a straightforward mapping from the function type (as writtent in C) and the type generator name.
1110–1111	Fixed.
1114	Yes, the comment does not make sense any more. I fixed it. Thanks for catching!
1121–1134	Your suggested code looks good to me. I applied it.

vzakhari updated this revision to Diff 440438.Jun 27 2022, 5:05 PM

vzakhari edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B172360: Diff 440438.Jun 27 2022, 6:33 PM

Latest revision is rebase + Lower/array-expression-slice-1.f90 fix.

Thanks again for working on this. In particular, for refactoring the tests. It would be nice if you could reduce them a bit (specific suggestions inline). I won't have any more comments after that!

Btw, I finally had a chance to go over the most interesting part (thanks for writing this down!):

In D128385#3606162, @vzakhari wrote:

Thank you for bringing up this point! Yes, I agree that it makes great sense to align flang behavior with clang, and -ffp-model is the right option to control how MLIR operations and the libm calls are handled in the compiler.

I believe the different versions of pgmath can be described as:

precise - matches libm functions behavior (e.g. matching errno, exception behavior). With regards to math library calls it seems to match clang's -ffp-model=strict.

fast - matching accuracy for scalar and vector versions; does not set errno; exception behavior may not match precise in some cases. I guess we can say that it matches -ffp-model=fast, but I am not 100% sure: e.g. I think llvm.sqrt with Fast-Math Flag 'afn' (which is set under -ffp-model=fast) will be less accurate than pgmath's fast sqrt call. So maybe -ffp-model=fast actually matches pgmath's relaxed, and pgmath's fast matches -ffp-model=fast with some Fast-Math Flags disabled (which may be controlled by clang's other options).

relaxed - more inaccurate than fast; no specified accuracy for any entry point; scalar and vector versions may provide different accuracy.

If we switch to libm calls, I think it makes sense to align with clang completely, i.e.:

-ffp-model=fast - use math/complex MLIR operations or libm calls (if there is no corresponding MLIR operation) with all MLIR Fast-Math Flags set. Note that the 'afn' flag should probably enable "unsafe" optimizations such as math polynomial approximations. To avoid too much inaccuracy, users will be able to specify -fno-approx-func.

-ffp-model=strict - until MLIR operations are able to model FP strict behavior, I guess, we will have to generate generic library calls for all math operations and hope that the optimizations do not violate it too much. Note that we are not honoring FP strict behavior not only for math function, but for user calls as well, e.g. LLVM IR uses strictfp call attribute to control the inlining, and there is nothing like that in MLIR currently.

-ffp-model=precise - it must support errno and IEEE denormal-fp-math which is currently not modelled in MLIR, so at the current point in time we can treat it the same as -ffp-mode=strict.

So my current changes are just trying to match the existing precise/fast/relaxed behavior for pgmath, and more work will be needed to properly follow clang's -ffp-model within MLIR. I will be glad to hear your thoughts on this topic. FYI, I think SideEffectsInterface is one of the ways to model fenv_access and -ffp-exception-behavior=strict, but we also need to model the rounding mode and denormal behavior in MLIR.

I've not investigated the precise meaning of these flags myself - sadly that's always been on a back-burner. Your summary makes a lot of sense to me - thanks for taking all this into consideration while working on this patch. I guess that we can dive into the fine details once follow-up patches are up for review?

-Andrzej

flang/lib/Lower/IntrinsicCall.cpp
1083	Makes sense, thanks for the explanation!
1113	Did I get this right?
1116	Did I get this right?
flang/test/Intrinsics/late-math-codegen.fir
15	Remove unused args. Similar suggestion for other tests.
flang/test/Lower/late-math-lowering.f90
24	`hypotf` is for `abs(c)`, right? Why not move it to a dedicated function/test? (e.g. `test_complex(c)`)
26–32	I would remove the code that's not needed for this particular test. Similar suggestion for other tests.

Harbormaster completed remote builds in B172507: Diff 440651.Jun 28 2022, 10:05 AM

Still LGTM

Thank you all for the reviews! I will upload the final changes shortly and merge them after pre-merge checks pass.

flang/lib/Lower/IntrinsicCall.cpp
1113	Thanks! Fixed.
1116	Thanks!
flang/test/Intrinsics/late-math-codegen.fir
15	Fixed.
flang/test/Lower/late-math-lowering.f90
24	Makes sense. Fixed.
26–32	Fixed.

vzakhari updated this revision to Diff 440691.Jun 28 2022, 10:40 AM

vzakhari edited the summary of this revision. (Show Details)

LGTM, many thanks! Could you give @kiranchandramohan a chance to give his blessing too? (i.e. could you wait another day before merging?)

In D128385#3616334, @awarzynski wrote:

LGTM, many thanks! Could you give @kiranchandramohan a chance to give his blessing too? (i.e. could you wait another day before merging?)

You can go ahead. I am unlikely to get a chance to have a look today or tomorrow.

flang/lib/Lower/IntrinsicCall.cpp
969	Nit: know -> known

Harbormaster completed remote builds in B172536: Diff 440691.Jun 28 2022, 12:28 PM

vzakhari retitled this revision from Lower Fortran math intrinsic operations into MLIR ops or libm calls. to [flang] Lower Fortran math intrinsic operations into MLIR ops or libm calls..Jun 28 2022, 1:25 PM

vzakhari edited the summary of this revision. (Show Details)

vzakhari added inline comments.

flang/lib/Lower/IntrinsicCall.cpp
969	Thanks. Fixed. Feel free to post more comments after the merge.

Closed by commit rG9f35657983c5: [flang] Lower Fortran math intrinsic operations into MLIR ops or libm calls. (authored by vzakhari). · Explain WhyJun 28 2022, 1:38 PM

This revision was automatically updated to reflect the committed changes.

vzakhari added a commit: rG9f35657983c5: [flang] Lower Fortran math intrinsic operations into MLIR ops or libm calls..

awarzynski mentioned this in D129497: [flang] Lower TRANSPOSE without using runtime..Jul 20 2022, 3:05 AM

awarzynski mentioned this in D130048: [flang] Support late math lowering for intrinsics from the llvm table..Jul 20 2022, 3:40 AM

Revision Contents

Path

Size

flang/

include/

flang/

Optimizer/

Support/

InitFIR.h

3 lines

lib/

Lower/

IntrinsicCall.cpp

349 lines

Optimizer/

CodeGen/

CMakeLists.txt

2 lines

CodeGen.cpp

6 lines

test/

Intrinsics/

late-math-codegen.fir

741 lines

Lower/

Intrinsics/

exp.f90

4 lines

log.f90

4 lines

math-runtime-options.f90

12 lines

late-math-lowering.f90

137 lines

llvm-math.f90

3 lines

sqrt.f90

4 lines

trigonometric-intrinsics.f90

4 lines

Diff 439858

flang/include/flang/Optimizer/Support/InitFIR.h

	Show All 22 Lines
	#include "mlir/Transforms/Passes.h"			#include "mlir/Transforms/Passes.h"

	namespace fir::support {			namespace fir::support {

	#define FLANG_NONCODEGEN_DIALECT_LIST \			#define FLANG_NONCODEGEN_DIALECT_LIST \
	mlir::AffineDialect, FIROpsDialect, mlir::acc::OpenACCDialect, \			mlir::AffineDialect, FIROpsDialect, mlir::acc::OpenACCDialect, \
	mlir::omp::OpenMPDialect, mlir::scf::SCFDialect, \			mlir::omp::OpenMPDialect, mlir::scf::SCFDialect, \
	mlir::arith::ArithmeticDialect, mlir::cf::ControlFlowDialect, \			mlir::arith::ArithmeticDialect, mlir::cf::ControlFlowDialect, \
	mlir::func::FuncDialect, mlir::vector::VectorDialect			mlir::func::FuncDialect, mlir::vector::VectorDialect, \
				mlir::math::MathDialect

	// The definitive list of dialects used by flang.			// The definitive list of dialects used by flang.
	#define FLANG_DIALECT_LIST \			#define FLANG_DIALECT_LIST \
	FLANG_NONCODEGEN_DIALECT_LIST, FIRCodeGenDialect, mlir::LLVM::LLVMDialect			FLANG_NONCODEGEN_DIALECT_LIST, FIRCodeGenDialect, mlir::LLVM::LLVMDialect

	inline void registerNonCodegenDialects(mlir::DialectRegistry &registry) {			inline void registerNonCodegenDialects(mlir::DialectRegistry &registry) {
	registry.insert<FLANG_NONCODEGEN_DIALECT_LIST>();			registry.insert<FLANG_NONCODEGEN_DIALECT_LIST>();
	}			}
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

flang/lib/Lower/IntrinsicCall.cpp

Show All 29 Lines

#include "flang/Optimizer/Builder/Runtime/RTBuilder.h" #include "flang/Optimizer/Builder/Runtime/RTBuilder.h"

#include "flang/Optimizer/Builder/Runtime/Reduction.h" #include "flang/Optimizer/Builder/Runtime/Reduction.h"

#include "flang/Optimizer/Builder/Runtime/Stop.h" #include "flang/Optimizer/Builder/Runtime/Stop.h"

#include "flang/Optimizer/Builder/Runtime/Transformational.h" #include "flang/Optimizer/Builder/Runtime/Transformational.h"

#include "flang/Optimizer/Builder/Todo.h" #include "flang/Optimizer/Builder/Todo.h"

#include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Dialect/FIROpsSupport.h"

#include "flang/Optimizer/Support/FatalError.h" #include "flang/Optimizer/Support/FatalError.h"

#include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h"

#include "mlir/Dialect/Math/IR/Math.h"

#include "llvm/Support/CommandLine.h" #include "llvm/Support/CommandLine.h"

#include "llvm/Support/Debug.h" #include "llvm/Support/Debug.h"

#define DEBUG_TYPE "flang-lower-intrinsic" #define DEBUG_TYPE "flang-lower-intrinsic"

#define PGMATH_DECLARE #define PGMATH_DECLARE

#include "flang/Evaluate/pgmath.h.inc" #include "flang/Evaluate/pgmath.h.inc"

▲ Show 20 Lines • Show All 909 Lines • ▼ Show 20 Lines static llvm::cl::opt<bool> outlineAllIntrinsics(

llvm::cl::desc( llvm::cl::desc(

"Lower all intrinsic procedure implementation in their own functions"), "Lower all intrinsic procedure implementation in their own functions"),

llvm::cl::init(false)); llvm::cl::init(false));

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

// Math runtime description and matching utility // Math runtime description and matching utility

//===----------------------------------------------------------------------===// //===----------------------------------------------------------------------===//

/// Command line option to modify math runtime version used to implement /// Command line option to control how math operations are lowered

/// intrinsics. /// into MLIR.

awarzynskiUnsubmitted

Not Done

[nit] The command line option is defined on line 991 rather than 981.

awarzynski: [nit] The command line option is defined on line 991 rather than 981.

/// Going forward, most of the math operations have to be lowered

/// to some MLIR dialect operations or libm calls, if the corresponding

/// MLIR operation is not available or not reasonable to create

/// (e.g. there are no know optimization opportunities for the math

kiranchandramohanUnsubmitted

Not Done

Nit: know -> known

kiranchandramohan: Nit: know -> known

vzakhariAuthorUnsubmitted

Done

Thanks. Fixed.

Feel free to post more comments after the merge.

vzakhari: Thanks. Fixed. Feel free to post more comments after the merge.

/// operation in MLIR).

awarzynskiUnsubmitted

Not Done

Basically, the mathRuntimeVersion generation will happen for these math operations late during FIR conversion.

It won't happen late, right? It will happen either late or early and that will be determined by the value of mathLowering.

awarzynski: > Basically, the mathRuntimeVersion generation will happen for these math operations //late//…

vzakhariAuthorUnsubmitted

Done

You are correct. This (and other comments) is a leftover from the initial implementation that tried to convert MLIR operations into pgmath calls late in codegen. I will fix it.

vzakhari: You are correct. This (and other comments) is a leftover from the initial implementation that…

///

/// In general, exposing MLIR operations early can potentially enable more

/// MLIR optimizations.

enum MathLoweringMode {

// Most operations will be lowered to pgmath calls in this mode.

earlyLowering,

// Lower math operations into operations of MLIR dialects,

// such as mlir::math, mlir::complex, etc.

lateLowering,

awarzynskiUnsubmitted

Not Done

I feel that this comment mixes overall justification for the approach taken here _with_ the documentation for what MathLoweringMode represents. For example:

In order to preserve strict FP behavior with late math lowering we have to extend the dialects used by the late lowering such that they model strict FP behavior properly.

I think that this is a generic design challenge here that's orthogonal to the meaning of MathLoweringMode. Would you mind expanding/splitting this a bit? Easier said than done, I know!

awarzynski: I feel that this comment mixes overall justification for the approach taken here _with_ the…

vzakhariAuthorUnsubmitted

Done

Right, I think I will just remove this comment and we will discuss the general issue on Discourse.

vzakhari: Right, I think I will just remove this comment and we will discuss the general issue on…

};

llvm::cl::opt<MathLoweringMode> mathLowering(

awarzynskiUnsubmitted

Not Done

I'm a bit confused. In your tests you use -math-lowering=late -mllvm -math-runtime=precise, which implies that mathRuntimeVersion applies to both earlyLowering and lateLowering. But this comment suggests that mathRuntimeVersion is only relevant for earlyLowering.

awarzynski: I'm a bit confused. In your tests you use `-math-lowering=late -mllvm -math-runtime=precise`…

vzakhariAuthorUnsubmitted

Done

Right, mathRuntimeVersion applies to both early and late lowering.

vzakhari: Right, `mathRuntimeVersion` applies to both `early` and `late` lowering.

"math-lowering", llvm::cl::desc("Select math operations lowering mode:"),

llvm::cl::values(

clEnumValN(earlyLowering, "early", "lower to library calls early"),

clEnumValN(lateLowering, "late", "lower to MLIR dialect operations")),

awarzynskiUnsubmitted

Not Done

[nit] lateLowering suggests that this defines when Math intrinsics are going to be lowered. But the comment focus on what Math intrinsics are going to be lowered to instead of when it's going to happen.

awarzynski: [nit] `lateLowering` suggests that this defines //when// Math intrinsics are going to be…

vzakhariAuthorUnsubmitted

Done

I guess the options should be renamed to something else, but I cannot come up with a good name :)

vzakhari: I guess the options should be renamed to something else, but I cannot come up with a good name…

awarzynskiUnsubmitted

Not Done

Naming is hard!

Since there are only two values, you may want to change this to a bool option instead, e.g.:

static llvm::cl::opt<bool> lowerEarlyToLibCall("lower-early", llvm::cl::desc("Controls when to lower Math intrinsics to library calls"), llvm::cl::init(true));

That would probably simplify the code and communicate the overall intent a bit better. And with only one option to document, you will require fewer comments :)

awarzynski: Naming is hard! Since there are only two values, you may want to change this to a `bool`…

llvm::cl::init(earlyLowering));

/// Command line option to modify math runtime behavior used to implement

/// intrinsics. This option applies both to early and late math-lowering modes.

enum MathRuntimeVersion { enum MathRuntimeVersion {

fastVersion, fastVersion,

relaxedVersion, relaxedVersion,

preciseVersion, preciseVersion,

llvmOnly llvmOnly

}; };

llvm::cl::opt<MathRuntimeVersion> mathRuntimeVersion( llvm::cl::opt<MathRuntimeVersion> mathRuntimeVersion(

"math-runtime", llvm::cl::desc("Select math runtime version:"), "math-runtime", llvm::cl::desc("Select math operations' runtime behavior:"),

llvm::cl::values( llvm::cl::values(

clEnumValN(fastVersion, "fast", "use pgmath fast runtime"), clEnumValN(fastVersion, "fast", "use fast runtime behavior"),

clEnumValN(relaxedVersion, "relaxed", "use pgmath relaxed runtime"), clEnumValN(relaxedVersion, "relaxed", "use relaxed runtime behavior"),

clEnumValN(preciseVersion, "precise", "use pgmath precise runtime"), clEnumValN(preciseVersion, "precise", "use precise runtime behavior"),

clEnumValN(llvmOnly, "llvm", clEnumValN(llvmOnly, "llvm",

"only use LLVM intrinsics (may be incomplete)")), "only use LLVM intrinsics (may be incomplete)")),

llvm::cl::init(fastVersion)); llvm::cl::init(fastVersion));

struct RuntimeFunction { struct RuntimeFunction {

// llvm::StringRef comparison operator are not constexpr, so use string_view. // llvm::StringRef comparison operator are not constexpr, so use string_view.

using Key = std::string_view; using Key = std::string_view;

// Needed for implicit compare with keys. // Needed for implicit compare with keys.

constexpr operator Key() const { return key; } constexpr operator Key() const { return key; }

Key key; // intrinsic name Key key; // intrinsic name

// Name of a runtime function that implements the operation.

llvm::StringRef symbol; llvm::StringRef symbol;

fir::runtime::FuncTypeBuilderFunc typeGenerator; fir::runtime::FuncTypeBuilderFunc typeGenerator;

}; };

#define RUNTIME_STATIC_DESCRIPTION(name, func) \ #define RUNTIME_STATIC_DESCRIPTION(name, func) \

{#name, #func, fir::runtime::RuntimeTableKey<decltype(func)>::getTypeModel()}, {#name, #func, fir::runtime::RuntimeTableKey<decltype(func)>::getTypeModel()},

static constexpr RuntimeFunction pgmathFast[] = { static constexpr RuntimeFunction pgmathFast[] = {

#define PGMATH_FAST #define PGMATH_FAST

▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines

template <int Bits> template <int Bits>

static mlir::FunctionType genIntF32FuncType(mlir::MLIRContext *context) { static mlir::FunctionType genIntF32FuncType(mlir::MLIRContext *context) {

auto t = mlir::FloatType::getF32(context); auto t = mlir::FloatType::getF32(context);

auto r = mlir::IntegerType::get(context, Bits); auto r = mlir::IntegerType::get(context, Bits);

return mlir::FunctionType::get(context, {t}, {r}); return mlir::FunctionType::get(context, {t}, {r});

} }

// TODO : Fill-up this table with more intrinsic. template <int Bits>

static mlir::FunctionType genF64F64IntFuncType(mlir::MLIRContext *context) {

awarzynskiUnsubmitted

Not Done

Shouldn't this be genF64IntF64FuncType instead of genF64F64IntFuncType? As in, the arguments are F64 and Int and the return type is F64. Same for genF32F32IntFuncType.

awarzynski: Shouldn't this be `genF64IntF64FuncType` instead of `genF64F64IntFuncType`? As in, the…

vzakhariAuthorUnsubmitted

Done

I followed the same convention as in genIntF32FuncType above: the first type is the result type, and the following types are the argument types, so genF64F64IntFuncType stands for F64 (*)(F64, Int<Bits>), which seems to be a straightforward mapping from the function type (as writtent in C) and the type generator name.

vzakhari: I followed the same convention as in `genIntF32FuncType` above: the first type is the result…

awarzynskiUnsubmitted

Not Done

Makes sense, thanks for the explanation!

awarzynski: Makes sense, thanks for the explanation!

auto ftype = mlir::FloatType::getF64(context);

auto itype = mlir::IntegerType::get(context, Bits);

return mlir::FunctionType::get(context, {ftype, itype}, {ftype});

}

template <int Bits>

static mlir::FunctionType genF32F32IntFuncType(mlir::MLIRContext *context) {

auto ftype = mlir::FloatType::getF32(context);

auto itype = mlir::IntegerType::get(context, Bits);

return mlir::FunctionType::get(context, {ftype, itype}, {ftype});

}

/// Callback type for generating lowering for a math operation.

using MathGeneratorTy = mlir::Value (*)(fir::FirOpBuilder &, mlir::Location,

llvm::StringRef name,

mlir::FunctionType funcType,

llvm::ArrayRef<mlir::Value>);

struct MathOperation {

// llvm::StringRef comparison operator are not constexpr, so use string_view.

using Key = std::string_view;

// Needed for implicit compare with keys.

constexpr operator Key() const { return key; }

// Intrinsic name.

awarzynskiUnsubmitted

Not Done

constexpr operator Key() const { return key; }

- Key key; // intrinsic name

+ // Intrinsic name

+ Key key;

// Name of a runtime function that implements the operation.

awarzynski:

Key key;

// Name of a runtime function that implements the operation.

llvm::StringRef symbol;

awarzynskiUnsubmitted

Not Done

[nit] Perhaps runtimeFunc so that the intent is clearer?

awarzynski: [nit] Perhaps `runtimeFunc` so that the intent is clearer?

vzakhariAuthorUnsubmitted

Done

Fixed.

vzakhari: Fixed.

fir::runtime::FuncTypeBuilderFunc typeGenerator;

awarzynskiUnsubmitted

Not Done

fir::runtime::FuncTypeBuilderFunc typeGenerator;

- // A callback to generate FIR for the intrinsic defined by 'name'.

+ // A callback to generate FIR for the intrinsic defined by 'Key'.

// A callback may generate either dedicated MLIR operation(s) or

Did I get this right?

awarzynski: Did I get this right?

vzakhariAuthorUnsubmitted

Done

Thanks! Fixed.

vzakhari: Thanks! Fixed.

// If funcGenerator is non null, then it is generating

awarzynskiUnsubmitted

Not Done

Where is this condition checked? (i.e. if (!funcGenerator) or something similar).

Also, I'm a bit confused about the relationship between symbol and funcGenerator. To me this comment suggests that either symbol or funcGenerator is used, but in fact funcGenerator will use symbol to generate the correct call:

result = mathOp->funcGenerator(builder, loc, mathOp->symbol, actualFuncType, convertedArguments);

Could you clarify?

awarzynski: Where is this condition checked? (i.e. `if (!funcGenerator)` or something similar). Also, I'm…

vzakhariAuthorUnsubmitted

Done

Yes, the comment does not make sense any more. I fixed it. Thanks for catching!

vzakhari: Yes, the comment does not make sense any more. I fixed it. Thanks for catching!

// the lowering code, otherwise - the lowering is done

// as a call to a runtime function named as specified

awarzynskiUnsubmitted

Not Done

// a function call to a runtime function with name defined by

- // 'symbol'.

+ // 'runtimeFunc'.

MathGeneratorTy funcGenerator;

Did I get this right?

awarzynski: Did I get this right?

vzakhariAuthorUnsubmitted

Done

Thanks!

vzakhari: Thanks!

// in 'symbol' member.

MathGeneratorTy funcGenerator;

};

static mlir::Value genLibCall(fir::FirOpBuilder &builder, mlir::Location loc,

llvm::StringRef name, mlir::FunctionType funcType,

llvm::ArrayRef<mlir::Value> args) {

LLVM_DEBUG(llvm::dbgs() << "Generating '" << name << "' call with type ";

funcType.dump(); llvm::dbgs() << "\n");

mlir::func::FuncOp funcOp = builder.addNamedFunction(loc, name, funcType);

// TODO: ensure 'strictfp' setting on the call for "precise/strict"

// FP mode. Set appropriate Fast-Math Flags otherwise.

// TODO: we should also mark as many libm function as possible

// with 'pure' attribute (of course, not in strict FP mode).

auto libCall = builder.create<fir::CallOp>(loc, funcOp, args);

LLVM_DEBUG(libCall.dump(); llvm::dbgs() << "\n");

return libCall.getResult(0);

}

awarzynskiUnsubmitted

Not Done

MathGeneratorTy funcGenerator;

};

static mlir::Value genLibCall(fir::FirOpBuilder &builder, mlir::Location loc,

- llvm::StringRef name, mlir::FunctionType funcType,

+ llvm::StringRef libFuncName, mlir::FunctionType libFuncType,

llvm::ArrayRef<mlir::Value> args) {

LLVM_DEBUG(llvm::dbgs() << "Generating '" << name << "' call with type ";

- funcType.dump(); llvm::dbgs() << "\n");

- mlir::func::FuncOp funcOp = builder.addNamedFunction(loc, name, funcType);

+ libFuncType.dump(); llvm::dbgs() << "\n");

+ mlir::func::FuncOp funcOp = builder.addNamedFunction(loc, libFuncName, funcType);

// TODO: ensure 'strictfp' setting on the call for "precise/strict"

// FP mode. Set appropriate Fast-Math Flags otherwise.

// TODO: we should also mark as many libm function as possible

// with 'pure' attribute (of course, not in strict FP mode).

auto libCall = builder.create<fir::CallOp>(loc, funcOp, args);

LLVM_DEBUG(libCall.dump(); llvm::dbgs() << "\n");

return libCall.getResult(0);

}

template <typename T>

Does my suggestion make sense? Perhaps I'm confusing things here. My point is that name is quite enigmatic and it's tricky to follow this function without more descriptive names. Same comment for genMathOp.

awarzynski: Does my suggestion make sense? Perhaps I'm confusing things here. My point is that `name` is…

vzakhariAuthorUnsubmitted

Done

Your suggested code looks good to me. I applied it.

vzakhari: Your suggested code looks good to me. I applied it.

template <typename T>

static mlir::Value genMathOp(fir::FirOpBuilder &builder, mlir::Location loc,

llvm::StringRef name, mlir::FunctionType funcType,

llvm::ArrayRef<mlir::Value> args) {

awarzynskiUnsubmitted

Not Done

template <typename T>

static mlir::Value genMathOp(fir::FirOpBuilder &builder, mlir::Location loc,

- llvm::StringRef name, mlir::FunctionType funcType,

+ llvm::StringRef mathLibFuncName, mlir::FunctionType mathLibfuncType,

llvm::ArrayRef<mlir::Value> args) {

// TODO: we have to annotate the math operations with flags

awarzynski:

// TODO: we have to annotate the math operations with flags

// that will allow to define FP accuracy/exception

// behavior per operation, so that after early multi-module

// MLIR inlining we can distiguish operation that were

// compiled with different settings.

// Suggestion:

// * For "relaxed" FP mode set all Fast-Math Flags

// (see "[RFC] FastMath flags support in MLIR (arith dialect)"

// topic at discourse.llvm.org).

// * For "fast" FP mode set all Fast-Math Flags except 'afn'.

// * For "precise/strict" FP mode generate fir.calls to libm

// entries and annotate them with an attribute that will

// end up transformed into 'strictfp' LLVM attribute (TBD).

// Elsewhere, "precise/strict" FP mode should also set

// 'strictfp' for all user functions and calls so that

// LLVM backend does the right job.

// * Operations that cannot be reasonably optimized in MLIR

// can be also lowered to libm calls for "fast" and "relaxed"

// modes.

mlir::Value result;

if (mathRuntimeVersion == preciseVersion) {

result = genLibCall(builder, loc, name, funcType, args);

} else {

LLVM_DEBUG(llvm::dbgs()

<< "Generating '" << name << "' operation with type ";

funcType.dump(); llvm::dbgs() << "\n");

result = builder.create<T>(loc, args);

}

LLVM_DEBUG(result.dump(); llvm::dbgs() << "\n");

return result;

}

/// Mapping between mathematical intrinsic operations and MLIR operations

/// of some appropriate dialect (math, complex, etc.) or libm calls.

awarzynskiUnsubmitted

Not Done

return result;

}

- /// Map mathematical intrinsic operations into MLIR operations

- /// of some appropriate dialect (math, complex, etc.) or libm

- /// calls.

+ /// An array of maps for mathematical intrinsic operations into MLIR operations

+ /// of some appropriate dialect (math, complex, etc.) or C's libm calls.

/// TODO: support more operations here.

[nit] "Map mathematical intrinsic operations into MLIR operations" is a bit misleading - "map" is a verb, but this is a variable that represents state and I would expect a noun.

awarzynski: [nit] "Map mathematical intrinsic operations into MLIR operations" is a bit misleading - "map"…

vzakhariAuthorUnsubmitted

Done

Fixed.

vzakhari: Fixed.

/// TODO: support remaining Fortran math intrinsics.

awarzynskiUnsubmitted

Not Done

[nit] Rather than saying "more", could you clarify where to look for the complete list? Or perhaps "remaining Fortran Math intrinsics"?

awarzynski: [nit] Rather than saying "more", could you clarify where to look for the complete list? Or…

vzakhariAuthorUnsubmitted

Done

Fixed.

vzakhari: Fixed.

/// See https://gcc.gnu.org/onlinedocs/gcc-12.1.0/gfortran/\

/// Intrinsic-Procedures.html for a reference.

static constexpr MathOperation mathOperations[] = {

{"abs", "fabsf", genF32F32FuncType, genMathOp<mlir::math::AbsOp>},

{"abs", "fabs", genF64F64FuncType, genMathOp<mlir::math::AbsOp>},

// llvm.trunc behaves the same way as libm's trunc.

{"aint", "llvm.trunc.f32", genF32F32FuncType, genLibCall},

{"aint", "llvm.trunc.f64", genF64F64FuncType, genLibCall},

// llvm.round behaves the same way as libm's round.

{"anint", "llvm.round.f32", genF32F32FuncType,

genMathOp<mlir::LLVM::RoundOp>},

{"anint", "llvm.round.f64", genF64F64FuncType,

genMathOp<mlir::LLVM::RoundOp>},

{"atan", "atanf", genF32F32FuncType, genMathOp<mlir::math::AtanOp>},

{"atan", "atan", genF64F64FuncType, genMathOp<mlir::math::AtanOp>},

{"atan2", "atan2f", genF32F32F32FuncType, genMathOp<mlir::math::Atan2Op>},

{"atan2", "atan2", genF64F64F64FuncType, genMathOp<mlir::math::Atan2Op>},

// math::CeilOp returns a real, while Fortran CEILING returns integer.

{"ceil", "ceilf", genF32F32FuncType, genMathOp<mlir::math::CeilOp>},

{"ceil", "ceil", genF64F64FuncType, genMathOp<mlir::math::CeilOp>},

{"cos", "cosf", genF32F32FuncType, genMathOp<mlir::math::CosOp>},

{"cos", "cos", genF64F64FuncType, genMathOp<mlir::math::CosOp>},

{"erf", "erff", genF32F32FuncType, genMathOp<mlir::math::ErfOp>},

{"erf", "erf", genF64F64FuncType, genMathOp<mlir::math::ErfOp>},

{"exp", "expf", genF32F32FuncType, genMathOp<mlir::math::ExpOp>},

{"exp", "exp", genF64F64FuncType, genMathOp<mlir::math::ExpOp>},

// math::FloorOp returns a real, while Fortran FLOOR returns integer.

{"floor", "floorf", genF32F32FuncType, genMathOp<mlir::math::FloorOp>},

{"floor", "floor", genF64F64FuncType, genMathOp<mlir::math::FloorOp>},

{"hypot", "hypotf", genF32F32F32FuncType, genLibCall},

{"hypot", "hypot", genF64F64F64FuncType, genLibCall},

{"log", "logf", genF32F32FuncType, genMathOp<mlir::math::LogOp>},

{"log", "log", genF64F64FuncType, genMathOp<mlir::math::LogOp>},

{"log10", "log10f", genF32F32FuncType, genMathOp<mlir::math::Log10Op>},

{"log10", "log10", genF64F64FuncType, genMathOp<mlir::math::Log10Op>},

// llvm.lround behaves the same way as libm's lround.

{"nint", "llvm.lround.i64.f64", genIntF64FuncType<64>, genLibCall},

{"nint", "llvm.lround.i64.f32", genIntF32FuncType<64>, genLibCall},

{"nint", "llvm.lround.i32.f64", genIntF64FuncType<32>, genLibCall},

{"nint", "llvm.lround.i32.f32", genIntF32FuncType<32>, genLibCall},

{"pow", "powf", genF32F32F32FuncType, genMathOp<mlir::math::PowFOp>},

{"pow", "pow", genF64F64F64FuncType, genMathOp<mlir::math::PowFOp>},

// TODO: add PowIOp in math and complex dialects.

{"pow", "llvm.powi.f32.i32", genF32F32IntFuncType<32>, genLibCall},

{"pow", "llvm.powi.f64.i32", genF64F64IntFuncType<32>, genLibCall},

{"sign", "copysignf", genF32F32F32FuncType,

genMathOp<mlir::math::CopySignOp>},

{"sign", "copysign", genF64F64F64FuncType,

genMathOp<mlir::math::CopySignOp>},

{"sin", "sinf", genF32F32FuncType, genMathOp<mlir::math::SinOp>},

{"sin", "sin", genF64F64FuncType, genMathOp<mlir::math::SinOp>},

{"sqrt", "sqrtf", genF32F32FuncType, genMathOp<mlir::math::SqrtOp>},

{"sqrt", "sqrt", genF64F64FuncType, genMathOp<mlir::math::SqrtOp>},

{"tanh", "tanhf", genF32F32FuncType, genMathOp<mlir::math::TanhOp>},

{"tanh", "tanh", genF64F64FuncType, genMathOp<mlir::math::TanhOp>},

};

// Note: These are also defined as operations in LLVM dialect. See if this // Note: These are also defined as operations in LLVM dialect. See if this

// can be use and has advantages. // can be use and has advantages.

// TODO: remove this table, since the late math lowering should

awarzynskiUnsubmitted

Not Done

Why can't it be removed now?

awarzynski: Why can't it be removed now?

vzakhariAuthorUnsubmitted

Done

It is still used in the early mode both under -math-runtime=llvm and under any other -math-runtime if the corresponding operation is not found in pgmath list in flang/include/flang/Evaluate/pgmath.h.inc.

vzakhari: It is still used in the `early` mode both under `-math-runtime=llvm` and under any other `-math…

awarzynskiUnsubmitted

Not Done

Thanks!

[nit] it would be helpful to have a comment like this here :) (basically, so that we know when to delete this)

awarzynski: Thanks! [nit] it would be helpful to have a comment like this here :) (basically, so that we…

// replace it and generate proper MLIR operations rather

// than llvm intrinsic calls, which still look like generic

// calls to MLIR and do not enable many optimizations.

static constexpr RuntimeFunction llvmIntrinsics[] = { static constexpr RuntimeFunction llvmIntrinsics[] = {

{"abs", "llvm.fabs.f32", genF32F32FuncType}, {"abs", "llvm.fabs.f32", genF32F32FuncType},

{"abs", "llvm.fabs.f64", genF64F64FuncType}, {"abs", "llvm.fabs.f64", genF64F64FuncType},

{"aint", "llvm.trunc.f32", genF32F32FuncType}, {"aint", "llvm.trunc.f32", genF32F32FuncType},

{"aint", "llvm.trunc.f64", genF64F64FuncType}, {"aint", "llvm.trunc.f64", genF64F64FuncType},

{"anint", "llvm.round.f32", genF32F32FuncType}, {"anint", "llvm.round.f32", genF32F32FuncType},

{"anint", "llvm.round.f64", genF64F64FuncType}, {"anint", "llvm.round.f64", genF64F64FuncType},

{"atan", "atanf", genF32F32FuncType}, {"atan", "atanf", genF32F32FuncType},

▲ Show 20 Lines • Show All 182 Lines • ▼ Show 20 Lines static mlir::func::FuncOp getFuncOp(mlir::Location loc,

function->setAttr("fir.runtime", builder.getUnitAttr()); function->setAttr("fir.runtime", builder.getUnitAttr());

return function; return function;

} }

/// Select runtime function that has the smallest distance to the intrinsic /// Select runtime function that has the smallest distance to the intrinsic

/// function type and that will not imply narrowing arguments or extending the /// function type and that will not imply narrowing arguments or extending the

/// result. /// result.

/// If nothing is found, the mlir::func::FuncOp will contain a nullptr. /// If nothing is found, the mlir::func::FuncOp will contain a nullptr.

mlir::func::FuncOp searchFunctionInLibrary( static mlir::func::FuncOp searchFunctionInLibrary(

mlir::Location loc, fir::FirOpBuilder &builder, mlir::Location loc, fir::FirOpBuilder &builder,

const Fortran::common::StaticMultimapView<RuntimeFunction> &lib, const Fortran::common::StaticMultimapView<RuntimeFunction> &lib,

llvm::StringRef name, mlir::FunctionType funcType, llvm::StringRef name, mlir::FunctionType funcType,

const RuntimeFunction **bestNearMatch, const RuntimeFunction **bestNearMatch,

FunctionDistance &bestMatchDistance) { FunctionDistance &bestMatchDistance) {

std::pair<const RuntimeFunction *, const RuntimeFunction *> range = std::pair<const RuntimeFunction *, const RuntimeFunction *> range =

lib.equal_range(name); lib.equal_range(name);

for (auto iter = range.first; iter != range.second && iter; ++iter) { for (auto iter = range.first; iter != range.second && iter; ++iter) {

const RuntimeFunction &impl = *iter; const RuntimeFunction &impl = *iter;

mlir::FunctionType implType = impl.typeGenerator(builder.getContext()); mlir::FunctionType implType = impl.typeGenerator(builder.getContext());

if (funcType == implType) if (funcType == implType)

return getFuncOp(loc, builder, impl); // exact match return getFuncOp(loc, builder, impl); // exact match

FunctionDistance distance(funcType, implType); FunctionDistance distance(funcType, implType);

if (distance.isSmallerThan(bestMatchDistance)) { if (distance.isSmallerThan(bestMatchDistance)) {

*bestNearMatch = &impl; *bestNearMatch = &impl;

bestMatchDistance = std::move(distance); bestMatchDistance = std::move(distance);

} }

return {}; return {};

} }

using RtMap = Fortran::common::StaticMultimapView<MathOperation>;

static constexpr RtMap mathOps(mathOperations);

static_assert(mathOps.Verify() && "map must be sorted");

/// Look for a MathOperation entry specifying how to lower a mathematical

/// operation defined by \p name with its result' and operands' types

awarzynskiUnsubmitted

Not Done

Could you add doxygen?

awarzynski: Could you add doxygen?

vzakhariAuthorUnsubmitted

Done

Done.

vzakhari: Done.

/// specified in the form of a FunctionType \p funcType.

/// If exact match for the given types is found, then the function

/// returns a pointer to the corresponding MathOperation.

/// Otherwise, the function returns nullptr.

/// If there is a MathOperation that can be used with additional

/// type casts for the operands or/and result (non-exact match),

/// then it is returned via \p bestNearMatch argument, and

/// \p bestMatchDistance specifies the FunctionDistance between

/// the requested operation and the non-exact match.

static const MathOperation *

searchMathOperation(fir::FirOpBuilder &builder, llvm::StringRef name,

mlir::FunctionType funcType,

const MathOperation **bestNearMatch,

FunctionDistance &bestMatchDistance) {

auto range = mathOps.equal_range(name);

for (auto iter = range.first; iter != range.second && iter; ++iter) {

const auto &impl = *iter;

auto implType = impl.typeGenerator(builder.getContext());

if (funcType == implType)

return &impl; // exact match

FunctionDistance distance(funcType, implType);

if (distance.isSmallerThan(bestMatchDistance)) {

*bestNearMatch = &impl;

bestMatchDistance = std::move(distance);

awarzynskiUnsubmitted

Not Done

Function names should be verb phrases (as they represent actions), and command-like function should be imperative.

https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

awarzynski: > Function names should be verb phrases (as they represent actions), and command-like function…

vzakhariAuthorUnsubmitted

Done

Renamed to checkPrecisionLoss.

vzakhari: Renamed to `checkPrecisionLoss`.

}

return nullptr;

}

/// Implementation of the operation defined by \p name with type

/// \p funcType is not precise, and the actual available implementation

/// is \p distance away from the requested. If using the available

/// implementation results in a precision loss, emit an error message

/// with the given code location \p loc.

static void checkPrecisionLoss(llvm::StringRef name,

mlir::FunctionType funcType,

const FunctionDistance &distance,

mlir::Location loc) {

if (!distance.isLosingPrecision())

return;

// Using this runtime version requires narrowing the arguments

// or extending the result. It is not numerically safe. There

// is currently no quad math library that was described in

// lowering and could be used here. Emit an error and continue

// generating the code with the narrowing cast so that the user

// can get a complete list of the problematic intrinsic calls.

std::string message("TODO: no math runtime available for '");

llvm::raw_string_ostream sstream(message);

if (name == "pow") {

assert(funcType.getNumInputs() == 2 && "power operator has two arguments");

sstream << funcType.getInput(0) << " ** " << funcType.getInput(1);

} else {

sstream << name << "(";

if (funcType.getNumInputs() > 0)

sstream << funcType.getInput(0);

for (mlir::Type argType : funcType.getInputs().drop_front())

sstream << ", " << argType;

sstream << ")";

}

sstream << "'";

mlir::emitError(loc, message);

}

/// Search runtime for the best runtime function given an intrinsic name /// Search runtime for the best runtime function given an intrinsic name

/// and interface. The interface may not be a perfect match in which case /// and interface. The interface may not be a perfect match in which case

/// the caller is responsible to insert argument and return value conversions. /// the caller is responsible to insert argument and return value conversions.

/// If nothing is found, the mlir::func::FuncOp will contain a nullptr. /// If nothing is found, the mlir::func::FuncOp will contain a nullptr.

static mlir::func::FuncOp getRuntimeFunction(mlir::Location loc, static mlir::func::FuncOp getRuntimeFunction(mlir::Location loc,

fir::FirOpBuilder &builder, fir::FirOpBuilder &builder,

llvm::StringRef name, llvm::StringRef name,

mlir::FunctionType funcType) { mlir::FunctionType funcType) {

const RuntimeFunction *bestNearMatch = nullptr; const RuntimeFunction *bestNearMatch = nullptr;

FunctionDistance bestMatchDistance{}; FunctionDistance bestMatchDistance{};

mlir::func::FuncOp match; mlir::func::FuncOp match;

using RtMap = Fortran::common::StaticMultimapView<RuntimeFunction>; using RtMap = Fortran::common::StaticMultimapView<RuntimeFunction>;

static constexpr RtMap pgmathF(pgmathFast); static constexpr RtMap pgmathF(pgmathFast);

static_assert(pgmathF.Verify() && "map must be sorted"); static_assert(pgmathF.Verify() && "map must be sorted");

static constexpr RtMap pgmathR(pgmathRelaxed); static constexpr RtMap pgmathR(pgmathRelaxed);

static_assert(pgmathR.Verify() && "map must be sorted"); static_assert(pgmathR.Verify() && "map must be sorted");

static constexpr RtMap pgmathP(pgmathPrecise); static constexpr RtMap pgmathP(pgmathPrecise);

static_assert(pgmathP.Verify() && "map must be sorted"); static_assert(pgmathP.Verify() && "map must be sorted");

if (mathRuntimeVersion == fastVersion) { if (mathRuntimeVersion == fastVersion) {

match = searchFunctionInLibrary(loc, builder, pgmathF, name, funcType, match = searchFunctionInLibrary(loc, builder, pgmathF, name, funcType,

&bestNearMatch, bestMatchDistance); &bestNearMatch, bestMatchDistance);

} else if (mathRuntimeVersion == relaxedVersion) { } else if (mathRuntimeVersion == relaxedVersion) {

match = searchFunctionInLibrary(loc, builder, pgmathR, name, funcType, match = searchFunctionInLibrary(loc, builder, pgmathR, name, funcType,

&bestNearMatch, bestMatchDistance); &bestNearMatch, bestMatchDistance);

} else if (mathRuntimeVersion == preciseVersion) { } else if (mathRuntimeVersion == preciseVersion) {

match = searchFunctionInLibrary(loc, builder, pgmathP, name, funcType, match = searchFunctionInLibrary(loc, builder, pgmathP, name, funcType,

Show All 9 Lines static mlir::func::FuncOp getRuntimeFunction(mlir::Location loc,

static constexpr RtMap llvmIntr(llvmIntrinsics); static constexpr RtMap llvmIntr(llvmIntrinsics);

static_assert(llvmIntr.Verify() && "map must be sorted"); static_assert(llvmIntr.Verify() && "map must be sorted");

if (mlir::func::FuncOp exactMatch = if (mlir::func::FuncOp exactMatch =

searchFunctionInLibrary(loc, builder, llvmIntr, name, funcType, searchFunctionInLibrary(loc, builder, llvmIntr, name, funcType,

&bestNearMatch, bestMatchDistance)) &bestNearMatch, bestMatchDistance))

return exactMatch; return exactMatch;

if (bestNearMatch != nullptr) { if (bestNearMatch != nullptr) {

if (bestMatchDistance.isLosingPrecision()) { checkPrecisionLoss(name, funcType, bestMatchDistance, loc);

// Using this runtime version requires narrowing the arguments

// or extending the result. It is not numerically safe. There

// is currently no quad math library that was described in

// lowering and could be used here. Emit an error and continue

// generating the code with the narrowing cast so that the user

// can get a complete list of the problematic intrinsic calls.

std::string message("TODO: no math runtime available for '");

llvm::raw_string_ostream sstream(message);

if (name == "pow") {

assert(funcType.getNumInputs() == 2 &&

"power operator has two arguments");

sstream << funcType.getInput(0) << " ** " << funcType.getInput(1);

} else {

sstream << name << "(";

if (funcType.getNumInputs() > 0)

sstream << funcType.getInput(0);

for (mlir::Type argType : funcType.getInputs().drop_front())

sstream << ", " << argType;

sstream << ")";

}

sstream << "'";

mlir::emitError(loc, message);

}

return getFuncOp(loc, builder, *bestNearMatch); return getFuncOp(loc, builder, *bestNearMatch);

} }

return {}; return {};

} }

/// Helpers to get function type from arguments and result type. /// Helpers to get function type from arguments and result type.

static mlir::FunctionType getFunctionType(llvm::Optional<mlir::Type> resultType, static mlir::FunctionType getFunctionType(llvm::Optional<mlir::Type> resultType,

llvm::ArrayRef<mlir::Value> arguments, llvm::ArrayRef<mlir::Value> arguments,

▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines for (const fir::ExtendedValue &extendedVal : args) {

mlirArgs.emplace_back(val); mlirArgs.emplace_back(val);

} }

mlir::FunctionType soughtFuncType = mlir::FunctionType soughtFuncType =

getFunctionType(*resultType, mlirArgs, builder); getFunctionType(*resultType, mlirArgs, builder);

IntrinsicLibrary::RuntimeCallGenerator runtimeCallGenerator = IntrinsicLibrary::RuntimeCallGenerator runtimeCallGenerator =

getRuntimeCallGenerator(name, soughtFuncType); getRuntimeCallGenerator(name, soughtFuncType);

return genElementalCall(runtimeCallGenerator, name, *resultType, args, return genElementalCall(runtimeCallGenerator, name, *resultType, args,

/* outline */ true); /*outline=*/outlineAllIntrinsics);

} }

mlir::Value mlir::Value

IntrinsicLibrary::invokeGenerator(ElementalGenerator generator, IntrinsicLibrary::invokeGenerator(ElementalGenerator generator,

mlir::Type resultType, mlir::Type resultType,

llvm::ArrayRef<mlir::Value> args) { llvm::ArrayRef<mlir::Value> args) {

return std::invoke(generator, *this, resultType, args); return std::invoke(generator, *this, resultType, args);

} }

▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines if (resultType)

return toExtendedValue(call.getResult(0), builder, loc); return toExtendedValue(call.getResult(0), builder, loc);

// Subroutine calls // Subroutine calls

return mlir::Value{}; return mlir::Value{};

} }

IntrinsicLibrary::RuntimeCallGenerator IntrinsicLibrary::RuntimeCallGenerator

IntrinsicLibrary::getRuntimeCallGenerator(llvm::StringRef name, IntrinsicLibrary::getRuntimeCallGenerator(llvm::StringRef name,

mlir::FunctionType soughtFuncType) { mlir::FunctionType soughtFuncType) {

mlir::func::FuncOp funcOp = mlir::func::FuncOp funcOp;

getRuntimeFunction(loc, builder, name, soughtFuncType); mlir::FunctionType actualFuncType;

if (!funcOp) { const MathOperation *mathOp = nullptr;

if (mathLowering == lateLowering) {

// Look for a dedicated math operation generator, which

// normally produces a single MLIR operation implementing

// the math operation.

// If not found fall back to a runtime function lookup.

const MathOperation *bestNearMatch = nullptr;

FunctionDistance bestMatchDistance;

mathOp = searchMathOperation(builder, name, soughtFuncType, &bestNearMatch,

bestMatchDistance);

if (!mathOp && bestNearMatch) {

// Use the best near match, optionally issuing an error,

// if types conversions cause precision loss.

checkPrecisionLoss(name, soughtFuncType, bestMatchDistance, loc);

mathOp = bestNearMatch;

}

if (mathOp)

actualFuncType = mathOp->typeGenerator(builder.getContext());

}

if (!mathOp)

awarzynskiUnsubmitted

Not Done

So this is effectively earlyLowering, right? Which will also happen if lateLowering fails?

awarzynski: So this is effectively `earlyLowering`, right? Which will also happen if `lateLowering` fails?

vzakhariAuthorUnsubmitted

Done

Exactly. The late lowering falls back to early lowering currently.

vzakhari: Exactly. The `late` lowering falls back to `early` lowering currently.

if ((funcOp = getRuntimeFunction(loc, builder, name, soughtFuncType)))

actualFuncType = funcOp.getFunctionType();

if (!mathOp && !funcOp) {

std::string nameAndType; std::string nameAndType;

llvm::raw_string_ostream sstream(nameAndType); llvm::raw_string_ostream sstream(nameAndType);

sstream << name << "\nrequested type: " << soughtFuncType; sstream << name << "\nrequested type: " << soughtFuncType;

crashOnMissingIntrinsic(loc, nameAndType); crashOnMissingIntrinsic(loc, nameAndType);

} }

mlir::FunctionType actualFuncType = funcOp.getFunctionType();

assert(actualFuncType.getNumResults() == soughtFuncType.getNumResults() && assert(actualFuncType.getNumResults() == soughtFuncType.getNumResults() &&

actualFuncType.getNumInputs() == soughtFuncType.getNumInputs() && actualFuncType.getNumInputs() == soughtFuncType.getNumInputs() &&

actualFuncType.getNumResults() == 1 && "Bad intrinsic match"); actualFuncType.getNumResults() == 1 && "Bad intrinsic match");

return [funcOp, actualFuncType, return [funcOp, actualFuncType, mathOp,

soughtFuncType](fir::FirOpBuilder &builder, mlir::Location loc, soughtFuncType](fir::FirOpBuilder &builder, mlir::Location loc,

llvm::ArrayRef<mlir::Value> args) { llvm::ArrayRef<mlir::Value> args) {

llvm::SmallVector<mlir::Value> convertedArguments; llvm::SmallVector<mlir::Value> convertedArguments;

for (auto [fst, snd] : llvm::zip(actualFuncType.getInputs(), args)) for (auto [fst, snd] : llvm::zip(actualFuncType.getInputs(), args))

convertedArguments.push_back(builder.createConvert(loc, fst, snd)); convertedArguments.push_back(builder.createConvert(loc, fst, snd));

auto call = builder.create<fir::CallOp>(loc, funcOp, convertedArguments); mlir::Value result;

// Use math operation generator, if available.

if (mathOp)

result = mathOp->funcGenerator(builder, loc, mathOp->symbol,

actualFuncType, convertedArguments);

else

result = builder.create<fir::CallOp>(loc, funcOp, convertedArguments)

.getResult(0);

mlir::Type soughtType = soughtFuncType.getResult(0); mlir::Type soughtType = soughtFuncType.getResult(0);

return builder.createConvert(loc, soughtType, call.getResult(0)); return builder.createConvert(loc, soughtType, result);

}; };

} }

mlir::SymbolRefAttr IntrinsicLibrary::getUnrestrictedIntrinsicSymbolRefAttr( mlir::SymbolRefAttr IntrinsicLibrary::getUnrestrictedIntrinsicSymbolRefAttr(

llvm::StringRef name, mlir::FunctionType signature) { llvm::StringRef name, mlir::FunctionType signature) {

// Unrestricted intrinsics signature follows implicit rules: argument // Unrestricted intrinsics signature follows implicit rules: argument

// are passed by references. But the runtime versions expect values. // are passed by references. But the runtime versions expect values.

// So instead of duplicating the runtime, just have the wrappers loading // So instead of duplicating the runtime, just have the wrappers loading

▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines return res.match(

}); });

} }

// AIMAG // AIMAG

mlir::Value IntrinsicLibrary::genAimag(mlir::Type resultType, mlir::Value IntrinsicLibrary::genAimag(mlir::Type resultType,

llvm::ArrayRef<mlir::Value> args) { llvm::ArrayRef<mlir::Value> args) {

assert(args.size() == 1); assert(args.size() == 1);

return fir::factory::Complex{builder, loc}.extractComplexPart( return fir::factory::Complex{builder, loc}.extractComplexPart(

args[0], true /* isImagPart */); args[0], /*isImagPart=*/true);

} }

// AINT // AINT

mlir::Value IntrinsicLibrary::genAint(mlir::Type resultType, mlir::Value IntrinsicLibrary::genAint(mlir::Type resultType,

llvm::ArrayRef<mlir::Value> args) { llvm::ArrayRef<mlir::Value> args) {

assert(args.size() >= 1 && args.size() <= 2); assert(args.size() >= 1 && args.size() <= 2);

// Skip optional kind argument to search the runtime; it is already reflected // Skip optional kind argument to search the runtime; it is already reflected

// in result type. // in result type.

▲ Show 20 Lines • Show All 2,056 Lines • ▼ Show 20 Lines mlir::Value Fortran::lower::genMin(fir::FirOpBuilder &builder,

return IntrinsicLibrary{builder, loc} return IntrinsicLibrary{builder, loc}

.genExtremum<Extremum::Min, ExtremumBehavior::MinMaxss>(args[0].getType(), .genExtremum<Extremum::Min, ExtremumBehavior::MinMaxss>(args[0].getType(),

args); args);

} }

mlir::Value Fortran::lower::genPow(fir::FirOpBuilder &builder, mlir::Value Fortran::lower::genPow(fir::FirOpBuilder &builder,

mlir::Location loc, mlir::Type type, mlir::Location loc, mlir::Type type,

mlir::Value x, mlir::Value y) { mlir::Value x, mlir::Value y) {

// TODO: since there is no libm version of pow with integer exponent,

// we have to provide an alternative implementation for

// "precise/strict" FP mode and (mathLowering == lateLowering).

// One option is to generate internal function with inlined

// implementation and mark it 'strictfp'.

// Another option is to implement it in Fortran runtime library

// (just like matmul).

return IntrinsicLibrary{builder, loc}.genRuntimeCall("pow", type, {x, y}); return IntrinsicLibrary{builder, loc}.genRuntimeCall("pow", type, {x, y});

} }

mlir::SymbolRefAttr Fortran::lower::getUnrestrictedIntrinsicSymbolRefAttr( mlir::SymbolRefAttr Fortran::lower::getUnrestrictedIntrinsicSymbolRefAttr(

fir::FirOpBuilder &builder, mlir::Location loc, llvm::StringRef name, fir::FirOpBuilder &builder, mlir::Location loc, llvm::StringRef name,

mlir::FunctionType signature) { mlir::FunctionType signature) {

return IntrinsicLibrary{builder, loc}.getUnrestrictedIntrinsicSymbolRefAttr( return IntrinsicLibrary{builder, loc}.getUnrestrictedIntrinsicSymbolRefAttr(

name, signature); name, signature);

} }

flang/lib/Optimizer/CodeGen/CMakeLists.txt

Show All 11 Lines	add_flang_library(FIRCodeGen
FIRSupport		FIRSupport
FIROptCodeGenPassIncGen		FIROptCodeGenPassIncGen
CGOpsIncGen		CGOpsIncGen

LINK_LIBS		LINK_LIBS
FIRBuilder		FIRBuilder
FIRDialect		FIRDialect
FIRSupport		FIRSupport
		MLIRMathToLLVM
		MLIRMathToLibm
MLIROpenMPToLLVM		MLIROpenMPToLLVM
MLIRLLVMToLLVMIRTranslation		MLIRLLVMToLLVMIRTranslation
MLIRTargetLLVMIRExport		MLIRTargetLLVMIRExport

LINK_COMPONENTS		LINK_COMPONENTS
AsmParser		AsmParser
AsmPrinter		AsmPrinter
Remarks		Remarks
)		)

flang/lib/Optimizer/CodeGen/CodeGen.cpp

Show All 17 Lines
#include "flang/Optimizer/Dialect/FIROps.h"		#include "flang/Optimizer/Dialect/FIROps.h"
#include "flang/Optimizer/Support/InternalNames.h"		#include "flang/Optimizer/Support/InternalNames.h"
#include "flang/Optimizer/Support/TypeCode.h"		#include "flang/Optimizer/Support/TypeCode.h"
#include "flang/Semantics/runtime-type-info.h"		#include "flang/Semantics/runtime-type-info.h"
#include "mlir/Conversion/ArithmeticToLLVM/ArithmeticToLLVM.h"		#include "mlir/Conversion/ArithmeticToLLVM/ArithmeticToLLVM.h"
#include "mlir/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.h"		#include "mlir/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.h"
#include "mlir/Conversion/FuncToLLVM/ConvertFuncToLLVM.h"		#include "mlir/Conversion/FuncToLLVM/ConvertFuncToLLVM.h"
#include "mlir/Conversion/LLVMCommon/Pattern.h"		#include "mlir/Conversion/LLVMCommon/Pattern.h"
		#include "mlir/Conversion/MathToLLVM/MathToLLVM.h"
		#include "mlir/Conversion/MathToLibm/MathToLibm.h"
#include "mlir/Conversion/OpenMPToLLVM/ConvertOpenMPToLLVM.h"		#include "mlir/Conversion/OpenMPToLLVM/ConvertOpenMPToLLVM.h"
#include "mlir/IR/BuiltinTypes.h"		#include "mlir/IR/BuiltinTypes.h"
#include "mlir/IR/Matchers.h"		#include "mlir/IR/Matchers.h"
#include "mlir/Pass/Pass.h"		#include "mlir/Pass/Pass.h"
#include "mlir/Target/LLVMIR/ModuleTranslation.h"		#include "mlir/Target/LLVMIR/ModuleTranslation.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"

#define DEBUG_TYPE "flang-codegen"		#define DEBUG_TYPE "flang-codegen"
▲ Show 20 Lines • Show All 3,341 Lines • ▼ Show 20 Lines	pattern.insert<
XEmboxOpConversion, XReboxOpConversion, ZeroOpConversion>(typeConverter,		XEmboxOpConversion, XReboxOpConversion, ZeroOpConversion>(typeConverter,
options);		options);
mlir::populateFuncToLLVMConversionPatterns(typeConverter, pattern);		mlir::populateFuncToLLVMConversionPatterns(typeConverter, pattern);
mlir::populateOpenMPToLLVMConversionPatterns(typeConverter, pattern);		mlir::populateOpenMPToLLVMConversionPatterns(typeConverter, pattern);
mlir::arith::populateArithmeticToLLVMConversionPatterns(typeConverter,		mlir::arith::populateArithmeticToLLVMConversionPatterns(typeConverter,
pattern);		pattern);
mlir::cf::populateControlFlowToLLVMConversionPatterns(typeConverter,		mlir::cf::populateControlFlowToLLVMConversionPatterns(typeConverter,
pattern);		pattern);
		// Convert math-like dialect operations, which can be produced
		// when late math lowering mode is used, into llvm dialect.
		mlir::populateMathToLLVMConversionPatterns(typeConverter, pattern);
		mlir::populateMathToLibmConversionPatterns(pattern, /benefit=/0);
mlir::ConversionTarget target{*context};		mlir::ConversionTarget target{*context};
target.addLegalDialect<mlir::LLVM::LLVMDialect>();		target.addLegalDialect<mlir::LLVM::LLVMDialect>();
// The OpenMP dialect is legal for Operations without regions, for those		// The OpenMP dialect is legal for Operations without regions, for those
// which contains regions it is legal if the region contains only the		// which contains regions it is legal if the region contains only the
// LLVM dialect. Add OpenMP dialect as a legal dialect for conversion and		// LLVM dialect. Add OpenMP dialect as a legal dialect for conversion and
// legalize conversion of OpenMP operations without regions.		// legalize conversion of OpenMP operations without regions.
mlir::configureOpenMPToLLVMConversionLegality(target, typeConverter);		mlir::configureOpenMPToLLVMConversionLegality(target, typeConverter);
target.addLegalDialect<mlir::omp::OpenMPDialect>();		target.addLegalDialect<mlir::omp::OpenMPDialect>();
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

flang/test/Intrinsics/late-math-codegen.fir

This file was added.

// RUN: split-file %s %t

// TODO: verify that Fast-Math Flags and 'strictfp' are properly set.

// RUN: fir-opt %t/fast --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck --check-prefixes=ALL,FAST %s

// RUN: fir-opt %t/relaxed --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck --check-prefixes=ALL,RELAXED %s

// RUN: fir-opt %t/precise --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck --check-prefixes=ALL,PRECISE %s

//Fortran original source:

//function test_real4(x, y, c, s, i)

// real :: x, y, test_real4

// complex(4) :: c

// integer(2) :: s

// integer(4) :: i

// test_real4 = abs(x) + abs(c) + aint(x) + anint(x) + atan(x) + atan2(x, y) + &

// ceiling(x) + cos(x) + erf(x) + exp(x) + floor(x) + log(x) + log10(x) + &

// nint(x, 4) + nint(x, 8) + x ** s + x ** y + x ** i + sign(x, y) + &

awarzynskiUnsubmitted

Not Done

// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @hypot({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

- func.func @_QPtest_real4(%arg0: !fir.ref<f32> {fir.bindc_name = "x"}, %arg1: !fir.ref<f32> {fir.bindc_name = "y"}, %arg2: !fir.ref<!fir.complex<4>> {fir.bindc_name = "c"}, %arg3: !fir.ref<i16> {fir.bindc_name = "s"}, %arg4: !fir.ref<i32> {fir.bindc_name = "i"}) -> f32 {

+ func.func @_QPtest_real4(%arg0: !fir.ref<f32> {fir.bindc_name = "x"}, %arg2: !fir.ref<!fir.complex<4>> {fir.bindc_name = "c"}) -> f32 {

%0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"}

Remove unused args. Similar suggestion for other tests.

awarzynski: Remove unused args. Similar suggestion for other tests.

vzakhariAuthorUnsubmitted

Done

Fixed.

vzakhari: Fixed.

// sin(x) + tanh(x)

//end function

//function test_real8(x, y, c, s, i)

// real(8) :: x, y, test_real8

// complex(8) :: c

// integer(2) :: s

// integer(4) :: i

// test_real8 = abs(x) + abs(c) + aint(x) + anint(x) + atan(x) + atan2(x, y) + &

// ceiling(x) + cos(x) + erf(x) + exp(x) + floor(x) + log(x) + log10(x) + &

// nint(x, 4) + nint(x, 8) + x ** s + x ** y + x ** i + sign(x, y) + &

// sin(x) + tanh(x)

//end function

// ALL-LABEL: @_QPtest_real4

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.fabs"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.fabs"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @fabsf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @hypotf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @hypotf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @hypotf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.trunc.f32({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.round.f32({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.ceil"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.ceil"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @ceilf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.cos"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.cos"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @cosf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @erff({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @erff({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @erff({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.exp"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.exp"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @expf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.floor"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.floor"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @floorf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.log"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.log"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @logf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.log10"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.log10"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @log10f({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.lround.i32.f32({{%[A-Za-z0-9._]+}}) : (f32) -> i32

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.lround.i64.f32({{%[A-Za-z0-9._]+}}) : (f32) -> i64

// ALL: [[STOI:%[A-Za-z0-9._]+]] = llvm.sext {{%[A-Za-z0-9._]+}} : i16 to i32

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.powi.f32.i32({{%[A-Za-z0-9._]+}}, [[STOI]]) : (f32, i32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.pow"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.pow"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @powf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.powi.f32.i32({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, i32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.copysign"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.copysign"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @copysignf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.sin"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.sin"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @sinf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @tanhf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @tanhf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @tanhf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

// ALL-LABEL: @_QPtest_real8

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.fabs"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.fabs"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @fabs({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @hypot({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @hypot({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @hypot({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.trunc.f64({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.round.f64({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.ceil"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.ceil"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @ceil({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.cos"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.cos"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @cos({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @erf({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @erf({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @erf({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.exp"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.exp"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @exp({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.floor"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.floor"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @floor({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.log"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.log"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @log({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.log10"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.log10"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @log10({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.lround.i32.f64({{%[A-Za-z0-9._]+}}) : (f64) -> i32

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.lround.i64.f64({{%[A-Za-z0-9._]+}}) : (f64) -> i64

// ALL: [[STOI:%[A-Za-z0-9._]+]] = llvm.sext {{%[A-Za-z0-9._]+}} : i16 to i32

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.powi.f64.i32({{%[A-Za-z0-9._]+}}, [[STOI]]) : (f64, i32) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.pow"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.pow"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @pow({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// ALL: {{%[A-Za-z0-9._]+}} = llvm.call @llvm.powi.f64.i32({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, i32) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.copysign"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.copysign"({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @copysign({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.sin"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.sin"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @sin({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// FAST: {{%[A-Za-z0-9._]+}} = llvm.call @tanh({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// RELAXED: {{%[A-Za-z0-9._]+}} = llvm.call @tanh({{%[A-Za-z0-9._]+}}) : (f64) -> f64

// PRECISE: {{%[A-Za-z0-9._]+}} = llvm.call @tanh({{%[A-Za-z0-9._]+}}) : (f64) -> f64

//--- fast

func.func @_QPtest_real4(%arg0: !fir.ref<f32> {fir.bindc_name = "x"}, %arg1: !fir.ref<f32> {fir.bindc_name = "y"}, %arg2: !fir.ref<!fir.complex<4>> {fir.bindc_name = "c"}, %arg3: !fir.ref<i16> {fir.bindc_name = "s"}, %arg4: !fir.ref<i32> {fir.bindc_name = "i"}) -> f32 {

%0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"}

%1 = fir.load %arg0 : !fir.ref<f32>

%2 = math.abs %1 : f32

%3 = fir.load %arg2 : !fir.ref<!fir.complex<4>>

%4 = fir.extract_value %3, [0 : index] : (!fir.complex<4>) -> f32

%5 = fir.extract_value %3, [1 : index] : (!fir.complex<4>) -> f32

%6 = fir.call @hypotf(%4, %5) : (f32, f32) -> f32

%7 = arith.addf %2, %6 : f32

%8 = fir.load %arg0 : !fir.ref<f32>

%9 = fir.call @llvm.trunc.f32(%8) : (f32) -> f32

%10 = arith.addf %7, %9 : f32

%11 = fir.load %arg0 : !fir.ref<f32>

%12 = "llvm.intr.round"(%11) : (f32) -> f32

%13 = arith.addf %10, %12 : f32

%14 = fir.load %arg0 : !fir.ref<f32>

%15 = math.atan %14 : f32

%16 = arith.addf %13, %15 : f32

%17 = fir.load %arg0 : !fir.ref<f32>

%18 = fir.load %arg1 : !fir.ref<f32>

%19 = math.atan2 %17, %18 : f32

%20 = arith.addf %16, %19 : f32

%21 = fir.load %arg0 : !fir.ref<f32>

%22 = math.ceil %21 : f32

%23 = fir.convert %22 : (f32) -> i32

%24 = fir.convert %23 : (i32) -> f32

%25 = arith.addf %20, %24 : f32

%26 = fir.load %arg0 : !fir.ref<f32>

%27 = math.cos %26 : f32

%28 = arith.addf %25, %27 : f32

%29 = fir.load %arg0 : !fir.ref<f32>

%30 = math.erf %29 : f32

%31 = arith.addf %28, %30 : f32

%32 = fir.load %arg0 : !fir.ref<f32>

%33 = math.exp %32 : f32

%34 = arith.addf %31, %33 : f32

%35 = fir.load %arg0 : !fir.ref<f32>

%36 = math.floor %35 : f32

%37 = fir.convert %36 : (f32) -> i32

%38 = fir.convert %37 : (i32) -> f32

%39 = arith.addf %34, %38 : f32

%40 = fir.load %arg0 : !fir.ref<f32>

%41 = math.log %40 : f32

%42 = arith.addf %39, %41 : f32

%43 = fir.load %arg0 : !fir.ref<f32>

%44 = math.log10 %43 : f32

%45 = arith.addf %42, %44 : f32

%46 = fir.load %arg0 : !fir.ref<f32>

%47 = fir.call @llvm.lround.i32.f32(%46) : (f32) -> i32

%48 = fir.convert %47 : (i32) -> f32

%49 = arith.addf %45, %48 : f32

%50 = fir.load %arg0 : !fir.ref<f32>

%51 = fir.call @llvm.lround.i64.f32(%50) : (f32) -> i64

%52 = fir.convert %51 : (i64) -> f32

%53 = arith.addf %49, %52 : f32

%54 = fir.load %arg0 : !fir.ref<f32>

%55 = fir.load %arg3 : !fir.ref<i16>

%56 = fir.convert %55 : (i16) -> i32

%57 = fir.call @llvm.powi.f32.i32(%54, %56) : (f32, i32) -> f32

%58 = arith.addf %53, %57 : f32

%59 = fir.load %arg0 : !fir.ref<f32>

%60 = fir.load %arg1 : !fir.ref<f32>

%61 = math.powf %59, %60 : f32

%62 = arith.addf %58, %61 : f32

%63 = fir.load %arg0 : !fir.ref<f32>

%64 = fir.load %arg4 : !fir.ref<i32>

%65 = fir.call @llvm.powi.f32.i32(%63, %64) : (f32, i32) -> f32

%66 = arith.addf %62, %65 : f32

%67 = fir.load %arg0 : !fir.ref<f32>

%68 = fir.load %arg1 : !fir.ref<f32>

%69 = math.copysign %67, %68 : f32

%70 = arith.addf %66, %69 : f32

%71 = fir.load %arg0 : !fir.ref<f32>

%72 = math.sin %71 : f32

%73 = arith.addf %70, %72 : f32

%74 = fir.load %arg0 : !fir.ref<f32>

%75 = math.tanh %74 : f32

%76 = arith.addf %73, %75 : f32

fir.store %76 to %0 : !fir.ref<f32>

%77 = fir.load %0 : !fir.ref<f32>

return %77 : f32

}

func.func @_QPtest_real8(%arg0: !fir.ref<f64> {fir.bindc_name = "x"}, %arg1: !fir.ref<f64> {fir.bindc_name = "y"}, %arg2: !fir.ref<!fir.complex<8>> {fir.bindc_name = "c"}, %arg3: !fir.ref<i16> {fir.bindc_name = "s"}, %arg4: !fir.ref<i32> {fir.bindc_name = "i"}) -> f64 {

%0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"}

%1 = fir.load %arg0 : !fir.ref<f64>

%2 = math.abs %1 : f64

%3 = fir.load %arg2 : !fir.ref<!fir.complex<8>>

%4 = fir.extract_value %3, [0 : index] : (!fir.complex<8>) -> f64

%5 = fir.extract_value %3, [1 : index] : (!fir.complex<8>) -> f64

%6 = fir.call @hypot(%4, %5) : (f64, f64) -> f64

%7 = arith.addf %2, %6 : f64

%8 = fir.load %arg0 : !fir.ref<f64>

%9 = fir.call @llvm.trunc.f64(%8) : (f64) -> f64

%10 = arith.addf %7, %9 : f64

%11 = fir.load %arg0 : !fir.ref<f64>

%12 = "llvm.intr.round"(%11) : (f64) -> f64

%13 = arith.addf %10, %12 : f64

%14 = fir.load %arg0 : !fir.ref<f64>

%15 = math.atan %14 : f64

%16 = arith.addf %13, %15 : f64

%17 = fir.load %arg0 : !fir.ref<f64>

%18 = fir.load %arg1 : !fir.ref<f64>

%19 = math.atan2 %17, %18 : f64

%20 = arith.addf %16, %19 : f64

%21 = fir.load %arg0 : !fir.ref<f64>

%22 = math.ceil %21 : f64

%23 = fir.convert %22 : (f64) -> i32

%24 = fir.convert %23 : (i32) -> f64

%25 = arith.addf %20, %24 : f64

%26 = fir.load %arg0 : !fir.ref<f64>

%27 = math.cos %26 : f64

%28 = arith.addf %25, %27 : f64

%29 = fir.load %arg0 : !fir.ref<f64>

%30 = math.erf %29 : f64

%31 = arith.addf %28, %30 : f64

%32 = fir.load %arg0 : !fir.ref<f64>

%33 = math.exp %32 : f64

%34 = arith.addf %31, %33 : f64

%35 = fir.load %arg0 : !fir.ref<f64>

%36 = math.floor %35 : f64

%37 = fir.convert %36 : (f64) -> i32

%38 = fir.convert %37 : (i32) -> f64

%39 = arith.addf %34, %38 : f64

%40 = fir.load %arg0 : !fir.ref<f64>

%41 = math.log %40 : f64

%42 = arith.addf %39, %41 : f64

%43 = fir.load %arg0 : !fir.ref<f64>

%44 = math.log10 %43 : f64

%45 = arith.addf %42, %44 : f64

%46 = fir.load %arg0 : !fir.ref<f64>

%47 = fir.call @llvm.lround.i32.f64(%46) : (f64) -> i32

%48 = fir.convert %47 : (i32) -> f64

%49 = arith.addf %45, %48 : f64

%50 = fir.load %arg0 : !fir.ref<f64>

%51 = fir.call @llvm.lround.i64.f64(%50) : (f64) -> i64

%52 = fir.convert %51 : (i64) -> f64

%53 = arith.addf %49, %52 : f64

%54 = fir.load %arg0 : !fir.ref<f64>

%55 = fir.load %arg3 : !fir.ref<i16>

%56 = fir.convert %55 : (i16) -> i32

%57 = fir.call @llvm.powi.f64.i32(%54, %56) : (f64, i32) -> f64

%58 = arith.addf %53, %57 : f64

%59 = fir.load %arg0 : !fir.ref<f64>

%60 = fir.load %arg1 : !fir.ref<f64>

%61 = math.powf %59, %60 : f64

%62 = arith.addf %58, %61 : f64

%63 = fir.load %arg0 : !fir.ref<f64>

%64 = fir.load %arg4 : !fir.ref<i32>

%65 = fir.call @llvm.powi.f64.i32(%63, %64) : (f64, i32) -> f64

%66 = arith.addf %62, %65 : f64

%67 = fir.load %arg0 : !fir.ref<f64>

%68 = fir.load %arg1 : !fir.ref<f64>

%69 = math.copysign %67, %68 : f64

%70 = arith.addf %66, %69 : f64

%71 = fir.load %arg0 : !fir.ref<f64>

%72 = math.sin %71 : f64

%73 = arith.addf %70, %72 : f64

%74 = fir.load %arg0 : !fir.ref<f64>

%75 = math.tanh %74 : f64

%76 = arith.addf %73, %75 : f64

fir.store %76 to %0 : !fir.ref<f64>

%77 = fir.load %0 : !fir.ref<f64>

return %77 : f64

}

func.func private @hypotf(f32, f32) -> f32

func.func private @llvm.trunc.f32(f32) -> f32

func.func private @llvm.lround.i32.f32(f32) -> i32

func.func private @llvm.lround.i64.f32(f32) -> i64

func.func private @llvm.powi.f32.i32(f32, i32) -> f32

func.func private @hypot(f64, f64) -> f64

func.func private @llvm.trunc.f64(f64) -> f64

func.func private @llvm.lround.i32.f64(f64) -> i32

func.func private @llvm.lround.i64.f64(f64) -> i64

func.func private @llvm.powi.f64.i32(f64, i32) -> f64

//--- relaxed

%0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"}

%1 = fir.load %arg0 : !fir.ref<f32>

%2 = math.abs %1 : f32

%3 = fir.load %arg2 : !fir.ref<!fir.complex<4>>

%4 = fir.extract_value %3, [0 : index] : (!fir.complex<4>) -> f32

%5 = fir.extract_value %3, [1 : index] : (!fir.complex<4>) -> f32

%6 = fir.call @hypotf(%4, %5) : (f32, f32) -> f32

%7 = arith.addf %2, %6 : f32

%8 = fir.load %arg0 : !fir.ref<f32>

%9 = fir.call @llvm.trunc.f32(%8) : (f32) -> f32

%10 = arith.addf %7, %9 : f32

%11 = fir.load %arg0 : !fir.ref<f32>

%12 = "llvm.intr.round"(%11) : (f32) -> f32

%13 = arith.addf %10, %12 : f32

%14 = fir.load %arg0 : !fir.ref<f32>

%15 = math.atan %14 : f32

%16 = arith.addf %13, %15 : f32

%17 = fir.load %arg0 : !fir.ref<f32>

%18 = fir.load %arg1 : !fir.ref<f32>

%19 = math.atan2 %17, %18 : f32

%20 = arith.addf %16, %19 : f32

%21 = fir.load %arg0 : !fir.ref<f32>

%22 = math.ceil %21 : f32

%23 = fir.convert %22 : (f32) -> i32

%24 = fir.convert %23 : (i32) -> f32

%25 = arith.addf %20, %24 : f32

%26 = fir.load %arg0 : !fir.ref<f32>

%27 = math.cos %26 : f32

%28 = arith.addf %25, %27 : f32

%29 = fir.load %arg0 : !fir.ref<f32>

%30 = math.erf %29 : f32

%31 = arith.addf %28, %30 : f32

%32 = fir.load %arg0 : !fir.ref<f32>

%33 = math.exp %32 : f32

%34 = arith.addf %31, %33 : f32

%35 = fir.load %arg0 : !fir.ref<f32>

%36 = math.floor %35 : f32

%37 = fir.convert %36 : (f32) -> i32

%38 = fir.convert %37 : (i32) -> f32

%39 = arith.addf %34, %38 : f32

%40 = fir.load %arg0 : !fir.ref<f32>

%41 = math.log %40 : f32

%42 = arith.addf %39, %41 : f32

%43 = fir.load %arg0 : !fir.ref<f32>

%44 = math.log10 %43 : f32

%45 = arith.addf %42, %44 : f32

%46 = fir.load %arg0 : !fir.ref<f32>

%47 = fir.call @llvm.lround.i32.f32(%46) : (f32) -> i32

%48 = fir.convert %47 : (i32) -> f32

%49 = arith.addf %45, %48 : f32

%50 = fir.load %arg0 : !fir.ref<f32>

%51 = fir.call @llvm.lround.i64.f32(%50) : (f32) -> i64

%52 = fir.convert %51 : (i64) -> f32

%53 = arith.addf %49, %52 : f32

%54 = fir.load %arg0 : !fir.ref<f32>

%55 = fir.load %arg3 : !fir.ref<i16>

%56 = fir.convert %55 : (i16) -> i32

%57 = fir.call @llvm.powi.f32.i32(%54, %56) : (f32, i32) -> f32

%58 = arith.addf %53, %57 : f32

%59 = fir.load %arg0 : !fir.ref<f32>

%60 = fir.load %arg1 : !fir.ref<f32>

%61 = math.powf %59, %60 : f32

%62 = arith.addf %58, %61 : f32

%63 = fir.load %arg0 : !fir.ref<f32>

%64 = fir.load %arg4 : !fir.ref<i32>

%65 = fir.call @llvm.powi.f32.i32(%63, %64) : (f32, i32) -> f32

%66 = arith.addf %62, %65 : f32

%67 = fir.load %arg0 : !fir.ref<f32>

%68 = fir.load %arg1 : !fir.ref<f32>

%69 = math.copysign %67, %68 : f32

%70 = arith.addf %66, %69 : f32

%71 = fir.load %arg0 : !fir.ref<f32>

%72 = math.sin %71 : f32

%73 = arith.addf %70, %72 : f32

%74 = fir.load %arg0 : !fir.ref<f32>

%75 = math.tanh %74 : f32

%76 = arith.addf %73, %75 : f32

fir.store %76 to %0 : !fir.ref<f32>

%77 = fir.load %0 : !fir.ref<f32>

return %77 : f32

}

%0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"}

%1 = fir.load %arg0 : !fir.ref<f64>

%2 = math.abs %1 : f64

%3 = fir.load %arg2 : !fir.ref<!fir.complex<8>>

%4 = fir.extract_value %3, [0 : index] : (!fir.complex<8>) -> f64

%5 = fir.extract_value %3, [1 : index] : (!fir.complex<8>) -> f64

%6 = fir.call @hypot(%4, %5) : (f64, f64) -> f64

%7 = arith.addf %2, %6 : f64

%8 = fir.load %arg0 : !fir.ref<f64>

%9 = fir.call @llvm.trunc.f64(%8) : (f64) -> f64

%10 = arith.addf %7, %9 : f64

%11 = fir.load %arg0 : !fir.ref<f64>

%12 = "llvm.intr.round"(%11) : (f64) -> f64

%13 = arith.addf %10, %12 : f64

%14 = fir.load %arg0 : !fir.ref<f64>

%15 = math.atan %14 : f64

%16 = arith.addf %13, %15 : f64

%17 = fir.load %arg0 : !fir.ref<f64>

%18 = fir.load %arg1 : !fir.ref<f64>

%19 = math.atan2 %17, %18 : f64

%20 = arith.addf %16, %19 : f64

%21 = fir.load %arg0 : !fir.ref<f64>

%22 = math.ceil %21 : f64

%23 = fir.convert %22 : (f64) -> i32

%24 = fir.convert %23 : (i32) -> f64

%25 = arith.addf %20, %24 : f64

%26 = fir.load %arg0 : !fir.ref<f64>

%27 = math.cos %26 : f64

%28 = arith.addf %25, %27 : f64

%29 = fir.load %arg0 : !fir.ref<f64>

%30 = math.erf %29 : f64

%31 = arith.addf %28, %30 : f64

%32 = fir.load %arg0 : !fir.ref<f64>

%33 = math.exp %32 : f64

%34 = arith.addf %31, %33 : f64

%35 = fir.load %arg0 : !fir.ref<f64>

%36 = math.floor %35 : f64

%37 = fir.convert %36 : (f64) -> i32

%38 = fir.convert %37 : (i32) -> f64

%39 = arith.addf %34, %38 : f64

%40 = fir.load %arg0 : !fir.ref<f64>

%41 = math.log %40 : f64

%42 = arith.addf %39, %41 : f64

%43 = fir.load %arg0 : !fir.ref<f64>

%44 = math.log10 %43 : f64

%45 = arith.addf %42, %44 : f64

%46 = fir.load %arg0 : !fir.ref<f64>

%47 = fir.call @llvm.lround.i32.f64(%46) : (f64) -> i32

%48 = fir.convert %47 : (i32) -> f64

%49 = arith.addf %45, %48 : f64

%50 = fir.load %arg0 : !fir.ref<f64>

%51 = fir.call @llvm.lround.i64.f64(%50) : (f64) -> i64

%52 = fir.convert %51 : (i64) -> f64

%53 = arith.addf %49, %52 : f64

%54 = fir.load %arg0 : !fir.ref<f64>

%55 = fir.load %arg3 : !fir.ref<i16>

%56 = fir.convert %55 : (i16) -> i32

%57 = fir.call @llvm.powi.f64.i32(%54, %56) : (f64, i32) -> f64

%58 = arith.addf %53, %57 : f64

%59 = fir.load %arg0 : !fir.ref<f64>

%60 = fir.load %arg1 : !fir.ref<f64>

%61 = math.powf %59, %60 : f64

%62 = arith.addf %58, %61 : f64

%63 = fir.load %arg0 : !fir.ref<f64>

%64 = fir.load %arg4 : !fir.ref<i32>

%65 = fir.call @llvm.powi.f64.i32(%63, %64) : (f64, i32) -> f64

%66 = arith.addf %62, %65 : f64

%67 = fir.load %arg0 : !fir.ref<f64>

%68 = fir.load %arg1 : !fir.ref<f64>

%69 = math.copysign %67, %68 : f64

%70 = arith.addf %66, %69 : f64

%71 = fir.load %arg0 : !fir.ref<f64>

%72 = math.sin %71 : f64

%73 = arith.addf %70, %72 : f64

%74 = fir.load %arg0 : !fir.ref<f64>

%75 = math.tanh %74 : f64

%76 = arith.addf %73, %75 : f64

fir.store %76 to %0 : !fir.ref<f64>

%77 = fir.load %0 : !fir.ref<f64>

return %77 : f64

}

func.func private @hypotf(f32, f32) -> f32

func.func private @llvm.trunc.f32(f32) -> f32

func.func private @llvm.lround.i32.f32(f32) -> i32

func.func private @llvm.lround.i64.f32(f32) -> i64

func.func private @llvm.powi.f32.i32(f32, i32) -> f32

func.func private @hypot(f64, f64) -> f64

func.func private @llvm.trunc.f64(f64) -> f64

func.func private @llvm.lround.i32.f64(f64) -> i32

func.func private @llvm.lround.i64.f64(f64) -> i64

func.func private @llvm.powi.f64.i32(f64, i32) -> f64

//--- precise

%0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"}

%1 = fir.load %arg0 : !fir.ref<f32>

%2 = fir.call @fabsf(%1) : (f32) -> f32

%3 = fir.load %arg2 : !fir.ref<!fir.complex<4>>

%4 = fir.extract_value %3, [0 : index] : (!fir.complex<4>) -> f32

%5 = fir.extract_value %3, [1 : index] : (!fir.complex<4>) -> f32

%6 = fir.call @hypotf(%4, %5) : (f32, f32) -> f32

%7 = arith.addf %2, %6 : f32

%8 = fir.load %arg0 : !fir.ref<f32>

%9 = fir.call @llvm.trunc.f32(%8) : (f32) -> f32

%10 = arith.addf %7, %9 : f32

%11 = fir.load %arg0 : !fir.ref<f32>

%12 = fir.call @llvm.round.f32(%11) : (f32) -> f32

%13 = arith.addf %10, %12 : f32

%14 = fir.load %arg0 : !fir.ref<f32>

%15 = fir.call @atanf(%14) : (f32) -> f32

%16 = arith.addf %13, %15 : f32

%17 = fir.load %arg0 : !fir.ref<f32>

%18 = fir.load %arg1 : !fir.ref<f32>

%19 = fir.call @atan2f(%17, %18) : (f32, f32) -> f32

%20 = arith.addf %16, %19 : f32

%21 = fir.load %arg0 : !fir.ref<f32>

%22 = fir.call @ceilf(%21) : (f32) -> f32

%23 = fir.convert %22 : (f32) -> i32

%24 = fir.convert %23 : (i32) -> f32

%25 = arith.addf %20, %24 : f32

%26 = fir.load %arg0 : !fir.ref<f32>

%27 = fir.call @cosf(%26) : (f32) -> f32

%28 = arith.addf %25, %27 : f32

%29 = fir.load %arg0 : !fir.ref<f32>

%30 = fir.call @erff(%29) : (f32) -> f32

%31 = arith.addf %28, %30 : f32

%32 = fir.load %arg0 : !fir.ref<f32>

%33 = fir.call @expf(%32) : (f32) -> f32

%34 = arith.addf %31, %33 : f32

%35 = fir.load %arg0 : !fir.ref<f32>

%36 = fir.call @floorf(%35) : (f32) -> f32

%37 = fir.convert %36 : (f32) -> i32

%38 = fir.convert %37 : (i32) -> f32

%39 = arith.addf %34, %38 : f32

%40 = fir.load %arg0 : !fir.ref<f32>

%41 = fir.call @logf(%40) : (f32) -> f32

%42 = arith.addf %39, %41 : f32

%43 = fir.load %arg0 : !fir.ref<f32>

%44 = fir.call @log10f(%43) : (f32) -> f32

%45 = arith.addf %42, %44 : f32

%46 = fir.load %arg0 : !fir.ref<f32>

%47 = fir.call @llvm.lround.i32.f32(%46) : (f32) -> i32

%48 = fir.convert %47 : (i32) -> f32

%49 = arith.addf %45, %48 : f32

%50 = fir.load %arg0 : !fir.ref<f32>

%51 = fir.call @llvm.lround.i64.f32(%50) : (f32) -> i64

%52 = fir.convert %51 : (i64) -> f32

%53 = arith.addf %49, %52 : f32

%54 = fir.load %arg0 : !fir.ref<f32>

%55 = fir.load %arg3 : !fir.ref<i16>

%56 = fir.convert %55 : (i16) -> i32

%57 = fir.call @llvm.powi.f32.i32(%54, %56) : (f32, i32) -> f32

%58 = arith.addf %53, %57 : f32

%59 = fir.load %arg0 : !fir.ref<f32>

%60 = fir.load %arg1 : !fir.ref<f32>

%61 = fir.call @powf(%59, %60) : (f32, f32) -> f32

%62 = arith.addf %58, %61 : f32

%63 = fir.load %arg0 : !fir.ref<f32>

%64 = fir.load %arg4 : !fir.ref<i32>

%65 = fir.call @llvm.powi.f32.i32(%63, %64) : (f32, i32) -> f32

%66 = arith.addf %62, %65 : f32

%67 = fir.load %arg0 : !fir.ref<f32>

%68 = fir.load %arg1 : !fir.ref<f32>

%69 = fir.call @copysignf(%67, %68) : (f32, f32) -> f32

%70 = arith.addf %66, %69 : f32

%71 = fir.load %arg0 : !fir.ref<f32>

%72 = fir.call @sinf(%71) : (f32) -> f32

%73 = arith.addf %70, %72 : f32

%74 = fir.load %arg0 : !fir.ref<f32>

%75 = fir.call @tanhf(%74) : (f32) -> f32

%76 = arith.addf %73, %75 : f32

fir.store %76 to %0 : !fir.ref<f32>

%77 = fir.load %0 : !fir.ref<f32>

return %77 : f32

}

%0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"}

%1 = fir.load %arg0 : !fir.ref<f64>

%2 = fir.call @fabs(%1) : (f64) -> f64

%3 = fir.load %arg2 : !fir.ref<!fir.complex<8>>

%4 = fir.extract_value %3, [0 : index] : (!fir.complex<8>) -> f64

%5 = fir.extract_value %3, [1 : index] : (!fir.complex<8>) -> f64

%6 = fir.call @hypot(%4, %5) : (f64, f64) -> f64

%7 = arith.addf %2, %6 : f64

%8 = fir.load %arg0 : !fir.ref<f64>

%9 = fir.call @llvm.trunc.f64(%8) : (f64) -> f64

%10 = arith.addf %7, %9 : f64

%11 = fir.load %arg0 : !fir.ref<f64>

%12 = fir.call @llvm.round.f64(%11) : (f64) -> f64

%13 = arith.addf %10, %12 : f64

%14 = fir.load %arg0 : !fir.ref<f64>

%15 = fir.call @atan(%14) : (f64) -> f64

%16 = arith.addf %13, %15 : f64

%17 = fir.load %arg0 : !fir.ref<f64>

%18 = fir.load %arg1 : !fir.ref<f64>

%19 = fir.call @atan2(%17, %18) : (f64, f64) -> f64

%20 = arith.addf %16, %19 : f64

%21 = fir.load %arg0 : !fir.ref<f64>

%22 = fir.call @ceil(%21) : (f64) -> f64

%23 = fir.convert %22 : (f64) -> i32

%24 = fir.convert %23 : (i32) -> f64

%25 = arith.addf %20, %24 : f64

%26 = fir.load %arg0 : !fir.ref<f64>

%27 = fir.call @cos(%26) : (f64) -> f64

%28 = arith.addf %25, %27 : f64

%29 = fir.load %arg0 : !fir.ref<f64>

%30 = fir.call @erf(%29) : (f64) -> f64

%31 = arith.addf %28, %30 : f64

%32 = fir.load %arg0 : !fir.ref<f64>

%33 = fir.call @exp(%32) : (f64) -> f64

%34 = arith.addf %31, %33 : f64

%35 = fir.load %arg0 : !fir.ref<f64>

%36 = fir.call @floor(%35) : (f64) -> f64

%37 = fir.convert %36 : (f64) -> i32

%38 = fir.convert %37 : (i32) -> f64

%39 = arith.addf %34, %38 : f64

%40 = fir.load %arg0 : !fir.ref<f64>

%41 = fir.call @log(%40) : (f64) -> f64

%42 = arith.addf %39, %41 : f64

%43 = fir.load %arg0 : !fir.ref<f64>

%44 = fir.call @log10(%43) : (f64) -> f64

%45 = arith.addf %42, %44 : f64

%46 = fir.load %arg0 : !fir.ref<f64>

%47 = fir.call @llvm.lround.i32.f64(%46) : (f64) -> i32

%48 = fir.convert %47 : (i32) -> f64

%49 = arith.addf %45, %48 : f64

%50 = fir.load %arg0 : !fir.ref<f64>

%51 = fir.call @llvm.lround.i64.f64(%50) : (f64) -> i64

%52 = fir.convert %51 : (i64) -> f64

%53 = arith.addf %49, %52 : f64

%54 = fir.load %arg0 : !fir.ref<f64>

%55 = fir.load %arg3 : !fir.ref<i16>

%56 = fir.convert %55 : (i16) -> i32

%57 = fir.call @llvm.powi.f64.i32(%54, %56) : (f64, i32) -> f64

%58 = arith.addf %53, %57 : f64

%59 = fir.load %arg0 : !fir.ref<f64>

%60 = fir.load %arg1 : !fir.ref<f64>

%61 = fir.call @pow(%59, %60) : (f64, f64) -> f64

%62 = arith.addf %58, %61 : f64

%63 = fir.load %arg0 : !fir.ref<f64>

%64 = fir.load %arg4 : !fir.ref<i32>

%65 = fir.call @llvm.powi.f64.i32(%63, %64) : (f64, i32) -> f64

%66 = arith.addf %62, %65 : f64

%67 = fir.load %arg0 : !fir.ref<f64>

%68 = fir.load %arg1 : !fir.ref<f64>

%69 = fir.call @copysign(%67, %68) : (f64, f64) -> f64

%70 = arith.addf %66, %69 : f64

%71 = fir.load %arg0 : !fir.ref<f64>

%72 = fir.call @sin(%71) : (f64) -> f64

%73 = arith.addf %70, %72 : f64

%74 = fir.load %arg0 : !fir.ref<f64>

%75 = fir.call @tanh(%74) : (f64) -> f64

%76 = arith.addf %73, %75 : f64

fir.store %76 to %0 : !fir.ref<f64>

%77 = fir.load %0 : !fir.ref<f64>

return %77 : f64

}

func.func private @fabsf(f32) -> f32

func.func private @hypotf(f32, f32) -> f32

func.func private @llvm.trunc.f32(f32) -> f32

func.func private @llvm.round.f32(f32) -> f32

func.func private @atanf(f32) -> f32

func.func private @atan2f(f32, f32) -> f32

func.func private @ceilf(f32) -> f32

func.func private @cosf(f32) -> f32

func.func private @erff(f32) -> f32

func.func private @expf(f32) -> f32

func.func private @floorf(f32) -> f32

func.func private @logf(f32) -> f32

func.func private @log10f(f32) -> f32

func.func private @llvm.lround.i32.f32(f32) -> i32

func.func private @llvm.lround.i64.f32(f32) -> i64

func.func private @llvm.powi.f32.i32(f32, i32) -> f32

func.func private @powf(f32, f32) -> f32

func.func private @copysignf(f32, f32) -> f32

func.func private @sinf(f32) -> f32

func.func private @tanhf(f32) -> f32

func.func private @fabs(f64) -> f64

func.func private @hypot(f64, f64) -> f64

func.func private @llvm.trunc.f64(f64) -> f64

func.func private @llvm.round.f64(f64) -> f64

func.func private @atan(f64) -> f64

func.func private @atan2(f64, f64) -> f64

func.func private @ceil(f64) -> f64

func.func private @cos(f64) -> f64

func.func private @erf(f64) -> f64

func.func private @exp(f64) -> f64

func.func private @floor(f64) -> f64

func.func private @log(f64) -> f64

func.func private @log10(f64) -> f64

func.func private @llvm.lround.i32.f64(f64) -> i32

func.func private @llvm.lround.i64.f64(f64) -> i64

func.func private @llvm.powi.f64.i32(f64, i32) -> f64

func.func private @pow(f64, f64) -> f64

func.func private @copysign(f64, f64) -> f64

func.func private @sin(f64) -> f64

func.func private @tanh(f64) -> f64

flang/test/Lower/Intrinsics/exp.f90

	! RUN: bbc -emit-fir %s -o - \| FileCheck %s			! RUN: bbc -emit-fir -outline-intrinsics %s -o - \| FileCheck %s
	! RUN: %flang_fc1 -emit-fir %s -o - \| FileCheck %s			! RUN: %flang_fc1 -emit-fir -mllvm -outline-intrinsics %s -o - \| FileCheck %s

	! CHECK-LABEL: exp_testr			! CHECK-LABEL: exp_testr
	! CHECK-SAME: (%[[AREF:.]]: !fir.ref<f32> {{.}}, %[[BREF:.]]: !fir.ref<f32> {{.}})			! CHECK-SAME: (%[[AREF:.]]: !fir.ref<f32> {{.}}, %[[BREF:.]]: !fir.ref<f32> {{.}})
	subroutine exp_testr(a, b)			subroutine exp_testr(a, b)
	real :: a, b			real :: a, b
	! CHECK: %[[A:.]] = fir.load %[[AREF:.]] : !fir.ref<f32>			! CHECK: %[[A:.]] = fir.load %[[AREF:.]] : !fir.ref<f32>
	! CHECK: %[[RES:.*]] = fir.call @fir.exp.f32.f32(%[[A]]) : (f32) -> f32			! CHECK: %[[RES:.*]] = fir.call @fir.exp.f32.f32(%[[A]]) : (f32) -> f32
	! CHECK: fir.store %[[RES]] to %[[BREF]] : !fir.ref<f32>			! CHECK: fir.store %[[RES]] to %[[BREF]] : !fir.ref<f32>
	▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

flang/test/Lower/Intrinsics/log.f90

	! RUN: bbc -emit-fir %s -o - \| FileCheck %s			! RUN: bbc -emit-fir -outline-intrinsics %s -o - \| FileCheck %s
	! RUN: %flang_fc1 -emit-fir %s -o - \| FileCheck %s			! RUN: %flang_fc1 -emit-fir -mllvm -outline-intrinsics %s -o - \| FileCheck %s

	! CHECK-LABEL: log_testr			! CHECK-LABEL: log_testr
	! CHECK-SAME: (%[[AREF:.]]: !fir.ref<f32> {{.}}, %[[BREF:.]]: !fir.ref<f32> {{.}})			! CHECK-SAME: (%[[AREF:.]]: !fir.ref<f32> {{.}}, %[[BREF:.]]: !fir.ref<f32> {{.}})
	subroutine log_testr(a, b)			subroutine log_testr(a, b)
	real :: a, b			real :: a, b
	! CHECK: %[[A:.]] = fir.load %[[AREF:.]] : !fir.ref<f32>			! CHECK: %[[A:.]] = fir.load %[[AREF:.]] : !fir.ref<f32>
	! CHECK: %[[RES:.*]] = fir.call @fir.log.f32.f32(%[[A]]) : (f32) -> f32			! CHECK: %[[RES:.*]] = fir.call @fir.log.f32.f32(%[[A]]) : (f32) -> f32
	! CHECK: fir.store %[[RES]] to %[[BREF]] : !fir.ref<f32>			! CHECK: fir.store %[[RES]] to %[[BREF]] : !fir.ref<f32>
	▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines

flang/test/Lower/Intrinsics/math-runtime-options.f90

	! RUN: bbc -emit-fir --math-runtime=fast %s -o - \| FileCheck %s --check-prefixes="FIR,FAST"			! RUN: bbc -emit-fir --math-runtime=fast -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,FAST"
	! RUN: bbc -emit-fir --math-runtime=relaxed %s -o - \| FileCheck %s --check-prefixes="FIR,RELAXED"			! RUN: %flang_fc1 -emit-fir -mllvm -math-runtime=fast -mllvm -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,FAST"
	! RUN: bbc -emit-fir --math-runtime=precise %s -o - \| FileCheck %s --check-prefixes="FIR,PRECISE"			! RUN: bbc -emit-fir --math-runtime=relaxed -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,RELAXED"
	! RUN: bbc -emit-fir --math-runtime=llvm %s -o - \| FileCheck %s --check-prefixes="FIR,LLVM"			! RUN: %flang_fc1 -emit-fir -mllvm -math-runtime=relaxed -mllvm -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,RELAXED"
				! RUN: bbc -emit-fir --math-runtime=precise -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,PRECISE"
				! RUN: %flang_fc1 -emit-fir -mllvm -math-runtime=precise -mllvm -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,PRECISE"
				! RUN: bbc -emit-fir --math-runtime=llvm -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,LLVM"
				! RUN: %flang_fc1 -emit-fir -mllvm -math-runtime=llvm -mllvm -outline-intrinsics %s -o - \| FileCheck %s --check-prefixes="FIR,LLVM"

	! CHECK-LABEL: cos_testr			! CHECK-LABEL: cos_testr
	subroutine cos_testr(a, b)			subroutine cos_testr(a, b)
	real :: a, b			real :: a, b
	! FIR: fir.call @fir.cos.f32.f32			! FIR: fir.call @fir.cos.f32.f32
	b = cos(a)			b = cos(a)
	end subroutine			end subroutine

	Show All 17 Lines

flang/test/Lower/late-math-lowering.f90

This file was added.

! RUN: bbc -emit-fir %s -o - --math-lowering=late --math-runtime=fast | FileCheck --check-prefixes=ALL,FAST %s

! RUN: %flang_fc1 -emit-fir -mllvm -math-lowering=late -mllvm -math-runtime=fast %s -o - | FileCheck --check-prefixes=ALL,FAST %s

! 'relaxed' matches 'fast' exactly right now, but this will change:

! RUN: bbc -emit-fir %s -o - --math-lowering=late --math-runtime=relaxed | FileCheck --check-prefixes=ALL,RELAXED %s

! RUN: %flang_fc1 -emit-fir -mllvm -math-lowering=late -mllvm -math-runtime=relaxed %s -o - | FileCheck --check-prefixes=ALL,RELAXED %s

! RUN: bbc -emit-fir %s -o - --math-lowering=late --math-runtime=precise | FileCheck --check-prefixes=ALL,PRECISE %s

! RUN: %flang_fc1 -emit-fir -mllvm -math-lowering=late -mllvm -math-runtime=precise %s -o - | FileCheck --check-prefixes=ALL,PRECISE %s

! ALL-LABEL: @_QPtest_real4

! FAST: {{%[A-Za-z0-9._]+}} = math.abs {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.abs {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @fabsf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @hypotf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.trunc.f32({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @llvm.round.f32({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.atan {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.atan {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.atan2 {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.atan2 {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.ceil {{%[A-Za-z0-9._]+}} : f32

awarzynskiUnsubmitted

Not Done

hypotf is for abs(c), right? Why not move it to a dedicated function/test? (e.g. test_complex(c))

awarzynski: `hypotf` is for `abs(c)`, right? Why not move it to a dedicated function/test? (e.g.

vzakhariAuthorUnsubmitted

Done

Makes sense. Fixed.

vzakhari: Makes sense. Fixed.

! RELAXED: {{%[A-Za-z0-9._]+}} = math.ceil {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @ceilf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.cos {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.cos {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @cosf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.erf {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.erf {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @erff({{%[A-Za-z0-9._]+}}) : (f32) -> f32

awarzynskiUnsubmitted

Not Done

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @hypotf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

- function test_real8(x, y, c, s, i)

- real(8) :: x, y, test_real8

+ function test_real8(x, c)

+ real(8) :: x, test_real8

complex(8) :: c

- integer(2) :: s

- integer(4) :: i

test_real8 = abs(x) + abs(c)

end function

! ALL-LABEL: @_QPtest_real8

I would remove the code that's not needed for this particular test. Similar suggestion for other tests.

awarzynski: I would remove the code that's not needed for this particular test. Similar suggestion for…

vzakhariAuthorUnsubmitted

Done

Fixed.

vzakhari: Fixed.

! FAST: {{%[A-Za-z0-9._]+}} = math.exp {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.exp {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @expf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.floor {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.floor {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @floorf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.log {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.log {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @logf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.log10 {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.log10 {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @log10f({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.lround.i32.f32({{%[A-Za-z0-9._]+}}) : (f32) -> i32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.lround.i64.f32({{%[A-Za-z0-9._]+}}) : (f32) -> i64

! ALL: [[STOI:%[A-Za-z0-9._]+]] = fir.convert {{%[A-Za-z0-9._]+}} : (i16) -> i32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.powi.f32.i32({{%[A-Za-z0-9._]+}}, [[STOI]]) : (f32, i32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.powf {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.powf {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @powf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.powi.f32.i32({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, i32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.copysign {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.copysign {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @copysignf({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.sin {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.sin {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @sinf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

! FAST: {{%[A-Za-z0-9._]+}} = math.tanh {{%[A-Za-z0-9._]+}} : f32

! RELAXED: {{%[A-Za-z0-9._]+}} = math.tanh {{%[A-Za-z0-9._]+}} : f32

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @tanhf({{%[A-Za-z0-9._]+}}) : (f32) -> f32

function test_real4(x, y, c, s, i)

real :: x, y, test_real4

complex(4) :: c

integer(2) :: s

integer(4) :: i

test_real4 = abs(x) + abs(c) + aint(x) + anint(x) + atan(x) + atan2(x, y) + &

ceiling(x) + cos(x) + erf(x) + exp(x) + floor(x) + log(x) + log10(x) + &

nint(x, 4) + nint(x, 8) + x ** s + x ** y + x ** i + sign(x, y) + &

sin(x) + tanh(x)

end function

! ALL-LABEL: @_QPtest_real8

! FAST: {{%[A-Za-z0-9._]+}} = math.abs {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.abs {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @fabs({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @hypot({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.trunc.f64({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! RELAXED: {{%[A-Za-z0-9._]+}} = "llvm.intr.round"({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @llvm.round.f64({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.atan {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.atan {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.atan2 {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.atan2 {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.ceil {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.ceil {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @ceil({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.cos {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.cos {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @cos({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.erf {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.erf {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @erf({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.exp {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.exp {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @exp({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.floor {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.floor {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @floor({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.log {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.log {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @log({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.log10 {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.log10 {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @log10({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.lround.i32.f64({{%[A-Za-z0-9._]+}}) : (f64) -> i32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.lround.i64.f64({{%[A-Za-z0-9._]+}}) : (f64) -> i64

! ALL: [[STOI:%[A-Za-z0-9._]+]] = fir.convert {{%[A-Za-z0-9._]+}} : (i16) -> i32

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.powi.f64.i32({{%[A-Za-z0-9._]+}}, [[STOI]]) : (f64, i32) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.powf {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.powf {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @pow({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

! ALL: {{%[A-Za-z0-9._]+}} = fir.call @llvm.powi.f64.i32({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, i32) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.copysign {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.copysign {{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @copysign({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.sin {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.sin {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @sin({{%[A-Za-z0-9._]+}}) : (f64) -> f64

! FAST: {{%[A-Za-z0-9._]+}} = math.tanh {{%[A-Za-z0-9._]+}} : f64

! RELAXED: {{%[A-Za-z0-9._]+}} = math.tanh {{%[A-Za-z0-9._]+}} : f64

! PRECISE: {{%[A-Za-z0-9._]+}} = fir.call @tanh({{%[A-Za-z0-9._]+}}) : (f64) -> f64

function test_real8(x, y, c, s, i)

real(8) :: x, y, test_real8

complex(8) :: c

integer(2) :: s

integer(4) :: i

test_real8 = abs(x) + abs(c) + aint(x) + anint(x) + atan(x) + atan2(x, y) + &

ceiling(x) + cos(x) + erf(x) + exp(x) + floor(x) + log(x) + log10(x) + &

nint(x, 4) + nint(x, 8) + x ** s + x ** y + x ** i + sign(x, y) + &

sin(x) + tanh(x)

end function

flang/test/Lower/llvm-math.f90

	! RUN: bbc -emit-fir %s -o - --math-runtime=llvm \| FileCheck %s			! RUN: bbc -emit-fir %s -o - --math-runtime=llvm --outline-intrinsics \| FileCheck %s
				! RUN: %flang_fc1 -emit-fir -mllvm -math-runtime=llvm -mllvm -outline-intrinsics %s -o - \| FileCheck %s

	SUBROUTINE POW_WRAPPER(IN, IN2, OUT)			SUBROUTINE POW_WRAPPER(IN, IN2, OUT)
	DOUBLE PRECISION IN, IN2			DOUBLE PRECISION IN, IN2
	OUT = IN ** IN2			OUT = IN ** IN2
	RETURN			RETURN
	END			END

	! CHECK-LABEL: func @_QPpow_wrapper(			! CHECK-LABEL: func @_QPpow_wrapper(
	▲ Show 20 Lines • Show All 209 Lines • Show Last 20 Lines

flang/test/Lower/sqrt.f90

	! RUN: bbc -emit-fir %s -o - \| FileCheck %s			! RUN: bbc -emit-fir -outline-intrinsics %s -o - \| FileCheck %s
	! RUN: %flang_fc1 -emit-fir %s -o - \| FileCheck %s			! RUN: %flang_fc1 -emit-fir -mllvm -outline-intrinsics %s -o - \| FileCheck %s

	! CHECK-LABEL: sqrt_testr			! CHECK-LABEL: sqrt_testr
	subroutine sqrt_testr(a, b)			subroutine sqrt_testr(a, b)
	real :: a, b			real :: a, b
	! CHECK: fir.call @fir.sqrt.f32.f32			! CHECK: fir.call @fir.sqrt.f32.f32
	b = sqrt(a)			b = sqrt(a)
	end subroutine			end subroutine

	Show All 32 Lines

flang/test/Lower/trigonometric-intrinsics.f90

	! RUN: bbc -emit-fir %s -o - \| FileCheck %s			! RUN: bbc -emit-fir -outline-intrinsics %s -o - \| FileCheck %s
	! RUN: %flang_fc1 -emit-fir %s -o - \| FileCheck %s			! RUN: %flang_fc1 -emit-fir -mllvm -outline-intrinsics %s -o - \| FileCheck %s

	! CHECK-LABEL: atan_testr			! CHECK-LABEL: atan_testr
	subroutine atan_testr(a, b)			subroutine atan_testr(a, b)
	real :: a, b			real :: a, b
	! CHECK: fir.call @fir.atan.f32.f32			! CHECK: fir.call @fir.atan.f32.f32
	b = atan(a)			b = atan(a)
	end subroutine			end subroutine

	▲ Show 20 Lines • Show All 192 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[flang] Lower Fortran math intrinsic operations into MLIR ops or libm calls.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 439858

flang/include/flang/Optimizer/Support/InitFIR.h

flang/lib/Lower/IntrinsicCall.cpp

flang/lib/Optimizer/CodeGen/CMakeLists.txt

flang/lib/Optimizer/CodeGen/CodeGen.cpp

flang/test/Intrinsics/late-math-codegen.fir

flang/test/Lower/Intrinsics/exp.f90

flang/test/Lower/Intrinsics/log.f90

flang/test/Lower/Intrinsics/math-runtime-options.f90

flang/test/Lower/late-math-lowering.f90

flang/test/Lower/llvm-math.f90

flang/test/Lower/sqrt.f90

flang/test/Lower/trigonometric-intrinsics.f90

[flang] Lower Fortran math intrinsic operations into MLIR ops or libm calls.
ClosedPublic