This patch introduces an option to enable conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need to link libxlopt.a library.
This patch is tested on PowerPC Linux and AIX.
Paths
| Differential D101759
[PowerPC] Scalar IBM MASS library conversion pass ClosedPublic Authored by masoud.ataei on May 3 2021, 7:46 AM.
Details Summary This patch introduces an option to enable conversions from math function calls This patch is tested on PowerPC Linux and AIX.
Diff Detail
Event TimelineHerald added subscribers: steven.zhang, shchenz, kbarton and 3 others. · View Herald TranscriptMay 3 2021, 7:46 AM
Comment Actions Sorry it took me so long to update this patch -- I think I addressed all reviews till now.
Comment Actions Removed clang changes from this PR. Adding another PR for clang changes on approx-func option. masoud.ataei added inline comments.
Comment Actions Do we *really* need -enable-approx-func-fp-math? Comment Actions
I am handling LLVM intrinsic math functions in PPCISelLowering.cpp, so I need to check for TM.Options.ApproxFuncFPMath. This is the only place that I think I need it. Comment Actions
How is this going to work e.g. in LTO when not all TU's are compiled with fast-math flags? I'm not familiar with those llc flags, but i'm quite sure that e.g. DAGCombiner
Comment Actions I'm not familiar with this library, and I haven't looked at current state of how we enable/map optional libs in a while... Comment Actions errno handling for math library functions is a mess. Currently, we don't model it properly; we just mark the calls "readnone" and hope for the best. If you don't want to fix that, just check for readnone for now. I don't think we want to be querying function attributes or options here; afn plus enabling MASS should be enough. The function attributes are the old mechanism; we just haven't completely migrated some parts of SelectionDAG yet.
Comment Actions
I think using readnone would work fine. It seems that clang marks math functions with that attribute when -fno-math-errno is in effect. To get the non-finite MASS lowerings at -O3 one would have to compile with both -fapprox-func and -fno-math-errno, which seems reasonable to me.
I agree. I think the problem is that this patch is trying to decide on a global lowering strategy for llvm.* math intrinsics in llvm/lib/Target/PowerPC/PPCISelLowering.cpp but such global decision making does not go well with finer granularity of fast-math flags. My understanding is that the reason we need to handle intrinsic math functions later is because of strength-reduction transformations like pow(x,0.5) --> sqrt(x) that currently operate on intrinsic calls only. If we could apply those operations on things like __xl_pow_finite and produce calls to __xl_sqrt_finite then we wouldn't have this problem. Another possibility might be to have two versions of PPCGenScalarMASSEntries one that handles non-intrinsics and runs earlier, and another one that handles intrinsics after transformations likes pow(x,0.5) --> sqrt(x) are done. Comment Actions
Hmm. Instead of using setLibcallName() and letting the legalizer generate the calls, it should be possible to use custom lowering to generate the appropriate calls, at the cost of writing a little more code.
instcombine should be primarily responsible for this sort of optimization. See LibCallSimplifier::optimizePow. I guess a few transforms (D51630 etc.) landed in DAGCombine; probably we could move them earlier. Comment Actions As suggested before, I removed dependency to the global option to convert math functions to MASS for all intrinsic and non-intrinsic functions.
Comment Actions This update will fix the type of arguments passing to the converted math function in PPCISelLowing.cpp.
masoud.ataei added inline comments.
Comment Actions Apart from some minor inline comments this revision addresses all my outstanding comments. LGTM.
This revision is now accepted and ready to land.Feb 1 2022, 11:21 AM
Revision Contents
Diff 359385 llvm/include/llvm/Analysis/ScalarFuncs.def
llvm/include/llvm/CodeGen/CommandFlags.h
llvm/include/llvm/Target/TargetOptions.h
llvm/lib/CodeGen/CommandFlags.cpp
llvm/lib/Target/PowerPC/CMakeLists.txt
llvm/lib/Target/PowerPC/PPC.h
llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
llvm/lib/Target/PowerPC/PPCTargetMachine.cpp
llvm/test/CodeGen/PowerPC/lower-intrinsics-afn-mass.ll
llvm/test/CodeGen/PowerPC/lower-intrinsics-fast-mass.ll
llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll
llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll
llvm/test/CodeGen/PowerPC/lower-scalar-mass-afn.ll
llvm/test/CodeGen/PowerPC/lower-scalar-mass-fast.ll
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-afn.ll
llvm/test/CodeGen/PowerPC/pow-025-075-intrinsic-scalar-mass-fast.ll
llvm/test/CodeGen/PowerPC/pow-025-075-nointrinsic-scalar-mass-fast.ll
|
[nit] ISelLowing -> PPCISelLowering