Page MenuHomePhabricator

masoud.ataei (Masoud Ataei)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 9 2018, 8:48 AM (196 w, 2 d)

Recent Activity

Jul 11 2022

masoud.ataei added a reviewer for D128222: [PowerPC] Activate MASSV for Linux P10 and add support for finite math calls : rzurob.
Jul 11 2022, 6:15 AM · Restricted Project, Restricted Project, Restricted Project

Jul 6 2022

masoud.ataei committed rGfe06b9f02ccd: Bringing back the test with the required target related to commit… (authored by masoud.ataei).
Bringing back the test with the required target related to commit…
Jul 6 2022, 1:05 PM · Restricted Project, Restricted Project
masoud.ataei added a comment to D128653: [PowerPC] Fix the check for scalar MASS conversion.

You likely need // REQUIRES: powerpc-registered-target in the top of the test, as -enable-ppc-gen-scalar-mass is only present if the PowerPC target has been compiled into LLVM.

Jul 6 2022, 12:57 PM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
masoud.ataei added a comment to D128653: [PowerPC] Fix the check for scalar MASS conversion.

Just took it down temporarily. -- Looking at it...

Jul 6 2022, 12:43 PM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
masoud.ataei committed rGd34315e71adf: Removing this test temporarily beacuse of a failure in x86_64 (authored by masoud.ataei).
Removing this test temporarily beacuse of a failure in x86_64
Jul 6 2022, 12:42 PM · Restricted Project, Restricted Project
masoud.ataei closed D128653: [PowerPC] Fix the check for scalar MASS conversion.
Jul 6 2022, 11:45 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
masoud.ataei committed rG96515df816eb: [PowerPC] Fix the check for scalar MASS conversion (authored by masoud.ataei).
[PowerPC] Fix the check for scalar MASS conversion
Jul 6 2022, 11:45 AM · Restricted Project, Restricted Project, Restricted Project

Jun 29 2022

masoud.ataei updated the diff for D128653: [PowerPC] Fix the check for scalar MASS conversion.

Update the test

Jun 29 2022, 2:25 PM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Jun 28 2022

masoud.ataei updated the diff for D128653: [PowerPC] Fix the check for scalar MASS conversion.

Test updated.

Jun 28 2022, 7:48 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Jun 27 2022

masoud.ataei updated the summary of D128653: [PowerPC] Fix the check for scalar MASS conversion.
Jun 27 2022, 11:41 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
masoud.ataei updated the diff for D128653: [PowerPC] Fix the check for scalar MASS conversion.

Add more tests.

Jun 27 2022, 11:30 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
masoud.ataei requested review of D128653: [PowerPC] Fix the check for scalar MASS conversion.
Jun 27 2022, 8:18 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Jun 20 2022

masoud.ataei requested review of D128222: [PowerPC] Activate MASSV for Linux P10 and add support for finite math calls .
Jun 20 2022, 1:29 PM · Restricted Project, Restricted Project, Restricted Project

Mar 8 2022

masoud.ataei closed D121016: [PowerPC] Fix the none tail call in scalar MASS conversion.
Mar 8 2022, 9:16 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei committed rG30f30e1c12fa: [PowerPC] Fix the none tail call in scalar MASS conversion (authored by masoud.ataei).
[PowerPC] Fix the none tail call in scalar MASS conversion
Mar 8 2022, 9:02 AM · Restricted Project
masoud.ataei added inline comments to D121016: [PowerPC] Fix the none tail call in scalar MASS conversion.
Mar 8 2022, 7:45 AM · Restricted Project, Restricted Project, Restricted Project

Mar 7 2022

masoud.ataei added reviewers for D121016: [PowerPC] Fix the none tail call in scalar MASS conversion: jsji, Whitney.
Mar 7 2022, 10:45 AM · Restricted Project, Restricted Project, Restricted Project

Mar 4 2022

masoud.ataei requested review of D121016: [PowerPC] Fix the none tail call in scalar MASS conversion.
Mar 4 2022, 11:44 AM · Restricted Project, Restricted Project, Restricted Project

Feb 4 2022

masoud.ataei committed rG8ce13bc93be4: [PowerPC] Option controling scalar MASS convertion (authored by masoud.ataei).
[PowerPC] Option controling scalar MASS convertion
Feb 4 2022, 1:28 PM
masoud.ataei requested review of D119035: [PowerPC] Option controling scalar MASS convertion.
Feb 4 2022, 12:57 PM · Restricted Project

Feb 2 2022

masoud.ataei committed rG70066dd0e855: [PowerPC] Fixing buildbod failure ppc64le-lld-multistage-test (authored by masoud.ataei).
[PowerPC] Fixing buildbod failure ppc64le-lld-multistage-test
Feb 2 2022, 10:30 AM
masoud.ataei closed D101759: [PowerPC] Scalar IBM MASS library conversion pass.
Feb 2 2022, 8:35 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei committed rG256d2533322c: [PowerPC] Scalar IBM MASS library conversion pass (authored by masoud.ataei).
[PowerPC] Scalar IBM MASS library conversion pass
Feb 2 2022, 7:55 AM

Jan 28 2022

masoud.ataei added a comment to D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Ready for another round of review.

Jan 28 2022, 10:49 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Changing function name: lowerLibCall() -> lowerLibCallType()

Jan 28 2022, 10:47 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Fix test cases.

Jan 28 2022, 10:26 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei added inline comments to D101759: [PowerPC] Scalar IBM MASS library conversion pass.
Jan 28 2022, 10:25 AM · Restricted Project, Restricted Project, Restricted Project

Jan 24 2022

masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

This update will fix the type of arguments passing to the converted math function in PPCISelLowing.cpp.

Jan 24 2022, 6:59 AM · Restricted Project, Restricted Project, Restricted Project

Jan 19 2022

masoud.ataei closed D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.
Jan 19 2022, 8:08 AM · Restricted Project
masoud.ataei committed rGd261660af96d: Fix the use of -fno-approx-func along with -Ofast or -ffast-math (authored by masoud.ataei).
Fix the use of -fno-approx-func along with -Ofast or -ffast-math
Jan 19 2022, 8:06 AM

Jan 11 2022

masoud.ataei added a comment to D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.

A gentle reminder for reviewers. -- This patch will fix the bug reported here: https://bugs.llvm.org/show_bug.cgi?id=52565

Jan 11 2022, 9:19 AM · Restricted Project

Jan 10 2022

masoud.ataei abandoned D110288: Move pow transformations to sqrt/cbrt to earlier in the compiler pipeline.

This PR is created in support for IBM scalar MASS library pass, but with new approach of handling llvm intrinsic math functions on https://reviews.llvm.org/D101759. There is no need for this PR.

Jan 10 2022, 8:21 AM · Restricted Project

Jan 7 2022

masoud.ataei added inline comments to D101759: [PowerPC] Scalar IBM MASS library conversion pass.
Jan 7 2022, 10:54 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

As suggested before, I removed dependency to the global option to convert math functions to MASS for all intrinsic and non-intrinsic functions.
The main changes here with respect to the last proposal is in PPCIselLowing.cpp file, about how to handle llvm intrinsic math function.

Jan 7 2022, 10:50 AM · Restricted Project, Restricted Project, Restricted Project

Dec 9 2021

masoud.ataei added inline comments to D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.
Dec 9 2021, 11:41 AM · Restricted Project

Dec 7 2021

masoud.ataei added a comment to D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.

Gentle reminder for reviewers. -- This PR is ready for review.

Dec 7 2021, 7:38 AM · Restricted Project

Nov 25 2021

masoud.ataei added inline comments to D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.
Nov 25 2021, 9:29 AM · Restricted Project
masoud.ataei updated the diff for D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.

Updated the test combining -ffast-math and -fno-approx-func options.
@andrew.w.kaylor I hope you don't mind that I put those tests on clang/test/Driver/fast-math.c instead.

Nov 25 2021, 9:15 AM · Restricted Project

Nov 24 2021

masoud.ataei requested review of D114564: Fix the use of -fno-approx-func along with -Ofast or -ffast-math.
Nov 24 2021, 2:01 PM · Restricted Project

Oct 18 2021

masoud.ataei added a comment to D110288: Move pow transformations to sqrt/cbrt to earlier in the compiler pipeline.

A gentle reminder for reviewers.

Oct 18 2021, 7:59 AM · Restricted Project

Oct 8 2021

masoud.ataei closed D106191: [clang] Option control afn flag.
Oct 8 2021, 11:54 AM · Restricted Project, Restricted Project
masoud.ataei committed rGb0f68791f0ad: [clang] Option control afn flag (authored by masoud.ataei).
[clang] Option control afn flag
Oct 8 2021, 11:27 AM
masoud.ataei updated the diff for D106191: [clang] Option control afn flag.

Update the documentation.

Oct 8 2021, 9:39 AM · Restricted Project, Restricted Project
masoud.ataei added inline comments to D106191: [clang] Option control afn flag.
Oct 8 2021, 8:26 AM · Restricted Project, Restricted Project

Oct 7 2021

masoud.ataei updated the diff for D106191: [clang] Option control afn flag.
Oct 7 2021, 2:06 PM · Restricted Project, Restricted Project
masoud.ataei added inline comments to D106191: [clang] Option control afn flag.
Oct 7 2021, 1:38 PM · Restricted Project, Restricted Project
masoud.ataei updated the diff for D106191: [clang] Option control afn flag.

Update the documentation.

Oct 7 2021, 10:43 AM · Restricted Project, Restricted Project
masoud.ataei updated the diff for D106191: [clang] Option control afn flag.

Description and driver test are added.

Oct 7 2021, 10:31 AM · Restricted Project, Restricted Project

Oct 6 2021

masoud.ataei added a comment to D106191: [clang] Option control afn flag.

Reminder for reviewers.

Oct 6 2021, 2:49 PM · Restricted Project, Restricted Project
masoud.ataei added inline comments to D110288: Move pow transformations to sqrt/cbrt to earlier in the compiler pipeline.
Oct 6 2021, 2:47 PM · Restricted Project
masoud.ataei updated the diff for D110288: Move pow transformations to sqrt/cbrt to earlier in the compiler pipeline.

Ready for review.

Oct 6 2021, 2:45 PM · Restricted Project

Sep 28 2021

masoud.ataei added a comment to D106191: [clang] Option control afn flag.

Sorry that it took me so long to reply reviews. Thank you for reviewing this patch.

Sep 28 2021, 9:41 AM · Restricted Project, Restricted Project

Sep 22 2021

masoud.ataei requested review of D110288: Move pow transformations to sqrt/cbrt to earlier in the compiler pipeline.
Sep 22 2021, 1:27 PM · Restricted Project

Aug 26 2021

masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Removing dependency to the global option to convert math functions to MASS.

Aug 26 2021, 2:22 PM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei added a comment to D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Do we *really* need -enable-approx-func-fp-math?
I'm pretty sure we are moving away from such global options, onto relying only on the per-instruction fast-math flags.

Aug 26 2021, 8:22 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei added a comment to D106191: [clang] Option control afn flag.

Making a new option mapped to another float op flag looks reasonable, but is there any clearer motivation for this? (such as the need for -Ofast -fno-approx-func)

Aug 26 2021, 7:30 AM · Restricted Project, Restricted Project

Jul 26 2021

masoud.ataei closed D106678: [PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX.
Jul 26 2021, 4:27 PM · Restricted Project, Restricted Project
masoud.ataei committed rG45951ad3231c: [PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX (authored by masoud.ataei).
[PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX
Jul 26 2021, 4:23 PM
masoud.ataei added inline comments to D106678: [PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX.
Jul 26 2021, 10:45 AM · Restricted Project, Restricted Project

Jul 23 2021

masoud.ataei requested review of D106678: [PowerPC] Add pwr7 and pwr10 support to IBM MASSV pass on AIX.
Jul 23 2021, 9:17 AM · Restricted Project, Restricted Project

Jul 22 2021

masoud.ataei added a comment to D106091: [PowerPC] Improve error message on MASSV pass.

Seems patch is committed in https://reviews.llvm.org/rGee2068b30ecf297c004c06eedd7e1063c67a279c, revision should be close now?

Jul 22 2021, 8:35 AM · Restricted Project, Restricted Project

Jul 16 2021

masoud.ataei updated the diff for D106191: [clang] Option control afn flag.

Remove extra function deceleration.

Jul 16 2021, 2:30 PM · Restricted Project, Restricted Project
masoud.ataei requested review of D106191: [clang] Option control afn flag.
Jul 16 2021, 2:14 PM · Restricted Project, Restricted Project
masoud.ataei added inline comments to D101759: [PowerPC] Scalar IBM MASS library conversion pass.
Jul 16 2021, 11:10 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Removed clang changes from this PR.
Removed extra option for MASS pass.
Now MASS pass is active with -O3 and approx-func option.

Jul 16 2021, 11:03 AM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei committed rGee2068b30ecf: [PowerPC] Updated the error message of MASSV pass to mention vectorization (authored by masoud.ataei).
[PowerPC] Updated the error message of MASSV pass to mention vectorization
Jul 16 2021, 7:46 AM
masoud.ataei updated the diff for D106091: [PowerPC] Improve error message on MASSV pass.
Jul 16 2021, 7:38 AM · Restricted Project, Restricted Project

Jul 15 2021

masoud.ataei added inline comments to D106091: [PowerPC] Improve error message on MASSV pass.
Jul 15 2021, 1:55 PM · Restricted Project, Restricted Project
masoud.ataei requested review of D106091: [PowerPC] Improve error message on MASSV pass.
Jul 15 2021, 12:51 PM · Restricted Project, Restricted Project

Jul 14 2021

masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.
Jul 14 2021, 2:38 PM · Restricted Project, Restricted Project, Restricted Project
masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Removed dependency to unsafe-fp-math and added clang option to
control afn flag.

Jul 14 2021, 2:07 PM · Restricted Project, Restricted Project, Restricted Project

Jun 29 2021

masoud.ataei updated the diff for D101759: [PowerPC] Scalar IBM MASS library conversion pass.

Sorry it took me so long to update this patch -- I think I addressed all reviews till now.

Jun 29 2021, 1:26 PM · Restricted Project, Restricted Project, Restricted Project

May 19 2021

masoud.ataei added inline comments to D101759: [PowerPC] Scalar IBM MASS library conversion pass.
May 19 2021, 1:07 PM · Restricted Project, Restricted Project, Restricted Project

May 3 2021

masoud.ataei requested review of D101759: [PowerPC] Scalar IBM MASS library conversion pass.
May 3 2021, 7:46 AM · Restricted Project, Restricted Project, Restricted Project

Mar 8 2021

masoud.ataei committed rG820f508b08d7: [PowerPC] Removing _massv place holder (authored by masoud.ataei).
[PowerPC] Removing _massv place holder
Mar 8 2021, 1:46 PM
masoud.ataei closed D98064: [PowerPC] Removing _massv place holder .
Mar 8 2021, 1:46 PM · Restricted Project, Restricted Project

Mar 5 2021

masoud.ataei requested review of D98064: [PowerPC] Removing _massv place holder .
Mar 5 2021, 11:41 AM · Restricted Project, Restricted Project

Mar 1 2021

masoud.ataei committed rG5fe0cab79e18: [PowerPC] Removing sqrtd2 and sqrtf4 from list of vectorizable function with… (authored by masoud.ataei).
[PowerPC] Removing sqrtd2 and sqrtf4 from list of vectorizable function with…
Mar 1 2021, 7:43 AM
masoud.ataei closed D97487: Removing sqrtd2 and sqrtf4 from list of vectorizable function with MASSV.
Mar 1 2021, 7:43 AM · Restricted Project, Restricted Project

Feb 26 2021

masoud.ataei added a comment to D97487: Removing sqrtd2 and sqrtf4 from list of vectorizable function with MASSV.

Addressed.

Feb 26 2021, 2:12 PM · Restricted Project, Restricted Project
masoud.ataei updated the diff for D97487: Removing sqrtd2 and sqrtf4 from list of vectorizable function with MASSV.

Checking sqrtd2_massv and sqrtf4_massv is not generated.

Feb 26 2021, 2:10 PM · Restricted Project, Restricted Project

Feb 25 2021

masoud.ataei requested review of D97487: Removing sqrtd2 and sqrtf4 from list of vectorizable function with MASSV.
Feb 25 2021, 10:14 AM · Restricted Project, Restricted Project

Dec 8 2020

masoud.ataei committed rGfc750f609dfb: [PPC] Fixing a typo in altivec.h. Commenting out an unnecessary macro (authored by masoud.ataei).
[PPC] Fixing a typo in altivec.h. Commenting out an unnecessary macro
Dec 8 2020, 11:24 AM

Nov 24 2020

masoud.ataei committed rGb86a1cd2f854: [PowerPC] dyn_cast should be dyn_cast_or_null in MASSV pass (authored by masoud.ataei).
[PowerPC] dyn_cast should be dyn_cast_or_null in MASSV pass
Nov 24 2020, 8:22 AM
masoud.ataei closed D91729: [PowerPC] dyn_cast should be dyn_cast_or_null in MASSV pass.
Nov 24 2020, 8:22 AM · Restricted Project

Nov 19 2020

masoud.ataei added inline comments to D91729: [PowerPC] dyn_cast should be dyn_cast_or_null in MASSV pass.
Nov 19 2020, 10:14 AM · Restricted Project

Nov 18 2020

masoud.ataei requested review of D91729: [PowerPC] dyn_cast should be dyn_cast_or_null in MASSV pass.
Nov 18 2020, 11:22 AM · Restricted Project

Jun 12 2020

masoud.ataei committed rG2d038370bb6b: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single… (authored by masoud.ataei).
DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single…
Jun 12 2020, 7:34 AM
masoud.ataei closed D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.
Jun 12 2020, 7:33 AM · Restricted Project, Restricted Project

Jun 10 2020

masoud.ataei updated the diff for D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

Sorry, I forgot the clang-format.

Jun 10 2020, 12:46 PM · Restricted Project, Restricted Project
masoud.ataei retitled D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked from DAGCombiner optimization for pow(x,0.75) even in case massv function is asked to DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.
Jun 10 2020, 12:46 PM · Restricted Project, Restricted Project
masoud.ataei updated the diff for D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

I added support for the case when exponent is 0.25 in addition to support for double precision cases.

Jun 10 2020, 12:14 PM · Restricted Project, Restricted Project

Jun 9 2020

masoud.ataei updated the diff for D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

Moving completely the changes from PPCISelLowring.cpp to PPCLowerMASSVEntries.cpp (MASSV pass) to address the reviewer comments.

Jun 9 2020, 5:59 AM · Restricted Project, Restricted Project

Jun 4 2020

masoud.ataei updated the diff for D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

@steven.zhang
I think I had done a terrible mistake. When I tested with my c code, I didn't have if (ClVectorLibrary == TargetLibraryInfoImpl::MASSV) on

if (ClVectorLibrary == TargetLibraryInfoImpl::MASSV)
  setOperationAction(ISD::FPOW, MVT::v4f32, Custom);

and for some reason if I move this check inside the function LowerFPOWMASSV it works good. So I am updating the patch. Thank you for catching it.

Jun 4 2020, 11:33 AM · Restricted Project, Restricted Project

Jun 3 2020

masoud.ataei added a comment to D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

What we are doing is as follows:
llvm.pow(IR) --> FPOW(ISD) --> __powf4_P8/9(ISD/IR)

It makes more sense to do it in the IR pass from what I see. And then, you can query the TargetLibraryInfoImpl::isFunctionVectorizable(name) instead of the option. Maybe, we can do it in ppc pass: PPCLowerMASSVEntries which did something like:

__sind2_massv --> __sind2_P9 for a Power9 subtarget.

Does it make sense ?

I agree in general it makes more sense to do this kind of conversions in an IR pass like PPCLowerMASSVEntries. This is what we are currently doing in LLVM but there is a problem here. If we change the llvm intrinsic to libcall earlier than a later optimization (like in DAGCombiner) the later optimization won't be triggered. In the case that I am proposing the change, if we have pow(x,0.75) in the code, the PPCLowerMASSVEntries pass will currently change it to __powf4_P* libcall then later in the DAGCombiner we will not get the optimization pow(x,0.75) --> sqrt(x)*sqrt(sqrt(x)). So that's a problem, because sqrt(x)*sqrt(sqrt(x)) is faster.

What I am proposing is to move the conversion for powf4 to late in the compiler pipeline. With this change we will we will get above optimization when exponent is 0.75 and MASSV calls otherwise.

If we expect the llvm.pow(x, 0.75) to be lowered as two sqrt and for others, they are libcall, can we just don't transform it as libcall for 0.75 in PPCLowerMASSVEntries? And it will be lowered as two sqrts in DAGCombine.

For pow to be __powf4_P8, there are two step:

  1. in LoopVectorizePass, pow becomes __powf4_massv, then
  2. in PPCLowerMASSVEntries __powf4_massv becomes __powf4_P8

So when we reach PPCLowerMASSVEntries the pow intrinsic is already a libcall. I thought it is really ugly to undo the LoopVectorizePass conversion in PPCLowerMASSVEntries pass when there is an special case.

As what I see from your test is that, we are trying to lower the pow intrinsic(not pow libcall) to two sqrts or __powf4_P8/9. It is different from the case that, pow -> __powf4_massv -> __powf4_P8 as they are all libcall path.

If your intention is to have __powf4_massv transformed to two fsqrts if argument is 0.75, the test is missing in this patch and I am not sure if this patch could work. You might need to transform the __powf4_massv to llvm.pow intrinsic at some place like PartiallyInlineLibCallsLegacyPass if the argument is 0.75

Maybe, I miss some background here and I just want to get a clear understanding of the reason why we have to do it in DAG. Thank you for your patience.

Jun 3 2020, 8:15 AM · Restricted Project, Restricted Project

Jun 2 2020

masoud.ataei added a comment to D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

What we are doing is as follows:
llvm.pow(IR) --> FPOW(ISD) --> __powf4_P8/9(ISD/IR)

It makes more sense to do it in the IR pass from what I see. And then, you can query the TargetLibraryInfoImpl::isFunctionVectorizable(name) instead of the option. Maybe, we can do it in ppc pass: PPCLowerMASSVEntries which did something like:

__sind2_massv --> __sind2_P9 for a Power9 subtarget.

Does it make sense ?

I agree in general it makes more sense to do this kind of conversions in an IR pass like PPCLowerMASSVEntries. This is what we are currently doing in LLVM but there is a problem here. If we change the llvm intrinsic to libcall earlier than a later optimization (like in DAGCombiner) the later optimization won't be triggered. In the case that I am proposing the change, if we have pow(x,0.75) in the code, the PPCLowerMASSVEntries pass will currently change it to __powf4_P* libcall then later in the DAGCombiner we will not get the optimization pow(x,0.75) --> sqrt(x)*sqrt(sqrt(x)). So that's a problem, because sqrt(x)*sqrt(sqrt(x)) is faster.

What I am proposing is to move the conversion for powf4 to late in the compiler pipeline. With this change we will we will get above optimization when exponent is 0.75 and MASSV calls otherwise.

If we expect the llvm.pow(x, 0.75) to be lowered as two sqrt and for others, they are libcall, can we just don't transform it as libcall for 0.75 in PPCLowerMASSVEntries? And it will be lowered as two sqrts in DAGCombine.

Jun 2 2020, 8:46 AM · Restricted Project, Restricted Project

Jun 1 2020

masoud.ataei added a comment to D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

What we are doing is as follows:
llvm.pow(IR) --> FPOW(ISD) --> __powf4_P8/9(ISD/IR)

It makes more sense to do it in the IR pass from what I see. And then, you can query the TargetLibraryInfoImpl::isFunctionVectorizable(name) instead of the option. Maybe, we can do it in ppc pass: PPCLowerMASSVEntries which did something like:

__sind2_massv --> __sind2_P9 for a Power9 subtarget.

Does it make sense ?

Jun 1 2020, 9:10 AM · Restricted Project, Restricted Project
masoud.ataei updated the diff for D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.

Addressing the reviews

Jun 1 2020, 9:06 AM · Restricted Project, Restricted Project

May 28 2020

masoud.ataei created D80744: DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked.
May 28 2020, 10:56 AM · Restricted Project, Restricted Project

Apr 30 2020

masoud.ataei committed rGb4934ae44cf4: [VFDatabase] Testsuite for scalar functions are vector functions with VF =1 (authored by masoud.ataei).
[VFDatabase] Testsuite for scalar functions are vector functions with VF =1
Apr 30 2020, 12:55 PM