Page MenuHomePhabricator

[InstCombine] Expand the simplification of pow() into exp2()
ClosedPublic

Authored by evandro on Jul 12 2018, 4:07 PM.

Details

Summary

Generalize the simplification of pow(2.0, y) to pow(2.0 ** n, y) for all scalar and vector types.

This improvement helps some benchmarks in SPEC CPU2000 and CPU2006, such as 252.eon, 447.dealII, 453.povray. Otherwise, no significant regressions on x86-64 or A64.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
evandro updated this revision to Diff 155480.Jul 13 2018, 1:55 PM

Also consider cases in pow(2.0 ** n, y) where n is negative.

evandro edited the summary of this revision. (Show Details)Jul 13 2018, 1:59 PM

Ping! πŸ””

Β‘Ping! πŸ””πŸ””

evandro updated this revision to Diff 158161.Jul 30 2018, 7:06 PM
evandro edited reviewers, added: efriedma; removed: eli.friedman.
evandro updated this revision to Diff 158439.Jul 31 2018, 6:14 PM
evandro retitled this revision from [SLC] Refactor the simplifications involving pow() and exp{,2,10}() to [SLC] Expand the simplification of pow({e,2,10}, y) to exp{,2,10}().Aug 2 2018, 9:15 AM
evandro retitled this revision from [SLC] Expand the simplification of pow({e,2,10}, y) to exp{,2,10}() to [SLC] Expand the simplification of pow({e,2,10}, y) to exp{,2,10}(y).

Ping! πŸ””

evandro edited the summary of this revision. (Show Details)Aug 9 2018, 2:28 PM
evandro updated this revision to Diff 160018.Aug 9 2018, 3:02 PM
evandro retitled this revision from [SLC] Expand the simplification of pow({e,2,10}, y) to exp{,2,10}(y) to [SLC] Expand the simplification of pow({e,2}, y) to exp{,2}(y).
evandro edited the summary of this revision. (Show Details)

Remove additional simplification when exp10() is the base in pow(), since it's not enabled by many targets.

spatel added inline comments.Aug 14 2018, 8:27 AM
llvm/test/Transforms/InstCombine/pow-sqrt.ll
37–39 β†—(On Diff #160018)

No need to update this file if this is just a cosmetic diff in the value naming.
I updated the other files with auto-generated checks. Please make any IR changes in those tests that are necessary to show functional changes from this patch as a preliminary step. Then, update the assertions using the script.

evandro added inline comments.Aug 14 2018, 8:45 AM
llvm/test/Transforms/InstCombine/pow-sqrt.ll
37–39 β†—(On Diff #160018)

The functionality changes are shown in the tests above.

evandro updated this revision to Diff 160879.Aug 15 2018, 11:47 AM
evandro marked 2 inline comments as done.

I think this is ok, but it's hard to tell exactly what all of the diffs are. Please split it up so:

  1. All of the new tests are committed first with baseline assertions.
  2. Two code improvement pieces: (a) pow(exp(x), y), (b) pow(2.0 ** n, x). (if you can do the refactoring to create the replacePowWithExp() helper as an NFC commit ahead of that, that's even better)
llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1197 β†—(On Diff #160879)

Please add a TODO comment to loosen the FMF restriction. I'm not sure what the minimal set will be, but it can't be all of the flags.

evandro added a subscriber: fhahn.Aug 17 2018, 8:00 AM
evandro updated this revision to Diff 161291.Aug 17 2018, 11:12 AM
evandro retitled this revision from [SLC] Expand the simplification of pow({e,2}, y) to exp{,2}(y) to [InstCombine] Expand the simplification of pow() involving exp{,2}().
evandro edited the summary of this revision. (Show Details)
evandro set the repository for this revision to rL LLVM.

Rebase after the preliminary patches rL340060 and rL340061.

This could still be split into 2 patches...
I'm just looking at the first code hunk, and I'm not sure this if this behaving as intended. We should have tests (please add) where both of the calls (exp{2}/pow) in the pattern are actual libcalls (rather than intrinsics). And in that case, this patch will replace a libcall with an intrinsic? Do the FMF allow that?

We should also have some test coverage for long double (fp128).

evandro added a comment.EditedAug 22 2018, 3:45 PM

Increased test coverage in rL340462, where I also identified an issue with pow(exp{,2}(x), y).

evandro updated this revision to Diff 162249.Aug 23 2018, 12:35 PM
evandro retitled this revision from [InstCombine] Expand the simplification of pow() involving exp{,2}() to [InstCombine] Expand the simplification of pow() into exp{,2}().
evandro edited the summary of this revision. (Show Details)
evandro retitled this revision from [InstCombine] Expand the simplification of pow() into exp{,2}() to [InstCombine] Expand the simplification of pow() into exp2().Aug 23 2018, 1:37 PM
evandro marked an inline comment as done.Aug 23 2018, 3:44 PM
evandro updated this revision to Diff 162292.Aug 23 2018, 4:05 PM

Rebase after D51194.

Are there any negative tests where the base is not a power-of-2? If not, please add at least 1 test like that to make sure we're not transforming that.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1239–1240 β†—(On Diff #162292)

isInteger -> IsInteger
isReciprocal -> IsReciprocal

llvm/test/Transforms/InstCombine/pow-1.ll
173 β†—(On Diff #162292)

It would be better to leave this and the similar assertions as-is because they confirm that the attribute statement actually contains the expected strings.

The update script should leave existing CHECK lines alone if you specify "--function=foo" to only change the tests that you are targeting with this patch.

evandro marked 2 inline comments as done.Aug 28 2018, 12:21 PM

Are there any negative tests where the base is not a power-of-2? If not, please add at least 1 test like that to make sure we're not transforming that.

I believe that @test_simplify1, @test_simplify2, @test_simplify9, @test_simplify10, @test_simplify18, @test_simplify19 already function as negative tests for this transformation. Yes?

spatel accepted this revision.Aug 28 2018, 1:33 PM

Are there any negative tests where the base is not a power-of-2? If not, please add at least 1 test like that to make sure we're not transforming that.

I believe that @test_simplify1, @test_simplify2, @test_simplify9, @test_simplify10, @test_simplify18, @test_simplify19 already function as negative tests for this transformation. Yes?

Not sure if most of those actually make it into this function or if they're simplified sooner. I suppose the exp10 case must, so that's fine.

LGTM

This revision is now accepted and ready to land.Aug 28 2018, 1:33 PM
This revision was automatically updated to reflect the committed changes.
a.elovikov added inline comments.
llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
1275

Why don't we guard it with hasUnaryFloatFn similar to line 1264?

evandro added a subscriber: rnk.Aug 30 2018, 8:30 AM
evandro added inline comments.
llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
1275

I was thinking if this was the reason why this patch was reverted in rL340991. Will investigate it, thank you.

evandro reopened this revision.Aug 30 2018, 9:57 AM

Windows lacks exp2() support.

This revision is now accepted and ready to land.Aug 30 2018, 9:57 AM
evandro updated this revision to Diff 163362.Aug 30 2018, 10:08 AM

Guard this transformation against targets that lack support for exp2(). In this case, refrain from transforming into either an intrinsic or a libcall.

evandro marked 2 inline comments as done.Aug 30 2018, 10:09 AM
evandro requested review of this revision.Aug 30 2018, 10:27 AM
evandro updated this revision to Diff 163369.Aug 30 2018, 10:36 AM

Add more tests.

spatel accepted this revision.Aug 30 2018, 10:55 AM

LGTM

Did someone confirm that producing exp2() was the cause of the bot failure?

This revision is now accepted and ready to land.Aug 30 2018, 10:55 AM

@rnk has kindly reduced the issue to:

target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"                              
target triple = "x86_64-unknown-windows-msvc19.11.0"                                     
@a = dso_local global double 0.000000e+00, align 8                                       
define dso_local double @b() {                                                           
entry:                                                                                   
  %0 = load double, double* @a, align 8                                                  
  %call = call double @pow(double 2.000000e+00, double %0)                               
  ret double %call                                                                       
}                                                                                        
declare dso_local double @pow(double, double)

I added a test to cover this case and variations of it on Windows.

This revision was automatically updated to reflect the committed changes.

Hi,
A late comment about a problem I've noticed with this change.

llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
1276

On my target, exp2f is available, but exp2 isn't, so the

hasUnaryFloatFn(TLI, Ty, LibFunc_exp2, LibFunc_exp2f, LibFunc_exp2l)

guard above returns true, but then when we try to find the name of LibFunc_exp2 we get "".

emitUnaryFloatFnCall doesn't check that the input name is something nice, it just happily adds an "f" to the name, and we end up with code trying to call the function "f" which eventually leads to a linking error.

Any thoughts about such cases?

efriedma added inline comments.Oct 16 2018, 11:04 AM
llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
1276

I think the issue you're describing affects every user of emitUnaryFloatFnCall/emitBinaryFloatFnCall.

emitUnaryFloatFnCall should be fixed so it takes the same list of LibFunc enums that hasUnaryFloatFn does, and gets the appropriate name using getName(). The "appendTypeSuffix" helper is clearly inconsistent with the way TargetLibraryInfo is supposed to work. I doubt the issue you're describing affects any in-tree target, but I'd approve the patch anyway as a cleanup.

uabelho added inline comments.Oct 16 2018, 11:50 PM
llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
1276

There are a few uses of emitUnaryFloatFnCall that I don't think are problematic atm, e.g. in optimizeDoubleFP. There we don't fetch the name via TLI, but pass in the name of the already existing function (which I suppose is the "double" version), and then I guess it works.

Anyway, it sounds good that you agree this is a problem and that it should work even if the "float" function is available and the "double" isn't like on my target.

I can try to make a cleanup patch that improves this in some way.

Thanks!

uabelho added inline comments.Oct 17 2018, 6:09 AM
llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp
1276

I created something here:
https://reviews.llvm.org/D53370
Let's continue the discussion there.

Hello, ldexp seems to be faster here. Did you try it?

pow(2, e) to ldexp(1, e)

Herald added a project: Restricted Project. Β· View Herald TranscriptAug 7 2019, 1:14 PM
evandro marked an inline comment as done.Aug 7 2019, 1:33 PM

Hello, ldexp seems to be faster here. Did you try it?

pow(2, e) to ldexp(1, e)

ldexp() restricts the exponent to integers, so only then is it potentially faster.

Will think about it.

Thank you.

We have exp to ldexp fold, so we indirectly fold pow integer case to ldexp. I think it is okay as is.

Thanks.

We have exp to ldexp fold, so we indirectly fold pow integer case to ldexp. I think it is okay as is.

The test pow_fp_int.ll indicates that it doesn't fold exp2() into ldexp() then.

Oh right, we miss this for pow of two bases other than 2.0.

Yeah, we need a direct fold.

Oh right, we miss this for pow of two bases other than 2.0.

Yeah, we need a direct fold.

Methinks that I got a patch working. The only drawback is that there is no intrinsic ldexp(), which means that it's probably better to use exp2() for vector types.