This is an archive of the discontinued LLVM Phabricator instance.

[SLC] Support expanding pow(x, n+0.5) to x * x * ... * sqrt(x)
ClosedPublic

Authored by fhahn on Aug 29 2018, 8:54 AM.

Diff Detail

Event Timeline

fhahn created this revision.Aug 29 2018, 8:54 AM
evandro added inline comments.Aug 29 2018, 12:52 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1394

May I suggest s/VInt/Expo2/?

1421

Methinks that s/"sqrt"/TLI->getName(LibFunc_exp2)/ is more elegant. Additionally, if pow() is an intrinsic, you want to emit the intrinsic for sqrt() then, rather than a libcall always. Then, please the corresponding tests too.

fhahn updated this revision to Diff 163194.Aug 29 2018, 2:42 PM

Thanks Evandro! Move code to create Sqrt intrinsic/libcall call to helper function and use it. Rename variable as suggested.

fhahn marked 2 inline comments as done.Aug 29 2018, 2:42 PM
spatel added inline comments.Aug 29 2018, 3:01 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1381–1386

There was no real justification for this limit in D13994, but I suppose we're still ok with the transform +1 more instruction?

1395

I didn't catch the doubling of "?.5" followed by an isInteger check when I saw this the first time, so it deserves a code comment.

LGTM, but, please, wait for @spatel's agreement.

fhahn updated this revision to Diff 163280.Aug 30 2018, 1:54 AM
fhahn marked an inline comment as done.

Add comment about how ExpoA == integer + 0.5 is detected.

fhahn added inline comments.Aug 30 2018, 1:59 AM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1381–1386

IIUC the main reason for adding this limit was to avoid generating too long fmul chains. I think adding a call to sqrt() is independent of that (similar to adding a call to fdiv for negative exponents), but I can either update the comment or only generate sqrt() for smaller exponents.

spatel accepted this revision.Aug 30 2018, 5:58 AM

LGTM - see inline for minor improvements.

lib/Transforms/Utils/SimplifyLibCalls.cpp
1381–1386

The sqrt enhancement was mentioned in the original patch, so I won't hold this patch up...
But this entire transform is questionable as an IR canonicalization (instcombine). The limit was chosen arbitrarily, and it applies universally even though the optimal limit will vary based on target. We're also doing this transform regardless of whether we are optimizing for size or not.

There was a suggestion in the original patch that this should be a backend transform, and I think that was correct. Alternatively, there should be a backend reversal of this transform if the target would prefer to use a libcall instead of expanding.

This should be noted in a TODO comment here.

test/Transforms/InstCombine/pow-4.ll
138

It would be nice to get a bit more coverage from these tests by varying the exponent and data types (the transform should work with vectors too?).

This revision is now accepted and ready to land.Aug 30 2018, 5:58 AM
evandro added inline comments.Aug 30 2018, 8:40 AM
test/Transforms/InstCombine/pow-4.ll
138

Good point.

fhahn updated this revision to Diff 163349.Aug 30 2018, 8:57 AM

Thanks! Added a comment and additional test cases. Please let me know if the comment makes sense.

Yes, it makes sense.

lib/Transforms/Utils/SimplifyLibCalls.cpp
1383

I wonder if the additional multiplication and the sqrt() should be counted towards this limit, as arbitrary as it is.

Comment + extra tests look good.

Given the revert of D49273, make the getSqrtCall() diff an NFC commit ahead of this patch + generalize that for any libcall, so we make sure that we're not generating libcalls when we're not allowed to?

Comment + extra tests look good.

Given the revert of D49273, make the getSqrtCall() diff an NFC commit ahead of this patch + generalize that for any libcall, so we make sure that we're not generating libcalls when we're not allowed to?

Will do tomorrow, thanks!

fhahn added a comment.Sep 3 2018, 6:57 AM

Comment + extra tests look good.

Given the revert of D49273, make the getSqrtCall() diff an NFC commit ahead of this patch + generalize that for any libcall, so we make sure that we're not generating libcalls when we're not allowed to?

I had a look at other uses that could benefit from a general getSqrtCall, but I am not entirely sure what the scope of it should be. Also, for sqrt, we use the intrinsic when possible, and a lib call if it is available otherwise. But e.g. for the pow(2.0 ** n, x) -> exp2(n * x) transform we check if an exp2 lib func is available. If it is not, we also do not emit an intrinsic, even if it would be possible. Should the generic function behave similar to the exp2 behavior?

spatel added a comment.Sep 3 2018, 7:27 AM

Comment + extra tests look good.

Given the revert of D49273, make the getSqrtCall() diff an NFC commit ahead of this patch + generalize that for any libcall, so we make sure that we're not generating libcalls when we're not allowed to?

I had a look at other uses that could benefit from a general getSqrtCall, but I am not entirely sure what the scope of it should be. Also, for sqrt, we use the intrinsic when possible, and a lib call if it is available otherwise. But e.g. for the pow(2.0 ** n, x) -> exp2(n * x) transform we check if an exp2 lib func is available. If it is not, we also do not emit an intrinsic, even if it would be possible. Should the generic function behave similar to the exp2 behavior?

That's an interesting question. IIUC, we don't generate an exp2 intrinsic if the libcall is not available because we expect that most targets would end up lowering to that libcall anyway (they don't have hardware support for exp2). But that's probably not true for sqrt - either a compliant or estimate sqrt instruction probably is available even if the libcall is not. That question/problem is noted in the 'TODO' comment for the sqrt libcall test.

This is probably worth asking about on llvm-dev if not via another patch. Either way, I don't think we should hold this patch up while deciding how to fix that.

fhahn added a comment.Sep 3 2018, 8:15 AM

Ok thanks! So I can commit this patch as is?

spatel added a comment.Sep 3 2018, 8:31 AM

Ok thanks! So I can commit this patch as is?

Yes, no objections from me. I think we have sufficient TODOs sprinkled around that make it clear how we can improve things more if needed.

This revision was automatically updated to reflect the committed changes.
fhahn added a comment.Sep 3 2018, 10:39 AM

Great, thanks for the review!