This is an archive of the discontinued LLVM Phabricator instance.

[AggressiveInstCombine] convert sqrt libcalls with "nnan" to sqrt intrinsics
ClosedPublic

Authored by spatel on Jul 5 2022, 7:22 PM.

Details

Summary

This is an alternate to D129155 that uses TTI.haveFastSqrt() to avoid a potential miscompile for programs with reads of errno. Moving the transform to AggressiveInstCombine provides access to TTI.

If a sqrt call has "nnan", that implies that the input argument is never negative because sqrt of {negative number} --> NAN.
If the argument is never negative and the call can be lowered without a libcall, then we can assume that errno accesses are unchanged after lowering, so the call can be translated to the LLVM intrinsic (which is expected to become inline code).

This affects codegen for targets like x86 that have sqrt instructions, but still have to conservatively assume that a libcall may be needed to set errno as shown in issue #52620 and issue #56383.

This patch won't solve that example - we will need to extend this to use CannotBeOrderedLessThanZero or similar, enhance that analysis for new operators, and deal with llvm.assume too.

Diff Detail

Event Timeline

spatel created this revision.Jul 5 2022, 7:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2022, 7:22 PM
spatel requested review of this revision.Jul 5 2022, 7:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2022, 7:22 PM
efriedma added inline comments.Jul 26 2022, 10:04 AM
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
445

Checking that CalledFunc->getParent() doesn't make sense; the module is never going to be null. Did you mean !CalledFunc?

Can we use the getLibFunc() overload that takes a CallBase as an argument?

spatel marked an inline comment as done.Jul 26 2022, 11:12 AM
spatel added inline comments.
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
445

I thought I had adapted this line from the checks leading up to LibCallSimplifier::optimizeSqrt(), but you're correct - that parent check is not there.
The version that uses CallBase does look better - it includes the nobuiltin check, so we don't need to repeat it here. I can add a 'nobuiltin' regression test to confirm.

spatel updated this revision to Diff 447774.Jul 26 2022, 11:34 AM
spatel marked an inline comment as done.

Patch updated:

  1. Improved checking of libcall suitability
  2. Added test for 'nobuiltin'
This revision is now accepted and ready to land.Jul 26 2022, 11:39 AM
This revision was landed with ongoing or failed builds.Jul 26 2022, 12:50 PM
This revision was automatically updated to reflect the committed changes.