This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] loosen FMF for sqrt(X) * sqrt(X) --> X
ClosedPublic

Authored by spatel on Feb 26 2018, 7:47 AM.

Details

Summary

If my fast-math understanding is correct, 'reassoc' alone is not enough because that doesn't give us the freedom to get the negative number cases wrong.

I considered that we might not even need 'reassoc' here, but if we eliminate the sqrt calls, then we may differ in the last bit by eliminating the rounding that occurs in those calls.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Feb 26 2018, 7:47 AM
scanon added a comment.Mar 5 2018, 8:55 AM

IIRC Intrinsic::sqrt is undef for negative inputs (unlike the sqrt libcall), so we don't need FMF.noNaNs to license this transformation.

arsenm added a comment.Mar 5 2018, 9:02 AM

IIRC Intrinsic::sqrt is undef for negative inputs (unlike the sqrt libcall), so we don't need FMF.noNaNs to license this transformation.

Although IMO if we're fixing all of the math in LLVM I would like to fix this. At least do something like ctlz/cttz for whether negative is undef

spatel added a comment.Mar 5 2018, 9:16 AM

IIRC Intrinsic::sqrt is undef for negative inputs (unlike the sqrt libcall), so we don't need FMF.noNaNs to license this transformation.

Although IMO if we're fixing all of the math in LLVM I would like to fix this. At least do something like ctlz/cttz for whether negative is undef

Sqrt was fixed:
https://reviews.llvm.org/D28797

IIRC Intrinsic::sqrt is undef for negative inputs (unlike the sqrt libcall), so we don't need FMF.noNaNs to license this transformation.

Although IMO if we're fixing all of the math in LLVM I would like to fix this. At least do something like ctlz/cttz for whether negative is undef

Sqrt was fixed:
https://reviews.llvm.org/D28797

Also, note that we convert sqrt and other libcalls to the LLVM intrinsics when possible in clang:
D39204 (and follow-up commits listed there)
...so we don't have to muddy the code looking for sqrt libcall patterns too. If errno can be set by a libcall, then no amount of FMF should override that; if errno can't be set by the libcall, it should have been converted to the LLVM intrinsic.

efriedma added inline comments.Mar 12 2018, 12:06 PM
test/Transforms/InstSimplify/fast-math.ll
219 ↗(On Diff #135904)

Maybe add a test that we don't do this without the relevant fast-math flags?

spatel updated this revision to Diff 138080.Mar 12 2018, 12:54 PM
spatel marked an inline comment as done.

Patch updated:
Include negative tests to show that we're only doing the transform when all 3 of the required FMF are present.

wristow accepted this revision.Mar 15 2018, 6:42 PM

LGTM

This revision is now accepted and ready to land.Mar 15 2018, 6:42 PM
This revision was automatically updated to reflect the committed changes.