This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Convert rcp to rcp_iflag
ClosedPublic

Authored by rampitec on Jun 25 2018, 2:26 PM.

Details

Summary

If a source of rcp instruction is a result of any conversion from
an integer convert it into rcp_iflag instruction. No FP exception
can ever happen except division by zero if a single precision rcp
argument is a representation of an integral number.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec created this revision.Jun 25 2018, 2:26 PM
arsenm added inline comments.Jun 25 2018, 11:57 PM
lib/Target/AMDGPU/SIISelLowering.cpp
6641 ↗(On Diff #152775)

Preserve fast math flags?

6644 ↗(On Diff #152775)

This can just return SDValue(). The base PerformDAGCombine isn't going to do anything here

rampitec added inline comments.Jun 26 2018, 12:05 AM
lib/Target/AMDGPU/SIISelLowering.cpp
6641 ↗(On Diff #152775)

Makes sense.

6644 ↗(On Diff #152775)

In fact not, there is combining code in parent and having no this call results in tests regression.

arsenm added inline comments.Jun 26 2018, 12:35 AM
lib/Target/AMDGPU/SIISelLowering.cpp
6644 ↗(On Diff #152775)

Oh, I see there is constant folding there already.

It would probably be better then to either factor that into a function which can be called from here, or to just handle all of this is one performRcpCombine

rampitec updated this revision to Diff 152891.Jun 26 2018, 8:07 AM
rampitec marked 2 inline comments as done.
rampitec marked 3 inline comments as done.Jun 26 2018, 3:06 PM
arsenm accepted this revision.Jun 27 2018, 5:16 AM

LGTM

This revision is now accepted and ready to land.Jun 27 2018, 5:16 AM
This revision was automatically updated to reflect the committed changes.