This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Refine rcp/rsq intrinsic folding for modern FP rules
ClosedPublic

Authored by arsenm on May 22 2020, 5:14 AM.

Details

Summary

We have to assume undef could be an snan, which would need quieting so
returning qnan is safer than undef. Also consider strictfp, and don't
care if the result rounded.

Diff Detail

Event Timeline

arsenm created this revision.May 22 2020, 5:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2020, 5:14 AM
foad added a comment.May 22 2020, 6:11 AM

We have to assume undef could be an snan, which would need quieting so returning qnan is safer than undef.

So you've chosen to optimize assuming that an undef input was a nan. Is that better than assuming it was something more ordinary like 0?

We have to assume undef could be an snan, which would need quieting so returning qnan is safer than undef.

So you've chosen to optimize assuming that an undef input was a nan. Is that better than assuming it was something more ordinary like 0?

qnan matches the current fdiv handling. nan also enables folding out more use operations more consistently

rampitec accepted this revision.May 22 2020, 10:21 AM
This revision is now accepted and ready to land.May 22 2020, 10:21 AM