This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/EG,CM: Implement fsqrt using recip(rsqrt(x)) instead of x * rsqrt(x)
ClosedPublic

Authored by jvesely on Feb 4 2020, 6:10 PM.

Details

Summary

The old version might be faster on EG (RECIP_IEEE is Trans only),
but it'd need extra corner case checks.
This gives correct corner case behaviour and saves a register.
Fixes OCL CTS sqrt test (1-thread, scalar) on Turks.

Diff Detail

Event Timeline

jvesely created this revision.Feb 4 2020, 6:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2020, 6:10 PM
arsenm accepted this revision.Feb 4 2020, 6:15 PM

LGTM

This revision is now accepted and ready to land.Feb 4 2020, 6:15 PM
This revision was automatically updated to reflect the committed changes.