This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fix more unsafe rsq formation
ClosedPublic

Authored by arsenm on Aug 16 2023, 10:18 AM.

Details

Reviewers
foad
rampitec
Group Reviewers
Restricted Project
Summary

Introducing rsq contract flags is wrong, and also requires some level
of approximate functions. AMDGPUCodeGenPrepare already should handle
the f32 cases with appropriate flags, and I don't see how new
situations to handle would arise during legalization (other than cases
involving the rcp intrinsic, which instcombine tries to
handle). AMDGPUCodeGenPrepare does need to learn better handling of
rcp/rsq for f64 though, which we never bothered to handle well.

Removes another obstacle to correctly lowering sqrt.

Diff Detail