This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Implement new 2ulp fdiv lowering
ClosedPublic

Authored by arsenm on Jul 19 2023, 12:34 PM.

Details

Reviewers
foad
b-sumner
Pierre-vh
rampitec
Group Reviewers
Restricted Project
Summary

Extends the new frexp scaled reciprocal to the general case. The
reciprocal case is just the same thing when frexp of 1 is constant
folded. Could probably clean up the code to rely on that constant
folding.

Improves results for the IEEE path for the default OpenCL division. We
used to only emit the fdiv.fast intrinsic with a 2.5 ulp accuracy
threshold with DAZ, which uses explicit range checks. This gives us a
better fast option with the default IEEE behavior.

Diff Detail

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fdiv.ll