This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Remove pointless libcall recognition of native_{divide|recip}
ClosedPublic

Authored by arsenm on Jul 31 2023, 5:15 AM.

Details

Reviewers
rampitec
vpykhtin
jhuber6
dfukalov
yaxunl
Group Reviewers
Restricted Project
Summary

This was trying to constant fold these calls, and also turn some of
them into a regular fmul/fdiv. There's no point to doing that, the
underlying library implementation should be using those in the first
place. Even when the library does use the rcp intrinsics, the backend
handles constant folding of those. This was also only performing the
folds under overly strict fast-evertyhing-is-required conditions.

The one possible plus this gained over linking in the library is if
you were using all fast math flags, it would propagate them to the new
instructions. We could address this in the library by adding more fast
math flags to the native implementations.

The constant fold case also had no test coverage.

Diff Detail