This is an archive of the discontinued LLVM Phabricator instance.

HIP: Directly use sqrt builtins instead of calling ocml (f32 case)
AbandonedPublic

Authored by arsenm on Nov 22 2022, 8:49 AM.

Details

Summary

The wrappers aren't buying anything and just add complexity.
We do not propagate fast math flags into the linked library functions,
but they'll naturally be applied if the intrinsic is directly emitted.

We also don't need to treat the native case differently, since we just
directly select the generic intrinsic anyway.

f64 case requires a backend change, so defer that for now.

Diff Detail

Event Timeline

arsenm created this revision.Nov 22 2022, 8:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 22 2022, 8:49 AM
arsenm requested review of this revision.Nov 22 2022, 8:49 AM

__builtin_sqrtf does not produce a correctly rounded result. I don't recommend this change.

__builtin_sqrtf does not produce a correctly rounded result. I don't recommend this change.

It's supposed to. I'm working towards correctly handling these in the backend

arsenm abandoned this revision.Sep 12 2023, 1:25 PM

reposted D158131