This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Use more accurate fast f64 fdiv
ClosedPublic

Authored by arsenm on Jan 20 2021, 3:30 PM.

Details

Reviewers
rampitec
b-sumner
Summary

A raw v_rcp_f64 isn't accurate enough, so start applying correction.

Diff Detail

Event Timeline

arsenm created this revision.Jan 20 2021, 3:30 PM
arsenm requested review of this revision.Jan 20 2021, 3:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2021, 3:30 PM
Herald added a subscriber: wdng. · View Herald Transcript
rampitec accepted this revision.Jan 20 2021, 3:42 PM

LGTM. Please also check with Brian correction sequence.

This revision is now accepted and ready to land.Jan 20 2021, 3:42 PM

That sequence will give a very accurate result as long as overflow and underflow is avoided. LGTM.

foad added a subscriber: foad.Jan 22 2021, 6:10 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
3115

Can you add a comment here and/or in the sdag equivalent showing what the code you're building here will look like, and preferably where it came from and what kind of accuracy you expect from it?