This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Correctly lower llvm.sqrt.f32
ClosedPublic

Authored by arsenm on Aug 16 2023, 4:32 PM.

Details

Reviewers
rampitec
foad
Group Reviewers
Restricted Project
Summary

Make codegen emit correctly rounded sqrt by default.

Emit the fast but only kind of fast expansion in AMDGPUCodeGenPrepare
based on !fpmath, like the fdiv case. Hack around visitation ordering
problems from AMDGPUCodeGenPrepare using forward iteration instead of
a well behaved combiner.

Diff Detail

Event Timeline

arsenm created this revision.Aug 16 2023, 4:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2023, 4:32 PM
arsenm requested review of this revision.Aug 16 2023, 4:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2023, 4:32 PM
Herald added a subscriber: wdng. · View Herald Transcript
rampitec accepted this revision.Aug 30 2023, 10:00 AM
This revision is now accepted and ready to land.Aug 30 2023, 10:00 AM

Failed tests on llvm-clang-x86_64-expensive-checks-ubuntu builder

  • LLVM::amdgpu-codegenprepare-fdiv.ll
  • LLVM::fsqrt.f32.ll

https://lab.llvm.org/buildbot/#/builders/104/builds/13361

llvm/test/CodeGen/AMDGPU/fdiv_flags.f32.ll