This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Correctly lower llvm.log.f32 and llvm.log10.f32
ClosedPublic

Authored by arsenm on Jun 15 2023, 7:01 AM.

Details

Reviewers
foad
jhuber6
rampitec
Pierre-vh
cdevadas
b-sumner
Group Reviewers
Restricted Project
Summary

Previously we expanded these in a fast-math way and the device
libraries were relying on this behavior. The libraries have a pending
change to switch to the new target intrinsic.

Unlike the library version, this takes advantage of no-infinities on
the result overflow check.

Diff Detail

Event Timeline

arsenm created this revision.Jun 15 2023, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 7:01 AM
arsenm requested review of this revision.Jun 15 2023, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 7:01 AM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm edited the summary of this revision. (Show Details)Jun 15 2023, 7:03 AM

Coding is fine. Does it pass OpenCL conformance?

Coding is fine. Does it pass OpenCL conformance?

I tested all the permutations on math_brute_force (the combinations of DAZ, fast fma, log/log2/log10, DAG/gisel) I don’t believe conformance covers the finite only cases though

arsenm added a comment.Jul 5 2023, 5:32 AM

ping

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2516

TODO: Use ldexp instead

This revision is now accepted and ready to land.Jul 5 2023, 11:50 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fpow.mir