This is an archive of the discontinued LLVM Phabricator instance.

[amdgpu] Revise the conversion from i64 to f32.
ClosedPublic

Authored by hliao on Aug 4 2021, 3:17 PM.

Details

Summary
  • Replace 'cmp+sel' with 'umin' if possible.

Diff Detail

Event Timeline

hliao created this revision.Aug 4 2021, 3:17 PM
hliao requested review of this revision.Aug 4 2021, 3:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2021, 3:17 PM
foad added inline comments.Aug 5 2021, 6:50 AM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2469

The comment says 33 if Hi and Lo have the same sign, but the expression gives 33 if they have different sign. Do you want 33 - ((Lo ^ Hi) >> 31)?

hliao updated this revision to Diff 364683.Aug 5 2021, 8:12 PM

Fix the calculation of MaxShAmt.

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2469

Good catch. That's wrong. Corrected in the latest revision.

foad accepted this revision.Aug 6 2021, 2:09 AM

LGTM, thanks!

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2496–2498

Another possible micro-optimization is to push the -1 into the umin, to shorten the critical path: umin((ffbh hi)-1, ((hi^lo) ashr 31)+32)

This revision is now accepted and ready to land.Aug 6 2021, 2:09 AM
This revision was automatically updated to reflect the committed changes.
hliao marked an inline comment as done.Aug 6 2021, 2:02 PM