This is an archive of the discontinued LLVM Phabricator instance.

[amdgpu] Improve the from f32 to i64.
ClosedPublic

Authored by hliao on Jun 16 2021, 4:39 PM.

Details

Summary
  • Take the same principle as the conversion from f64 to i64 with extra necessary pre- and post-processing. It helps to reduce that conversion sequence by half compared to legacy one.

Diff Detail

Event Timeline

hliao created this revision.Jun 16 2021, 4:39 PM
hliao requested review of this revision.Jun 16 2021, 4:39 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2021, 4:39 PM

Can you also apply this to the globalisel version? that was a direct port

foad added a reviewer: foad.Jun 17 2021, 1:25 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2595

LowerFP_TO_INT64 might be a better name?

2614

Only do this if Signed?

2653

Only do this if Signed?

hliao updated this revision to Diff 352748.Jun 17 2021, 9:01 AM
  • Add global-isel support.
  • Revise the method name following the suggestion.
hliao added inline comments.Jun 17 2021, 9:05 AM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2614

I need to verify that through the OCL compliance test. Even though out-of-range conversion has undefined behavior, I need to double-check that.

hliao updated this revision to Diff 352802.Jun 17 2021, 11:24 AM

Only apply that abs/flip on fptosi for f32.

hliao marked 3 inline comments as done.Jun 17 2021, 11:26 AM
hliao updated this revision to Diff 352809.Jun 17 2021, 11:36 AM

Fix typos.

foad accepted this revision.Jun 18 2021, 2:21 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jun 18 2021, 2:21 AM
This revision was landed with ongoing or failed builds.Jun 19 2021, 9:47 AM
This revision was automatically updated to reflect the committed changes.