This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Select V_CVT_*16_F16 more often
ClosedPublic

Authored by jpages on Apr 28 2021, 12:42 PM.

Details

Summary

Improve the code generation of fp_to_sint
and fp_to_uint for integer on 16-bits.

Diff Detail

Event Timeline

jpages created this revision.Apr 28 2021, 12:42 PM
jpages requested review of this revision.Apr 28 2021, 12:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2021, 12:42 PM

All of the patterns you added look to me like workarounds for the DAGCombiner or legalizer not doing its job correctly

llvm/lib/Target/AMDGPU/SIInstructions.td
918 ↗(On Diff #341292)

This is identical to the pattern attached to the instruction definition, so this shouldn't be doing anything

923 ↗(On Diff #341292)

Ditto

928–929 ↗(On Diff #341292)

I don't see the advantage of just selecting the 16-bit result op to the 32-bit result. The legalizer should just take care of this

933 ↗(On Diff #341292)

Ditto

938 ↗(On Diff #341292)

Ditto

943 ↗(On Diff #341292)

Ditto

949–955 ↗(On Diff #341292)

Would these sexts ever reach the selector? I would expect the combiner to take care of these. I don't see tests for this

foad added a subscriber: foad.Apr 28 2021, 3:03 PM
foad added inline comments.Apr 29 2021, 2:59 AM
llvm/test/CodeGen/AMDGPU/fp_to_uint.ll
242–243

Just use a single GCN: check line.

llvm/test/CodeGen/AMDGPU/fptosi.f16.ll
2

Don't do this. Instead change GCN: to SI: on lines 5-8.

GCN is supposed to be used for checks that are the same for all architectures (gfx6 onwards).

llvm/test/CodeGen/AMDGPU/fptoui.f16.ll
2

Don't do this.

jpages updated this revision to Diff 342057.Apr 30 2021, 3:08 PM

Thank you for your inputs.

Rebased the patch to do the same thing in the Legalizer. It seems more natural indeed.
I took the occasion to do the suggested refactoring around LowerFP_TO_SINT/LowerFP_TO_UINT to merge them into one.

For @foad, I changed the tests back to the original version, thanks for the comment.

jpages marked 3 inline comments as done.Apr 30 2021, 3:11 PM

LGTM with nits

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
2734–2737

Combine into one if?

2759

Don't need the ternary operator, just directly pass the bool expression

jpages updated this revision to Diff 342406.May 3 2021, 8:36 AM

Updated for code style

jpages marked 2 inline comments as done.May 3 2021, 8:36 AM
foad accepted this revision.May 4 2021, 1:14 AM
This revision is now accepted and ready to land.May 4 2021, 1:14 AM
jpages added a comment.May 4 2021, 1:47 PM

Thanks for the review. I don't have commit access to the repo, could someone do it for me?

This revision was landed with ongoing or failed builds.May 5 2021, 12:58 AM
Closed by commit rGa1ed39df96bc: [AMDGPU] Select V_CVT_*16_F16 more often (authored by jpages, committed by foad). · Explain Why
This revision was automatically updated to reflect the committed changes.