Improve the code generation of fp_to_sint
and fp_to_uint for integer on 16-bits.
Details
Diff Detail
Event Timeline
All of the patterns you added look to me like workarounds for the DAGCombiner or legalizer not doing its job correctly
| llvm/lib/Target/AMDGPU/SIInstructions.td | ||
|---|---|---|
| 918 | This is identical to the pattern attached to the instruction definition, so this shouldn't be doing anything | |
| 923 | Ditto | |
| 928–929 | I don't see the advantage of just selecting the 16-bit result op to the 32-bit result. The legalizer should just take care of this | |
| 933 | Ditto | |
| 938 | Ditto | |
| 943 | Ditto | |
| 949–955 | Would these sexts ever reach the selector? I would expect the combiner to take care of these. I don't see tests for this | |
| llvm/test/CodeGen/AMDGPU/fp_to_uint.ll | ||
|---|---|---|
| 243–244 | Just use a single GCN: check line. | |
| llvm/test/CodeGen/AMDGPU/fptosi.f16.ll | ||
| 2 | Don't do this. Instead change GCN: to SI: on lines 5-8. GCN is supposed to be used for checks that are the same for all architectures (gfx6 onwards). | |
| llvm/test/CodeGen/AMDGPU/fptoui.f16.ll | ||
| 2 | Don't do this. | |
Thank you for your inputs.
Rebased the patch to do the same thing in the Legalizer. It seems more natural indeed.
I took the occasion to do the suggested refactoring around LowerFP_TO_SINT/LowerFP_TO_UINT to merge them into one.
For @foad, I changed the tests back to the original version, thanks for the comment.
LGTM with nits
| llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
|---|---|---|
| 2734–2737 ↗ | (On Diff #342057) | Combine into one if? |
| 2759 ↗ | (On Diff #342057) | Don't need the ternary operator, just directly pass the bool expression |
Thanks for the review. I don't have commit access to the repo, could someone do it for me?
This is identical to the pattern attached to the instruction definition, so this shouldn't be doing anything