Improve the code generation of fp_to_sint
and fp_to_uint for integer on 16-bits.
Details
Diff Detail
Event Timeline
All of the patterns you added look to me like workarounds for the DAGCombiner or legalizer not doing its job correctly
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
918 ↗ | (On Diff #341292) | This is identical to the pattern attached to the instruction definition, so this shouldn't be doing anything |
923 ↗ | (On Diff #341292) | Ditto |
928–929 ↗ | (On Diff #341292) | I don't see the advantage of just selecting the 16-bit result op to the 32-bit result. The legalizer should just take care of this |
933 ↗ | (On Diff #341292) | Ditto |
938 ↗ | (On Diff #341292) | Ditto |
943 ↗ | (On Diff #341292) | Ditto |
949–955 ↗ | (On Diff #341292) | Would these sexts ever reach the selector? I would expect the combiner to take care of these. I don't see tests for this |
llvm/test/CodeGen/AMDGPU/fp_to_uint.ll | ||
---|---|---|
242–243 | Just use a single GCN: check line. | |
llvm/test/CodeGen/AMDGPU/fptosi.f16.ll | ||
2 | Don't do this. Instead change GCN: to SI: on lines 5-8. GCN is supposed to be used for checks that are the same for all architectures (gfx6 onwards). | |
llvm/test/CodeGen/AMDGPU/fptoui.f16.ll | ||
2 | Don't do this. |
Thank you for your inputs.
Rebased the patch to do the same thing in the Legalizer. It seems more natural indeed.
I took the occasion to do the suggested refactoring around LowerFP_TO_SINT/LowerFP_TO_UINT to merge them into one.
For @foad, I changed the tests back to the original version, thanks for the comment.
Thanks for the review. I don't have commit access to the repo, could someone do it for me?
clang-tidy: warning: invalid case style for function 'LowerFP_TO_INT' [readability-identifier-naming]
not useful