Instead of transforming G_BUILD_VECTOR_TRUNC into G_OR + G_AND + G_SHL, transform it into G_BITCAST or just replace the registers,
if the operand of build_vector_trunc is undef or a direct product of the other operand.
Details
- Reviewers
foad arsenm mbrkusanin Pierre-vh
Diff Detail
Event Timeline
I think it would be better to do this by using a selection of existing combines if possible. For example binop_left_undef_to_zero + undef_to_int_zero + right_identity_zero should do most of this.
llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp | ||
---|---|---|
334–335 ↗ | (On Diff #403615) | This combine feels way too targeted. We shouldn't have to have combines that look for undefs in the operands of other things. Those operations on undef should have folded out on their own |
345–346 ↗ | (On Diff #403615) | Shouldn't hardcode these specific values, should compute based on the type bitwidth |
351 ↗ | (On Diff #403615) | No else after return |
355 ↗ | (On Diff #403615) | No else after return |
362 ↗ | (On Diff #403615) | Don't need this check |
Instead of using a combiner in the legalizer, change the implementation of apply mapping for G_BUILD_VECTOR_TRUNC in regbankselect.
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp | ||
---|---|---|
2612 ↗ | (On Diff #405948) | I think we should have combined out any implicit def inputs into something else. Trying to optimize as part of a lowering expansion is generally a last resort strategy |
Instead of a combiner in regbankselect pass, add a combiner in amdgpu-postlegalizer-combiner pass.
llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp | ||
---|---|---|
386–396 | I guess you did track that from before, so it doesn't really matter |
llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp | ||
---|---|---|
322 | In case we have a copy instruction as a second operand of G_LSHR. mi_match would return false, this way we don't have to worry about that. |
llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp | ||
---|---|---|
322 | Extra copies should be separately folded out by copy combines |
llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp | ||
---|---|---|
322 | No, you were right, no need for getDefIgnoringCopies. |
Remove a lambda that is used only once and a part of the code that covers only a specially constructed test.
clang-format: please reformat the code