Make divergent fshr legal since it is selected to v_alignbit, but
expand uniform fshr since there is no s_alignbit instruction.
Details
- Reviewers
- None
- Group Reviewers
Restricted Project
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/test/CodeGen/AMDGPU/bf16.ll | ||
---|---|---|
631 | There are lots of minor regressions like this. I will investigate. | |
llvm/test/CodeGen/AMDGPU/build_vector.ll | ||
74 | This is the intended change. In general we should select SALU instructions for uniform calculations, even if it's more instructions. In this case it's a shame that the result gets copied to a VGPR anyway... |
llvm/test/CodeGen/AMDGPU/bf16.ll | ||
---|---|---|
631 | The problem here is that really need to combine shifts and ORs into fshr post-legalization. This no longer happens automatically because we have marked fshr as Custom instead of Legal. I could do it with a target-specific OR combine, but I can't find any way to call back into helper code like MatchRotate in the generic DAGCombiner from a target-specific combine. |
llvm/test/CodeGen/AMDGPU/bf16.ll | ||
---|---|---|
631 | is there just an isLegal call that needs to be isLegalOrCustom? |
llvm/test/CodeGen/AMDGPU/bf16.ll | ||
---|---|---|
631 |
I can make that change in DAGCombiner::MatchRotate but it does not work: --- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -8339,10 +8339,10 @@ SDValue DAGCombiner::MatchRotate(SDValue LHS, SDValue RHS, const SDLoc &DL) { // The target must have at least one rotate/funnel flavor. // We still try to match rotate by constant pre-legalization. // TODO: Support pre-legalization funnel-shift by constant. - bool HasROTL = hasOperation(ISD::ROTL, VT); - bool HasROTR = hasOperation(ISD::ROTR, VT); - bool HasFSHL = hasOperation(ISD::FSHL, VT); - bool HasFSHR = hasOperation(ISD::FSHR, VT); + bool HasROTL = TLI.isOperationLegalOrCustom(ISD::ROTL, VT); + bool HasROTR = TLI.isOperationLegalOrCustom(ISD::ROTR, VT); + bool HasFSHL = TLI.isOperationLegalOrCustom(ISD::FSHL, VT); + bool HasFSHR = TLI.isOperationLegalOrCustom(ISD::FSHR, VT); // If the type is going to be promoted and the target has enabled custom // lowering for rotate, allow matching rotate by non-constants. Only allow The problem is that during legalization a uniform fshr will be legalized by lowering it to shifts and ORs, but this combine will immediately kick in and combine it back into a fshr. That causes an infinite loop. Maybe the whole premise of this patch is flawed? Is it OK to say that fshr is only legal if it is divergent? Or do I have to say fshr is always legal, and then lower uniform fshr back into shift and ORs at some later stage? |
There are lots of minor regressions like this. I will investigate.