We don't have real demanded bits support for MULHU, but we can
still use the known bits based constant folding support at the end
of SimplifyDemandedBits to simplify a MULHU. This helps with cases
where we know the LHS and RHS have enough leading zeros so that
the high multiply result is always 0.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/AMDGPU/sdiv64.ll | ||
---|---|---|
1397–1398 | This also points to a failure to canonicalize constants to the RHS so the isNullValue check in visitMULHU would work. I tried to add the canonicalization without this patch, but ended up with a verifier failure on some AMDGPU tests. It appears the simplification introduced in this patch catches something even earlier and produces simpler code that doesn't hit the verifier error. |
llvm/test/CodeGen/AMDGPU/sdiv64.ll | ||
---|---|---|
1397–1398 | *** Bad machine code: VOP* instruction violates constant bus restriction *** - function: v_test_sdiv_k_num_i64 - basic block: %bb.0 (0x238f3f455f0) - instruction: %160:vgpr_32 = V_ADDC_U32_e32 %70:sreg_32, %161:vgpr_32, implicit-def dead $vcc, implicit $vcc, implicit $exec |
llvm/test/CodeGen/AMDGPU/sdiv64.ll | ||
---|---|---|
1397–1398 | There appears to be a verifier bug, where VCC is being counted toward constant bus usage for V_ADDC (where it is implicit). |
llvm/test/CodeGen/AMDGPU/sdiv64.ll | ||
---|---|---|
1397–1398 | Please ignore me - I misread the documentation. |
llvm/test/CodeGen/AMDGPU/sdiv64.ll | ||
---|---|---|
1397–1398 | How did you "add the canonicalization"? Have you got a patch for that? |
llvm/test/CodeGen/AMDGPU/sdiv64.ll | ||
---|---|---|
1397–1398 | Yes, I'll dig it out and post it on a bug |
llvm/test/CodeGen/AMDGPU/sdiv64.ll | ||
---|---|---|
1397–1398 |
@craig.topper once D106868 goes in, are you happy to add the canonicalization of mulh constants to rhs (from PR51217) first?
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
4582 | Out of interest, do we need MULHU support in SimplifyDemandedBits? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
4582 | I'm not sure there's much you can do. If you demand any of the bits from MULHU, then I think you demand all bits of the input. Maybe there's something you can do if you have known bits from one input, but I'd need to think about it a lot more. |
LGTM - cheers
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
4582 | Since a common reason for creating these nodes is divide by constants (TargetLowering.BuildUDIV et al) then RHS is likely to be constant - but yes this should probably wait until we have some actual examples. |
Out of interest, do we need MULHU support in SimplifyDemandedBits?