Page MenuHomePhabricator

[DAGCombiner][RISCV][AMDGPU] Call SimplifyDemandedBits at the end of visitMULHU to enable known bits contant folding.
ClosedPublic

Authored by craig.topper on Jul 21 2021, 11:53 AM.

Details

Summary

We don't have real demanded bits support for MULHU, but we can
still use the known bits based constant folding support at the end
of SimplifyDemandedBits to simplify a MULHU. This helps with cases
where we know the LHS and RHS have enough leading zeros so that
the high multiply result is always 0.

Diff Detail

Event Timeline

craig.topper created this revision.Jul 21 2021, 11:53 AM
craig.topper requested review of this revision.Jul 21 2021, 11:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2021, 11:53 AM
Herald added subscribers: MaskRay, wdng. · View Herald Transcript
craig.topper added inline comments.Jul 21 2021, 11:57 AM
llvm/test/CodeGen/AMDGPU/sdiv64.ll
1397–1398

This also points to a failure to canonicalize constants to the RHS so the isNullValue check in visitMULHU would work. I tried to add the canonicalization without this patch, but ended up with a verifier failure on some AMDGPU tests.

It appears the simplification introduced in this patch catches something even earlier and produces simpler code that doesn't hit the verifier error.

RKSimon added a subscriber: foad.
RKSimon added inline comments.
llvm/test/CodeGen/AMDGPU/sdiv64.ll
1397–1398

@foad @arsenm - any ideas?

*** Bad machine code: VOP* instruction violates constant bus restriction ***
- function:    v_test_sdiv_k_num_i64
- basic block: %bb.0  (0x238f3f455f0)
- instruction: %160:vgpr_32 = V_ADDC_U32_e32 %70:sreg_32, %161:vgpr_32, implicit-def dead $vcc, implicit $vcc, implicit $exec
critson added inline comments.
llvm/test/CodeGen/AMDGPU/sdiv64.ll
1397–1398

There appears to be a verifier bug, where VCC is being counted toward constant bus usage for V_ADDC (where it is implicit).
The instruction is legal.

critson added inline comments.Jul 21 2021, 8:03 PM
llvm/test/CodeGen/AMDGPU/sdiv64.ll
1397–1398

Please ignore me - I misread the documentation.
The verifier is correct.

foad added inline comments.Jul 26 2021, 6:41 AM
llvm/test/CodeGen/AMDGPU/sdiv64.ll
1397–1398

How did you "add the canonicalization"? Have you got a patch for that?

RKSimon added inline comments.Jul 26 2021, 7:03 AM
llvm/test/CodeGen/AMDGPU/sdiv64.ll
1397–1398

Yes, I'll dig it out and post it on a bug

RKSimon added inline comments.Jul 26 2021, 7:12 AM

@craig.topper once D106868 goes in, are you happy to add the canonicalization of mulh constants to rhs (from PR51217) first?

@craig.topper once D106868 goes in, are you happy to add the canonicalization of mulh constants to rhs (from PR51217) first?

Yes we can add the canonicalization first.

RKSimon added inline comments.Aug 4 2021, 12:31 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4582

Out of interest, do we need MULHU support in SimplifyDemandedBits?

craig.topper added inline comments.Aug 4 2021, 1:07 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4582

I'm not sure there's much you can do. If you demand any of the bits from MULHU, then I think you demand all bits of the input. Maybe there's something you can do if you have known bits from one input, but I'd need to think about it a lot more.

RKSimon accepted this revision.Aug 5 2021, 5:09 AM

LGTM - cheers

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4582

Since a common reason for creating these nodes is divide by constants (TargetLowering.BuildUDIV et al) then RHS is likely to be constant - but yes this should probably wait until we have some actual examples.

This revision is now accepted and ready to land.Aug 5 2021, 5:09 AM