This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs.
Comparison with GCC: https://godbolt.org/z/c5zPdP7j4
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This seems like it would be better as a DAG->DAG transform pre-lowering? Or perhaps a more general peephole to merge the neg(w)+li? Or do it in TableGen? I don't think this warrants custom C++ lowering.
The isel patterns are creating the neg, it does seem better to create that early before isel to give maximum opportunity to combine it. The code in this patch seems very similar to AArch64.
Does this patch improve this too
define signext i32 @atomicrmw_sub_i32_monotonic(ptr %a, i32 %x, i32 %y) nounwind {
  %b = sub i32 %x, %y
  %1 = atomicrmw sub ptr %a, i32 %b monotonic
  ret i32 %1
  }it currently generates
atomicrmw_sub_i32_monotonic:            # @atomicrmw_sub_i32_monotonic
        subw    a1, a1, a2
        neg     a1, a1
        amoadd.w        a0, a1, (a0)
        retBut I think we could swap the operands to the subw to remove the neg.
| llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
|---|---|---|
| 1244 | Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change to ReplaceNodeResults? | |
| llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
|---|---|---|
| 1242 | Drop curly braces | |
| llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
|---|---|---|
| 2913 | I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT? | |
Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?
(That would allow AArch64 and RISCV to share support for this, which really doesn't need any target-specific knowledge)
I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.
That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.
We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction. Then we could add an Expand action for ATOMIC_SUB to ExpandNode that uses NEG+ATOMIC_ADD. RISC-V and AArch64 could use the Expand action. Need to change AArch64TargetLowering constructor to figure out the cases to use Expand.
I think we also need a DAGCombine to call SimplifyDemandedBits on the operand based on the memory VT in order to remove the SIGN_EXTEND_INREG.
I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?
For outlined atomics? That's intentional, they correspond to the available LSE instructions, and behave as if they were instructions.
So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.
- Rebase
- Expand ATOMIC_LOAD_SUB to NEG+ATOMIC_LOAD_LOAD in LegalizeDAG::ExpandNode
Related patch (AArch64): D42477
| llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | ||
|---|---|---|
| 3137 | This should be conditional on what ATOMIC_LOAD_ADD is (see forced-atomics.ll for unnecessary churn, though that won't show that this _adds_ an instruction for non-constant operands, at least I assume) | |
| llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | ||
|---|---|---|
| 3143 | This is not the correct way to find out the sign_extend_inreg can be removed. Operand 0 of SIGN_EXTEND_INREG always matches the destination type which means it always matches VT if its present. So this will delete any SIGN_EXTEND_INREG no matter what. The check you really need is if (RHS->getOpcode() == ISD::SIGN_EXTEND_INREG &&
    cast<VTSDNode>(RHS->getOperand(1))->getVT() == AN->getMemoryVT()) | |
I think this patch would change behavior atomicrmw sub for mips16, but it looks to be untested. llvm/test/CodeGen/Mips/atomicops.ll is the mips16 atomic test but it does not check all operations.
Left a couple of very minor comment. The approach seems sound to me, but there's clear potential interaction with other targets and so I'd rather rely on a LGTM from someone who's been involved in this patch since the beginning, but can take a closer look if no-one has time.
| llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
|---|---|---|
| 1233 | This comment is now out of date and should probably just be something like "Force __sync libcalls to be emitted for atomic rmw/cas operations." | |
| llvm/lib/Target/RISCV/RISCVInstrInfoA.td | ||
| 336 | This "heading" now has nothing under it (and I think was in the wrong place anyway), so best delete it. | |
This should be conditional on what ATOMIC_LOAD_ADD is (see forced-atomics.ll for unnecessary churn, though that won't show that this _adds_ an instruction for non-constant operands, at least I assume)