Download Raw Diff

Details

Reviewers

asb
craig.topper
jrtc27
olista01

Commits

rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a…

Summary

This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs.
Comparison with GCC: https://godbolt.org/z/c5zPdP7j4

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dtcxzyw created this revision.Aug 23 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 2:05 PM

Herald added subscribers: jobnoorman, luke, sunshaoce and 28 others. · View Herald Transcript

dtcxzyw requested review of this revision.Aug 23 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 2:05 PM

Herald added subscribers: llvm-commits, wangpc, eopXD, MaskRay. · View Herald Transcript

This seems like it would be better as a DAG->DAG transform pre-lowering? Or perhaps a more general peephole to merge the neg(w)+li? Or do it in TableGen? I don't think this warrants custom C++ lowering.

Probably via setTargetDAGCombine(ISD::ATOMIC_LOAD_SUB)?

In D158673#4611704, @jrtc27 wrote:

This seems like it would be better as a DAG->DAG transform pre-lowering? Or perhaps a more general peephole to merge the neg(w)+li? Or do it in TableGen? I don't think this warrants custom C++ lowering.

The isel patterns are creating the neg, it does seem better to create that early before isel to give maximum opportunity to combine it. The code in this patch seems very similar to AArch64.

Does this patch improve this too

define signext i32 @atomicrmw_sub_i32_monotonic(ptr %a, i32 %x, i32 %y) nounwind {

  %b = sub i32 %x, %y
  %1 = atomicrmw sub ptr %a, i32 %b monotonic
  ret i32 %1
  }

it currently generates

atomicrmw_sub_i32_monotonic:            # @atomicrmw_sub_i32_monotonic
        subw    a1, a1, a2
        neg     a1, a1
        amoadd.w        a0, a1, (a0)
        ret

But I think we could swap the operands to the subw to remove the neg.

craig.topper added inline comments.Aug 23 2023, 2:43 PM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1241	Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change to ReplaceNodeResults?

Harbormaster completed remote builds in B254468: Diff 552882.Aug 23 2023, 4:19 PM

Rebase
Address comments

dtcxzyw marked an inline comment as done.Aug 23 2023, 9:58 PM

Harbormaster completed remote builds in B254536: Diff 552985.Aug 23 2023, 10:31 PM

craig.topper added inline comments.Aug 24 2023, 11:01 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1239	Drop curly braces

craig.topper added inline comments.Aug 24 2023, 11:03 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2914	I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT?

Rebase
Address feedback

dtcxzyw marked 2 inline comments as done.Aug 24 2023, 12:21 PM

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

(That would allow AArch64 and RISCV to share support for this, which really doesn't need any target-specific knowledge)

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction. Then we could add an Expand action for ATOMIC_SUB to ExpandNode that uses NEG+ATOMIC_ADD. RISC-V and AArch64 could use the Expand action. Need to change AArch64TargetLowering constructor to figure out the cases to use Expand.

I think we also need a DAGCombine to call SimplifyDemandedBits on the operand based on the memory VT in order to remove the SIGN_EXTEND_INREG.

Harbormaster completed remote builds in B254697: Diff 553227.Aug 24 2023, 2:28 PM

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

In D158673#4615235, @jrtc27 wrote:

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

AArch64 seems to create add libcalls from sub. But maybe that isn't intentional?

In D158673#4615347, @craig.topper wrote:

In D158673#4615235, @jrtc27 wrote:

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

AArch64 seems to create add libcalls from sub. But maybe that isn't intentional?

For outlined atomics? That's intentional, they correspond to the available LSE instructions, and behave as if they were instructions.

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

In D158673#4615391, @craig.topper wrote:

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

I guess "not Expand" rather than "is Legal" then?

In D158673#4615395, @jrtc27 wrote:

In D158673#4615391, @craig.topper wrote:

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

I guess "not Expand" rather than "is Legal" then?

(or "is Legal or LibCall" depending on your view of Custom)

Rebase
Expand ATOMIC_LOAD_SUB to NEG+ATOMIC_LOAD_LOAD in LegalizeDAG::ExpandNode

Related patch (AArch64): D42477

Harbormaster completed remote builds in B254883: Diff 553465.Aug 25 2023, 8:20 AM

jrtc27 added inline comments.Aug 25 2023, 12:25 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3137 ↗	(On Diff #553465)	This should be conditional on what ATOMIC_LOAD_ADD is (see forced-atomics.ll for unnecessary churn, though that won't show that this _adds_ an instruction for non-constant operands, at least I assume)

craig.topper added inline comments.Aug 25 2023, 12:29 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3143 ↗	(On Diff #553465)	This is not the correct way to find out the sign_extend_inreg can be removed. Operand 0 of SIGN_EXTEND_INREG always matches the destination type which means it always matches VT if its present. So this will delete any SIGN_EXTEND_INREG no matter what. The check you really need is if (RHS->getOpcode() == ISD::SIGN_EXTEND_INREG && cast<VTSDNode>(RHS->getOperand(1))->getVT() == AN->getMemoryVT())

Looks like you didn't update tests for all targets affected by this patch.

Rebase
Fix ARM/RISCV regression tests

dtcxzyw marked 2 inline comments as done.Aug 26 2023, 6:34 AM

Harbormaster completed remote builds in B255078: Diff 553726.Aug 26 2023, 7:30 AM

I think this patch would change behavior atomicrmw sub for mips16, but it looks to be untested. llvm/test/CodeGen/Mips/atomicops.ll is the mips16 atomic test but it does not check all operations.

In D158673#4623269, @craig.topper wrote:

I think this patch would change behavior atomicrmw sub for mips16, but it looks to be untested. llvm/test/CodeGen/Mips/atomicops.ll is the mips16 atomic test but it does not check all operations.

I will add some pre-commit tests later.

Rebase
Fix Mips16 regression tests

Herald added a subscriber: sdardis. · View Herald TranscriptAug 28 2023, 8:46 PM

Harbormaster completed remote builds in B255391: Diff 554154.Aug 28 2023, 9:51 PM

Left a couple of very minor comment. The approach seems sound to me, but there's clear potential interaction with other targets and so I'd rather rely on a LGTM from someone who's been involved in this patch since the beginning, but can take a closer look if no-one has time.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1233	This comment is now out of date and should probably just be something like "Force __sync libcalls to be emitted for atomic rmw/cas operations."
llvm/lib/Target/RISCV/RISCVInstrInfoA.td
336	This "heading" now has nothing under it (and I think was in the wrong place anyway), so best delete it.

Rebase
Address feedback

Update diff with full context

Harbormaster completed remote builds in B256919: Diff 556336.Sep 8 2023, 8:56 PM

Ping.

LGTM

This revision is now accepted and ready to land.Sep 15 2023, 1:24 PM

This revision was landed with ongoing or failed builds.Sep 16 2023, 2:10 AM

Closed by commit rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a… (authored by dtcxzyw). · Explain Why

This revision was automatically updated to reflect the committed changes.

dtcxzyw added a commit: rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a….

Diff 553227

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Context not available.
	XLenVT, Expand);	XLenVT, Expand);
	}	}

		if (Subtarget.hasStdExtA())
		craig.topperUnsubmitted Done Reply Inline Actions Drop curly braces craig.topper: Drop curly braces
		setOperationAction(ISD::ATOMIC_LOAD_SUB, XLenVT, Custom);

		craig.topperUnsubmitted Done Reply Inline Actions Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change to ReplaceNodeResults? craig.topper: Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change…
	if (Subtarget.hasVendorXTHeadMemIdx()) {	if (Subtarget.hasVendorXTHeadMemIdx()) {
	for (unsigned im = (unsigned)ISD::PRE_INC; im != (unsigned)ISD::POST_DEC;	for (unsigned im = (unsigned)ISD::PRE_INC; im != (unsigned)ISD::POST_DEC;
	++im) {	++im) {
Context not available.
	return DAG.getNode(RISCVISD::VSLIDEUP_VL, DL, VT, Ops);	return DAG.getNode(RISCVISD::VSLIDEUP_VL, DL, VT, Ops);
	}	}

		static SDValue lowerATOMIC_LOAD_SUB(SDValue Op, SelectionDAG &DAG) {
		SDLoc DL(Op);
		MVT VT = Op.getSimpleValueType();
		AtomicSDNode *AN = cast<AtomicSDNode>(Op.getNode());
		SDValue RHS = Op.getOperand(2);
		if (RHS->getOpcode() == ISD::SIGN_EXTEND_INREG &&
		RHS->getOperand(0).getValueType() == VT)
		craig.topperUnsubmitted Done Reply Inline Actions I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT? craig.topper: I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT?
		RHS = RHS->getOperand(0);
		SDValue NewRHS =
		DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, DL, VT), RHS);
		return DAG.getAtomic(ISD::ATOMIC_LOAD_ADD, DL, AN->getMemoryVT(),
		Op.getOperand(0), Op.getOperand(1), NewRHS,
		AN->getMemOperand());
		}

	struct VIDSequence {	struct VIDSequence {
	int64_t StepNumerator;	int64_t StepNumerator;
	unsigned StepDenominator;	unsigned StepDenominator;
Context not available.
	!Subtarget.hasVInstructionsF16()))	!Subtarget.hasVInstructionsF16()))
	return SplitVPOp(Op, DAG);	return SplitVPOp(Op, DAG);
	return lowerVectorFTRUNC_FCEIL_FFLOOR_FROUND(Op, DAG, Subtarget);	return lowerVectorFTRUNC_FCEIL_FFLOOR_FROUND(Op, DAG, Subtarget);
		case ISD::ATOMIC_LOAD_SUB:
		return lowerATOMIC_LOAD_SUB(Op, DAG);
	}	}
	}	}

Context not available.

llvm/lib/Target/RISCV/RISCVInstrInfoA.td

Context not available.
	defm : AMOPat<"atomic_load_umax_32", "AMOMAXU_W">;	defm : AMOPat<"atomic_load_umax_32", "AMOMAXU_W">;
	defm : AMOPat<"atomic_load_umin_32", "AMOMINU_W">;	defm : AMOPat<"atomic_load_umin_32", "AMOMINU_W">;

	def : Pat<(XLenVT (atomic_load_sub_32_monotonic GPR:$addr, GPR:$incr)),
	(AMOADD_W GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_acquire GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_release GPR:$addr, GPR:$incr)),
	(AMOADD_W_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_acq_rel GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_seq_cst GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;

	/// Pseudo AMOs	/// Pseudo AMOs

	class PseudoAMO : Pseudo<(outs GPR:$res, GPR:$scratch),	class PseudoAMO : Pseudo<(outs GPR:$res, GPR:$scratch),
Context not available.

	/// 64-bit AMOs	/// 64-bit AMOs
		asbUnsubmitted Done Reply Inline Actions This "heading" now has nothing under it (and I think was in the wrong place anyway), so best delete it. asb: This "heading" now has nothing under it (and I think was in the wrong place anyway), so best…

	def : Pat<(i64 (atomic_load_sub_64_monotonic GPR:$addr, GPR:$incr)),
	(AMOADD_D GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_acquire GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_release GPR:$addr, GPR:$incr)),
	(AMOADD_D_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_acq_rel GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_seq_cst GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;

	/// 64-bit pseudo AMOs	/// 64-bit pseudo AMOs

	let Size = 20 in	let Size = 20 in
Context not available.

llvm/test/CodeGen/RISCV/atomic-rmw-sub.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
				; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV32I %s
				; RUN: llc -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefixes=RV32IA %s
				; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV64I %s
				; RUN: llc -mtriple=riscv64 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefixes=RV64IA %s

				define i32 @atomicrmw_sub_i32_constant(ptr %a) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i32_constant:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: li a1, 1
				; RV32I-NEXT: li a2, 5
				; RV32I-NEXT: call __atomic_fetch_sub_4@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i32_constant:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: li a1, -1
				; RV32IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i32_constant:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: li a1, 1
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_4@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i32_constant:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: li a1, -1
				; RV64IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
				; RV64IA-NEXT: ret
				%1 = atomicrmw sub ptr %a, i32 1 seq_cst
				ret i32 %1
				}

				define i64 @atomicrmw_sub_i64_constant(ptr %a) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i64_constant:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: li a1, 1
				; RV32I-NEXT: li a3, 5
				; RV32I-NEXT: li a2, 0
				; RV32I-NEXT: call __atomic_fetch_sub_8@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i64_constant:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32IA-NEXT: li a1, 1
				; RV32IA-NEXT: li a3, 5
				; RV32IA-NEXT: li a2, 0
				; RV32IA-NEXT: call __atomic_fetch_sub_8@plt
				; RV32IA-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i64_constant:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: li a1, 1
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_8@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i64_constant:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: li a1, -1
				; RV64IA-NEXT: amoadd.d.aqrl a0, a1, (a0)
				; RV64IA-NEXT: ret
				%1 = atomicrmw sub ptr %a, i64 1 seq_cst
				ret i64 %1
				}

				define i32 @atomicrmw_sub_i32_neg(ptr %a, i32 %x, i32 %y) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i32_neg:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: sub a1, a1, a2
				; RV32I-NEXT: li a2, 5
				; RV32I-NEXT: call __atomic_fetch_sub_4@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i32_neg:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sub a2, a2, a1
				; RV32IA-NEXT: amoadd.w.aqrl a0, a2, (a0)
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i32_neg:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: subw a1, a1, a2
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_4@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i32_neg:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: sub a2, a2, a1
				; RV64IA-NEXT: amoadd.w.aqrl a0, a2, (a0)
				; RV64IA-NEXT: ret
				%b = sub i32 %x, %y
				%1 = atomicrmw sub ptr %a, i32 %b seq_cst
				ret i32 %1
				}

				define i64 @atomicrmw_sub_i64_neg(ptr %a, i64 %x, i64 %y) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i64_neg:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: sltu a5, a1, a3
				; RV32I-NEXT: sub a2, a2, a4
				; RV32I-NEXT: sub a2, a2, a5
				; RV32I-NEXT: sub a1, a1, a3
				; RV32I-NEXT: li a3, 5
				; RV32I-NEXT: call __atomic_fetch_sub_8@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i64_neg:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32IA-NEXT: sltu a5, a1, a3
				; RV32IA-NEXT: sub a2, a2, a4
				; RV32IA-NEXT: sub a2, a2, a5
				; RV32IA-NEXT: sub a1, a1, a3
				; RV32IA-NEXT: li a3, 5
				; RV32IA-NEXT: call __atomic_fetch_sub_8@plt
				; RV32IA-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i64_neg:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: sub a1, a1, a2
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_8@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i64_neg:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: sub a2, a2, a1
				; RV64IA-NEXT: amoadd.d.aqrl a0, a2, (a0)
				; RV64IA-NEXT: ret
				%b = sub i64 %x, %y
				%1 = atomicrmw sub ptr %a, i64 %b seq_cst
				ret i64 %1
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 553227

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/RISCV/RISCVInstrInfoA.td

llvm/test/CodeGen/RISCV/atomic-rmw-sub.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 553227

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/RISCV/RISCVInstrInfoA.td

llvm/test/CodeGen/RISCV/atomic-rmw-sub.ll

[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs
ClosedPublic