This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
4/4
RISCVISelLowering.cpp
1/1
RISCVInstrInfoA.td
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
-
atomic-rmw-sub-constant.ll
-
atomic-rmw.ll
-
atomic-signext.ll

Differential D158673

[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs
ClosedPublic

Authored by dtcxzyw on Aug 23 2023, 2:05 PM.

Download Raw Diff

Details

Reviewers

asb
craig.topper
jrtc27
olista01

Commits

rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a…

Summary

This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs.
Comparison with GCC: https://godbolt.org/z/c5zPdP7j4

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,040 ms	x64 debian > MLIR.Examples/standalone::test.toy

Event Timeline

dtcxzyw created this revision.Aug 23 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 2:05 PM

Herald added subscribers: jobnoorman, luke, sunshaoce and 28 others. · View Herald Transcript

dtcxzyw requested review of this revision.Aug 23 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 2:05 PM

Herald added subscribers: llvm-commits, wangpc, eopXD, MaskRay. · View Herald Transcript

This seems like it would be better as a DAG->DAG transform pre-lowering? Or perhaps a more general peephole to merge the neg(w)+li? Or do it in TableGen? I don't think this warrants custom C++ lowering.

Probably via setTargetDAGCombine(ISD::ATOMIC_LOAD_SUB)?

In D158673#4611704, @jrtc27 wrote:

This seems like it would be better as a DAG->DAG transform pre-lowering? Or perhaps a more general peephole to merge the neg(w)+li? Or do it in TableGen? I don't think this warrants custom C++ lowering.

The isel patterns are creating the neg, it does seem better to create that early before isel to give maximum opportunity to combine it. The code in this patch seems very similar to AArch64.

Does this patch improve this too

define signext i32 @atomicrmw_sub_i32_monotonic(ptr %a, i32 %x, i32 %y) nounwind {

  %b = sub i32 %x, %y
  %1 = atomicrmw sub ptr %a, i32 %b monotonic
  ret i32 %1
  }

it currently generates

atomicrmw_sub_i32_monotonic:            # @atomicrmw_sub_i32_monotonic
        subw    a1, a1, a2
        neg     a1, a1
        amoadd.w        a0, a1, (a0)
        ret

But I think we could swap the operands to the subw to remove the neg.

craig.topper added inline comments.Aug 23 2023, 2:43 PM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1241	Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change to ReplaceNodeResults?

Harbormaster completed remote builds in B254468: Diff 552882.Aug 23 2023, 4:19 PM

Rebase
Address comments

dtcxzyw marked an inline comment as done.Aug 23 2023, 9:58 PM

Harbormaster completed remote builds in B254536: Diff 552985.Aug 23 2023, 10:31 PM

craig.topper added inline comments.Aug 24 2023, 11:01 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1239	Drop curly braces

craig.topper added inline comments.Aug 24 2023, 11:03 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2916	I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT?

Rebase
Address feedback

dtcxzyw marked 2 inline comments as done.Aug 24 2023, 12:21 PM

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

(That would allow AArch64 and RISCV to share support for this, which really doesn't need any target-specific knowledge)

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction. Then we could add an Expand action for ATOMIC_SUB to ExpandNode that uses NEG+ATOMIC_ADD. RISC-V and AArch64 could use the Expand action. Need to change AArch64TargetLowering constructor to figure out the cases to use Expand.

I think we also need a DAGCombine to call SimplifyDemandedBits on the operand based on the memory VT in order to remove the SIGN_EXTEND_INREG.

Harbormaster completed remote builds in B254697: Diff 553227.Aug 24 2023, 2:28 PM

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

In D158673#4615235, @jrtc27 wrote:

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

AArch64 seems to create add libcalls from sub. But maybe that isn't intentional?

In D158673#4615347, @craig.topper wrote:

In D158673#4615235, @jrtc27 wrote:

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

AArch64 seems to create add libcalls from sub. But maybe that isn't intentional?

For outlined atomics? That's intentional, they correspond to the available LSE instructions, and behave as if they were instructions.

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

In D158673#4615391, @craig.topper wrote:

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

I guess "not Expand" rather than "is Legal" then?

In D158673#4615395, @jrtc27 wrote:

In D158673#4615391, @craig.topper wrote:

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

I guess "not Expand" rather than "is Legal" then?

(or "is Legal or LibCall" depending on your view of Custom)

Rebase
Expand ATOMIC_LOAD_SUB to NEG+ATOMIC_LOAD_LOAD in LegalizeDAG::ExpandNode

Related patch (AArch64): D42477

Harbormaster completed remote builds in B254883: Diff 553465.Aug 25 2023, 8:20 AM

jrtc27 added inline comments.Aug 25 2023, 12:25 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3137 ↗	(On Diff #553465)	This should be conditional on what ATOMIC_LOAD_ADD is (see forced-atomics.ll for unnecessary churn, though that won't show that this _adds_ an instruction for non-constant operands, at least I assume)

craig.topper added inline comments.Aug 25 2023, 12:29 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3143 ↗	(On Diff #553465)	This is not the correct way to find out the sign_extend_inreg can be removed. Operand 0 of SIGN_EXTEND_INREG always matches the destination type which means it always matches VT if its present. So this will delete any SIGN_EXTEND_INREG no matter what. The check you really need is if (RHS->getOpcode() == ISD::SIGN_EXTEND_INREG && cast<VTSDNode>(RHS->getOperand(1))->getVT() == AN->getMemoryVT())

Looks like you didn't update tests for all targets affected by this patch.

Rebase
Fix ARM/RISCV regression tests

dtcxzyw marked 2 inline comments as done.Aug 26 2023, 6:34 AM

Harbormaster completed remote builds in B255078: Diff 553726.Aug 26 2023, 7:30 AM

I think this patch would change behavior atomicrmw sub for mips16, but it looks to be untested. llvm/test/CodeGen/Mips/atomicops.ll is the mips16 atomic test but it does not check all operations.

In D158673#4623269, @craig.topper wrote:

I think this patch would change behavior atomicrmw sub for mips16, but it looks to be untested. llvm/test/CodeGen/Mips/atomicops.ll is the mips16 atomic test but it does not check all operations.

I will add some pre-commit tests later.

Rebase
Fix Mips16 regression tests

Herald added a subscriber: sdardis. · View Herald TranscriptAug 28 2023, 8:46 PM

Harbormaster completed remote builds in B255391: Diff 554154.Aug 28 2023, 9:51 PM

Left a couple of very minor comment. The approach seems sound to me, but there's clear potential interaction with other targets and so I'd rather rely on a LGTM from someone who's been involved in this patch since the beginning, but can take a closer look if no-one has time.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1230	This comment is now out of date and should probably just be something like "Force __sync libcalls to be emitted for atomic rmw/cas operations."
llvm/lib/Target/RISCV/RISCVInstrInfoA.td
336	This "heading" now has nothing under it (and I think was in the wrong place anyway), so best delete it.

Rebase
Address feedback

Update diff with full context

Harbormaster completed remote builds in B256919: Diff 556336.Sep 8 2023, 8:56 PM

Ping.

LGTM

This revision is now accepted and ready to land.Sep 15 2023, 1:24 PM

This revision was landed with ongoing or failed builds.Sep 16 2023, 2:10 AM

Closed by commit rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a… (authored by dtcxzyw). · Explain Why

This revision was automatically updated to reflect the committed changes.

dtcxzyw added a commit: rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a….

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelLowering.cpp

29 lines

RISCVInstrInfoA.td

22 lines

test/

CodeGen/

RISCV/

atomic-rmw-sub-constant.ll

92 lines

atomic-rmw.ll

10 lines

atomic-signext.ll

2 lines

Diff 552882

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,221 Lines • ▼ Show 20 Lines	if (Subtarget.useRVVForFixedLengthVectors()) {
if (Subtarget.hasStdExtFOrZfinx())		if (Subtarget.hasStdExtFOrZfinx())
setOperationAction(ISD::BITCAST, MVT::f32, Custom);		setOperationAction(ISD::BITCAST, MVT::f32, Custom);
if (Subtarget.hasStdExtDOrZdinx())		if (Subtarget.hasStdExtDOrZdinx())
setOperationAction(ISD::BITCAST, MVT::f64, Custom);		setOperationAction(ISD::BITCAST, MVT::f64, Custom);
}		}
}		}

if (Subtarget.hasForcedAtomics()) {		if (Subtarget.hasForcedAtomics()) {
// Set atomic rmw/cas operations to expand to force __sync libcalls.		// Set atomic rmw/cas operations to expand to force __sync libcalls.
		asbUnsubmitted Done Reply Inline Actions This comment is now out of date and should probably just be something like "Force __sync libcalls to be emitted for atomic rmw/cas operations." asb: This comment is now out of date and should probably just be something like "Force __sync…
setOperationAction(		setOperationAction(
{ISD::ATOMIC_CMP_SWAP, ISD::ATOMIC_SWAP, ISD::ATOMIC_LOAD_ADD,		{ISD::ATOMIC_CMP_SWAP, ISD::ATOMIC_SWAP, ISD::ATOMIC_LOAD_ADD,
ISD::ATOMIC_LOAD_SUB, ISD::ATOMIC_LOAD_AND, ISD::ATOMIC_LOAD_OR,		ISD::ATOMIC_LOAD_SUB, ISD::ATOMIC_LOAD_AND, ISD::ATOMIC_LOAD_OR,
ISD::ATOMIC_LOAD_XOR, ISD::ATOMIC_LOAD_NAND, ISD::ATOMIC_LOAD_MIN,		ISD::ATOMIC_LOAD_XOR, ISD::ATOMIC_LOAD_NAND, ISD::ATOMIC_LOAD_MIN,
ISD::ATOMIC_LOAD_MAX, ISD::ATOMIC_LOAD_UMIN, ISD::ATOMIC_LOAD_UMAX},		ISD::ATOMIC_LOAD_MAX, ISD::ATOMIC_LOAD_UMIN, ISD::ATOMIC_LOAD_UMAX},
XLenVT, Expand);		XLenVT, Expand);
}		}

		if (Subtarget.hasStdExtA()) {
		craig.topperUnsubmitted Done Reply Inline Actions Drop curly braces craig.topper: Drop curly braces
		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Custom);
		if (Subtarget.is64Bit())
		craig.topperUnsubmitted Done Reply Inline Actions Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change to ReplaceNodeResults? craig.topper: Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change…
		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, Custom);
		}

if (Subtarget.hasVendorXTHeadMemIdx()) {		if (Subtarget.hasVendorXTHeadMemIdx()) {
for (unsigned im = (unsigned)ISD::PRE_INC; im != (unsigned)ISD::POST_DEC;		for (unsigned im = (unsigned)ISD::PRE_INC; im != (unsigned)ISD::POST_DEC;
++im) {		++im) {
setIndexedLoadAction(im, MVT::i8, Legal);		setIndexedLoadAction(im, MVT::i8, Legal);
setIndexedStoreAction(im, MVT::i8, Legal);		setIndexedStoreAction(im, MVT::i8, Legal);
setIndexedLoadAction(im, MVT::i16, Legal);		setIndexedLoadAction(im, MVT::i16, Legal);
setIndexedStoreAction(im, MVT::i16, Legal);		setIndexedStoreAction(im, MVT::i16, Legal);
setIndexedLoadAction(im, MVT::i32, Legal);		setIndexedLoadAction(im, MVT::i32, Legal);
▲ Show 20 Lines • Show All 1,650 Lines • ▼ Show 20 Lines	getVSlideup(SelectionDAG &DAG, const RISCVSubtarget &Subtarget, const SDLoc &DL,
unsigned Policy = RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED) {		unsigned Policy = RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED) {
if (Merge.isUndef())		if (Merge.isUndef())
Policy = RISCVII::TAIL_AGNOSTIC \| RISCVII::MASK_AGNOSTIC;		Policy = RISCVII::TAIL_AGNOSTIC \| RISCVII::MASK_AGNOSTIC;
SDValue PolicyOp = DAG.getTargetConstant(Policy, DL, Subtarget.getXLenVT());		SDValue PolicyOp = DAG.getTargetConstant(Policy, DL, Subtarget.getXLenVT());
SDValue Ops[] = {Merge, Op, Offset, Mask, VL, PolicyOp};		SDValue Ops[] = {Merge, Op, Offset, Mask, VL, PolicyOp};
return DAG.getNode(RISCVISD::VSLIDEUP_VL, DL, VT, Ops);		return DAG.getNode(RISCVISD::VSLIDEUP_VL, DL, VT, Ops);
}		}

		static SDValue lowerATOMIC_LOAD_SUB(SDValue Op, SelectionDAG &DAG) {
		SDLoc DL(Op);
		MVT VT = Op.getSimpleValueType();
		SDValue RHS = Op.getOperand(2);
		AtomicSDNode *AN = cast<AtomicSDNode>(Op.getNode());
		SDValue NewRHS =
		craig.topperUnsubmitted Done Reply Inline Actions I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT? craig.topper: I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT?
		DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, DL, VT), RHS);
		return DAG.getAtomic(ISD::ATOMIC_LOAD_ADD, DL, AN->getMemoryVT(),
		Op.getOperand(0), Op.getOperand(1), NewRHS,
		AN->getMemOperand());
		}

struct VIDSequence {		struct VIDSequence {
int64_t StepNumerator;		int64_t StepNumerator;
unsigned StepDenominator;		unsigned StepDenominator;
int64_t Addend;		int64_t Addend;
};		};

static std::optional<uint64_t> getExactInteger(const APFloat &APF,		static std::optional<uint64_t> getExactInteger(const APFloat &APF,
uint32_t BitWidth) {		uint32_t BitWidth) {
▲ Show 20 Lines • Show All 3,175 Lines • ▼ Show 20 Lines	SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
case ISD::VP_FROUND:		case ISD::VP_FROUND:
case ISD::VP_FROUNDEVEN:		case ISD::VP_FROUNDEVEN:
case ISD::VP_FROUNDTOZERO:		case ISD::VP_FROUNDTOZERO:
if (Op.getValueType() == MVT::nxv32f16 &&		if (Op.getValueType() == MVT::nxv32f16 &&
(Subtarget.hasVInstructionsF16Minimal() &&		(Subtarget.hasVInstructionsF16Minimal() &&
!Subtarget.hasVInstructionsF16()))		!Subtarget.hasVInstructionsF16()))
return SplitVPOp(Op, DAG);		return SplitVPOp(Op, DAG);
return lowerVectorFTRUNC_FCEIL_FFLOOR_FROUND(Op, DAG, Subtarget);		return lowerVectorFTRUNC_FCEIL_FFLOOR_FROUND(Op, DAG, Subtarget);
		case ISD::ATOMIC_LOAD_SUB:
		return lowerATOMIC_LOAD_SUB(Op, DAG);
}		}
}		}

static SDValue getTargetNode(GlobalAddressSDNode *N, const SDLoc &DL, EVT Ty,		static SDValue getTargetNode(GlobalAddressSDNode *N, const SDLoc &DL, EVT Ty,
SelectionDAG &DAG, unsigned Flags) {		SelectionDAG &DAG, unsigned Flags) {
return DAG.getTargetGlobalAddress(N->getGlobal(), DL, Ty, 0, Flags);		return DAG.getTargetGlobalAddress(N->getGlobal(), DL, Ty, 0, Flags);
}		}

▲ Show 20 Lines • Show All 4,700 Lines • ▼ Show 20 Lines	case ISD::VP_REDUCE_UMIN:
break;		break;
case ISD::GET_ROUNDING: {		case ISD::GET_ROUNDING: {
SDVTList VTs = DAG.getVTList(Subtarget.getXLenVT(), MVT::Other);		SDVTList VTs = DAG.getVTList(Subtarget.getXLenVT(), MVT::Other);
SDValue Res = DAG.getNode(ISD::GET_ROUNDING, DL, VTs, N->getOperand(0));		SDValue Res = DAG.getNode(ISD::GET_ROUNDING, DL, VTs, N->getOperand(0));
Results.push_back(Res.getValue(0));		Results.push_back(Res.getValue(0));
Results.push_back(Res.getValue(1));		Results.push_back(Res.getValue(1));
break;		break;
}		}
		case ISD::ATOMIC_LOAD_SUB: {
		MVT VT = N->getSimpleValueType(0);
		assert(VT == MVT::i32 && Subtarget.is64Bit() &&
		"Unexpected custom legalization");
		SDValue Res = lowerATOMIC_LOAD_SUB(SDValue(N, 0), DAG);
		Results.push_back(Res);
		Results.push_back(Res.getValue(1));
		break;
		}
}		}
}		}

// Try to fold (<bop> x, (reduction.<bop> vec, start))		// Try to fold (<bop> x, (reduction.<bop> vec, start))
static SDValue combineBinOpToReduce(SDNode *N, SelectionDAG &DAG,		static SDValue combineBinOpToReduce(SDNode *N, SelectionDAG &DAG,
const RISCVSubtarget &Subtarget) {		const RISCVSubtarget &Subtarget) {
auto BinOpToRVVReduce = [](unsigned Opc) {		auto BinOpToRVVReduce = [](unsigned Opc) {
switch (Opc) {		switch (Opc) {
▲ Show 20 Lines • Show All 7,400 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfoA.td

	Show First 20 Lines • Show All 168 Lines • ▼ Show 20 Lines
	defm : AMOPat<"atomic_load_and_32", "AMOAND_W">;			defm : AMOPat<"atomic_load_and_32", "AMOAND_W">;
	defm : AMOPat<"atomic_load_or_32", "AMOOR_W">;			defm : AMOPat<"atomic_load_or_32", "AMOOR_W">;
	defm : AMOPat<"atomic_load_xor_32", "AMOXOR_W">;			defm : AMOPat<"atomic_load_xor_32", "AMOXOR_W">;
	defm : AMOPat<"atomic_load_max_32", "AMOMAX_W">;			defm : AMOPat<"atomic_load_max_32", "AMOMAX_W">;
	defm : AMOPat<"atomic_load_min_32", "AMOMIN_W">;			defm : AMOPat<"atomic_load_min_32", "AMOMIN_W">;
	defm : AMOPat<"atomic_load_umax_32", "AMOMAXU_W">;			defm : AMOPat<"atomic_load_umax_32", "AMOMAXU_W">;
	defm : AMOPat<"atomic_load_umin_32", "AMOMINU_W">;			defm : AMOPat<"atomic_load_umin_32", "AMOMINU_W">;

	def : Pat<(XLenVT (atomic_load_sub_32_monotonic GPR:$addr, GPR:$incr)),
	(AMOADD_W GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_acquire GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_release GPR:$addr, GPR:$incr)),
	(AMOADD_W_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_acq_rel GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_seq_cst GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;

	/// Pseudo AMOs			/// Pseudo AMOs

	class PseudoAMO : Pseudo<(outs GPR:$res, GPR:$scratch),			class PseudoAMO : Pseudo<(outs GPR:$res, GPR:$scratch),
	(ins GPR:$addr, GPR:$incr, ixlenimm:$ordering), []> {			(ins GPR:$addr, GPR:$incr, ixlenimm:$ordering), []> {
	let Constraints = "@earlyclobber $res,@earlyclobber $scratch";			let Constraints = "@earlyclobber $res,@earlyclobber $scratch";
	let mayLoad = 1;			let mayLoad = 1;
	let mayStore = 1;			let mayStore = 1;
	let hasSideEffects = 0;			let hasSideEffects = 0;
	▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines
	defm : AMOPat<"atomic_load_and_64", "AMOAND_D", i64>;			defm : AMOPat<"atomic_load_and_64", "AMOAND_D", i64>;
	defm : AMOPat<"atomic_load_or_64", "AMOOR_D", i64>;			defm : AMOPat<"atomic_load_or_64", "AMOOR_D", i64>;
	defm : AMOPat<"atomic_load_xor_64", "AMOXOR_D", i64>;			defm : AMOPat<"atomic_load_xor_64", "AMOXOR_D", i64>;
	defm : AMOPat<"atomic_load_max_64", "AMOMAX_D", i64>;			defm : AMOPat<"atomic_load_max_64", "AMOMAX_D", i64>;
	defm : AMOPat<"atomic_load_min_64", "AMOMIN_D", i64>;			defm : AMOPat<"atomic_load_min_64", "AMOMIN_D", i64>;
	defm : AMOPat<"atomic_load_umax_64", "AMOMAXU_D", i64>;			defm : AMOPat<"atomic_load_umax_64", "AMOMAXU_D", i64>;
	defm : AMOPat<"atomic_load_umin_64", "AMOMINU_D", i64>;			defm : AMOPat<"atomic_load_umin_64", "AMOMINU_D", i64>;

	/// 64-bit AMOs			/// 64-bit AMOs
				asbUnsubmitted Done Reply Inline Actions This "heading" now has nothing under it (and I think was in the wrong place anyway), so best delete it. asb: This "heading" now has nothing under it (and I think was in the wrong place anyway), so best…

	def : Pat<(i64 (atomic_load_sub_64_monotonic GPR:$addr, GPR:$incr)),
	(AMOADD_D GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_acquire GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_release GPR:$addr, GPR:$incr)),
	(AMOADD_D_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_acq_rel GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_seq_cst GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;

	/// 64-bit pseudo AMOs			/// 64-bit pseudo AMOs

	let Size = 20 in			let Size = 20 in
	def PseudoAtomicLoadNand64 : PseudoAMO;			def PseudoAtomicLoadNand64 : PseudoAMO;
	// Ordering constants must be kept in sync with the AtomicOrdering enum in			// Ordering constants must be kept in sync with the AtomicOrdering enum in
	// AtomicOrdering.h.			// AtomicOrdering.h.
	def : Pat<(i64 (atomic_load_nand_64_monotonic GPR:$addr, GPR:$incr)),			def : Pat<(i64 (atomic_load_nand_64_monotonic GPR:$addr, GPR:$incr)),
	(PseudoAtomicLoadNand64 GPR:$addr, GPR:$incr, 2)>;			(PseudoAtomicLoadNand64 GPR:$addr, GPR:$incr, 2)>;
	Show All 36 Lines

llvm/test/CodeGen/RISCV/atomic-rmw-sub-constant.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
				; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV32I %s
				; RUN: llc -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefixes=RV32IA %s
				; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV64I %s
				; RUN: llc -mtriple=riscv64 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefixes=RV64IA %s

				define i32 @atomicrmw_sub_i32_constant(ptr %a) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i32_constant:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: li a1, 1
				; RV32I-NEXT: li a2, 5
				; RV32I-NEXT: call __atomic_fetch_sub_4@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i32_constant:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: li a1, -1
				; RV32IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i32_constant:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: li a1, 1
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_4@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i32_constant:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: li a1, -1
				; RV64IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
				; RV64IA-NEXT: ret
				%1 = atomicrmw sub ptr %a, i32 1 seq_cst
				ret i32 %1
				}

				define i64 @atomicrmw_sub_i64_constant(ptr %a) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i64_constant:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: li a1, 1
				; RV32I-NEXT: li a3, 5
				; RV32I-NEXT: li a2, 0
				; RV32I-NEXT: call __atomic_fetch_sub_8@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i64_constant:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32IA-NEXT: li a1, 1
				; RV32IA-NEXT: li a3, 5
				; RV32IA-NEXT: li a2, 0
				; RV32IA-NEXT: call __atomic_fetch_sub_8@plt
				; RV32IA-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i64_constant:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: li a1, 1
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_8@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i64_constant:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: li a1, -1
				; RV64IA-NEXT: amoadd.d.aqrl a0, a1, (a0)
				; RV64IA-NEXT: ret
				%1 = atomicrmw sub ptr %a, i64 1 seq_cst
				ret i64 %1
				}

llvm/test/CodeGen/RISCV/atomic-rmw.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 13,405 Lines • ▼ Show 20 Lines
	; RV64I-NEXT: li a2, 0			; RV64I-NEXT: li a2, 0
	; RV64I-NEXT: call __atomic_fetch_sub_4@plt			; RV64I-NEXT: call __atomic_fetch_sub_4@plt
	; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
	; RV64I-NEXT: addi sp, sp, 16			; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IA-LABEL: atomicrmw_sub_i32_monotonic:			; RV64IA-LABEL: atomicrmw_sub_i32_monotonic:
	; RV64IA: # %bb.0:			; RV64IA: # %bb.0:
	; RV64IA-NEXT: neg a1, a1			; RV64IA-NEXT: negw a1, a1
	; RV64IA-NEXT: amoadd.w a0, a1, (a0)			; RV64IA-NEXT: amoadd.w a0, a1, (a0)
	; RV64IA-NEXT: ret			; RV64IA-NEXT: ret
	%1 = atomicrmw sub ptr %a, i32 %b monotonic			%1 = atomicrmw sub ptr %a, i32 %b monotonic
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomicrmw_sub_i32_acquire(ptr %a, i32 %b) nounwind {			define i32 @atomicrmw_sub_i32_acquire(ptr %a, i32 %b) nounwind {
	; RV32I-LABEL: atomicrmw_sub_i32_acquire:			; RV32I-LABEL: atomicrmw_sub_i32_acquire:
	Show All 19 Lines
	; RV64I-NEXT: li a2, 2			; RV64I-NEXT: li a2, 2
	; RV64I-NEXT: call __atomic_fetch_sub_4@plt			; RV64I-NEXT: call __atomic_fetch_sub_4@plt
	; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
	; RV64I-NEXT: addi sp, sp, 16			; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IA-LABEL: atomicrmw_sub_i32_acquire:			; RV64IA-LABEL: atomicrmw_sub_i32_acquire:
	; RV64IA: # %bb.0:			; RV64IA: # %bb.0:
	; RV64IA-NEXT: neg a1, a1			; RV64IA-NEXT: negw a1, a1
	; RV64IA-NEXT: amoadd.w.aq a0, a1, (a0)			; RV64IA-NEXT: amoadd.w.aq a0, a1, (a0)
	; RV64IA-NEXT: ret			; RV64IA-NEXT: ret
	%1 = atomicrmw sub ptr %a, i32 %b acquire			%1 = atomicrmw sub ptr %a, i32 %b acquire
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomicrmw_sub_i32_release(ptr %a, i32 %b) nounwind {			define i32 @atomicrmw_sub_i32_release(ptr %a, i32 %b) nounwind {
	; RV32I-LABEL: atomicrmw_sub_i32_release:			; RV32I-LABEL: atomicrmw_sub_i32_release:
	Show All 19 Lines
	; RV64I-NEXT: li a2, 3			; RV64I-NEXT: li a2, 3
	; RV64I-NEXT: call __atomic_fetch_sub_4@plt			; RV64I-NEXT: call __atomic_fetch_sub_4@plt
	; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
	; RV64I-NEXT: addi sp, sp, 16			; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IA-LABEL: atomicrmw_sub_i32_release:			; RV64IA-LABEL: atomicrmw_sub_i32_release:
	; RV64IA: # %bb.0:			; RV64IA: # %bb.0:
	; RV64IA-NEXT: neg a1, a1			; RV64IA-NEXT: negw a1, a1
	; RV64IA-NEXT: amoadd.w.rl a0, a1, (a0)			; RV64IA-NEXT: amoadd.w.rl a0, a1, (a0)
	; RV64IA-NEXT: ret			; RV64IA-NEXT: ret
	%1 = atomicrmw sub ptr %a, i32 %b release			%1 = atomicrmw sub ptr %a, i32 %b release
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomicrmw_sub_i32_acq_rel(ptr %a, i32 %b) nounwind {			define i32 @atomicrmw_sub_i32_acq_rel(ptr %a, i32 %b) nounwind {
	; RV32I-LABEL: atomicrmw_sub_i32_acq_rel:			; RV32I-LABEL: atomicrmw_sub_i32_acq_rel:
	Show All 19 Lines
	; RV64I-NEXT: li a2, 4			; RV64I-NEXT: li a2, 4
	; RV64I-NEXT: call __atomic_fetch_sub_4@plt			; RV64I-NEXT: call __atomic_fetch_sub_4@plt
	; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
	; RV64I-NEXT: addi sp, sp, 16			; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IA-LABEL: atomicrmw_sub_i32_acq_rel:			; RV64IA-LABEL: atomicrmw_sub_i32_acq_rel:
	; RV64IA: # %bb.0:			; RV64IA: # %bb.0:
	; RV64IA-NEXT: neg a1, a1			; RV64IA-NEXT: negw a1, a1
	; RV64IA-NEXT: amoadd.w.aqrl a0, a1, (a0)			; RV64IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
	; RV64IA-NEXT: ret			; RV64IA-NEXT: ret
	%1 = atomicrmw sub ptr %a, i32 %b acq_rel			%1 = atomicrmw sub ptr %a, i32 %b acq_rel
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomicrmw_sub_i32_seq_cst(ptr %a, i32 %b) nounwind {			define i32 @atomicrmw_sub_i32_seq_cst(ptr %a, i32 %b) nounwind {
	; RV32I-LABEL: atomicrmw_sub_i32_seq_cst:			; RV32I-LABEL: atomicrmw_sub_i32_seq_cst:
	Show All 19 Lines
	; RV64I-NEXT: li a2, 5			; RV64I-NEXT: li a2, 5
	; RV64I-NEXT: call __atomic_fetch_sub_4@plt			; RV64I-NEXT: call __atomic_fetch_sub_4@plt
	; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
	; RV64I-NEXT: addi sp, sp, 16			; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IA-LABEL: atomicrmw_sub_i32_seq_cst:			; RV64IA-LABEL: atomicrmw_sub_i32_seq_cst:
	; RV64IA: # %bb.0:			; RV64IA: # %bb.0:
	; RV64IA-NEXT: neg a1, a1			; RV64IA-NEXT: negw a1, a1
	; RV64IA-NEXT: amoadd.w.aqrl a0, a1, (a0)			; RV64IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
	; RV64IA-NEXT: ret			; RV64IA-NEXT: ret
	%1 = atomicrmw sub ptr %a, i32 %b seq_cst			%1 = atomicrmw sub ptr %a, i32 %b seq_cst
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomicrmw_and_i32_monotonic(ptr %a, i32 %b) nounwind {			define i32 @atomicrmw_and_i32_monotonic(ptr %a, i32 %b) nounwind {
	; RV32I-LABEL: atomicrmw_and_i32_monotonic:			; RV32I-LABEL: atomicrmw_and_i32_monotonic:
	▲ Show 20 Lines • Show All 7,100 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/atomic-signext.ll

	Show First 20 Lines • Show All 2,324 Lines • ▼ Show 20 Lines
	; RV64I-NEXT: call __atomic_fetch_sub_4@plt			; RV64I-NEXT: call __atomic_fetch_sub_4@plt
	; RV64I-NEXT: sext.w a0, a0			; RV64I-NEXT: sext.w a0, a0
	; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload			; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
	; RV64I-NEXT: addi sp, sp, 16			; RV64I-NEXT: addi sp, sp, 16
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64IA-LABEL: atomicrmw_sub_i32_monotonic:			; RV64IA-LABEL: atomicrmw_sub_i32_monotonic:
	; RV64IA: # %bb.0:			; RV64IA: # %bb.0:
	; RV64IA-NEXT: neg a1, a1			; RV64IA-NEXT: negw a1, a1
	; RV64IA-NEXT: amoadd.w a0, a1, (a0)			; RV64IA-NEXT: amoadd.w a0, a1, (a0)
	; RV64IA-NEXT: ret			; RV64IA-NEXT: ret
	%1 = atomicrmw sub ptr %a, i32 %b monotonic			%1 = atomicrmw sub ptr %a, i32 %b monotonic
	ret i32 %1			ret i32 %1
	}			}

	define signext i32 @atomicrmw_and_i32_monotonic(ptr %a, i32 %b) nounwind {			define signext i32 @atomicrmw_and_i32_monotonic(ptr %a, i32 %b) nounwind {
	; RV32I-LABEL: atomicrmw_and_i32_monotonic:			; RV32I-LABEL: atomicrmw_and_i32_monotonic:
	▲ Show 20 Lines • Show All 1,854 Lines • Show Last 20 Lines