This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
2/2
LegalizeDAG.cpp
-
Target/
-
AArch64/
-
AArch64ISelLowering.h
-
AArch64ISelLowering.cpp
-
ARM/
-
ARMISelLowering.cpp
-
Mips/
-
Mips16ISelLowering.cpp
-
RISCV/
4/4
RISCVISelLowering.cpp
1/1
RISCVInstrInfoA.td
-
test/CodeGen/
-
CodeGen/
-
Mips/
-
atomicops.ll
-
RISCV/
-
atomic-rmw-sub.ll

Differential D158673

[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs
ClosedPublic

Authored by dtcxzyw on Aug 23 2023, 2:05 PM.

Download Raw Diff

Details

Reviewers

asb
craig.topper
jrtc27
olista01

Commits

rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a…

Summary

This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs.
Comparison with GCC: https://godbolt.org/z/c5zPdP7j4

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dtcxzyw created this revision.Aug 23 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 2:05 PM

Herald added subscribers: jobnoorman, luke, sunshaoce and 28 others. · View Herald Transcript

dtcxzyw requested review of this revision.Aug 23 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 2:05 PM

Herald added subscribers: llvm-commits, wangpc, eopXD, MaskRay. · View Herald Transcript

This seems like it would be better as a DAG->DAG transform pre-lowering? Or perhaps a more general peephole to merge the neg(w)+li? Or do it in TableGen? I don't think this warrants custom C++ lowering.

Probably via setTargetDAGCombine(ISD::ATOMIC_LOAD_SUB)?

In D158673#4611704, @jrtc27 wrote:

This seems like it would be better as a DAG->DAG transform pre-lowering? Or perhaps a more general peephole to merge the neg(w)+li? Or do it in TableGen? I don't think this warrants custom C++ lowering.

The isel patterns are creating the neg, it does seem better to create that early before isel to give maximum opportunity to combine it. The code in this patch seems very similar to AArch64.

Does this patch improve this too

define signext i32 @atomicrmw_sub_i32_monotonic(ptr %a, i32 %x, i32 %y) nounwind {

  %b = sub i32 %x, %y
  %1 = atomicrmw sub ptr %a, i32 %b monotonic
  ret i32 %1
  }

it currently generates

atomicrmw_sub_i32_monotonic:            # @atomicrmw_sub_i32_monotonic
        subw    a1, a1, a2
        neg     a1, a1
        amoadd.w        a0, a1, (a0)
        ret

But I think we could swap the operands to the subw to remove the neg.

craig.topper added inline comments.Aug 23 2023, 2:43 PM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1241	Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change to ReplaceNodeResults?

Harbormaster completed remote builds in B254468: Diff 552882.Aug 23 2023, 4:19 PM

Rebase
Address comments

dtcxzyw marked an inline comment as done.Aug 23 2023, 9:58 PM

Harbormaster completed remote builds in B254536: Diff 552985.Aug 23 2023, 10:31 PM

craig.topper added inline comments.Aug 24 2023, 11:01 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1239	Drop curly braces

craig.topper added inline comments.Aug 24 2023, 11:03 AM

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2914	I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT?

Rebase
Address feedback

dtcxzyw marked 2 inline comments as done.Aug 24 2023, 12:21 PM

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

(That would allow AArch64 and RISCV to share support for this, which really doesn't need any target-specific knowledge)

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction. Then we could add an Expand action for ATOMIC_SUB to ExpandNode that uses NEG+ATOMIC_ADD. RISC-V and AArch64 could use the Expand action. Need to change AArch64TargetLowering constructor to figure out the cases to use Expand.

I think we also need a DAGCombine to call SimplifyDemandedBits on the operand based on the memory VT in order to remove the SIGN_EXTEND_INREG.

Harbormaster completed remote builds in B254697: Diff 553227.Aug 24 2023, 2:28 PM

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

In D158673#4615235, @jrtc27 wrote:

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

AArch64 seems to create add libcalls from sub. But maybe that isn't intentional?

In D158673#4615347, @craig.topper wrote:

In D158673#4615235, @jrtc27 wrote:

In D158673#4615052, @craig.topper wrote:

In D158673#4614879, @jrtc27 wrote:

In D158673#4614877, @dtcxzyw wrote:

In D158673#4614846, @jrtc27 wrote:

Can we not just teach SelectionDAG to handle ATOMIC_LOAD_SUB=Expand, ATOMIC_LOAD_ADD=Legal?

I think we can let RISCVTargetLowering::shouldExpandAtomicRMWInIR(atomicrmw sub) return Expand and handle it in RISCVTargetLowering::emitExpandAtomicRMW.

That still makes it target-dependent, but the expansion at the IR or SelectionDAG level is target-independent. There is no reason we should have separate code for RISCV and AArch64. Sharing code rather than duplicating functionality is good practice when it makes sense, and I don't see a reason why it wouldn't here.

We can probably do this in LegalizeDAG::ExpandNode. First we need to change very target that really wants a sub LibCall to pass LibCall instead of Expand to setOperationAction.

I don't think you need that part, just check if add is legal, use that if so, otherwise fall back on a libcall?

AArch64 seems to create add libcalls from sub. But maybe that isn't intentional?

For outlined atomics? That's intentional, they correspond to the available LSE instructions, and behave as if they were instructions.

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

In D158673#4615391, @craig.topper wrote:

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

I guess "not Expand" rather than "is Legal" then?

In D158673#4615395, @jrtc27 wrote:

In D158673#4615391, @craig.topper wrote:

In D158673#4615369, @jrtc27 wrote:

(But AArch64 does mark ATOMIC_LOAD_ADD as LibCall already)

So AArch64 would fail your suggestion "just check if add is legal, use that if so, otherwise fall back on a libcall?" since add wouldn't be legal it would be libcall.

I guess "not Expand" rather than "is Legal" then?

(or "is Legal or LibCall" depending on your view of Custom)

Rebase
Expand ATOMIC_LOAD_SUB to NEG+ATOMIC_LOAD_LOAD in LegalizeDAG::ExpandNode

Related patch (AArch64): D42477

Harbormaster completed remote builds in B254883: Diff 553465.Aug 25 2023, 8:20 AM

jrtc27 added inline comments.Aug 25 2023, 12:25 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3137	This should be conditional on what ATOMIC_LOAD_ADD is (see forced-atomics.ll for unnecessary churn, though that won't show that this _adds_ an instruction for non-constant operands, at least I assume)

craig.topper added inline comments.Aug 25 2023, 12:29 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3143	This is not the correct way to find out the sign_extend_inreg can be removed. Operand 0 of SIGN_EXTEND_INREG always matches the destination type which means it always matches VT if its present. So this will delete any SIGN_EXTEND_INREG no matter what. The check you really need is if (RHS->getOpcode() == ISD::SIGN_EXTEND_INREG && cast<VTSDNode>(RHS->getOperand(1))->getVT() == AN->getMemoryVT())

Looks like you didn't update tests for all targets affected by this patch.

Rebase
Fix ARM/RISCV regression tests

dtcxzyw marked 2 inline comments as done.Aug 26 2023, 6:34 AM

Harbormaster completed remote builds in B255078: Diff 553726.Aug 26 2023, 7:30 AM

I think this patch would change behavior atomicrmw sub for mips16, but it looks to be untested. llvm/test/CodeGen/Mips/atomicops.ll is the mips16 atomic test but it does not check all operations.

In D158673#4623269, @craig.topper wrote:

I think this patch would change behavior atomicrmw sub for mips16, but it looks to be untested. llvm/test/CodeGen/Mips/atomicops.ll is the mips16 atomic test but it does not check all operations.

I will add some pre-commit tests later.

Rebase
Fix Mips16 regression tests

Herald added a subscriber: sdardis. · View Herald TranscriptAug 28 2023, 8:46 PM

Harbormaster completed remote builds in B255391: Diff 554154.Aug 28 2023, 9:51 PM

Left a couple of very minor comment. The approach seems sound to me, but there's clear potential interaction with other targets and so I'd rather rely on a LGTM from someone who's been involved in this patch since the beginning, but can take a closer look if no-one has time.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
1233	This comment is now out of date and should probably just be something like "Force __sync libcalls to be emitted for atomic rmw/cas operations."
llvm/lib/Target/RISCV/RISCVInstrInfoA.td
336	This "heading" now has nothing under it (and I think was in the wrong place anyway), so best delete it.

Rebase
Address feedback

Update diff with full context

Harbormaster completed remote builds in B256919: Diff 556336.Sep 8 2023, 8:56 PM

Ping.

LGTM

This revision is now accepted and ready to land.Sep 15 2023, 1:24 PM

This revision was landed with ongoing or failed builds.Sep 16 2023, 2:10 AM

Closed by commit rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a… (authored by dtcxzyw). · Explain Why

This revision was automatically updated to reflect the committed changes.

dtcxzyw added a commit: rGb423e1f05dc3: [SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a….

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

LegalizeDAG.cpp

17 lines

Target/

AArch64/

AArch64ISelLowering.h

1 line

AArch64ISelLowering.cpp

28 lines

ARM/

ARMISelLowering.cpp

26 lines

Mips/

Mips16ISelLowering.cpp

26 lines

RISCV/

RISCVISelLowering.cpp

7 lines

RISCVInstrInfoA.td

24 lines

test/

CodeGen/

Mips/

atomicops.ll

11 lines

RISCV/

atomic-rmw-sub.ll

181 lines

Diff 556335

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Context not available.
	Results.push_back(Res.getValue(1));	Results.push_back(Res.getValue(1));
	break;	break;
	}	}
		case ISD::ATOMIC_LOAD_SUB: {
		SDLoc DL(Node);
		jrtc27Unsubmitted Done Reply Inline Actions This should be conditional on what ATOMIC_LOAD_ADD is (see forced-atomics.ll for unnecessary churn, though that won't show that this _adds_ an instruction for non-constant operands, at least I assume) jrtc27: This should be conditional on what ATOMIC_LOAD_ADD is (see forced-atomics.ll for unnecessary…
		EVT VT = Node->getValueType(0);
		SDValue RHS = Node->getOperand(2);
		AtomicSDNode *AN = cast<AtomicSDNode>(Node);
		if (RHS->getOpcode() == ISD::SIGN_EXTEND_INREG &&
		cast<VTSDNode>(RHS->getOperand(1))->getVT() == AN->getMemoryVT())
		RHS = RHS->getOperand(0);
		craig.topperUnsubmitted Done Reply Inline Actions This is not the correct way to find out the sign_extend_inreg can be removed. Operand 0 of SIGN_EXTEND_INREG always matches the destination type which means it always matches VT if its present. So this will delete any SIGN_EXTEND_INREG no matter what. The check you really need is if (RHS->getOpcode() == ISD::SIGN_EXTEND_INREG && cast<VTSDNode>(RHS->getOperand(1))->getVT() == AN->getMemoryVT()) craig.topper: This is not the correct way to find out the sign_extend_inreg can be removed. Operand 0 of…
		SDValue NewRHS =
		DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, DL, VT), RHS);
		SDValue Res = DAG.getAtomic(ISD::ATOMIC_LOAD_ADD, DL, AN->getMemoryVT(),
		Node->getOperand(0), Node->getOperand(1),
		NewRHS, AN->getMemOperand());
		Results.push_back(Res);
		Results.push_back(Res.getValue(1));
		break;
		}
	case ISD::DYNAMIC_STACKALLOC:	case ISD::DYNAMIC_STACKALLOC:
	ExpandDYNAMIC_STACKALLOC(Node, Results);	ExpandDYNAMIC_STACKALLOC(Node, Results);
	break;	break;
Context not available.

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Context not available.
	SDValue LowerVSCALE(SDValue Op, SelectionDAG &DAG) const;	SDValue LowerVSCALE(SDValue Op, SelectionDAG &DAG) const;
	SDValue LowerTRUNCATE(SDValue Op, SelectionDAG &DAG) const;	SDValue LowerTRUNCATE(SDValue Op, SelectionDAG &DAG) const;
	SDValue LowerVECREDUCE(SDValue Op, SelectionDAG &DAG) const;	SDValue LowerVECREDUCE(SDValue Op, SelectionDAG &DAG) const;
	SDValue LowerATOMIC_LOAD_SUB(SDValue Op, SelectionDAG &DAG) const;
	SDValue LowerATOMIC_LOAD_AND(SDValue Op, SelectionDAG &DAG) const;	SDValue LowerATOMIC_LOAD_AND(SDValue Op, SelectionDAG &DAG) const;
	SDValue LowerDYNAMIC_STACKALLOC(SDValue Op, SelectionDAG &DAG) const;	SDValue LowerDYNAMIC_STACKALLOC(SDValue Op, SelectionDAG &DAG) const;
	SDValue LowerWindowsDYNAMIC_STACKALLOC(SDValue Op, SDValue Chain,	SDValue LowerWindowsDYNAMIC_STACKALLOC(SDValue Op, SDValue Chain,
Context not available.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Context not available.
	setOperationAction(ISD::SET_ROUNDING, MVT::Other, Custom);	setOperationAction(ISD::SET_ROUNDING, MVT::Other, Custom);

	setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i128, Custom);	setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i128, Custom);
	setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Custom);	if (!Subtarget->hasLSE() && !Subtarget->outlineAtomics()) {
	setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, Custom);	setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, LibCall);
		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, LibCall);
		} else {
		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Expand);
		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i64, Expand);
		}
	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Custom);	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Custom);
	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i64, Custom);	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i64, Custom);

Context not available.
	case ISD::VECREDUCE_FMAXIMUM:	case ISD::VECREDUCE_FMAXIMUM:
	case ISD::VECREDUCE_FMINIMUM:	case ISD::VECREDUCE_FMINIMUM:
	return LowerVECREDUCE(Op, DAG);	return LowerVECREDUCE(Op, DAG);
	case ISD::ATOMIC_LOAD_SUB:
	return LowerATOMIC_LOAD_SUB(Op, DAG);
	case ISD::ATOMIC_LOAD_AND:	case ISD::ATOMIC_LOAD_AND:
	return LowerATOMIC_LOAD_AND(Op, DAG);	return LowerATOMIC_LOAD_AND(Op, DAG);
	case ISD::DYNAMIC_STACKALLOC:	case ISD::DYNAMIC_STACKALLOC:
Context not available.
	}	}
	}	}

	SDValue AArch64TargetLowering::LowerATOMIC_LOAD_SUB(SDValue Op,
	SelectionDAG &DAG) const {
	auto &Subtarget = DAG.getSubtarget<AArch64Subtarget>();
	if (!Subtarget.hasLSE() && !Subtarget.outlineAtomics())
	return SDValue();

	// LSE has an atomic load-add instruction, but not a load-sub.
	SDLoc dl(Op);
	MVT VT = Op.getSimpleValueType();
	SDValue RHS = Op.getOperand(2);
	AtomicSDNode *AN = cast<AtomicSDNode>(Op.getNode());
	RHS = DAG.getNode(ISD::SUB, dl, VT, DAG.getConstant(0, dl, VT), RHS);
	return DAG.getAtomic(ISD::ATOMIC_LOAD_ADD, dl, AN->getMemoryVT(),
	Op.getOperand(0), Op.getOperand(1), RHS,
	AN->getMemOperand());
	}

	SDValue AArch64TargetLowering::LowerATOMIC_LOAD_AND(SDValue Op,	SDValue AArch64TargetLowering::LowerATOMIC_LOAD_AND(SDValue Op,
	SelectionDAG &DAG) const {	SelectionDAG &DAG) const {
	auto &Subtarget = DAG.getSubtarget<AArch64Subtarget>();	auto &Subtarget = DAG.getSubtarget<AArch64Subtarget>();
Context not available.

llvm/lib/Target/ARM/ARMISelLowering.cpp

Context not available.
	setOperationAction(ISD::ATOMIC_FENCE, MVT::Other,	setOperationAction(ISD::ATOMIC_FENCE, MVT::Other,
	Subtarget->hasAnyDataBarrier() ? Custom : Expand);	Subtarget->hasAnyDataBarrier() ? Custom : Expand);

	// Set them all for expansion, which will force libcalls.	// Set them all for libcall, which will force libcalls.
	setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_NAND, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_NAND, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_MIN, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_MIN, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_MAX, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_MAX, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_UMIN, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_UMIN, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_UMAX, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_UMAX, MVT::i32, LibCall);
	// Mark ATOMIC_LOAD and ATOMIC_STORE custom so we can handle the	// Mark ATOMIC_LOAD and ATOMIC_STORE custom so we can handle the
	// Unordered/Monotonic case.	// Unordered/Monotonic case.
	if (!InsertFencesForAtomic) {	if (!InsertFencesForAtomic) {
Context not available.

llvm/lib/Target/Mips/Mips16ISelLowering.cpp

Context not available.
	if (!Subtarget.useSoftFloat())	if (!Subtarget.useSoftFloat())
	setMips16HardFloatLibCalls();	setMips16HardFloatLibCalls();

	setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Expand);	setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, LibCall);
	setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_NAND, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_NAND, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_MIN, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_MIN, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_MAX, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_MAX, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_UMIN, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_UMIN, MVT::i32, LibCall);
	setOperationAction(ISD::ATOMIC_LOAD_UMAX, MVT::i32, Expand);	setOperationAction(ISD::ATOMIC_LOAD_UMAX, MVT::i32, LibCall);

	setOperationAction(ISD::ROTR, MVT::i32, Expand);	setOperationAction(ISD::ROTR, MVT::i32, Expand);
	setOperationAction(ISD::ROTR, MVT::i64, Expand);	setOperationAction(ISD::ROTR, MVT::i64, Expand);
Context not available.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Context not available.
	}	}
	}	}

		if (Subtarget.hasStdExtA())
		setOperationAction(ISD::ATOMIC_LOAD_SUB, XLenVT, Expand);
		asbUnsubmitted Done Reply Inline Actions This comment is now out of date and should probably just be something like "Force __sync libcalls to be emitted for atomic rmw/cas operations." asb: This comment is now out of date and should probably just be something like "Force __sync…

	if (Subtarget.hasForcedAtomics()) {	if (Subtarget.hasForcedAtomics()) {
	// Set atomic rmw/cas operations to expand to force __sync libcalls.	// Force __sync libcalls to be emitted for atomic rmw/cas operations.
	setOperationAction(	setOperationAction(
	{ISD::ATOMIC_CMP_SWAP, ISD::ATOMIC_SWAP, ISD::ATOMIC_LOAD_ADD,	{ISD::ATOMIC_CMP_SWAP, ISD::ATOMIC_SWAP, ISD::ATOMIC_LOAD_ADD,
	ISD::ATOMIC_LOAD_SUB, ISD::ATOMIC_LOAD_AND, ISD::ATOMIC_LOAD_OR,	ISD::ATOMIC_LOAD_SUB, ISD::ATOMIC_LOAD_AND, ISD::ATOMIC_LOAD_OR,
		craig.topperUnsubmitted Done Reply Inline Actions Drop curly braces craig.topper: Drop curly braces
	ISD::ATOMIC_LOAD_XOR, ISD::ATOMIC_LOAD_NAND, ISD::ATOMIC_LOAD_MIN,	ISD::ATOMIC_LOAD_XOR, ISD::ATOMIC_LOAD_NAND, ISD::ATOMIC_LOAD_MIN,
	ISD::ATOMIC_LOAD_MAX, ISD::ATOMIC_LOAD_UMIN, ISD::ATOMIC_LOAD_UMAX},	ISD::ATOMIC_LOAD_MAX, ISD::ATOMIC_LOAD_UMIN, ISD::ATOMIC_LOAD_UMAX},
		craig.topperUnsubmitted Done Reply Inline Actions Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change to ReplaceNodeResults? craig.topper: Why do we need to handle i32 on RV64? Wouldn't it be enough to do XLenLLT and remove the change…
	XLenVT, Expand);	XLenVT, LibCall);
	}	}

	if (Subtarget.hasVendorXTHeadMemIdx()) {	if (Subtarget.hasVendorXTHeadMemIdx()) {
Context not available.
		craig.topperUnsubmitted Done Reply Inline Actions I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT? craig.topper: I don't think this safe to do this if you don't check that the VT in Operand 1 matches MemoryVT?

llvm/lib/Target/RISCV/RISCVInstrInfoA.td

Context not available.
	defm : AMOPat<"atomic_load_umax_32", "AMOMAXU_W">;	defm : AMOPat<"atomic_load_umax_32", "AMOMAXU_W">;
	defm : AMOPat<"atomic_load_umin_32", "AMOMINU_W">;	defm : AMOPat<"atomic_load_umin_32", "AMOMINU_W">;

	def : Pat<(XLenVT (atomic_load_sub_32_monotonic GPR:$addr, GPR:$incr)),
	(AMOADD_W GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_acquire GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_release GPR:$addr, GPR:$incr)),
	(AMOADD_W_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_acq_rel GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(XLenVT (atomic_load_sub_32_seq_cst GPR:$addr, GPR:$incr)),
	(AMOADD_W_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;

	/// Pseudo AMOs	/// Pseudo AMOs

	class PseudoAMO : Pseudo<(outs GPR:$res, GPR:$scratch),	class PseudoAMO : Pseudo<(outs GPR:$res, GPR:$scratch),
Context not available.
	defm : AMOPat<"atomic_load_umax_64", "AMOMAXU_D", i64>;	defm : AMOPat<"atomic_load_umax_64", "AMOMAXU_D", i64>;
	defm : AMOPat<"atomic_load_umin_64", "AMOMINU_D", i64>;	defm : AMOPat<"atomic_load_umin_64", "AMOMINU_D", i64>;

	/// 64-bit AMOs

	def : Pat<(i64 (atomic_load_sub_64_monotonic GPR:$addr, GPR:$incr)),
	(AMOADD_D GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_acquire GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_release GPR:$addr, GPR:$incr)),
	(AMOADD_D_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_acq_rel GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;
	def : Pat<(i64 (atomic_load_sub_64_seq_cst GPR:$addr, GPR:$incr)),
	(AMOADD_D_AQ_RL GPR:$addr, (SUB (XLenVT X0), GPR:$incr))>;

	/// 64-bit pseudo AMOs	/// 64-bit pseudo AMOs

	let Size = 20 in	let Size = 20 in
Context not available.
		asbUnsubmitted Done Reply Inline Actions This "heading" now has nothing under it (and I think was in the wrong place anyway), so best delete it. asb: This "heading" now has nothing under it (and I think was in the wrong place anyway), so best…

llvm/test/CodeGen/Mips/atomicops.ll

Context not available.
	; 16: lw ${{[0-9]+}}, %call16(__sync_fetch_and_add_4)(${{[0-9]+}})	; 16: lw ${{[0-9]+}}, %call16(__sync_fetch_and_add_4)(${{[0-9]+}})
	}	}

		define i32 @atomic_load_sub(ptr %mem, i32 %val, i32 %c) nounwind {
		; 16-LABEL: atomic_load_sub:
		; 16: lw ${{[0-9]+}}, %call16(__sync_synchronize)(${{[0-9]+}})
		; 16: lw ${{[0-9]+}}, %call16(__sync_fetch_and_sub_4)(${{[0-9]+}})
		entry:
		%0 = atomicrmw sub ptr %mem, i32 %val seq_cst
		ret i32 %0
		}

	define i32 @main() nounwind {	define i32 @main() nounwind {
	entry:	entry:
	%x = alloca i32, align 4	%x = alloca i32, align 4
Context not available.
	}	}

	declare i32 @printf(ptr nocapture, ...) nounwind	declare i32 @printf(ptr nocapture, ...) nounwind


Context not available.

llvm/test/CodeGen/RISCV/atomic-rmw-sub.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2
				; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV32I %s
				; RUN: llc -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefixes=RV32IA %s
				; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV64I %s
				; RUN: llc -mtriple=riscv64 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefixes=RV64IA %s

				define i32 @atomicrmw_sub_i32_constant(ptr %a) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i32_constant:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: li a1, 1
				; RV32I-NEXT: li a2, 5
				; RV32I-NEXT: call __atomic_fetch_sub_4@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i32_constant:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: li a1, -1
				; RV32IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i32_constant:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: li a1, 1
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_4@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i32_constant:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: li a1, -1
				; RV64IA-NEXT: amoadd.w.aqrl a0, a1, (a0)
				; RV64IA-NEXT: ret
				%1 = atomicrmw sub ptr %a, i32 1 seq_cst
				ret i32 %1
				}

				define i64 @atomicrmw_sub_i64_constant(ptr %a) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i64_constant:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: li a1, 1
				; RV32I-NEXT: li a3, 5
				; RV32I-NEXT: li a2, 0
				; RV32I-NEXT: call __atomic_fetch_sub_8@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i64_constant:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32IA-NEXT: li a1, 1
				; RV32IA-NEXT: li a3, 5
				; RV32IA-NEXT: li a2, 0
				; RV32IA-NEXT: call __atomic_fetch_sub_8@plt
				; RV32IA-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i64_constant:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: li a1, 1
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_8@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i64_constant:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: li a1, -1
				; RV64IA-NEXT: amoadd.d.aqrl a0, a1, (a0)
				; RV64IA-NEXT: ret
				%1 = atomicrmw sub ptr %a, i64 1 seq_cst
				ret i64 %1
				}

				define i32 @atomicrmw_sub_i32_neg(ptr %a, i32 %x, i32 %y) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i32_neg:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: sub a1, a1, a2
				; RV32I-NEXT: li a2, 5
				; RV32I-NEXT: call __atomic_fetch_sub_4@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i32_neg:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sub a2, a2, a1
				; RV32IA-NEXT: amoadd.w.aqrl a0, a2, (a0)
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i32_neg:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: subw a1, a1, a2
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_4@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i32_neg:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: sub a2, a2, a1
				; RV64IA-NEXT: amoadd.w.aqrl a0, a2, (a0)
				; RV64IA-NEXT: ret
				%b = sub i32 %x, %y
				%1 = atomicrmw sub ptr %a, i32 %b seq_cst
				ret i32 %1
				}

				define i64 @atomicrmw_sub_i64_neg(ptr %a, i64 %x, i64 %y) nounwind {
				; RV32I-LABEL: atomicrmw_sub_i64_neg:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32I-NEXT: sltu a5, a1, a3
				; RV32I-NEXT: sub a2, a2, a4
				; RV32I-NEXT: sub a2, a2, a5
				; RV32I-NEXT: sub a1, a1, a3
				; RV32I-NEXT: li a3, 5
				; RV32I-NEXT: call __atomic_fetch_sub_8@plt
				; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomicrmw_sub_i64_neg:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
				; RV32IA-NEXT: sltu a5, a1, a3
				; RV32IA-NEXT: sub a2, a2, a4
				; RV32IA-NEXT: sub a2, a2, a5
				; RV32IA-NEXT: sub a1, a1, a3
				; RV32IA-NEXT: li a3, 5
				; RV32IA-NEXT: call __atomic_fetch_sub_8@plt
				; RV32IA-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
				;
				; RV64I-LABEL: atomicrmw_sub_i64_neg:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
				; RV64I-NEXT: sub a1, a1, a2
				; RV64I-NEXT: li a2, 5
				; RV64I-NEXT: call __atomic_fetch_sub_8@plt
				; RV64I-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IA-LABEL: atomicrmw_sub_i64_neg:
				; RV64IA: # %bb.0:
				; RV64IA-NEXT: sub a2, a2, a1
				; RV64IA-NEXT: amoadd.d.aqrl a0, a2, (a0)
				; RV64IA-NEXT: ret
				%b = sub i64 %x, %y
				%1 = atomicrmw sub ptr %a, i64 %b seq_cst
				ret i64 %1
				}