This is an archive of the discontinued LLVM Phabricator instance.

Preferably this should also include the implementation for atomic RMW/CAS instructions to prove that this lowering is legal. If native or at least kernel-supported CAS is not available, then atomic load/store needs to use libatomic (possibly subtarget dependent).

It would also be good to include relevant quotes from the ISA manual -- atomicity of load/store is usually a given, but do they also guarantee a seq_cst ordering without a memory barrier?

(Disclaimer: I'm not familiar with m68k, just covering the usual atomic lowering legality questions.)

Agree with @nikic . I think it will be better for this patch to have support for CmpXchg using CAS (for 68020 and later) and fallback to library solution for older CPUs.

In D136525#3877522, @nikic wrote:

Preferably this should also include the implementation for atomic RMW/CAS instructions to prove that this lowering is legal. If native or at least kernel-supported CAS is not available, then atomic load/store needs to use libatomic (possibly subtarget dependent).

It would also be good to include relevant quotes from the ISA manual -- atomicity of load/store is usually a given, but do they also guarantee a seq_cst ordering without a memory barrier?

(Disclaimer: I'm not familiar with m68k, just covering the usual atomic lowering legality questions.)

Thanks for replying !

To be honest, I can find any word regarding the atomic ordering in the document. I can't find any memory barrier instruction either. I just follow what gcc does.

There is a discussion here: https://github.com/M680x0/M680x0-mono-repo/issues/13

Embark on the CAS/RMW instruction.

Are multi-processor m68k computers a thing? I can't find any reference to such a thing existing, but the manual indicates that the processor was designed to allow it. If it does exist, m68k probably needs to use sequences similar to x86. (x86 didn't have any barrier instruction for a long time, but a "lock" instruction has the right semantics.)

In D136525#3880337, @efriedma wrote:

Are multi-processor m68k computers a thing? I can't find any reference to such a thing existing, but the manual indicates that the processor was designed to allow it. If it does exist, m68k probably needs to use sequences similar to x86. (x86 didn't have any barrier instruction for a long time, but a "lock" instruction has the right semantics.)

Take my word with a pinch of salt

We may have to use the compare-and-swap instruction to reach that (or with the aid of glibc). But is it necessary on such an old architecture that may not have all the modern technologies such as out-of-order execution and store/write buffer --- which are the reasons the atomic instruction or memory barrier exists for ?

In D136525#3882193, @0x59616e wrote:

In D136525#3880337, @efriedma wrote:

Are multi-processor m68k computers a thing? I can't find any reference to such a thing existing, but the manual indicates that the processor was designed to allow it. If it does exist, m68k probably needs to use sequences similar to x86. (x86 didn't have any barrier instruction for a long time, but a "lock" instruction has the right semantics.)

Take my word with a pinch of salt

We may have to use the compare-and-swap instruction to reach that (or with the aid of glibc). But is it necessary on such an old architecture that may not have all the modern technologies such as out-of-order execution and store/write buffer --- which are the reasons the atomic instruction or memory barrier exists for ?

I can't remember any details (so long ago, sniff) - but wasn't TAS (test and set) used as an alternative to CAS which only appeared on 68020 and later? I vaguely remember on Amigas, which although not truly multi-processor had a lot of custom chips accessing some of the memory at the same time, you had to be very careful about using TAS/CAS.

RKSimon added inline comments.Oct 25 2022, 5:39 AM

llvm/lib/Target/M68k/M68kTargetMachine.cpp
161	Also - do we need to add an pipeline.ll test file?

In D136525#3882193, @0x59616e wrote:

In D136525#3880337, @efriedma wrote:

Are multi-processor m68k computers a thing? I can't find any reference to such a thing existing, but the manual indicates that the processor was designed to allow it. If it does exist, m68k probably needs to use sequences similar to x86. (x86 didn't have any barrier instruction for a long time, but a "lock" instruction has the right semantics.)

Take my word with a pinch of salt

We may have to use the compare-and-swap instruction to reach that (or with the aid of glibc). But is it necessary on such an old architecture that may not have all the modern technologies such as out-of-order execution and store/write buffer --- which are the reasons the atomic instruction or memory barrier exists for ?

Store buffers existed in the era of the m68k, I think. OOO was mostly later, though.

That said, we shouldn't try to theorycraft the semantics of a system that doesn't actually exist. I'm okay with assuming we have a single processor if the rest of the ecosystem makes the same assumption (which means we only have to care about interrupts, not memory reordering).

I dig into the libatomic.a, here is part of the result:

00000000 <__atomic_store_4>:
   0:   206f 0004       moveal %sp@(4),%a0
   4:   20af 0008       movel %sp@(8),%a0@
   8:   4e75            rts

gcc also compiles to the same result:

$ echo 'void foo(int *x, int y) { __atomic_store_4(x, y, __ATOMIC_SEQ_CST); }' | m68k-linux-gnu-gcc -m68040 -x c - -o - -S -O2
#NO_APP
	.file	"<stdin>"
	.text
	.align	2
	.globl	foo
	.type	foo, @function
foo:
	move.l 4(%sp),%a0
	move.l 8(%sp),(%a0)
	rts
	.size	foo, .-foo
	.ident	"GCC: (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0"
	.section	.note.GNU-stack,"",@progbits

Maybe the normal load / store is sufficient to match the atomic ordering semantic on m68k ?

llvm/lib/Target/M68k/M68kTargetMachine.cpp
161	What is pipeline.ll ?

RKSimon added inline comments.Oct 27 2022, 4:29 AM

llvm/lib/Target/M68k/M68kTargetMachine.cpp
161	https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/ARM/O3-pipeline.ll https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/opt-pipeline.ll Different backends do it differently, but basically its a test file that checks what passes have been run at all/some opt levels

0x59616e marked an inline comment as done.Oct 27 2022, 4:34 AM

0x59616e added inline comments.

llvm/lib/Target/M68k/M68kTargetMachine.cpp
161	Thanks. This looks interesting. As far as I'm concerned, there is no reason not to add this one in M68k.

Add support for atomicrmw and cmpxchg

Add pipeline.ll

Harbormaster completed remote builds in B195062: Diff 471716.Oct 28 2022, 11:16 PM

Please can you pre-commit pipeline.ll for current trunk and then rebase so that the patch shows the diff?

llvm/lib/Target/M68k/M68kInstrAtomics.td
11	Wrap this inside FeatureISA20 ?
llvm/test/CodeGen/M68k/pipeline.ll
2	; CHECK
105	newline

address feedbacks

llvm/lib/Target/M68k/M68kInstrAtomics.td
11	Is there any example I can refer to ?

Harbormaster completed remote builds in B195186: Diff 471889.Oct 30 2022, 7:24 PM

I think it makes sense to assume m68k always runs on uniprocessors. GCC / libgcc and m68k port of Linux also makes the same assumption. For instance, gcc simply lowers atomic fence into asm volatile("" ::: "memory") (meaning the only thing we need to do is preventing compiler from reordering across the fence).

Though I don't think the changes on getMaxAtomicSizeInBitsSupported can really be justified. Given the fact that you can lower unsupported atomic operations to library calls in legalizer as well.

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	I think you tried to turn every atomic operations but atomic_load / store / cmpxchg into libcall here. But even we don't turn them into libcalls in this pass, we still can do that during legalization by marking the corresponding SDNode as Expand or LibCall, right?
llvm/lib/Target/M68k/M68kInstrAtomics.td
28	Why do we need to wrap these patterns with multiclass?
llvm/test/MC/Disassembler/M68k/atomics.txt
4	what about other sizes?
llvm/test/MC/M68k/Atomics/cas.s
6	ditto other sizes

myhsu added inline comments.Oct 30 2022, 10:07 PM

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	*corresponding operation as Expand or LibCall

address feedbacks

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	I'll look into it.
llvm/lib/Target/M68k/M68kInstrAtomics.td
28	Yeah that's verbose. I didn't think this through.

0x59616e marked an inline comment as done.Oct 31 2022, 3:22 AM

0x59616e marked 3 inline comments as done.

RKSimon added inline comments.Oct 31 2022, 3:40 AM

llvm/lib/Target/M68k/M68kInstrAtomics.td
28	Probably the best way to make this 020+ only is to wrap the patterns, something like: let Predicates = [FeatureISA20] { foreach size = [8, 16, 32] in { .... } }

Harbormaster completed remote builds in B195224: Diff 471948.Oct 31 2022, 4:56 AM

0x59616e added inline comments.Nov 1 2022, 5:37 AM

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	Marking it as `LibCall` can stop the type legalizer from expanding the AtomicStore with 64bits operand to AtomicSwap, which will be transformed to `__sync_lock_test_and_set` --- a function that I can't find in `libatomic.a`. We may need to find a way to stop the type legalizer from doing it so that the legalizer can expand AtomicStore to library call. Any ideas ?
llvm/lib/Target/M68k/M68kInstrAtomics.td
28	Thanks. I'll look into it.

0x59616e added inline comments.Nov 1 2022, 5:44 AM

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	Here is where the undesirable expansion happen : https://github.com/llvm/llvm-project/blob/69d117edc29d4c74e034d8474433e981b2702898/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp#L5109

0x59616e added inline comments.Nov 1 2022, 5:46 AM

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	Marking it as `LibCall` CANNOT

0x59616e added inline comments.Nov 1 2022, 8:33 AM

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	We may want to set the `ATOMIC_CMP_SWAP` as `LibCall` when the target is below 020. But that also turns into `__sync_*` library call. Here is my workaround: You can see from here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/AtomicExpandPass.cpp#L178 that the AtomicExpandPass will check the alignment to decide whether should give it a go. We can add a IR pass that change the alignment of `cmpxchg` to zero so that the AtomicExpandPass will work for us. Do you think this is auspicious ?

0x59616e added inline comments.Nov 1 2022, 8:52 AM

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	Another solution is mark the ATOMIC_CMP_SWAP as Custom and lower to `__atomic_compare_swap_*` by ourself. Which one do you prefer ?

0x59616e added inline comments.Nov 1 2022, 9:11 AM

llvm/lib/CodeGen/AtomicExpandPass.cpp
176 ↗	(On Diff #471716)	Pros of the first solution: It is easier to implement. Cons of the first solution: It's not that orthodox. Pros of the second: it follows the tradition. Cons of the second: We have to consider not only cas but also rmw instruction. It may require an effort.

RKSimon added inline comments.Nov 1 2022, 9:29 AM

llvm/lib/Target/M68k/M68kTargetMachine.cpp
161	So do we need to add TargetPassConfig::addIRPasses() to this call now?

The atomic width is never a property of a specific instruction. Either *all* atomic ops of a given width are lock-free, or *all* atomic ops of a given width need to be transformed into __atomic_* libcalls.

So the right thing to do is something like the following, in M68kTargetLowering::M68kTargetLowering():

if (Subtarget.atLeastM68020())
  setMaxAtomicSizeInBitsSupported(32);
else
  setMaxAtomicSizeInBitsSupported(0);

If you need to emulate rmw ops on top of cmpxchg, implementing shouldExpandAtomicRMWInIR will tell AtomicExpandPass to handle that expansion for you.

It's possible to emulate a lock-free atomic cmpxchg on a uniprocessor, for example, https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt . But the compiler can't do it itself; it needs some sort of operating system assistance. And as far as I know, Linux requires an 68020 anyway, so I don't see a reason for you to go down that path.

In D136525#3899548, @efriedma wrote:

The atomic width is never a property of a specific instruction. Either *all* atomic ops of a given width are lock-free, or *all* atomic ops of a given width need to be transformed into __atomic_* libcalls.

I have a few questions:

So I cannot lower atomic_load / atomic_store to native instruction whilst lower atomic_compare_and_swap to library call --- all of them should either be lowered to native instruction or library calls ?

Why the atomic width is not a property of a specific instruction ?

Thanks !

llvm/lib/Target/M68k/M68kTargetMachine.cpp
161	Oh, I forgot it. I will address this later. Thanks !

one more question :

is it legal to transform the atomic_load / atomic_store to a normal load / store before AtomicExpandPass ? Does that break any property that belongs to atomic instruction ?

Let me try to answer most of the questions at once. But first, here is the workflow I would do:

For 68020 and later, setMaxAtomicSizeInBitsSupported(32) and setMaxAtomicSizeInBitsSupported(0) otherwise. AtomicExpandPass then replaces everything not in the size ranges with __atomic_*.
AtomicLoad, AtomicStore and AtomicCmpXchg in target >= 68020 will be lowered to native instructions.
Mark every other ISD::ATOMIC_* as LibCall, this effectively lowers them into library calls to __sync_*.

Now here are the questions:

Marking it as LibCall can stop the type legalizer from expanding the AtomicStore with 64bits operand to AtomicSwap,

I'm not sure how did you get this, given the fact that 64-bit AtomicStore should already been replaced by __atomic_store by AtomicExpandPass. So you don't need to mark ISD::ATOMIC_STORE as LibCall.

transformed to __sync_lock_test_and_set --- a function that I can't find in libatomic.a.

Correct, because __sync_* come from libgcc, the primary compiler runtime library for GNU toolchain. In the case of m68k, these __sync_* functions are "basically" lock-free -- they simply make a syscall that provides the cmpxchg feature (though that also means it's less portable than, said libatomic which is a pure C implementation). To my best understanding, the reason legalizer chooses to lower atomic operations into __sync_* rather than __atomic_* is because the former are more lower level and more importantly, libgcc is always available during the linking (when using GNU driver of course, which is the case for m68k-linux-gnu).

So I cannot lower atomic_load / atomic_store to native instruction whilst lower atomic_compare_and_swap to library call --- all of them should either be lowered to native instruction or library calls ?
Why the atomic width is not a property of a specific instruction ?

To my best understanding, LLVM makes the decision to segregate native atomic supports by size, rather than a specific instruction, simply because most architectures do that. You either have full or no atomic support for a certain size. Meaning, m68k, as an anomaly, needs to do some extra works on this matter, which is not difficult because I believe the workflow I put at the beginning of this comment can lower atomic load / store to native instructions (on supported sub architectures) while lowering other atomic operations to libcalls for a given size.

So I cannot lower atomic_load / atomic_store to native instruction whilst lower atomic_compare_and_swap to library call --- all of them should either be lowered to native instruction or library calls ?
Why the atomic width is not a property of a specific instruction ?

To my best understanding, LLVM makes the decision to segregate native atomic supports by size, rather than a specific instruction, simply because most architectures do that. You either have full or no atomic support for a certain size. Meaning, m68k, as an anomaly, needs to do some extra works on this matter, which is not difficult because I believe the workflow I put at the beginning of this comment can lower atomic load / store to native instructions (on supported sub architectures) while lowering other atomic operations to libcalls for a given size.

The __atomic_* libcalls are allowed to be implemented using a lock. If you try to mix in lock-free load and store operations, they won't respect that lock.

If m68k has __sync_* in libgcc, those are lock-free; you can use them alongside plain load/store instructions.

https://llvm.org/docs/Atomics.html has more details on various forms of calls.

Thanks for all of your edifying comments. The path is getting clearer. Here is my understanding. Correct me if I'm wrong.

__atomic_* is allowed to use lock, whilst __sync_* is lock-free.
Non-lock-free function is NOT allowed to use simultaneously with the lock-free instruction or function, otherwise the lock may not be respected (i.e. race condition may occur)
atomic_compare_and_swap needs to be transformed to __atomic_* if we don't have CAS support.

Conclusion 1: According to 1 & 2 & 3, if we don't have CAS support, we have to transform the atomic_load / atomic_store to __atomic_* library calls, instead of normal load / store, to avoid the mixture of lock-free instruction & non-lock-free library calls.

Conslusion 2: According to 1 & 2 & 3, If we have CAS support, we can just simply lower atomic_load / atomic_store / atomic_compare_and_swap to the native instruction, and expand the atomic_rmw to either atomic_compare_and_swap or __sync_* calls --- which all of them are lock-free.

Right ?

In D136525#3901614, @0x59616e wrote:

Thanks for all of your edifying comments. The path is getting clearer. Here is my understanding. Correct me if I'm wrong.

__atomic_* is allowed to use lock, whilst __sync_* is lock-free.

Non-lock-free function is NOT allowed to use simultaneously with the lock-free instruction or function, otherwise the lock may not be respected (i.e. race condition may occur)

atomic_compare_and_swap needs to be transformed to __atomic_* if we don't have CAS support.

I think we can also transform atomic_compare_and_swap to its __sync counterpart, __sync_val_compare_and_swap, if we don't have CAS (i.e. < M68020). And I think __sync is preferable here because it makes more sense to conditionally lower cmpxchg to libcall based on target CPU in the backend , compared to detecting target CPU in an IR Pass.

myhsu added inline comments.Nov 3 2022, 3:46 PM

llvm/lib/Target/M68k/M68kInstrAtomics.td
11	you can use `let Predicates = [IsM68020] in { ...` to wrap both the instruction definitions and the patterns (the predicates won't be honored if you only wrap the instruction definitions, since we're using custom patterns). Though I just realized that the `IsM680X0` predicates defined in M68kInstrInfo.td are wrong. I will fix that shortly.

In D136525#3901614, @0x59616e wrote:

Conclusion 1: According to 1 & 2 & 3, if we don't have CAS support, we have to transform the atomic_load / atomic_store to __atomic_* library calls, instead of normal load / store, to avoid the mixture of lock-free instruction & non-lock-free library calls.

Conslusion 2: According to 1 & 2 & 3, If we have CAS support, we can just simply lower atomic_load / atomic_store / atomic_compare_and_swap to the native instruction, and expand the atomic_rmw to either atomic_compare_and_swap or __sync_* calls --- which all of them are lock-free.

Right ?

Right. You want to call setMaxAtomicSizeInBitsSupported(32) on targets that have 32-bit CAS.

If __sync_* is available for a given bitwidth, it counts as "having" CAS support. setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, LibCall); will tell the legalizer to generate the call for you.

In D136525#3906658, @myhsu wrote:

And I think __sync is preferable here because it makes more sense to conditionally lower cmpxchg to libcall based on target CPU in the backend , compared to detecting target CPU in an IR Pass.

We don't really make the distinction between "IR" passes vs. "CodeGen" passes. We draw the line between "optimization" passes (which are controlled by the frontend) and "backend" passes (which are directly controlled by the target in its implementation of TargetPassConfig). Some "backend" passes run on IR simply because it's convenient: LLVM IR is simpler to deal with (especially given the limitations of SelectionDAG).

update diff:

We transform all atomic instruction to __atomic_* for sizes > 32
Otherwise, lower to either native instruction or __sync_* function call.

The above two rules apply to all subtarget.

0x59616e added a parent revision: D137425: [M68k] Add predicates `AtLeastM680x0`.Nov 4 2022, 7:33 AM

0x59616e marked 4 inline comments as done.

0x59616e added inline comments.

llvm/lib/Target/M68k/M68kInstrAtomics.td
11	Something like D137425 ?

0x59616e marked an inline comment as done.Nov 4 2022, 7:34 AM

Harbormaster completed remote builds in B196142: Diff 473237.Nov 4 2022, 8:32 AM

I think the patch is in good shape now. I only have some minor comments. Thanks for the works :-)

llvm/lib/Target/M68k/M68kISelLowering.cpp
169	just want to double check, were these line formatted by clang-format?
llvm/lib/Target/M68k/M68kInstrAtomics.td
11	Something like D137425 ? Yes thank you!
42	nit: align with !cast in the previous line. ditto for the following line.
45	nit: "// let Predicates = [AtLeastM68020]"

efriedma added inline comments.Nov 4 2022, 12:55 PM

llvm/lib/Target/M68k/M68kISelLowering.cpp
167	It probably also makes sense to make shouldExpandAtomicRMWInIR return AtomicExpansionKind::CmpXChg on M68020, to use inline loops for atomicrmw.

0x59616e marked an inline comment as done.Nov 6 2022, 11:19 PM

0x59616e added inline comments.

llvm/lib/Target/M68k/M68kISelLowering.cpp
169	Yes.

0x59616e marked an inline comment as done.Nov 6 2022, 11:19 PM

update diff:

Expand atomic-rmw to atomic-compare-and-swap on target >= M68020
address feedbacks

Harbormaster completed remote builds in B196647: Diff 473897.Nov 8 2022, 1:26 AM

LGTM thank you! Please also include cmpxchg and rmw in the title.

This revision is now accepted and ready to land.Nov 8 2022, 7:00 AM

LGTM

Thanks a lot for all of your benign help ;)

Closed by commit rGe086b24d1530: [M68k] Add support for atomic instructions (authored by 0x59616e). · Explain WhyNov 9 2022, 2:38 AM

This revision was automatically updated to reflect the committed changes.

0x59616e added a commit: rGe086b24d1530: [M68k] Add support for atomic instructions.

Revision Contents

Path

Size

llvm/

lib/

Target/

M68k/

3 lines

31 lines

46 lines

1 line

M68kTargetMachine.cpp

6 lines

test/

CodeGen/

M68k/

Atomics/

136 lines

606 lines

508 lines

3 lines

MC/

Disassembler/

M68k/

atomics.txt

10 lines

M68k/

Atomics/

cas.s

13 lines

Diff 474206

llvm/lib/Target/M68k/M68kISelLowering.h

Show First 20 Lines • Show All 168 Lines • ▼ Show 20 Lines	public:

MachineBasicBlock *		MachineBasicBlock *
EmitInstrWithCustomInserter(MachineInstr &MI,		EmitInstrWithCustomInserter(MachineInstr &MI,
MachineBasicBlock *MBB) const override;		MachineBasicBlock *MBB) const override;

CCAssignFn *getCCAssignFn(CallingConv::ID CC, bool Return,		CCAssignFn *getCCAssignFn(CallingConv::ID CC, bool Return,
bool IsVarArg) const;		bool IsVarArg) const;

		AtomicExpansionKind
		shouldExpandAtomicRMWInIR(AtomicRMWInst *RMW) const override;

private:		private:
unsigned GetAlignedArgumentStackSize(unsigned StackSize,		unsigned GetAlignedArgumentStackSize(unsigned StackSize,
SelectionDAG &DAG) const;		SelectionDAG &DAG) const;

SDValue getReturnAddressFrameIndex(SelectionDAG &DAG) const;		SDValue getReturnAddressFrameIndex(SelectionDAG &DAG) const;

/// Emit a load of return address if tail call		/// Emit a load of return address if tail call
/// optimization is performed and it is required.		/// optimization is performed and it is required.
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/lib/Target/M68k/M68kISelLowering.cpp

Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	M68kTargetLowering::M68kTargetLowering(const M68kTargetMachine &TM,

setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);		setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);
setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);		setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);

setOperationAction(ISD::DYNAMIC_STACKALLOC, PtrVT, Custom);		setOperationAction(ISD::DYNAMIC_STACKALLOC, PtrVT, Custom);

computeRegisterProperties(STI.getRegisterInfo());		computeRegisterProperties(STI.getRegisterInfo());

		// We lower the `atomic-compare-and-swap` to `__sync_val_compare_and_swap`
		// for subtarget < M68020
		setMaxAtomicSizeInBitsSupported(32);
		setOperationAction(ISD::ATOMIC_CMP_SWAP, {MVT::i8, MVT::i16, MVT::i32},
		Subtarget.atLeastM68020() ? Legal : LibCall);

		// M68k does not have native read-modify-write support, so expand all of them
		// to `__sync_fetch_*` for target < M68020, otherwise expand to CmpxChg.
		efriedmaUnsubmitted Done Reply Inline Actions It probably also makes sense to make shouldExpandAtomicRMWInIR return AtomicExpansionKind::CmpXChg on M68020, to use inline loops for atomicrmw. efriedma: It probably also makes sense to make shouldExpandAtomicRMWInIR return AtomicExpansionKind…
		// See `shouldExpandAtomicRMWInIR` below.
		setOperationAction(
		myhsuUnsubmitted Done Reply Inline Actions just want to double check, were these line formatted by clang-format? myhsu: just want to double check, were these line formatted by clang-format?
		0x59616eAuthorUnsubmitted Done Reply Inline Actions Yes. 0x59616e: Yes.
		{
		ISD::ATOMIC_LOAD_ADD,
		ISD::ATOMIC_LOAD_SUB,
		ISD::ATOMIC_LOAD_AND,
		ISD::ATOMIC_LOAD_OR,
		ISD::ATOMIC_LOAD_XOR,
		ISD::ATOMIC_LOAD_NAND,
		ISD::ATOMIC_LOAD_MIN,
		ISD::ATOMIC_LOAD_MAX,
		ISD::ATOMIC_LOAD_UMIN,
		ISD::ATOMIC_LOAD_UMAX,
		},
		{MVT::i8, MVT::i16, MVT::i32}, LibCall);

// 2^2 bytes		// 2^2 bytes
// FIXME can it be just 2^1?		// FIXME can it be just 2^1?
setMinFunctionAlignment(Align::Constant<2>());		setMinFunctionAlignment(Align::Constant<2>());
}		}

		TargetLoweringBase::AtomicExpansionKind
		M68kTargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *RMW) const {
		return Subtarget.atLeastM68020()
		? TargetLoweringBase::AtomicExpansionKind::CmpXChg
		: TargetLoweringBase::AtomicExpansionKind::None;
		}

EVT M68kTargetLowering::getSetCCResultType(const DataLayout &DL,		EVT M68kTargetLowering::getSetCCResultType(const DataLayout &DL,
LLVMContext &Context, EVT VT) const {		LLVMContext &Context, EVT VT) const {
// M68k SETcc producess either 0x00 or 0xFF		// M68k SETcc producess either 0x00 or 0xFF
return MVT::i8;		return MVT::i8;
}		}

MVT M68kTargetLowering::getScalarShiftAmountTy(const DataLayout &DL,		MVT M68kTargetLowering::getScalarShiftAmountTy(const DataLayout &DL,
EVT Ty) const {		EVT Ty) const {
▲ Show 20 Lines • Show All 3,354 Lines • Show Last 20 Lines

llvm/lib/Target/M68k/M68kInstrAtomics.td

This file was added.

				//===-- M68kInstrAtomics.td - Atomics Instructions ---------- tablegen --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				foreach size = [8, 16, 32] in {
				def : Pat<(!cast<SDPatternOperator>("atomic_load_"#size) MxCP_ARI:$ptr),
				(!cast<MxInst>("MOV"#size#"dj") !cast<MxMemOp>("MxARI"#size):$ptr)>;
				RKSimonUnsubmitted Done Reply Inline Actions Wrap this inside FeatureISA20 ? RKSimon: Wrap this inside FeatureISA20 ?
				0x59616eAuthorUnsubmitted Done Reply Inline Actions Is there any example I can refer to ? 0x59616e: Is there any example I can refer to ?
				myhsuUnsubmitted Done Reply Inline Actions you can use `let Predicates = [IsM68020] in { ...` to wrap both the instruction definitions and the patterns (the predicates won't be honored if you only wrap the instruction definitions, since we're using custom patterns). Though I just realized that the `IsM680X0` predicates defined in M68kInstrInfo.td are wrong. I will fix that shortly. myhsu: you can use `let Predicates = [IsM68020] in { ...` to wrap both the instruction definitions and…
				0x59616eAuthorUnsubmitted Done Reply Inline Actions Something like D137425 ? 0x59616e: Something like D137425 ?
				myhsuUnsubmitted Done Reply Inline Actions Something like D137425 ? Yes thank you! myhsu: > Something like D137425 ? Yes thank you!

				def : Pat<(!cast<SDPatternOperator>("atomic_store_"#size) MxCP_ARI:$ptr,
				!cast<MxRegOp>("MxDRD"#size):$val),
				(!cast<MxInst>("MOV"#size#"jd") !cast<MxMemOp>("MxARI"#size):$ptr,
				!cast<MxRegOp>("MxDRD"#size):$val)>;
				}

				let Predicates = [AtLeastM68020] in {
				class MxCASOp<bits<2> size_encoding, MxType type>
				: MxInst<(outs type.ROp:$out),
				(ins type.ROp:$dc, type.ROp:$du, !cast<MxMemOp>("MxARI"#type.Size):$mem),
				"cas."#type.Prefix#" $dc, $du, $mem"> {
				let Inst = (ascend
				(descend 0b00001, size_encoding, 0b011, MxEncAddrMode_j<"mem">.EA),
				(descend 0b0000000, (operand "$du", 3), 0b000, (operand "$dc", 3))
				);
				let Constraints = "$out = $dc";
				myhsuUnsubmitted Done Reply Inline Actions Why do we need to wrap these patterns with multiclass? myhsu: Why do we need to wrap these patterns with multiclass?
				0x59616eAuthorUnsubmitted Done Reply Inline Actions Yeah that's verbose. I didn't think this through. 0x59616e: Yeah that's verbose. I didn't think this through.
				RKSimonUnsubmitted Done Reply Inline Actions Probably the best way to make this 020+ only is to wrap the patterns, something like: let Predicates = [FeatureISA20] { foreach size = [8, 16, 32] in { .... } } RKSimon: Probably the best way to make this 020+ only is to wrap the patterns, something like: ``` let…
				0x59616eAuthorUnsubmitted Done Reply Inline Actions Thanks. I'll look into it. 0x59616e: Thanks. I'll look into it.
				let mayLoad = 1;
				let mayStore = 1;
				}

				def CAS8 : MxCASOp<0x1, MxType8d>;
				def CAS16 : MxCASOp<0x2, MxType16d>;
				def CAS32 : MxCASOp<0x3, MxType32d>;


				foreach size = [8, 16, 32] in {
				def : Pat<(!cast<SDPatternOperator>("atomic_cmp_swap_"#size) MxCP_ARI:$ptr,
				!cast<MxRegOp>("MxDRD"#size):$cmp,
				!cast<MxRegOp>("MxDRD"#size):$new),
				(!cast<MxInst>("CAS"#size) !cast<MxRegOp>("MxDRD"#size):$cmp,
				myhsuUnsubmitted Done Reply Inline Actions nit: align with !cast in the previous line. ditto for the following line. myhsu: nit: align with !cast in the previous line. ditto for the following line.
				!cast<MxRegOp>("MxDRD"#size):$new,
				!cast<MxMemOp>("MxARI"#size):$ptr)>;
				}
				myhsuUnsubmitted Done Reply Inline Actions nit: "// let Predicates = [AtLeastM68020]" myhsu: nit: "// let Predicates = [AtLeastM68020]"
				} // let Predicates = [AtLeastM68020]

llvm/lib/Target/M68k/M68kInstrInfo.td

	Show First 20 Lines • Show All 775 Lines • ▼ Show 20 Lines
	// Subsystems			// Subsystems
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	include "M68kInstrData.td"			include "M68kInstrData.td"
	include "M68kInstrShiftRotate.td"			include "M68kInstrShiftRotate.td"
	include "M68kInstrBits.td"			include "M68kInstrBits.td"
	include "M68kInstrArithmetic.td"			include "M68kInstrArithmetic.td"
	include "M68kInstrControl.td"			include "M68kInstrControl.td"
				include "M68kInstrAtomics.td"

	include "M68kInstrCompiler.td"			include "M68kInstrCompiler.td"

llvm/lib/Target/M68k/M68kTargetMachine.cpp

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	public:

M68kTargetMachine &getM68kTargetMachine() const {		M68kTargetMachine &getM68kTargetMachine() const {
return getTM<M68kTargetMachine>();		return getTM<M68kTargetMachine>();
}		}

const M68kSubtarget &getM68kSubtarget() const {		const M68kSubtarget &getM68kSubtarget() const {
return *getM68kTargetMachine().getSubtargetImpl();		return *getM68kTargetMachine().getSubtargetImpl();
}		}
		void addIRPasses() override;
bool addIRTranslator() override;		bool addIRTranslator() override;
bool addLegalizeMachineIR() override;		bool addLegalizeMachineIR() override;
bool addRegBankSelect() override;		bool addRegBankSelect() override;
bool addGlobalInstructionSelect() override;		bool addGlobalInstructionSelect() override;
bool addInstSelector() override;		bool addInstSelector() override;
void addPreSched2() override;		void addPreSched2() override;
void addPreEmitPass() override;		void addPreEmitPass() override;
};		};
} // namespace		} // namespace

TargetPassConfig *M68kTargetMachine::createPassConfig(PassManagerBase &PM) {		TargetPassConfig *M68kTargetMachine::createPassConfig(PassManagerBase &PM) {
return new M68kPassConfig(*this, PM);		return new M68kPassConfig(*this, PM);
}		}

		void M68kPassConfig::addIRPasses() {
		RKSimonUnsubmitted Done Reply Inline Actions Also - do we need to add an pipeline.ll test file? RKSimon: Also - do we need to add an pipeline.ll test file?
		0x59616eAuthorUnsubmitted Done Reply Inline Actions What is pipeline.ll ? 0x59616e: What is pipeline.ll ?
		RKSimonUnsubmitted Done Reply Inline Actions https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/ARM/O3-pipeline.ll https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/opt-pipeline.ll Different backends do it differently, but basically its a test file that checks what passes have been run at all/some opt levels RKSimon: https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll https…
		0x59616eAuthorUnsubmitted Done Reply Inline Actions Thanks. This looks interesting. As far as I'm concerned, there is no reason not to add this one in M68k. 0x59616e: Thanks. This looks interesting. As far as I'm concerned, there is no reason not to add this one…
		RKSimonUnsubmitted Done Reply Inline Actions So do we need to add TargetPassConfig::addIRPasses() to this call now? RKSimon: So do we need to add TargetPassConfig::addIRPasses() to this call now?
		0x59616eAuthorUnsubmitted Done Reply Inline Actions Oh, I forgot it. I will address this later. Thanks ! 0x59616e: Oh, I forgot it. I will address this later. Thanks !
		addPass(createAtomicExpandPass());
		TargetPassConfig::addIRPasses();
		}

bool M68kPassConfig::addInstSelector() {		bool M68kPassConfig::addInstSelector() {
// Install an instruction selector.		// Install an instruction selector.
addPass(createM68kISelDag(getM68kTargetMachine()));		addPass(createM68kISelDag(getM68kTargetMachine()));
addPass(createM68kGlobalBaseRegPass());		addPass(createM68kGlobalBaseRegPass());
return false;		return false;
}		}

bool M68kPassConfig::addIRTranslator() {		bool M68kPassConfig::addIRTranslator() {
Show All 24 Lines

llvm/test/CodeGen/M68k/Atomics/cmpxchg.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68000 \| FileCheck %s --check-prefix=NO-ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68010 \| FileCheck %s --check-prefix=NO-ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68020 \| FileCheck %s --check-prefix=ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68030 \| FileCheck %s --check-prefix=ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68040 \| FileCheck %s --check-prefix=ATOMIC

				define i1 @cmpxchg_i8_monotonic_monotonic(i8 %cmp, i8 %new, ptr %mem) nounwind {
				; NO-ATOMIC-LABEL: cmpxchg_i8_monotonic_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #20, %sp
				; NO-ATOMIC-NEXT: movem.l %d2, (16,%sp) ; 8-byte Folded Spill
				; NO-ATOMIC-NEXT: move.b (31,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #255, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (8,%sp)
				; NO-ATOMIC-NEXT: move.b (27,%sp), %d2
				; NO-ATOMIC-NEXT: move.l %d2, %d0
				; NO-ATOMIC-NEXT: and.l #255, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (32,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_val_compare_and_swap_1@PLT
				; NO-ATOMIC-NEXT: sub.b %d2, %d0
				; NO-ATOMIC-NEXT: seq %d0
				; NO-ATOMIC-NEXT: movem.l (16,%sp), %d2 ; 8-byte Folded Reload
				; NO-ATOMIC-NEXT: adda.l #20, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: cmpxchg_i8_monotonic_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #4, %sp
				; ATOMIC-NEXT: movem.l %d2, (0,%sp) ; 8-byte Folded Spill
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.b (15,%sp), %d0
				; ATOMIC-NEXT: move.b (11,%sp), %d1
				; ATOMIC-NEXT: move.b %d1, %d2
				; ATOMIC-NEXT: cas.b %d2, %d0, (%a0)
				; ATOMIC-NEXT: sub.b %d1, %d2
				; ATOMIC-NEXT: seq %d0
				; ATOMIC-NEXT: movem.l (0,%sp), %d2 ; 8-byte Folded Reload
				; ATOMIC-NEXT: adda.l #4, %sp
				; ATOMIC-NEXT: rts
				%res = cmpxchg ptr %mem, i8 %cmp, i8 %new monotonic monotonic
				%val = extractvalue {i8, i1} %res, 1
				ret i1 %val
				}

				define i16 @cmpxchg_i16_release_monotonic(i16 %cmp, i16 %new, ptr %mem) nounwind {
				; NO-ATOMIC-LABEL: cmpxchg_i16_release_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: move.w (22,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #65535, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (8,%sp)
				; NO-ATOMIC-NEXT: move.w (18,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #65535, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_val_compare_and_swap_2@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: cmpxchg_i16_release_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (12,%sp), %a0
				; ATOMIC-NEXT: move.w (10,%sp), %d1
				; ATOMIC-NEXT: move.w (6,%sp), %d0
				; ATOMIC-NEXT: cas.w %d0, %d1, (%a0)
				; ATOMIC-NEXT: rts
				%res = cmpxchg ptr %mem, i16 %cmp, i16 %new release monotonic
				%val = extractvalue {i16, i1} %res, 0
				ret i16 %val
				}

				define i32 @cmpxchg_i32_release_acquire(i32 %cmp, i32 %new, ptr %mem) nounwind {
				; NO-ATOMIC-LABEL: cmpxchg_i32_release_acquire:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: move.l (20,%sp), (8,%sp)
				; NO-ATOMIC-NEXT: move.l (16,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_val_compare_and_swap_4@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: cmpxchg_i32_release_acquire:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (12,%sp), %a0
				; ATOMIC-NEXT: move.l (8,%sp), %d1
				; ATOMIC-NEXT: move.l (4,%sp), %d0
				; ATOMIC-NEXT: cas.l %d0, %d1, (%a0)
				; ATOMIC-NEXT: rts
				%res = cmpxchg ptr %mem, i32 %cmp, i32 %new release acquire
				%val = extractvalue {i32, i1} %res, 0
				ret i32 %val
				}

				define i64 @cmpxchg_i64_seqcst_seqcst(i64 %cmp, i64 %new, ptr %mem) nounwind {
				; NO-ATOMIC-LABEL: cmpxchg_i64_seqcst_seqcst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #36, %sp
				; NO-ATOMIC-NEXT: move.l (44,%sp), (28,%sp)
				; NO-ATOMIC-NEXT: move.l (40,%sp), (24,%sp)
				; NO-ATOMIC-NEXT: lea (24,%sp), %a0
				; NO-ATOMIC-NEXT: move.l %a0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l #5, (20,%sp)
				; NO-ATOMIC-NEXT: move.l #5, (16,%sp)
				; NO-ATOMIC-NEXT: move.l (52,%sp), (12,%sp)
				; NO-ATOMIC-NEXT: move.l (48,%sp), (8,%sp)
				; NO-ATOMIC-NEXT: move.l (56,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_compare_exchange_8@PLT
				; NO-ATOMIC-NEXT: move.l (28,%sp), %d1
				; NO-ATOMIC-NEXT: move.l (24,%sp), %d0
				; NO-ATOMIC-NEXT: adda.l #36, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: cmpxchg_i64_seqcst_seqcst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #36, %sp
				; ATOMIC-NEXT: move.l (44,%sp), (28,%sp)
				; ATOMIC-NEXT: move.l (40,%sp), (24,%sp)
				; ATOMIC-NEXT: lea (24,%sp), %a0
				; ATOMIC-NEXT: move.l %a0, (4,%sp)
				; ATOMIC-NEXT: move.l #5, (20,%sp)
				; ATOMIC-NEXT: move.l #5, (16,%sp)
				; ATOMIC-NEXT: move.l (52,%sp), (12,%sp)
				; ATOMIC-NEXT: move.l (48,%sp), (8,%sp)
				; ATOMIC-NEXT: move.l (56,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_compare_exchange_8@PLT
				; ATOMIC-NEXT: move.l (28,%sp), %d1
				; ATOMIC-NEXT: move.l (24,%sp), %d0
				; ATOMIC-NEXT: adda.l #36, %sp
				; ATOMIC-NEXT: rts
				%res = cmpxchg ptr %mem, i64 %cmp, i64 %new seq_cst seq_cst
				%val = extractvalue {i64, i1} %res, 0
				ret i64 %val
				}

llvm/test/CodeGen/M68k/Atomics/load-store.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68000 \| FileCheck %s --check-prefix=NO-ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68010 \| FileCheck %s --check-prefix=NO-ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68020 \| FileCheck %s --check-prefix=ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68030 \| FileCheck %s --check-prefix=ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68040 \| FileCheck %s --check-prefix=ATOMIC

				define i8 @atomic_load_i8_unordered(i8 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i8_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i8_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i8, i8* %a unordered, align 1
				ret i8 %1
				}

				define i8 @atomic_load_i8_monotonic(i8 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i8_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i8_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i8, i8* %a monotonic, align 1
				ret i8 %1
				}

				define i8 @atomic_load_i8_acquire(i8 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i8_acquire:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i8_acquire:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i8, i8* %a acquire, align 1
				ret i8 %1
				}

				define i8 @atomic_load_i8_seq_cst(i8 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i8_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i8_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i8, i8* %a seq_cst, align 1
				ret i8 %1
				}

				define i16 @atomic_load_i16_unordered(i16 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i16_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i16_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i16, i16* %a unordered, align 2
				ret i16 %1
				}

				define i16 @atomic_load_i16_monotonic(i16 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i16_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i16_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i16, i16* %a monotonic, align 2
				ret i16 %1
				}

				define i16 @atomic_load_i16_acquire(i16 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i16_acquire:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i16_acquire:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i16, i16* %a acquire, align 2
				ret i16 %1
				}

				define i16 @atomic_load_i16_seq_cst(i16 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i16_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i16_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i16, i16* %a seq_cst, align 2
				ret i16 %1
				}

				define i32 @atomic_load_i32_unordered(i32 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i32_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i32_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i32, i32* %a unordered, align 4
				ret i32 %1
				}

				define i32 @atomic_load_i32_monotonic(i32 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i32_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i32_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i32, i32* %a monotonic, align 4
				ret i32 %1
				}

				define i32 @atomic_load_i32_acquire(i32 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i32_acquire:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i32_acquire:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i32, i32* %a acquire, align 4
				ret i32 %1
				}

				define i32 @atomic_load_i32_seq_cst(i32 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i32_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l (%a0), %d0
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i32_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l (%a0), %d0
				; ATOMIC-NEXT: rts
				%1 = load atomic i32, i32* %a seq_cst, align 4
				ret i32 %1
				}

				define i64 @atomic_load_i64_unordered(i64 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i64_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: move.l #0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i64_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #12, %sp
				; ATOMIC-NEXT: move.l #0, (4,%sp)
				; ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; ATOMIC-NEXT: adda.l #12, %sp
				; ATOMIC-NEXT: rts
				%1 = load atomic i64, i64* %a unordered, align 8
				ret i64 %1
				}

				define i64 @atomic_load_i64_monotonic(i64 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i64_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: move.l #0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i64_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #12, %sp
				; ATOMIC-NEXT: move.l #0, (4,%sp)
				; ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; ATOMIC-NEXT: adda.l #12, %sp
				; ATOMIC-NEXT: rts
				%1 = load atomic i64, i64* %a monotonic, align 8
				ret i64 %1
				}

				define i64 @atomic_load_i64_acquire(i64 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i64_acquire:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: move.l #2, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i64_acquire:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #12, %sp
				; ATOMIC-NEXT: move.l #2, (4,%sp)
				; ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; ATOMIC-NEXT: adda.l #12, %sp
				; ATOMIC-NEXT: rts
				%1 = load atomic i64, i64* %a acquire, align 8
				ret i64 %1
				}

				define i64 @atomic_load_i64_seq_cst(i64 *%a) nounwind {
				; NO-ATOMIC-LABEL: atomic_load_i64_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: move.l #5, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_load_i64_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #12, %sp
				; ATOMIC-NEXT: move.l #5, (4,%sp)
				; ATOMIC-NEXT: move.l (16,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_load_8@PLT
				; ATOMIC-NEXT: adda.l #12, %sp
				; ATOMIC-NEXT: rts
				%1 = load atomic i64, i64* %a seq_cst, align 8
				ret i64 %1
				}

				define void @atomic_store_i8_unordered(i8 *%a, i8 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i8_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.b (11,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i8_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.b (11,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i8 %val, i8* %a unordered, align 1
				ret void
				}

				define void @atomic_store_i8_monotonic(i8 *%a, i8 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i8_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.b (11,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i8_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.b (11,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i8 %val, i8* %a monotonic, align 1
				ret void
				}

				define void @atomic_store_i8_release(i8 *%a, i8 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i8_release:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.b (11,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i8_release:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.b (11,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i8 %val, i8* %a release, align 1
				ret void
				}

				define void @atomic_store_i8_seq_cst(i8 *%a, i8 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i8_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.b (11,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.b %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i8_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.b (11,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.b %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i8 %val, i8* %a seq_cst, align 1
				ret void
				}

				define void @atomic_store_i16_unordered(i16 *%a, i16 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i16_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.w (10,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i16_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.w (10,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i16 %val, i16* %a unordered, align 2
				ret void
				}

				define void @atomic_store_i16_monotonic(i16 *%a, i16 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i16_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.w (10,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i16_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.w (10,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i16 %val, i16* %a monotonic, align 2
				ret void
				}

				define void @atomic_store_i16_release(i16 *%a, i16 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i16_release:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.w (10,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i16_release:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.w (10,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i16 %val, i16* %a release, align 2
				ret void
				}

				define void @atomic_store_i16_seq_cst(i16 *%a, i16 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i16_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.w (10,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.w %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i16_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.w (10,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.w %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i16 %val, i16* %a seq_cst, align 2
				ret void
				}

				define void @atomic_store_i32_unordered(i32 *%a, i32 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i32_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (8,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i32_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (8,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i32 %val, i32* %a unordered, align 4
				ret void
				}

				define void @atomic_store_i32_monotonic(i32 *%a, i32 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i32_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (8,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i32_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (8,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i32 %val, i32* %a monotonic, align 4
				ret void
				}

				define void @atomic_store_i32_release(i32 *%a, i32 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i32_release:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (8,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i32_release:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (8,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i32 %val, i32* %a release, align 4
				ret void
				}

				define void @atomic_store_i32_seq_cst(i32 *%a, i32 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i32_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: move.l (8,%sp), %d0
				; NO-ATOMIC-NEXT: move.l (4,%sp), %a0
				; NO-ATOMIC-NEXT: move.l %d0, (%a0)
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i32_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: move.l (8,%sp), %d0
				; ATOMIC-NEXT: move.l (4,%sp), %a0
				; ATOMIC-NEXT: move.l %d0, (%a0)
				; ATOMIC-NEXT: rts
				store atomic i32 %val, i32* %a seq_cst, align 4
				ret void
				}

				define void @atomic_store_i64_unordered(i64 *%a, i64 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i64_unordered:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #20, %sp
				; NO-ATOMIC-NEXT: move.l #0, (12,%sp)
				; NO-ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; NO-ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; NO-ATOMIC-NEXT: adda.l #20, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i64_unordered:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #20, %sp
				; ATOMIC-NEXT: move.l #0, (12,%sp)
				; ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; ATOMIC-NEXT: adda.l #20, %sp
				; ATOMIC-NEXT: rts
				store atomic i64 %val, i64* %a unordered, align 8
				ret void
				}

				define void @atomic_store_i64_monotonic(i64 *%a, i64 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i64_monotonic:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #20, %sp
				; NO-ATOMIC-NEXT: move.l #0, (12,%sp)
				; NO-ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; NO-ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; NO-ATOMIC-NEXT: adda.l #20, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i64_monotonic:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #20, %sp
				; ATOMIC-NEXT: move.l #0, (12,%sp)
				; ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; ATOMIC-NEXT: adda.l #20, %sp
				; ATOMIC-NEXT: rts
				store atomic i64 %val, i64* %a monotonic, align 8
				ret void
				}

				define void @atomic_store_i64_release(i64 *%a, i64 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i64_release:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #20, %sp
				; NO-ATOMIC-NEXT: move.l #3, (12,%sp)
				; NO-ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; NO-ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; NO-ATOMIC-NEXT: adda.l #20, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i64_release:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #20, %sp
				; ATOMIC-NEXT: move.l #3, (12,%sp)
				; ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; ATOMIC-NEXT: adda.l #20, %sp
				; ATOMIC-NEXT: rts
				store atomic i64 %val, i64* %a release, align 8
				ret void
				}

				define void @atomic_store_i64_seq_cst(i64 *%a, i64 %val) nounwind {
				; NO-ATOMIC-LABEL: atomic_store_i64_seq_cst:
				; NO-ATOMIC: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #20, %sp
				; NO-ATOMIC-NEXT: move.l #5, (12,%sp)
				; NO-ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; NO-ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; NO-ATOMIC-NEXT: adda.l #20, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomic_store_i64_seq_cst:
				; ATOMIC: ; %bb.0:
				; ATOMIC-NEXT: suba.l #20, %sp
				; ATOMIC-NEXT: move.l #5, (12,%sp)
				; ATOMIC-NEXT: move.l (32,%sp), (8,%sp)
				; ATOMIC-NEXT: move.l (28,%sp), (4,%sp)
				; ATOMIC-NEXT: move.l (24,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_store_8@PLT
				; ATOMIC-NEXT: adda.l #20, %sp
				; ATOMIC-NEXT: rts
				store atomic i64 %val, i64* %a seq_cst, align 8
				ret void
				}

llvm/test/CodeGen/M68k/Atomics/rmw.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68000 \| FileCheck %s --check-prefix=NO-ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68010 \| FileCheck %s --check-prefix=NO-ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68020 \| FileCheck %s --check-prefix=ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68030 \| FileCheck %s --check-prefix=ATOMIC
				; RUN: llc %s -o - -mtriple=m68k -mcpu=M68040 \| FileCheck %s --check-prefix=ATOMIC

				define i8 @atomicrmw_add_i8(i8 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_add_i8:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: move.b (19,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #255, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_add_1@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_add_i8:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.b (15,%sp), %d1
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.b (%a0), %d2
				; ATOMIC-NEXT: move.b %d2, %d0
				; ATOMIC-NEXT: .LBB0_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.b %d2, %d3
				; ATOMIC-NEXT: add.b %d1, %d3
				; ATOMIC-NEXT: cas.b %d0, %d3, (%a0)
				; ATOMIC-NEXT: move.b %d0, %d3
				; ATOMIC-NEXT: sub.b %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.b %d0, %d2
				; ATOMIC-NEXT: bne .LBB0_1
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw add ptr %ptr, i8 %val monotonic
				ret i8 %old
				}

				define i16 @atomicrmw_sub_i16(i16 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_sub_i16:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: move.w (18,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #65535, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_sub_2@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_sub_i16:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.w (14,%sp), %d1
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.w (%a0), %d2
				; ATOMIC-NEXT: move.w %d2, %d0
				; ATOMIC-NEXT: .LBB1_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.w %d2, %d3
				; ATOMIC-NEXT: sub.w %d1, %d3
				; ATOMIC-NEXT: cas.w %d0, %d3, (%a0)
				; ATOMIC-NEXT: move.w %d0, %d3
				; ATOMIC-NEXT: sub.w %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.w %d0, %d2
				; ATOMIC-NEXT: bne .LBB1_1
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw sub ptr %ptr, i16 %val acquire
				ret i16 %old
				}

				define i32 @atomicrmw_and_i32(i32 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_and_i32:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: move.l (16,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_and_4@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_and_i32:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.l (12,%sp), %d1
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.l (%a0), %d2
				; ATOMIC-NEXT: move.l %d2, %d0
				; ATOMIC-NEXT: .LBB2_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.l %d2, %d3
				; ATOMIC-NEXT: and.l %d1, %d3
				; ATOMIC-NEXT: cas.l %d0, %d3, (%a0)
				; ATOMIC-NEXT: move.l %d0, %d3
				; ATOMIC-NEXT: sub.l %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.l %d0, %d2
				; ATOMIC-NEXT: bne .LBB2_1
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw and ptr %ptr, i32 %val seq_cst
				ret i32 %old
				}

				define i64 @atomicrmw_xor_i64(i64 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_xor_i64:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #20, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -24
				; NO-ATOMIC-NEXT: move.l #3, (12,%sp)
				; NO-ATOMIC-NEXT: move.l (28,%sp), (8,%sp)
				; NO-ATOMIC-NEXT: move.l (24,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (32,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_fetch_xor_8@PLT
				; NO-ATOMIC-NEXT: adda.l #20, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_xor_i64:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #20, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -24
				; ATOMIC-NEXT: move.l #3, (12,%sp)
				; ATOMIC-NEXT: move.l (28,%sp), (8,%sp)
				; ATOMIC-NEXT: move.l (24,%sp), (4,%sp)
				; ATOMIC-NEXT: move.l (32,%sp), (%sp)
				; ATOMIC-NEXT: jsr __atomic_fetch_xor_8@PLT
				; ATOMIC-NEXT: adda.l #20, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw xor ptr %ptr, i64 %val release
				ret i64 %old
				}

				define i8 @atomicrmw_or_i8(i8 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_or_i8:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: move.b (19,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #255, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_or_1@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_or_i8:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.b (15,%sp), %d1
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.b (%a0), %d2
				; ATOMIC-NEXT: move.b %d2, %d0
				; ATOMIC-NEXT: .LBB4_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.b %d2, %d3
				; ATOMIC-NEXT: or.b %d1, %d3
				; ATOMIC-NEXT: cas.b %d0, %d3, (%a0)
				; ATOMIC-NEXT: move.b %d0, %d3
				; ATOMIC-NEXT: sub.b %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.b %d0, %d2
				; ATOMIC-NEXT: bne .LBB4_1
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw or ptr %ptr, i8 %val monotonic
				ret i8 %old
				}

				define i16 @atmoicrmw_nand_i16(i16 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atmoicrmw_nand_i16:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: movem.l %d2, (8,%sp) ; 8-byte Folded Spill
				; NO-ATOMIC-NEXT: move.w (18,%sp), %d2
				; NO-ATOMIC-NEXT: move.l %d2, %d0
				; NO-ATOMIC-NEXT: and.l #65535, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_nand_2@PLT
				; NO-ATOMIC-NEXT: move.w %d2, %d0
				; NO-ATOMIC-NEXT: movem.l (8,%sp), %d2 ; 8-byte Folded Reload
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atmoicrmw_nand_i16:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.w (14,%sp), %d0
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.w (%a0), %d2
				; ATOMIC-NEXT: move.w %d2, %d1
				; ATOMIC-NEXT: .LBB5_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.w %d2, %d3
				; ATOMIC-NEXT: and.w %d0, %d3
				; ATOMIC-NEXT: eori.w #-1, %d3
				; ATOMIC-NEXT: cas.w %d1, %d3, (%a0)
				; ATOMIC-NEXT: move.w %d1, %d3
				; ATOMIC-NEXT: sub.w %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.w %d1, %d2
				; ATOMIC-NEXT: bne .LBB5_1
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw nand ptr %ptr, i16 %val seq_cst
				ret i16 %val
				}

				define i32 @atomicrmw_min_i32(i32 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_min_i32:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: move.l (16,%sp), (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_min_4@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_min_i32:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.l (12,%sp), %d1
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.l (%a0), %d2
				; ATOMIC-NEXT: bra .LBB6_1
				; ATOMIC-NEXT: .LBB6_3: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB6_1 Depth=1
				; ATOMIC-NEXT: move.l %d2, %d0
				; ATOMIC-NEXT: cas.l %d0, %d3, (%a0)
				; ATOMIC-NEXT: move.l %d0, %d3
				; ATOMIC-NEXT: sub.l %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.l %d0, %d2
				; ATOMIC-NEXT: beq .LBB6_4
				; ATOMIC-NEXT: .LBB6_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.l %d2, %d0
				; ATOMIC-NEXT: sub.l %d1, %d0
				; ATOMIC-NEXT: move.l %d2, %d3
				; ATOMIC-NEXT: ble .LBB6_3
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB6_1 Depth=1
				; ATOMIC-NEXT: move.l %d1, %d3
				; ATOMIC-NEXT: bra .LBB6_3
				; ATOMIC-NEXT: .LBB6_4: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw min ptr %ptr, i32 %val acquire
				ret i32 %old
				}

				define i64 @atomicrmw_max_i64(i64 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_max_i64:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #52, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -56
				; NO-ATOMIC-NEXT: movem.l %d2-%d4/%a2-%a3, (32,%sp) ; 24-byte Folded Spill
				; NO-ATOMIC-NEXT: move.l (60,%sp), %d3
				; NO-ATOMIC-NEXT: move.l (56,%sp), %d4
				; NO-ATOMIC-NEXT: move.l (64,%sp), %a2
				; NO-ATOMIC-NEXT: move.l (4,%a2), %d1
				; NO-ATOMIC-NEXT: move.l (%a2), %d0
				; NO-ATOMIC-NEXT: lea (24,%sp), %a3
				; NO-ATOMIC-NEXT: bra .LBB7_1
				; NO-ATOMIC-NEXT: .LBB7_3: ; %atomicrmw.start
				; NO-ATOMIC-NEXT: ; in Loop: Header=BB7_1 Depth=1
				; NO-ATOMIC-NEXT: move.l %d1, (12,%sp)
				; NO-ATOMIC-NEXT: move.l %d0, (8,%sp)
				; NO-ATOMIC-NEXT: move.l #5, (20,%sp)
				; NO-ATOMIC-NEXT: move.l #5, (16,%sp)
				; NO-ATOMIC-NEXT: jsr __atomic_compare_exchange_8@PLT
				; NO-ATOMIC-NEXT: move.b %d0, %d2
				; NO-ATOMIC-NEXT: move.l (28,%sp), %d1
				; NO-ATOMIC-NEXT: move.l (24,%sp), %d0
				; NO-ATOMIC-NEXT: cmpi.b #0, %d2
				; NO-ATOMIC-NEXT: bne .LBB7_4
				; NO-ATOMIC-NEXT: .LBB7_1: ; %atomicrmw.start
				; NO-ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; NO-ATOMIC-NEXT: move.l %d0, (24,%sp)
				; NO-ATOMIC-NEXT: move.l %d1, (28,%sp)
				; NO-ATOMIC-NEXT: move.l %a2, (%sp)
				; NO-ATOMIC-NEXT: move.l %a3, (4,%sp)
				; NO-ATOMIC-NEXT: move.l %d3, %d2
				; NO-ATOMIC-NEXT: sub.l %d1, %d2
				; NO-ATOMIC-NEXT: move.l %d4, %d2
				; NO-ATOMIC-NEXT: subx.l %d0, %d2
				; NO-ATOMIC-NEXT: slt %d2
				; NO-ATOMIC-NEXT: cmpi.b #0, %d2
				; NO-ATOMIC-NEXT: bne .LBB7_3
				; NO-ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.start
				; NO-ATOMIC-NEXT: ; in Loop: Header=BB7_1 Depth=1
				; NO-ATOMIC-NEXT: move.l %d3, %d1
				; NO-ATOMIC-NEXT: move.l %d4, %d0
				; NO-ATOMIC-NEXT: bra .LBB7_3
				; NO-ATOMIC-NEXT: .LBB7_4: ; %atomicrmw.end
				; NO-ATOMIC-NEXT: movem.l (32,%sp), %d2-%d4/%a2-%a3 ; 24-byte Folded Reload
				; NO-ATOMIC-NEXT: adda.l #52, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_max_i64:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #52, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -56
				; ATOMIC-NEXT: movem.l %d2-%d4/%a2-%a3, (32,%sp) ; 24-byte Folded Spill
				; ATOMIC-NEXT: move.l (60,%sp), %d3
				; ATOMIC-NEXT: move.l (56,%sp), %d4
				; ATOMIC-NEXT: move.l (64,%sp), %a2
				; ATOMIC-NEXT: move.l (4,%a2), %d1
				; ATOMIC-NEXT: move.l (%a2), %d0
				; ATOMIC-NEXT: lea (24,%sp), %a3
				; ATOMIC-NEXT: bra .LBB7_1
				; ATOMIC-NEXT: .LBB7_3: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB7_1 Depth=1
				; ATOMIC-NEXT: move.l %d1, (12,%sp)
				; ATOMIC-NEXT: move.l %d0, (8,%sp)
				; ATOMIC-NEXT: move.l #5, (20,%sp)
				; ATOMIC-NEXT: move.l #5, (16,%sp)
				; ATOMIC-NEXT: jsr __atomic_compare_exchange_8@PLT
				; ATOMIC-NEXT: move.b %d0, %d2
				; ATOMIC-NEXT: move.l (28,%sp), %d1
				; ATOMIC-NEXT: move.l (24,%sp), %d0
				; ATOMIC-NEXT: cmpi.b #0, %d2
				; ATOMIC-NEXT: bne .LBB7_4
				; ATOMIC-NEXT: .LBB7_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.l %d0, (24,%sp)
				; ATOMIC-NEXT: move.l %d1, (28,%sp)
				; ATOMIC-NEXT: move.l %a2, (%sp)
				; ATOMIC-NEXT: move.l %a3, (4,%sp)
				; ATOMIC-NEXT: move.l %d3, %d2
				; ATOMIC-NEXT: sub.l %d1, %d2
				; ATOMIC-NEXT: move.l %d4, %d2
				; ATOMIC-NEXT: subx.l %d0, %d2
				; ATOMIC-NEXT: slt %d2
				; ATOMIC-NEXT: cmpi.b #0, %d2
				; ATOMIC-NEXT: bne .LBB7_3
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB7_1 Depth=1
				; ATOMIC-NEXT: move.l %d3, %d1
				; ATOMIC-NEXT: move.l %d4, %d0
				; ATOMIC-NEXT: bra .LBB7_3
				; ATOMIC-NEXT: .LBB7_4: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (32,%sp), %d2-%d4/%a2-%a3 ; 24-byte Folded Reload
				; ATOMIC-NEXT: adda.l #52, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw max ptr %ptr, i64 %val seq_cst
				ret i64 %old
				}

				define i8 @atomicrmw_i8_umin(i8 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_i8_umin:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: move.b (19,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #255, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_umin_1@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_i8_umin:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.b (15,%sp), %d1
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.b (%a0), %d2
				; ATOMIC-NEXT: bra .LBB8_1
				; ATOMIC-NEXT: .LBB8_3: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB8_1 Depth=1
				; ATOMIC-NEXT: move.b %d2, %d0
				; ATOMIC-NEXT: cas.b %d0, %d3, (%a0)
				; ATOMIC-NEXT: move.b %d0, %d3
				; ATOMIC-NEXT: sub.b %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.b %d0, %d2
				; ATOMIC-NEXT: beq .LBB8_4
				; ATOMIC-NEXT: .LBB8_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.b %d2, %d0
				; ATOMIC-NEXT: sub.b %d1, %d0
				; ATOMIC-NEXT: move.b %d2, %d3
				; ATOMIC-NEXT: bls .LBB8_3
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB8_1 Depth=1
				; ATOMIC-NEXT: move.b %d1, %d3
				; ATOMIC-NEXT: bra .LBB8_3
				; ATOMIC-NEXT: .LBB8_4: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw umin ptr %ptr, i8 %val release
				ret i8 %old
				}

				define i16 @atomicrmw_umax_i16(i16 %val, ptr %ptr) {
				; NO-ATOMIC-LABEL: atomicrmw_umax_i16:
				; NO-ATOMIC: .cfi_startproc
				; NO-ATOMIC-NEXT: ; %bb.0:
				; NO-ATOMIC-NEXT: suba.l #12, %sp
				; NO-ATOMIC-NEXT: .cfi_def_cfa_offset -16
				; NO-ATOMIC-NEXT: move.w (18,%sp), %d0
				; NO-ATOMIC-NEXT: and.l #65535, %d0
				; NO-ATOMIC-NEXT: move.l %d0, (4,%sp)
				; NO-ATOMIC-NEXT: move.l (20,%sp), (%sp)
				; NO-ATOMIC-NEXT: jsr __sync_fetch_and_umax_2@PLT
				; NO-ATOMIC-NEXT: adda.l #12, %sp
				; NO-ATOMIC-NEXT: rts
				;
				; ATOMIC-LABEL: atomicrmw_umax_i16:
				; ATOMIC: .cfi_startproc
				; ATOMIC-NEXT: ; %bb.0:
				; ATOMIC-NEXT: suba.l #8, %sp
				; ATOMIC-NEXT: .cfi_def_cfa_offset -12
				; ATOMIC-NEXT: movem.l %d2-%d3, (0,%sp) ; 12-byte Folded Spill
				; ATOMIC-NEXT: move.w (14,%sp), %d1
				; ATOMIC-NEXT: move.l (16,%sp), %a0
				; ATOMIC-NEXT: move.w (%a0), %d2
				; ATOMIC-NEXT: bra .LBB9_1
				; ATOMIC-NEXT: .LBB9_3: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB9_1 Depth=1
				; ATOMIC-NEXT: move.w %d2, %d0
				; ATOMIC-NEXT: cas.w %d0, %d3, (%a0)
				; ATOMIC-NEXT: move.w %d0, %d3
				; ATOMIC-NEXT: sub.w %d2, %d3
				; ATOMIC-NEXT: seq %d2
				; ATOMIC-NEXT: sub.b #1, %d2
				; ATOMIC-NEXT: move.w %d0, %d2
				; ATOMIC-NEXT: beq .LBB9_4
				; ATOMIC-NEXT: .LBB9_1: ; %atomicrmw.start
				; ATOMIC-NEXT: ; =>This Inner Loop Header: Depth=1
				; ATOMIC-NEXT: move.w %d2, %d0
				; ATOMIC-NEXT: sub.w %d1, %d0
				; ATOMIC-NEXT: move.w %d2, %d3
				; ATOMIC-NEXT: bhi .LBB9_3
				; ATOMIC-NEXT: ; %bb.2: ; %atomicrmw.start
				; ATOMIC-NEXT: ; in Loop: Header=BB9_1 Depth=1
				; ATOMIC-NEXT: move.w %d1, %d3
				; ATOMIC-NEXT: bra .LBB9_3
				; ATOMIC-NEXT: .LBB9_4: ; %atomicrmw.end
				; ATOMIC-NEXT: movem.l (0,%sp), %d2-%d3 ; 12-byte Folded Reload
				; ATOMIC-NEXT: adda.l #8, %sp
				; ATOMIC-NEXT: rts
				%old = atomicrmw umax ptr %ptr, i16 %val seq_cst
				ret i16 %old
				}

llvm/test/CodeGen/M68k/pipeline.ll

	; RUN: llc -mtriple=m68k -debug-pass=Structure < %s -o /dev/null 2>&1 \| grep -v "Verify generated machine code" \| FileCheck %s			; RUN: llc -mtriple=m68k -debug-pass=Structure < %s -o /dev/null 2>&1 \| grep -v "Verify generated machine code" \| FileCheck %s
	; CHECK: ModulePass Manager			; CHECK: ModulePass Manager
				RKSimonUnsubmitted Done Reply Inline Actions ; CHECK RKSimon: ; CHECK
	; CHECK-NEXT: Pre-ISel Intrinsic Lowering			; CHECK-NEXT: Pre-ISel Intrinsic Lowering
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Expand large div/rem			; CHECK-NEXT: Expand large div/rem
				; CHECK-NEXT: Expand Atomic instructions
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)			; CHECK-NEXT: Basic Alias Analysis (stateless AA impl)
	; CHECK-NEXT: Natural Loop Information			; CHECK-NEXT: Natural Loop Information
	; CHECK-NEXT: Canonicalize natural loops			; CHECK-NEXT: Canonicalize natural loops
	; CHECK-NEXT: Scalar Evolution Analysis			; CHECK-NEXT: Scalar Evolution Analysis
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Canonicalize Freeze Instructions in Loops			; CHECK-NEXT: Canonicalize Freeze Instructions in Loops
	▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Virtual Register Rewriter			; CHECK-NEXT: Virtual Register Rewriter
	; CHECK-NEXT: Register Allocation Pass Scoring			; CHECK-NEXT: Register Allocation Pass Scoring
	; CHECK-NEXT: Stack Slot Coloring			; CHECK-NEXT: Stack Slot Coloring
	; CHECK-NEXT: Machine Copy Propagation Pass			; CHECK-NEXT: Machine Copy Propagation Pass
	; CHECK-NEXT: Machine Loop Invariant Code Motion			; CHECK-NEXT: Machine Loop Invariant Code Motion
	; CHECK-NEXT: Remove Redundant DEBUG_VALUE analysis			; CHECK-NEXT: Remove Redundant DEBUG_VALUE analysis
	; CHECK-NEXT: Fixup Statepoint Caller Saved			; CHECK-NEXT: Fixup Statepoint Caller Saved
	; CHECK-NEXT: PostRA Machine Sink			; CHECK-NEXT: PostRA Machine Sink
	; CHECK-NEXT: Machine Block Frequency Analysis			; CHECK-NEXT: Machine Block Frequency Analysis
				RKSimonUnsubmitted Done Reply Inline Actions newline RKSimon: newline
	; CHECK-NEXT: MachineDominator Tree Construction			; CHECK-NEXT: MachineDominator Tree Construction
	; CHECK-NEXT: MachinePostDominator Tree Construction			; CHECK-NEXT: MachinePostDominator Tree Construction
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: Shrink Wrapping analysis			; CHECK-NEXT: Shrink Wrapping analysis
	; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization			; CHECK-NEXT: Prologue/Epilogue Insertion & Frame Finalization
	; CHECK-NEXT: Control Flow Optimizer			; CHECK-NEXT: Control Flow Optimizer
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	Show All 13 Lines
	; CHECK-NEXT: Implement the 'patchable-function' attribute			; CHECK-NEXT: Implement the 'patchable-function' attribute
	; CHECK-NEXT: M68k MOVEM collapser pass			; CHECK-NEXT: M68k MOVEM collapser pass
	; CHECK-NEXT: Contiguously Lay Out Funclets			; CHECK-NEXT: Contiguously Lay Out Funclets
	; CHECK-NEXT: StackMap Liveness Analysis			; CHECK-NEXT: StackMap Liveness Analysis
	; CHECK-NEXT: Live DEBUG_VALUE analysis			; CHECK-NEXT: Live DEBUG_VALUE analysis
	; CHECK-NEXT: Lazy Machine Block Frequency Analysis			; CHECK-NEXT: Lazy Machine Block Frequency Analysis
	; CHECK-NEXT: Machine Optimization Remark Emitter			; CHECK-NEXT: Machine Optimization Remark Emitter
	; CHECK-NEXT: M68k Assembly Printer			; CHECK-NEXT: M68k Assembly Printer
	; CHECK-NEXT: Free MachineFunction			; CHECK-NEXT: Free MachineFunction
	No newline at end of file

llvm/test/MC/Disassembler/M68k/atomics.txt

This file was added.

				# RUN: llvm-mc -disassemble %s -triple=m68k \| FileCheck %s

				# CHECK: cas.b %d3, %d2, (%a2)
				0x0a 0xd2 0x00 0x83
				myhsuUnsubmitted Done Reply Inline Actions what about other sizes? myhsu: what about other sizes?

				# CHECK: cas.w %d4, %d5, (%a3)
				0x0c 0xd3 0x01 0x44

				# CHECK: cas.l %d6, %d7, (%a4)
				0x0e 0xd4 0x01 0xc6

llvm/test/MC/M68k/Atomics/cas.s

This file was added.

				; RUN: llvm-mc -show-encoding -triple=m68k %s \| FileCheck %s

				; CHECK: cas.b %d3, %d2, (%a2)
				; CHECK-SAME: ; encoding: [0x0a,0xd2,0x00,0x83]
				cas.b %d3, %d2, (%a2)

				myhsuUnsubmitted Done Reply Inline Actions ditto other sizes myhsu: ditto other sizes
				; CHECK: cas.w %d4, %d5, (%a3)
				; CHECK-SAME: ; encoding: [0x0c,0xd3,0x01,0x44]
				cas.w %d4, %d5, (%a3)

				; CHECK: cas.l %d6, %d7, (%a4)
				; CHECK-SAME: ; encoding: [0x0e,0xd4,0x01,0xc6]
				cas.l %d6, %d7, (%a4)

This is an archive of the discontinued LLVM Phabricator instance.

[M68k] Add codegen pattern for atomic load / storeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 474206

llvm/lib/Target/M68k/M68kISelLowering.h

llvm/lib/Target/M68k/M68kISelLowering.cpp

llvm/lib/Target/M68k/M68kInstrAtomics.td

llvm/lib/Target/M68k/M68kInstrInfo.td

llvm/lib/Target/M68k/M68kTargetMachine.cpp

llvm/test/CodeGen/M68k/Atomics/cmpxchg.ll

llvm/test/CodeGen/M68k/Atomics/load-store.ll

llvm/test/CodeGen/M68k/Atomics/rmw.ll

llvm/test/CodeGen/M68k/pipeline.ll

llvm/test/MC/Disassembler/M68k/atomics.txt

llvm/test/MC/M68k/Atomics/cas.s

[M68k] Add codegen pattern for atomic load / store
ClosedPublic