This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/AArch64/
-
Target/
-
AArch64/
3/3
AArch64DeadRegisterDefinitionsPass.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
atomic-ops-lse.ll

Differential D58348

[AArch64] Fix for bug 35094 atomicrmw on Armv8.1-A+lse
ClosedPublic

Authored by christof on Feb 18 2019, 5:50 AM.

Download Raw Diff

Details

Reviewers

t.p.northover
john.brawn
olista01
ajasty-cavium

Commits

rG8cfd91dcc726: [AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse
rL356360: [AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse

Summary

Fix for https://bugs.llvm.org/show_bug.cgi?id=35094

The Dead register definition pass should leave alone the atomicrmw
instructions on AArch64 (LTE extension). The reason is the following
statement in the Arm ARM:

The ST<OP> instructions, and LD<OP> instructions where the destination
register is WZR or XZR, are not regarded as doing a read for the purpose
of a DMB LD barrier.

A good example was given in the gcc thread by Will Deacon (linked in the
bugzilla ticket):

P0 (atomic_int* y,atomic_int* x) {
  atomic_store_explicit(x,1,memory_order_relaxed);
  atomic_thread_fence(memory_order_release);
  atomic_store_explicit(y,1,memory_order_relaxed);
}

P1 (atomic_int* y,atomic_int* x) {
  atomic_fetch_add_explicit(y,1,memory_order_relaxed);  // STADD
  atomic_thread_fence(memory_order_acquire);
  int r0 = atomic_load_explicit(x,memory_order_relaxed);
}

P2 (atomic_int* y) {
  int r1 = atomic_load_explicit(y,memory_order_relaxed);
}
My understanding is that it is forbidden for r0 == 0 and r1 == 2 after
this test has executed. However, if the relaxed add in P1 compiles to
STADD and the subsequent acquire fence is compiled as DMB LD, then we
don't have any ordering guarantees in P1 and the forbidden result could
be observed.

Diff Detail

Event Timeline

christof created this revision.Feb 18 2019, 5:50 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2019, 5:50 AM

Herald added subscribers: llvm-commits, jdoerfert, jfb and 2 others. · View Herald Transcript

Are you sure that Bugzilla link is right?

I agree this is a problem according to the formal model, although it would be kind of weird to run into it (you can't tell whether the happens-before edge formed without reading the value at some point, so the code would have to load the result of the atomic operation at some point).

ajasty-cavium added a subscriber: steleman.Feb 20 2019, 4:31 PM

christof edited the summary of this revision. (Show Details)Feb 21 2019, 3:21 AM

In D58348#1404995, @efriedma wrote:

Are you sure that Bugzilla link is right?

Fixed. The bug number in the subject was the correct one.

I agree this is a problem according to the formal model, although it would be kind of weird to run into it (you can't tell whether the happens-before edge formed without reading the value at some point, so the code would have to load the result of the atomic operation at some point).

It might be an edge case, I'm not sure, to be honest. I'll go see if I can get some better idea of that. I though that it would be good to adhere to the formal model to prevent surprise, so if there is an example that breaks, we ought to prevent that. Is that up for debate in case this is an edge case that is unlikely to be hit?

lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp
71	s/load/lose/

john.brawn added inline comments.Feb 21 2019, 5:07 AM

lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp
73	It doesn't especially matter, as it'll already get caught by atomicBarrierDroppedOnZero, but the acquire variants of these instructions (LDADDAW etc.) also aren't counted as performing a read when the destination register is zero.

Is that up for debate in case this is an edge case that is unlikely to be hit?

When it comes to the atomics in general, it's important to allow people to write code with the same performance characteristics as hand-written assembly, or else we force people will go around the compiler using inline asm. But the performance impact here should be basically zero for code that doesn't use an acquire fence, I think, so it shouldn't be an issue here.

christof updated this revision to Diff 188212.Feb 25 2019, 10:17 AM

christof marked 2 inline comments as done.

christof added inline comments.

lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp
73	True. I'll add a remark that this is incomplete and I've left all the acquire variants to `atomicBarrierDroppedOnZero()`. If people wish to have this function complete, I can copy all the cases from atomicBarrierDroppedOnZero that apply and duplicate them here instead. I was wondering if I cannot do this type of grouping in tablegen by leaving some label on an instruction. That way I could just test that label, rather than listing the subsets in yet another place. I'm not aware of such mechanism, unfortunately.

In D58348#1406435, @efriedma wrote:

Is that up for debate in case this is an edge case that is unlikely to be hit?

When it comes to the atomics in general, it's important to allow people to write code with the same performance characteristics as hand-written assembly, or else we force people will go around the compiler using inline asm. But the performance impact here should be basically zero for code that doesn't use an acquire fence, I think, so it shouldn't be an issue here.

Fair enough. The loads increase the memory bus operations and in case you've got high register pressure you are using one register extra (with zero lifetime, but still). The performance impact is low. However, the programmer has asked for a load and the compiler has to proof that this cannot be observed before that load is removed, which is rather difficult when it can be used to order other memory operations.

LGTM.

This revision is now accepted and ready to land.Feb 26 2019, 7:58 AM

Committed as r356360

christof added a commit: rG8cfd91dcc726: [AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse.Mar 25 2019, 5:00 AM

Revision Contents

Path

Size

lib/

Target/

AArch64/

AArch64DeadRegisterDefinitionsPass.cpp

47 lines

test/

CodeGen/

AArch64/

atomic-ops-lse.ll

72 lines

Diff 188212

lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp

	Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines

	static bool usesFrameIndex(const MachineInstr &MI) {			static bool usesFrameIndex(const MachineInstr &MI) {
	for (const MachineOperand &MO : MI.uses())			for (const MachineOperand &MO : MI.uses())
	if (MO.isFI())			if (MO.isFI())
	return true;			return true;
	return false;			return false;
	}			}

				// Instructions that lose their 'read' operation for a subesquent fence acquire
				christofAuthorUnsubmitted Done Reply Inline Actions s/load/lose/ christof: s/load/lose/
				// (DMB LD) once the zero register is used.
				//
				john.brawnUnsubmitted Done Reply Inline Actions It doesn't especially matter, as it'll already get caught by atomicBarrierDroppedOnZero, but the acquire variants of these instructions (LDADDAW etc.) also aren't counted as performing a read when the destination register is zero. john.brawn: It doesn't especially matter, as it'll already get caught by atomicBarrierDroppedOnZero, but…
				christofAuthorUnsubmitted Done Reply Inline Actions True. I'll add a remark that this is incomplete and I've left all the acquire variants to `atomicBarrierDroppedOnZero()`. If people wish to have this function complete, I can copy all the cases from atomicBarrierDroppedOnZero that apply and duplicate them here instead. I was wondering if I cannot do this type of grouping in tablegen by leaving some label on an instruction. That way I could just test that label, rather than listing the subsets in yet another place. I'm not aware of such mechanism, unfortunately. christof: True. I'll add a remark that this is incomplete and I've left all the acquire variants to…
				// WARNING: The aquire variants of the instructions are also affected, but they
				// are split out into `atomicBarrierDroppedOnZero()` to support annotations on
				// assembly.
				static bool atomicReadDroppedOnZero(unsigned Opcode) {
				switch (Opcode) {
				case AArch64::LDADDB: case AArch64::LDADDH:
				case AArch64::LDADDW: case AArch64::LDADDX:
				case AArch64::LDADDLB: case AArch64::LDADDLH:
				case AArch64::LDADDLW: case AArch64::LDADDLX:
				case AArch64::LDCLRB: case AArch64::LDCLRH:
				case AArch64::LDCLRW: case AArch64::LDCLRX:
				case AArch64::LDCLRLB: case AArch64::LDCLRLH:
				case AArch64::LDCLRLW: case AArch64::LDCLRLX:
				case AArch64::LDEORB: case AArch64::LDEORH:
				case AArch64::LDEORW: case AArch64::LDEORX:
				case AArch64::LDEORLB: case AArch64::LDEORLH:
				case AArch64::LDEORLW: case AArch64::LDEORLX:
				case AArch64::LDSETB: case AArch64::LDSETH:
				case AArch64::LDSETW: case AArch64::LDSETX:
				case AArch64::LDSETLB: case AArch64::LDSETLH:
				case AArch64::LDSETLW: case AArch64::LDSETLX:
				case AArch64::LDSMAXB: case AArch64::LDSMAXH:
				case AArch64::LDSMAXW: case AArch64::LDSMAXX:
				case AArch64::LDSMAXLB: case AArch64::LDSMAXLH:
				case AArch64::LDSMAXLW: case AArch64::LDSMAXLX:
				case AArch64::LDSMINB: case AArch64::LDSMINH:
				case AArch64::LDSMINW: case AArch64::LDSMINX:
				case AArch64::LDSMINLB: case AArch64::LDSMINLH:
				case AArch64::LDSMINLW: case AArch64::LDSMINLX:
				case AArch64::LDUMAXB: case AArch64::LDUMAXH:
				case AArch64::LDUMAXW: case AArch64::LDUMAXX:
				case AArch64::LDUMAXLB: case AArch64::LDUMAXLH:
				case AArch64::LDUMAXLW: case AArch64::LDUMAXLX:
				case AArch64::LDUMINB: case AArch64::LDUMINH:
				case AArch64::LDUMINW: case AArch64::LDUMINX:
				case AArch64::LDUMINLB: case AArch64::LDUMINLH:
				case AArch64::LDUMINLW: case AArch64::LDUMINLX:
				return true;
				}
				return false;
				}

	void AArch64DeadRegisterDefinitions::processMachineBasicBlock(			void AArch64DeadRegisterDefinitions::processMachineBasicBlock(
	MachineBasicBlock &MBB) {			MachineBasicBlock &MBB) {
	const MachineFunction &MF = *MBB.getParent();			const MachineFunction &MF = *MBB.getParent();
	for (MachineInstr &MI : MBB) {			for (MachineInstr &MI : MBB) {
	if (usesFrameIndex(MI)) {			if (usesFrameIndex(MI)) {
	// We need to skip this instruction because while it appears to have a			// We need to skip this instruction because while it appears to have a
	// dead def it uses a frame index which might expand into a multi			// dead def it uses a frame index which might expand into a multi
	// instruction sequence during EPI.			// instruction sequence during EPI.
	LLVM_DEBUG(dbgs() << " Ignoring, operand is frame index\n");			LLVM_DEBUG(dbgs() << " Ignoring, operand is frame index\n");
	continue;			continue;
	}			}
	if (MI.definesRegister(AArch64::XZR) \|\| MI.definesRegister(AArch64::WZR)) {			if (MI.definesRegister(AArch64::XZR) \|\| MI.definesRegister(AArch64::WZR)) {
	// It is not allowed to write to the same register (not even the zero			// It is not allowed to write to the same register (not even the zero
	// register) twice in a single instruction.			// register) twice in a single instruction.
	LLVM_DEBUG(			LLVM_DEBUG(
	dbgs()			dbgs()
	<< " Ignoring, XZR or WZR already used by the instruction\n");			<< " Ignoring, XZR or WZR already used by the instruction\n");
	continue;			continue;
	}			}

	if (atomicBarrierDroppedOnZero(MI.getOpcode())) {			if (atomicBarrierDroppedOnZero(MI.getOpcode()) \|\| atomicReadDroppedOnZero(MI.getOpcode())) {
	LLVM_DEBUG(dbgs() << " Ignoring, semantics change with xzr/wzr.\n");			LLVM_DEBUG(dbgs() << " Ignoring, semantics change with xzr/wzr.\n");
	continue;			continue;
	}			}

	const MCInstrDesc &Desc = MI.getDesc();			const MCInstrDesc &Desc = MI.getDesc();
	for (int I = 0, E = Desc.getNumDefs(); I != E; ++I) {			for (int I = 0, E = Desc.getNumDefs(); I != E; ++I) {
	MachineOperand &MO = MI.getOperand(I);			MachineOperand &MO = MI.getOperand(I);
	if (!MO.isReg() \|\| !MO.isDef())			if (!MO.isReg() \|\| !MO.isDef())
	▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

test/CodeGen/AArch64/atomic-ops-lse.ll

	Show First 20 Lines • Show All 1,305 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_add_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_add_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_add_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_add_i32_noret_monotonic:
	atomicrmw add i32* @var32, i32 %offset monotonic			atomicrmw add i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stadd w0, [x[[ADDR]]]			; CHECK: ldadd w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_add_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_add_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_add_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_add_i64_noret_monotonic:
	atomicrmw add i64* @var64, i64 %offset monotonic			atomicrmw add i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stadd x0, [x[[ADDR]]]			; CHECK: ldadd x{{[0-9]}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_add_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_add_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_add_i8_release:			; CHECK-LABEL: test_atomic_load_add_i8_release:
	%old = atomicrmw add i8* @var8, i8 %offset release			%old = atomicrmw add i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_add_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_add_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_add_i32_noret_release:			; CHECK-LABEL: test_atomic_load_add_i32_noret_release:
	atomicrmw add i32* @var32, i32 %offset release			atomicrmw add i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: staddl w0, [x[[ADDR]]]			; CHECK: ldaddl w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_add_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_add_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_add_i64_noret_release:			; CHECK-LABEL: test_atomic_load_add_i64_noret_release:
	atomicrmw add i64* @var64, i64 %offset release			atomicrmw add i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: staddl x0, [x[[ADDR]]]			; CHECK: ldaddl x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_add_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_add_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_add_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_add_i8_seq_cst:
	%old = atomicrmw add i8* @var8, i8 %offset seq_cst			%old = atomicrmw add i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 280 Lines • ▼ Show 20 Lines
	define void @test_atomic_load_and_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_and_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_and_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_and_i32_noret_monotonic:
	atomicrmw and i32* @var32, i32 %offset monotonic			atomicrmw and i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: mvn w[[NOT:[0-9]+]], w[[OLD:[0-9]+]]			; CHECK: mvn w[[NOT:[0-9]+]], w[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stclr w[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldclr w{{[0-9]+}}, w[[NEW:[1-9][0-9]*]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_and_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_and_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_and_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_and_i64_noret_monotonic:
	atomicrmw and i64* @var64, i64 %offset monotonic			atomicrmw and i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: mvn x[[NOT:[0-9]+]], x[[OLD:[0-9]+]]			; CHECK: mvn x[[NOT:[0-9]+]], x[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stclr x[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldclr x{{[0-9]+}}, x[[NEW:[1-9][0-9]*]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_and_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_and_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_and_i8_release:			; CHECK-LABEL: test_atomic_load_and_i8_release:
	%old = atomicrmw and i8* @var8, i8 %offset release			%old = atomicrmw and i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines
	define void @test_atomic_load_and_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_and_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_and_i32_noret_release:			; CHECK-LABEL: test_atomic_load_and_i32_noret_release:
	atomicrmw and i32* @var32, i32 %offset release			atomicrmw and i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: mvn w[[NOT:[0-9]+]], w[[OLD:[0-9]+]]			; CHECK: mvn w[[NOT:[0-9]+]], w[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stclrl w[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldclrl w{{[0-9]}}, w[[NEW:[1-9][0-9]]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_and_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_and_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_and_i64_noret_release:			; CHECK-LABEL: test_atomic_load_and_i64_noret_release:
	atomicrmw and i64* @var64, i64 %offset release			atomicrmw and i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: mvn x[[NOT:[0-9]+]], x[[OLD:[0-9]+]]			; CHECK: mvn x[[NOT:[0-9]+]], x[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stclrl x[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldclrl x{{[0-9]}}, x[[NEW:[1-9][0-9]]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_and_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_and_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_and_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_and_i8_seq_cst:
	%old = atomicrmw and i8* @var8, i8 %offset seq_cst			%old = atomicrmw and i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 502 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_max_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_max_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_max_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_max_i32_noret_monotonic:
	atomicrmw max i32* @var32, i32 %offset monotonic			atomicrmw max i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stsmax w0, [x[[ADDR]]]			; CHECK: ldsmax w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_max_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_max_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_max_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_max_i64_noret_monotonic:
	atomicrmw max i64* @var64, i64 %offset monotonic			atomicrmw max i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stsmax x0, [x[[ADDR]]]			; CHECK: ldsmax x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_max_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_max_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_max_i8_release:			; CHECK-LABEL: test_atomic_load_max_i8_release:
	%old = atomicrmw max i8* @var8, i8 %offset release			%old = atomicrmw max i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_max_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_max_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_max_i32_noret_release:			; CHECK-LABEL: test_atomic_load_max_i32_noret_release:
	atomicrmw max i32* @var32, i32 %offset release			atomicrmw max i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stsmaxl w0, [x[[ADDR]]]			; CHECK: ldsmaxl w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_max_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_max_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_max_i64_noret_release:			; CHECK-LABEL: test_atomic_load_max_i64_noret_release:
	atomicrmw max i64* @var64, i64 %offset release			atomicrmw max i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stsmaxl x0, [x[[ADDR]]]			; CHECK: ldsmaxl x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_max_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_max_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_max_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_max_i8_seq_cst:
	%old = atomicrmw max i8* @var8, i8 %offset seq_cst			%old = atomicrmw max i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 275 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_min_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_min_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_min_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_min_i32_noret_monotonic:
	atomicrmw min i32* @var32, i32 %offset monotonic			atomicrmw min i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stsmin w0, [x[[ADDR]]]			; CHECK: ldsmin w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_min_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_min_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_min_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_min_i64_noret_monotonic:
	atomicrmw min i64* @var64, i64 %offset monotonic			atomicrmw min i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stsmin x0, [x[[ADDR]]]			; CHECK: ldsmin x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_min_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_min_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_min_i8_release:			; CHECK-LABEL: test_atomic_load_min_i8_release:
	%old = atomicrmw min i8* @var8, i8 %offset release			%old = atomicrmw min i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_min_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_min_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_min_i32_noret_release:			; CHECK-LABEL: test_atomic_load_min_i32_noret_release:
	atomicrmw min i32* @var32, i32 %offset release			atomicrmw min i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stsminl w0, [x[[ADDR]]]			; CHECK: ldsminl w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_min_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_min_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_min_i64_noret_release:			; CHECK-LABEL: test_atomic_load_min_i64_noret_release:
	atomicrmw min i64* @var64, i64 %offset release			atomicrmw min i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stsminl x0, [x[[ADDR]]]			; CHECK: ldsminl x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_min_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_min_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_min_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_min_i8_seq_cst:
	%old = atomicrmw min i8* @var8, i8 %offset seq_cst			%old = atomicrmw min i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 275 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_or_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_or_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_or_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_or_i32_noret_monotonic:
	atomicrmw or i32* @var32, i32 %offset monotonic			atomicrmw or i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stset w0, [x[[ADDR]]]			; CHECK: ldset w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_or_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_or_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_or_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_or_i64_noret_monotonic:
	atomicrmw or i64* @var64, i64 %offset monotonic			atomicrmw or i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stset x0, [x[[ADDR]]]			; CHECK: ldset x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_or_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_or_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_or_i8_release:			; CHECK-LABEL: test_atomic_load_or_i8_release:
	%old = atomicrmw or i8* @var8, i8 %offset release			%old = atomicrmw or i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_or_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_or_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_or_i32_noret_release:			; CHECK-LABEL: test_atomic_load_or_i32_noret_release:
	atomicrmw or i32* @var32, i32 %offset release			atomicrmw or i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stsetl w0, [x[[ADDR]]]			; CHECK: ldsetl w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_or_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_or_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_or_i64_noret_release:			; CHECK-LABEL: test_atomic_load_or_i64_noret_release:
	atomicrmw or i64* @var64, i64 %offset release			atomicrmw or i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stsetl x0, [x[[ADDR]]]			; CHECK: ldsetl x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_or_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_or_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_or_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_or_i8_seq_cst:
	%old = atomicrmw or i8* @var8, i8 %offset seq_cst			%old = atomicrmw or i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 296 Lines • ▼ Show 20 Lines
	define void @test_atomic_load_sub_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_sub_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_sub_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_sub_i32_noret_monotonic:
	atomicrmw sub i32* @var32, i32 %offset monotonic			atomicrmw sub i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: neg w[[NEG:[0-9]+]], w[[OLD:[0-9]+]]			; CHECK: neg w[[NEG:[0-9]+]], w[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stadd w[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldadd w{{[0-9]+}}, w[[NEW:[1-9][0-9]*]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb

	ret void			ret void
	}			}

	define void @test_atomic_load_sub_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_sub_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_sub_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_sub_i64_noret_monotonic:
	atomicrmw sub i64* @var64, i64 %offset monotonic			atomicrmw sub i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: neg x[[NEG:[0-9]+]], x[[OLD:[0-9]+]]			; CHECK: neg x[[NEG:[0-9]+]], x[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stadd x[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldadd x{{[0-9]+}}, x[[NEW:[1-9][0-9]*]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb

	ret void			ret void
	}			}

	define i8 @test_atomic_load_sub_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_sub_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_sub_i8_release:			; CHECK-LABEL: test_atomic_load_sub_i8_release:
	%old = atomicrmw sub i8* @var8, i8 %offset release			%old = atomicrmw sub i8* @var8, i8 %offset release
	▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	define void @test_atomic_load_sub_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_sub_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_sub_i32_noret_release:			; CHECK-LABEL: test_atomic_load_sub_i32_noret_release:
	atomicrmw sub i32* @var32, i32 %offset release			atomicrmw sub i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: neg w[[NEG:[0-9]+]], w[[OLD:[0-9]+]]			; CHECK: neg w[[NEG:[0-9]+]], w[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: staddl w[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldaddl w{{[0-9]}}, w[[NEW:[1-9][0-9]]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb

	ret void			ret void
	}			}

	define void @test_atomic_load_sub_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_sub_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_sub_i64_noret_release:			; CHECK-LABEL: test_atomic_load_sub_i64_noret_release:
	atomicrmw sub i64* @var64, i64 %offset release			atomicrmw sub i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: neg x[[NEG:[0-9]+]], x[[OLD:[0-9]+]]			; CHECK: neg x[[NEG:[0-9]+]], x[[OLD:[0-9]+]]
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: staddl x[[NEW:[0-9]+]], [x[[ADDR]]]			; CHECK: ldaddl x{{[0-9]}}, x[[NEW:[1-9][0-9]]], [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb

	ret void			ret void
	}			}

	define i8 @test_atomic_load_sub_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_sub_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_sub_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_sub_i8_seq_cst:
	%old = atomicrmw sub i8* @var8, i8 %offset seq_cst			%old = atomicrmw sub i8* @var8, i8 %offset seq_cst
	▲ Show 20 Lines • Show All 674 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_umax_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_umax_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umax_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_umax_i32_noret_monotonic:
	atomicrmw umax i32* @var32, i32 %offset monotonic			atomicrmw umax i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stumax w0, [x[[ADDR]]]			; CHECK: ldumax w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_umax_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_umax_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umax_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_umax_i64_noret_monotonic:
	atomicrmw umax i64* @var64, i64 %offset monotonic			atomicrmw umax i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stumax x0, [x[[ADDR]]]			; CHECK: ldumax x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_umax_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_umax_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umax_i8_release:			; CHECK-LABEL: test_atomic_load_umax_i8_release:
	%old = atomicrmw umax i8* @var8, i8 %offset release			%old = atomicrmw umax i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_umax_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_umax_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umax_i32_noret_release:			; CHECK-LABEL: test_atomic_load_umax_i32_noret_release:
	atomicrmw umax i32* @var32, i32 %offset release			atomicrmw umax i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stumaxl w0, [x[[ADDR]]]			; CHECK: ldumaxl w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_umax_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_umax_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umax_i64_noret_release:			; CHECK-LABEL: test_atomic_load_umax_i64_noret_release:
	atomicrmw umax i64* @var64, i64 %offset release			atomicrmw umax i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stumaxl x0, [x[[ADDR]]]			; CHECK: ldumaxl x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_umax_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_umax_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umax_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_umax_i8_seq_cst:
	%old = atomicrmw umax i8* @var8, i8 %offset seq_cst			%old = atomicrmw umax i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 275 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_umin_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_umin_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umin_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_umin_i32_noret_monotonic:
	atomicrmw umin i32* @var32, i32 %offset monotonic			atomicrmw umin i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stumin w0, [x[[ADDR]]]			; CHECK: ldumin w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_umin_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_umin_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umin_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_umin_i64_noret_monotonic:
	atomicrmw umin i64* @var64, i64 %offset monotonic			atomicrmw umin i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stumin x0, [x[[ADDR]]]			; CHECK: ldumin x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_umin_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_umin_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umin_i8_release:			; CHECK-LABEL: test_atomic_load_umin_i8_release:
	%old = atomicrmw umin i8* @var8, i8 %offset release			%old = atomicrmw umin i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_umin_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_umin_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umin_i32_noret_release:			; CHECK-LABEL: test_atomic_load_umin_i32_noret_release:
	atomicrmw umin i32* @var32, i32 %offset release			atomicrmw umin i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: stuminl w0, [x[[ADDR]]]			; CHECK: lduminl w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_umin_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_umin_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umin_i64_noret_release:			; CHECK-LABEL: test_atomic_load_umin_i64_noret_release:
	atomicrmw umin i64* @var64, i64 %offset release			atomicrmw umin i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: stuminl x0, [x[[ADDR]]]			; CHECK: lduminl x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_umin_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_umin_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_umin_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_umin_i8_seq_cst:
	%old = atomicrmw umin i8* @var8, i8 %offset seq_cst			%old = atomicrmw umin i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 275 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_xor_i32_noret_monotonic(i32 %offset) nounwind {			define void @test_atomic_load_xor_i32_noret_monotonic(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_xor_i32_noret_monotonic:			; CHECK-LABEL: test_atomic_load_xor_i32_noret_monotonic:
	atomicrmw xor i32* @var32, i32 %offset monotonic			atomicrmw xor i32* @var32, i32 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: steor w0, [x[[ADDR]]]			; CHECK: ldeor w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_xor_i64_noret_monotonic(i64 %offset) nounwind {			define void @test_atomic_load_xor_i64_noret_monotonic(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_xor_i64_noret_monotonic:			; CHECK-LABEL: test_atomic_load_xor_i64_noret_monotonic:
	atomicrmw xor i64* @var64, i64 %offset monotonic			atomicrmw xor i64* @var64, i64 %offset monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: steor x0, [x[[ADDR]]]			; CHECK: ldeor x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_xor_i8_release(i8 %offset) nounwind {			define i8 @test_atomic_load_xor_i8_release(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_xor_i8_release:			; CHECK-LABEL: test_atomic_load_xor_i8_release:
	%old = atomicrmw xor i8* @var8, i8 %offset release			%old = atomicrmw xor i8* @var8, i8 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines

	define void @test_atomic_load_xor_i32_noret_release(i32 %offset) nounwind {			define void @test_atomic_load_xor_i32_noret_release(i32 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_xor_i32_noret_release:			; CHECK-LABEL: test_atomic_load_xor_i32_noret_release:
	atomicrmw xor i32* @var32, i32 %offset release			atomicrmw xor i32* @var32, i32 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var32			; CHECK: adrp [[TMPADDR:x[0-9]+]], var32
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var32

	; CHECK: steorl w0, [x[[ADDR]]]			; CHECK: ldeorl w{{[0-9]+}}, w{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define void @test_atomic_load_xor_i64_noret_release(i64 %offset) nounwind {			define void @test_atomic_load_xor_i64_noret_release(i64 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_xor_i64_noret_release:			; CHECK-LABEL: test_atomic_load_xor_i64_noret_release:
	atomicrmw xor i64* @var64, i64 %offset release			atomicrmw xor i64* @var64, i64 %offset release
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: adrp [[TMPADDR:x[0-9]+]], var64			; CHECK: adrp [[TMPADDR:x[0-9]+]], var64
	; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64			; CHECK: add x[[ADDR:[0-9]+]], [[TMPADDR]], {{#?}}:lo12:var64

	; CHECK: steorl x0, [x[[ADDR]]]			; CHECK: ldeorl x{{[0-9]+}}, x{{[1-9][0-9]*}}, [x[[ADDR]]]
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	ret void			ret void
	}			}

	define i8 @test_atomic_load_xor_i8_seq_cst(i8 %offset) nounwind {			define i8 @test_atomic_load_xor_i8_seq_cst(i8 %offset) nounwind {
	; CHECK-LABEL: test_atomic_load_xor_i8_seq_cst:			; CHECK-LABEL: test_atomic_load_xor_i8_seq_cst:
	%old = atomicrmw xor i8* @var8, i8 %offset seq_cst			%old = atomicrmw xor i8* @var8, i8 %offset seq_cst
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines