This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVISelLowering.h
-
RISCVISelLowering.cpp
-
RISCVInstrInfo.td
-
RISCVInstrInfoA.td
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
-
atomic-fence.ll
-
atomic-load-store.ll

Differential D47589

[RISCV] Add codegen support for atomic load/stores with RV32A
ClosedPublic

Authored by asb on May 31 2018, 6:59 AM.

Download Raw Diff

Details

Reviewers

jyknight
theraven
jfb
eli.friedman

Commits

rG96f492d7df9e: [RISCV] Add codegen support for atomic load/stores with RV32A
rL334591: [RISCV] Add codegen support for atomic load/stores with RV32A

Summary

Fences are inserted according to table A.6 in the current draft of version 2.3 of the RISC-V Instruction Set Manual, which incorporates the memory model changes and definitions contributed by the RISC-V Memory Consistency Model
task group.

Instruction selection failures will now occur for 8/16/32-bit atomicrmw and cmpxchg operations when targeting RV32IA until lowering for these operations is added in a follow-on patch.

Diff Detail

Event Timeline

asb created this revision.May 31 2018, 6:59 AM

Herald added subscribers: mgrang, edward-jones, zzheng and 8 others. · View Herald TranscriptMay 31 2018, 6:59 AM

asb added parent revisions: D47553: Add TargetLowering::shouldExpandAtomicToLibCall and query it from AtomicExpandPass, D47587: [RISCV] Codegen support for atomic operations on RV32I.May 31 2018, 7:00 AM

When the A extension is supported, __atomic libcalls will be generated for any atomic that isn't the native word size or has less than natural alignment.

When do you expect non-natural alignment to occur? Is this purely for C++ support? If so you're guaranteed natural alignment. Otherwise (for intrinsics or for other languages) I'd like to understand what you expect, and whether you have the guarantee that the alignment information you have is correct. Without knowing that it's absolutely correct you're going to codegen bad code (a libcall in one place, and instructions in another).

In D47589#1117778, @jfb wrote:

When the A extension is supported, __atomic libcalls will be generated for any atomic that isn't the native word size or has less than natural alignment.

When do you expect non-natural alignment to occur? Is this purely for C++ support? If so you're guaranteed natural alignment. Otherwise (for intrinsics or for other languages) I'd like to understand what you expect, and whether you have the guarantee that the alignment information you have is correct. Without knowing that it's absolutely correct you're going to codegen bad code (a libcall in one place, and instructions in another).

In fact, let me go further and quote the langref:

'load' instruction

align must be explicitly specified on atomic loads, and the load has undefined behavior if the alignment is not set to a value which is at least the size in bytes of the pointee.

It sounds like you'll want to remove there UB from the IR before your work proceeds. I'm still not sure that you want what you said you did :-)

In D47589#1117778, @jfb wrote:

When the A extension is supported, __atomic libcalls will be generated for any atomic that isn't the native word size or has less than natural alignment.

When do you expect non-natural alignment to occur? Is this purely for C++ support? If so you're guaranteed natural alignment. Otherwise (for intrinsics or for other languages) I'd like to understand what you expect, and whether you have the guarantee that the alignment information you have is correct. Without knowing that it's absolutely correct you're going to codegen bad code (a libcall in one place, and instructions in another).

That comment simply reflects the status quo for behaviour of AtomicExpandPass, that I replicated. Do you think it would be worth doing report_fatal_error if Align < Size?

In D47589#1117978, @asb wrote:

In D47589#1117778, @jfb wrote:

When the A extension is supported, __atomic libcalls will be generated for any atomic that isn't the native word size or has less than natural alignment.

When do you expect non-natural alignment to occur? Is this purely for C++ support? If so you're guaranteed natural alignment. Otherwise (for intrinsics or for other languages) I'd like to understand what you expect, and whether you have the guarantee that the alignment information you have is correct. Without knowing that it's absolutely correct you're going to codegen bad code (a libcall in one place, and instructions in another).

That comment simply reflects the status quo for behaviour of AtomicExpandPass, that I replicated. Do you think it would be worth doing report_fatal_error if Align < Size?

Acutally I'm not sure it's reasonable to quit at compile-time when encountering UB (@regehr - thoughts?). Emitting an __atomic_* libcall in this case seems like a reasonable choice, so perhaps the fix is simply to be clear that an instruction with Align < Size has UB and emitting the __atomic_* call is the behaviour we choose.

LangRef should be fixed... it doesn't make sense to have undefined behavior with respect to a static property of the instruction, and I think it's out-of-date with recent changes. (Support for unaligned atomic load/store was recently added as an extension for AVR.)

Making it the behavior we choose is fine with me. I just understood your comment to mean that you wanted to give this semantics, and that's more work than just stating the semantics you want.

In D47589#1118021, @efriedma wrote:

LangRef should be fixed... it doesn't make sense to have undefined behavior with respect to a static property of the instruction, and I think it's out-of-date with recent changes. (Support for unaligned atomic load/store was recently added as an extension for AVR.)

Say I have two TUs which share an atomic location, but have a different idea of its alignment. They both access that location, one with libcall and one with an instruction. That's clearly broken. Can this happen?

Say I have two TUs which share an atomic location, but have a different idea of its alignment. They both access that location, one with libcall and one with an instruction. That's clearly broken. Can this happen?

No, it's not broken (assuming the location is actually aligned at runtime). In the translation unit where the location is known aligned, it'll use the native lock-free instruction. In the other translation unit, the libatomic call will check the alignment of the address at runtime, see it's aligned, and use a compatible lock-free implementation.

In D47589#1118082, @efriedma wrote:

Say I have two TUs which share an atomic location, but have a different idea of its alignment. They both access that location, one with libcall and one with an instruction. That's clearly broken. Can this happen?

No, it's not broken (assuming the location is actually aligned at runtime). In the translation unit where the location is known aligned, it'll use the native lock-free instruction. In the other translation unit, the libatomic call will check the alignment of the address at runtime, see it's aligned, and use a compatible lock-free implementation.

Is there a guarantee that __atomic_* functions start off with an alignment check, and use a compatible instruction if suitably aligned? This code doesn't seem to do so.

The compiler-rt "__atomic_*" are known to be buggy; see https://reviews.llvm.org/D45321

In D47589#1118128, @efriedma wrote:

The compiler-rt "__atomic_*" are known to be buggy; see https://reviews.llvm.org/D45321

Seems you're advocating that this patch (and the LangRef) follow the reality you'd like to have, not the one we actually have :-)
So I'll re-iterate: is the alignment check a documented guarantee that __atomic_* functions must provide, in all of their implementations?

Seems you're advocating that this patch (and the LangRef) follow the reality you'd like to have, not the one we actually have :-)

My description matches the way GNU libatomic works in practice, as far as I know. (compiler-rt's implementation is incomplete and broken in other ways anyway; see https://reviews.llvm.org/D47606.)

So I'll re-iterate: is the alignment check a documented guarantee that __atomic_* functions must provide, in all of their implementations?

The specialized forms (__atomic_load_8 etc.) assume natural alignment, but the forms which take a size argument check alignment. This is why __atomic_is_lock_free has a pointer argument. See https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary .

In D47589#1118192, @efriedma wrote:

Seems you're advocating that this patch (and the LangRef) follow the reality you'd like to have, not the one we actually have :-)

My description matches the way GNU libatomic works in practice, as far as I know. (compiler-rt's implementation is incomplete and broken in other ways anyway; see https://reviews.llvm.org/D47606.)

I can't look at the implementation, but the docs are kinda unclear: https://gcc.gnu.org/wiki/Atomic/GCCMM/UnalignedPolicy
If we intend to make this guarantee official it would be nice for our GCC friends to do the same. Will send a ping.

So I'll re-iterate: is the alignment check a documented guarantee that __atomic_* functions must provide, in all of their implementations?

The specialized forms (__atomic_load_8 etc.) assume natural alignment, but the forms which take a size argument check alignment. This is why __atomic_is_lock_free has a pointer argument. See https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary .

That page says:

All the optimized routines expect that the object will be properly aligned for a data type of the specified size. Results are undefined for objects not properly aligned if they are invoked directly. (ie, may or may not work as expected). The compiler will not map the generic routine to an optimized routine unless the alignment is correct.

Is that not the case?

My understanding was that __atomic_is_lock_free took a pointer argument because of C's specification, but C later backtracked on alignment (because it was another of their changes which C++ didn't have) and now ignores it. From the docs it looks like __atomic_is_lock_free indeed guarantees it'll look at alignment. I'm not sure about the actual atomic functions though.

In D47589#1118174, @jfb wrote:

In D47589#1118128, @efriedma wrote:

The compiler-rt "__atomic_*" are known to be buggy; see https://reviews.llvm.org/D45321

Seems you're advocating that this patch (and the LangRef) follow the reality you'd like to have, not the one we actually have :-)
So I'll re-iterate: is the alignment check a documented guarantee that __atomic_* functions must provide, in all of their implementations?

The compiler must not emit a call to the __atomic_load_4 (and other names with sizes on the end) functions on unaligned memory -- that function can assume its argument is aligned.

But __atomic_load (size is given as an argument) must be safe to call on both aligned and unaligned memory, *and* must safely interoperate both with __atomic_load_4 and inline-emitted atomics if the current hardware supports them.

In D47589#1119145, @jyknight wrote:

In D47589#1118174, @jfb wrote:

In D47589#1118128, @efriedma wrote:

The compiler-rt "__atomic_*" are known to be buggy; see https://reviews.llvm.org/D45321

Seems you're advocating that this patch (and the LangRef) follow the reality you'd like to have, not the one we actually have :-)
So I'll re-iterate: is the alignment check a documented guarantee that __atomic_* functions must provide, in all of their implementations?

The compiler must not emit a call to the __atomic_load_4 (and other names with sizes on the end) functions on unaligned memory -- that function can assume its argument is aligned.

But __atomic_load (size is given as an argument) must be safe to call on both aligned and unaligned memory, *and* must safely interoperate both with __atomic_load_4 and inline-emitted atomics if the current hardware supports them.

Can you refer to documentation that says so? I'm happy if that's the case, but I want to be really sure it is. Not just for compiler-rt, but for GCC as well (I've ping'd someone on that side to get more info). If so the @t.p.northover's patch fixing some alignment stuff is also something we want, and we'll want to update some documentation.

In D47589#1119221, @jfb wrote:

Can you refer to documentation that says so? I'm happy if that's the case, but I want to be really sure it is. Not just for compiler-rt, but for GCC as well (I've ping'd someone on that side to get more info). If so the @t.p.northover's patch fixing some alignment stuff is also something we want, and we'll want to update some documentation.

That is what GCC's LIbrary docs you referred to before say -- although I agree it's not as clear as it could be.

Starting with "All the optimized routines expect that the object will be properly aligned for a data type of the specified size." In this doc, "optimized routines" means the function names ending with a size, so __atomic_load_4, but not __atomic_load. Then it says: "The compiler will not map the generic routine to an optimized routine unless the alignment is correct." The clear implication of both those sentences together is that __atomic_load does _NOT_ require the object will be properly aligned.

The next question perhaps is whether __atomic_load *MUST* use a lock-free implementation when given suitably-aligned data. The answer to that is the same as the answer to whether __atomic_load_4 must, because "A lazy compiler may simply call the generic routine, bypassing the optimized versions" -- so they must both use the same atomicity mechanism.

And the answer for both is effectively "Yes if the runtime-detected CPU supports it", because you have the requirement that code built for a CPU without atomic instructions for a given size (and thus calling into libatomic for everything), must interoperate properly with code built for a newer CPU model that does support lock-free atomics of that size. (e.g. if I build one .so with -march=386 and another .so with -march=486, they must work together). That implies that atomic_load and atomic_load_4 *MUST* use native atomic ops in at least as many situations as the compiler could have emitted inline atomics for.

asb removed a parent revision: D47553: Add TargetLowering::shouldExpandAtomicToLibCall and query it from AtomicExpandPass.Jun 7 2018, 7:03 AM

This patch now enables lowering of 8, 16, and 32-bit atomic load/stores. We no longer rely on D47553, as the follow-up patch to lower atomicrmw supports partword and native size atomics.

asb added a child revision: D47882: [RISCV] Codegen for i8, i16, and i32 atomicrmw with RV32A.Jun 7 2018, 7:22 AM

jyknight accepted this revision.Jun 7 2018, 8:07 AM

This revision is now accepted and ready to land.Jun 7 2018, 8:07 AM

Closed by commit rL334591: [RISCV] Add codegen support for atomic load/stores with RV32A (authored by asb). · Explain WhyJun 13 2018, 5:09 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: rogfer01. · View Herald TranscriptJun 13 2018, 5:09 AM

Revision Contents

Path

Size

lib/

Target/

RISCV/

RISCVISelLowering.h

8 lines

RISCVISelLowering.cpp

24 lines

RISCVInstrInfo.td

6 lines

RISCVInstrInfoA.td

20 lines

test/

CodeGen/

RISCV/

atomic-fence.ll

2 lines

atomic-load-store.ll

217 lines

Diff 150326

lib/Target/RISCV/RISCVISelLowering.h

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	public:

MachineBasicBlock *		MachineBasicBlock *
EmitInstrWithCustomInserter(MachineInstr &MI,		EmitInstrWithCustomInserter(MachineInstr &MI,
MachineBasicBlock *BB) const override;		MachineBasicBlock *BB) const override;

EVT getSetCCResultType(const DataLayout &DL, LLVMContext &Context,		EVT getSetCCResultType(const DataLayout &DL, LLVMContext &Context,
EVT VT) const override;		EVT VT) const override;

		bool shouldInsertFencesForAtomic(const Instruction *I) const override {
		return isa<LoadInst>(I) \|\| isa<StoreInst>(I);
		}
		Instruction emitLeadingFence(IRBuilder<> &Builder, Instruction Inst,
		AtomicOrdering Ord) const override;
		Instruction emitTrailingFence(IRBuilder<> &Builder, Instruction Inst,
		AtomicOrdering Ord) const override;

private:		private:
void analyzeInputArgs(MachineFunction &MF, CCState &CCInfo,		void analyzeInputArgs(MachineFunction &MF, CCState &CCInfo,
const SmallVectorImpl<ISD::InputArg> &Ins,		const SmallVectorImpl<ISD::InputArg> &Ins,
bool IsRet) const;		bool IsRet) const;
void analyzeOutputArgs(MachineFunction &MF, CCState &CCInfo,		void analyzeOutputArgs(MachineFunction &MF, CCState &CCInfo,
const SmallVectorImpl<ISD::OutputArg> &Outs,		const SmallVectorImpl<ISD::OutputArg> &Outs,
bool IsRet, CallLoweringInfo *CLI) const;		bool IsRet, CallLoweringInfo *CLI) const;
// Lower incoming arguments, copy physregs into vregs		// Lower incoming arguments, copy physregs into vregs
Show All 35 Lines

lib/Target/RISCV/RISCVISelLowering.cpp

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	if (Subtarget.hasStdExtD()) {
setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f32, Expand);		setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f32, Expand);
setTruncStoreAction(MVT::f64, MVT::f32, Expand);		setTruncStoreAction(MVT::f64, MVT::f32, Expand);
}		}

setOperationAction(ISD::GlobalAddress, XLenVT, Custom);		setOperationAction(ISD::GlobalAddress, XLenVT, Custom);
setOperationAction(ISD::BlockAddress, XLenVT, Custom);		setOperationAction(ISD::BlockAddress, XLenVT, Custom);
setOperationAction(ISD::ConstantPool, XLenVT, Custom);		setOperationAction(ISD::ConstantPool, XLenVT, Custom);

// Atomic operations aren't suported in the base RV32I ISA.		if (Subtarget.hasStdExtA())
		setMaxAtomicSizeInBitsSupported(Subtarget.getXLen());
		else
setMaxAtomicSizeInBitsSupported(0);		setMaxAtomicSizeInBitsSupported(0);

setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);

// Function alignments (log2).		// Function alignments (log2).
unsigned FunctionAlignment = Subtarget.hasStdExtC() ? 1 : 2;		unsigned FunctionAlignment = Subtarget.hasStdExtC() ? 1 : 2;
setMinFunctionAlignment(FunctionAlignment);		setMinFunctionAlignment(FunctionAlignment);
setPrefFunctionAlignment(FunctionAlignment);		setPrefFunctionAlignment(FunctionAlignment);

▲ Show 20 Lines • Show All 1,398 Lines • ▼ Show 20 Lines	case 'r':
return std::make_pair(0U, &RISCV::GPRRegClass);		return std::make_pair(0U, &RISCV::GPRRegClass);
default:		default:
break;		break;
}		}
}		}

return TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT);		return TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT);
}		}

		Instruction *RISCVTargetLowering::emitLeadingFence(IRBuilder<> &Builder,
		Instruction *Inst,
		AtomicOrdering Ord) const {
		if (isa<LoadInst>(Inst) && Ord == AtomicOrdering::SequentiallyConsistent)
		return Builder.CreateFence(Ord);
		if (isa<StoreInst>(Inst) && isReleaseOrStronger(Ord))
		return Builder.CreateFence(AtomicOrdering::Release);
		return nullptr;
		}

		Instruction *RISCVTargetLowering::emitTrailingFence(IRBuilder<> &Builder,
		Instruction *Inst,
		AtomicOrdering Ord) const {
		if (isa<LoadInst>(Inst) && isAcquireOrStronger(Ord))
		return Builder.CreateFence(AtomicOrdering::Acquire);
		return nullptr;
		}

lib/Target/RISCV/RISCVInstrInfo.td

	Show First 20 Lines • Show All 729 Lines • ▼ Show 20 Lines
	// fence release -> fence rw, w			// fence release -> fence rw, w
	def : Pat<(atomic_fence (i32 5), (imm)), (FENCE 3, 1)>;			def : Pat<(atomic_fence (i32 5), (imm)), (FENCE 3, 1)>;
	// fence acq_rel -> fence rw, rw (a fence.tso instruction has been proposed			// fence acq_rel -> fence rw, rw (a fence.tso instruction has been proposed
	// but hasn't been added to the specification yet)			// but hasn't been added to the specification yet)
	def : Pat<(atomic_fence (i32 6), (imm)), (FENCE 3, 3)>;			def : Pat<(atomic_fence (i32 6), (imm)), (FENCE 3, 3)>;
	// fence seq_cst -> fence rw, rw			// fence seq_cst -> fence rw, rw
	def : Pat<(atomic_fence (i32 7), (imm)), (FENCE 3, 3)>;			def : Pat<(atomic_fence (i32 7), (imm)), (FENCE 3, 3)>;

				// Lowering for atomic load and store is defined in RISCVInstrInfoA.td.
				// Although these are lowered to fence+load/store instructions defined in the
				// base RV32I/RV64I ISA, this lowering is only used when the A extension is
				// present. This is necessary as it isn't valid to mix __atomic_* libcalls
				// with inline atomic operations for the same object.

	/// Other pseudo-instructions			/// Other pseudo-instructions

	// Pessimistically assume the stack pointer will be clobbered			// Pessimistically assume the stack pointer will be clobbered
	let Defs = [X2], Uses = [X2] in {			let Defs = [X2], Uses = [X2] in {
	def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i32imm:$amt1, i32imm:$amt2),			def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i32imm:$amt1, i32imm:$amt2),
	[(CallSeqStart timm:$amt1, timm:$amt2)]>;			[(CallSeqStart timm:$amt1, timm:$amt2)]>;
	def ADJCALLSTACKUP : Pseudo<(outs), (ins i32imm:$amt1, i32imm:$amt2),			def ADJCALLSTACKUP : Pseudo<(outs), (ins i32imm:$amt1, i32imm:$amt2),
	[(CallSeqEnd timm:$amt1, timm:$amt2)]>;			[(CallSeqEnd timm:$amt1, timm:$amt2)]>;
	Show All 11 Lines

lib/Target/RISCV/RISCVInstrInfoA.td

	Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	defm AMOXOR_D : AMO_rr_aq_rl<0b00100, 0b011, "amoxor.d">;			defm AMOXOR_D : AMO_rr_aq_rl<0b00100, 0b011, "amoxor.d">;
	defm AMOAND_D : AMO_rr_aq_rl<0b01100, 0b011, "amoand.d">;			defm AMOAND_D : AMO_rr_aq_rl<0b01100, 0b011, "amoand.d">;
	defm AMOOR_D : AMO_rr_aq_rl<0b01000, 0b011, "amoor.d">;			defm AMOOR_D : AMO_rr_aq_rl<0b01000, 0b011, "amoor.d">;
	defm AMOMIN_D : AMO_rr_aq_rl<0b10000, 0b011, "amomin.d">;			defm AMOMIN_D : AMO_rr_aq_rl<0b10000, 0b011, "amomin.d">;
	defm AMOMAX_D : AMO_rr_aq_rl<0b10100, 0b011, "amomax.d">;			defm AMOMAX_D : AMO_rr_aq_rl<0b10100, 0b011, "amomax.d">;
	defm AMOMINU_D : AMO_rr_aq_rl<0b11000, 0b011, "amominu.d">;			defm AMOMINU_D : AMO_rr_aq_rl<0b11000, 0b011, "amominu.d">;
	defm AMOMAXU_D : AMO_rr_aq_rl<0b11100, 0b011, "amomaxu.d">;			defm AMOMAXU_D : AMO_rr_aq_rl<0b11100, 0b011, "amomaxu.d">;
	} // Predicates = [HasStedExtA, IsRV64]			} // Predicates = [HasStedExtA, IsRV64]

				//===----------------------------------------------------------------------===//
				// Pseudo-instructions and codegen patterns
				//===----------------------------------------------------------------------===//

				let Predicates = [HasStdExtA] in {

				/// Atomic loads and stores

				// Fences will be inserted for atomic load/stores according to the logic in
				// RISCVTargetLowering::{emitLeadingFence,emitTrailingFence}.

				defm : LdPat<atomic_load_8, LB>;
				defm : LdPat<atomic_load_16, LH>;
				defm : LdPat<atomic_load_32, LW>;

				defm : StPat<atomic_store_8, SB, GPR>;
				defm : StPat<atomic_store_16, SH, GPR>;
				defm : StPat<atomic_store_32, SW, GPR>;
				} // Predicates = [HasStdExtF]

test/CodeGen/RISCV/atomic-fence.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \			; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
	; RUN: \| FileCheck -check-prefix=RV32I %s			; RUN: \| FileCheck -check-prefix=RV32I %s
				; RUN: llc -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV32I %s

	define void @fence_acquire() nounwind {			define void @fence_acquire() nounwind {
	; RV32I-LABEL: fence_acquire:			; RV32I-LABEL: fence_acquire:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: fence r, rw			; RV32I-NEXT: fence r, rw
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	fence acquire			fence acquire
	ret void			ret void
	Show All 30 Lines

test/CodeGen/RISCV/atomic-load-store.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \			; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
	; RUN: \| FileCheck -check-prefix=RV32I %s			; RUN: \| FileCheck -check-prefix=RV32I %s
				; RUN: llc -mtriple=riscv32 -mattr=+a -verify-machineinstrs < %s \
				; RUN: \| FileCheck -check-prefix=RV32IA %s

	define i8 @atomic_load_i8_unordered(i8 *%a) nounwind {			define i8 @atomic_load_i8_unordered(i8 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i8_unordered:			; RV32I-LABEL: atomic_load_i8_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_1			; RV32I-NEXT: call __atomic_load_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i8_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lb a0, 0(a0)
				; RV32IA-NEXT: ret
	%1 = load atomic i8, i8* %a unordered, align 1			%1 = load atomic i8, i8* %a unordered, align 1
	ret i8 %1			ret i8 %1
	}			}

	define i8 @atomic_load_i8_monotonic(i8 *%a) nounwind {			define i8 @atomic_load_i8_monotonic(i8 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i8_monotonic:			; RV32I-LABEL: atomic_load_i8_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_1			; RV32I-NEXT: call __atomic_load_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i8_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lb a0, 0(a0)
				; RV32IA-NEXT: ret
	%1 = load atomic i8, i8* %a monotonic, align 1			%1 = load atomic i8, i8* %a monotonic, align 1
	ret i8 %1			ret i8 %1
	}			}

	define i8 @atomic_load_i8_acquire(i8 *%a) nounwind {			define i8 @atomic_load_i8_acquire(i8 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i8_acquire:			; RV32I-LABEL: atomic_load_i8_acquire:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 2			; RV32I-NEXT: addi a1, zero, 2
	; RV32I-NEXT: call __atomic_load_1			; RV32I-NEXT: call __atomic_load_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i8_acquire:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lb a0, 0(a0)
				; RV32IA-NEXT: fence r, rw
				; RV32IA-NEXT: ret
	%1 = load atomic i8, i8* %a acquire, align 1			%1 = load atomic i8, i8* %a acquire, align 1
	ret i8 %1			ret i8 %1
	}			}

	define i8 @atomic_load_i8_seq_cst(i8 *%a) nounwind {			define i8 @atomic_load_i8_seq_cst(i8 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i8_seq_cst:			; RV32I-LABEL: atomic_load_i8_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 5			; RV32I-NEXT: addi a1, zero, 5
	; RV32I-NEXT: call __atomic_load_1			; RV32I-NEXT: call __atomic_load_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i8_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, rw
				; RV32IA-NEXT: lb a0, 0(a0)
				; RV32IA-NEXT: fence r, rw
				; RV32IA-NEXT: ret
	%1 = load atomic i8, i8* %a seq_cst, align 1			%1 = load atomic i8, i8* %a seq_cst, align 1
	ret i8 %1			ret i8 %1
	}			}

	define i16 @atomic_load_i16_unordered(i16 *%a) nounwind {			define i16 @atomic_load_i16_unordered(i16 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i16_unordered:			; RV32I-LABEL: atomic_load_i16_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_2			; RV32I-NEXT: call __atomic_load_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i16_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lh a0, 0(a0)
				; RV32IA-NEXT: ret
	%1 = load atomic i16, i16* %a unordered, align 2			%1 = load atomic i16, i16* %a unordered, align 2
	ret i16 %1			ret i16 %1
	}			}

	define i16 @atomic_load_i16_monotonic(i16 *%a) nounwind {			define i16 @atomic_load_i16_monotonic(i16 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i16_monotonic:			; RV32I-LABEL: atomic_load_i16_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_2			; RV32I-NEXT: call __atomic_load_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i16_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lh a0, 0(a0)
				; RV32IA-NEXT: ret
	%1 = load atomic i16, i16* %a monotonic, align 2			%1 = load atomic i16, i16* %a monotonic, align 2
	ret i16 %1			ret i16 %1
	}			}

	define i16 @atomic_load_i16_acquire(i16 *%a) nounwind {			define i16 @atomic_load_i16_acquire(i16 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i16_acquire:			; RV32I-LABEL: atomic_load_i16_acquire:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 2			; RV32I-NEXT: addi a1, zero, 2
	; RV32I-NEXT: call __atomic_load_2			; RV32I-NEXT: call __atomic_load_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i16_acquire:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lh a0, 0(a0)
				; RV32IA-NEXT: fence r, rw
				; RV32IA-NEXT: ret
	%1 = load atomic i16, i16* %a acquire, align 2			%1 = load atomic i16, i16* %a acquire, align 2
	ret i16 %1			ret i16 %1
	}			}

	define i16 @atomic_load_i16_seq_cst(i16 *%a) nounwind {			define i16 @atomic_load_i16_seq_cst(i16 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i16_seq_cst:			; RV32I-LABEL: atomic_load_i16_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 5			; RV32I-NEXT: addi a1, zero, 5
	; RV32I-NEXT: call __atomic_load_2			; RV32I-NEXT: call __atomic_load_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i16_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, rw
				; RV32IA-NEXT: lh a0, 0(a0)
				; RV32IA-NEXT: fence r, rw
				; RV32IA-NEXT: ret
	%1 = load atomic i16, i16* %a seq_cst, align 2			%1 = load atomic i16, i16* %a seq_cst, align 2
	ret i16 %1			ret i16 %1
	}			}

	define i32 @atomic_load_i32_unordered(i32 *%a) nounwind {			define i32 @atomic_load_i32_unordered(i32 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i32_unordered:			; RV32I-LABEL: atomic_load_i32_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_4			; RV32I-NEXT: call __atomic_load_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i32_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lw a0, 0(a0)
				; RV32IA-NEXT: ret
	%1 = load atomic i32, i32* %a unordered, align 4			%1 = load atomic i32, i32* %a unordered, align 4
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomic_load_i32_monotonic(i32 *%a) nounwind {			define i32 @atomic_load_i32_monotonic(i32 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i32_monotonic:			; RV32I-LABEL: atomic_load_i32_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_4			; RV32I-NEXT: call __atomic_load_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i32_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lw a0, 0(a0)
				; RV32IA-NEXT: ret
	%1 = load atomic i32, i32* %a monotonic, align 4			%1 = load atomic i32, i32* %a monotonic, align 4
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomic_load_i32_acquire(i32 *%a) nounwind {			define i32 @atomic_load_i32_acquire(i32 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i32_acquire:			; RV32I-LABEL: atomic_load_i32_acquire:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 2			; RV32I-NEXT: addi a1, zero, 2
	; RV32I-NEXT: call __atomic_load_4			; RV32I-NEXT: call __atomic_load_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i32_acquire:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: lw a0, 0(a0)
				; RV32IA-NEXT: fence r, rw
				; RV32IA-NEXT: ret
	%1 = load atomic i32, i32* %a acquire, align 4			%1 = load atomic i32, i32* %a acquire, align 4
	ret i32 %1			ret i32 %1
	}			}

	define i32 @atomic_load_i32_seq_cst(i32 *%a) nounwind {			define i32 @atomic_load_i32_seq_cst(i32 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i32_seq_cst:			; RV32I-LABEL: atomic_load_i32_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 5			; RV32I-NEXT: addi a1, zero, 5
	; RV32I-NEXT: call __atomic_load_4			; RV32I-NEXT: call __atomic_load_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i32_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, rw
				; RV32IA-NEXT: lw a0, 0(a0)
				; RV32IA-NEXT: fence r, rw
				; RV32IA-NEXT: ret
	%1 = load atomic i32, i32* %a seq_cst, align 4			%1 = load atomic i32, i32* %a seq_cst, align 4
	ret i32 %1			ret i32 %1
	}			}

	define i64 @atomic_load_i64_unordered(i64 *%a) nounwind {			define i64 @atomic_load_i64_unordered(i64 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i64_unordered:			; RV32I-LABEL: atomic_load_i64_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_8			; RV32I-NEXT: call __atomic_load_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i64_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: mv a1, zero
				; RV32IA-NEXT: call __atomic_load_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	%1 = load atomic i64, i64* %a unordered, align 8			%1 = load atomic i64, i64* %a unordered, align 8
	ret i64 %1			ret i64 %1
	}			}

	define i64 @atomic_load_i64_monotonic(i64 *%a) nounwind {			define i64 @atomic_load_i64_monotonic(i64 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i64_monotonic:			; RV32I-LABEL: atomic_load_i64_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a1, zero			; RV32I-NEXT: mv a1, zero
	; RV32I-NEXT: call __atomic_load_8			; RV32I-NEXT: call __atomic_load_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i64_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: mv a1, zero
				; RV32IA-NEXT: call __atomic_load_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	%1 = load atomic i64, i64* %a monotonic, align 8			%1 = load atomic i64, i64* %a monotonic, align 8
	ret i64 %1			ret i64 %1
	}			}

	define i64 @atomic_load_i64_acquire(i64 *%a) nounwind {			define i64 @atomic_load_i64_acquire(i64 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i64_acquire:			; RV32I-LABEL: atomic_load_i64_acquire:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 2			; RV32I-NEXT: addi a1, zero, 2
	; RV32I-NEXT: call __atomic_load_8			; RV32I-NEXT: call __atomic_load_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i64_acquire:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: addi a1, zero, 2
				; RV32IA-NEXT: call __atomic_load_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	%1 = load atomic i64, i64* %a acquire, align 8			%1 = load atomic i64, i64* %a acquire, align 8
	ret i64 %1			ret i64 %1
	}			}

	define i64 @atomic_load_i64_seq_cst(i64 *%a) nounwind {			define i64 @atomic_load_i64_seq_cst(i64 *%a) nounwind {
	; RV32I-LABEL: atomic_load_i64_seq_cst:			; RV32I-LABEL: atomic_load_i64_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a1, zero, 5			; RV32I-NEXT: addi a1, zero, 5
	; RV32I-NEXT: call __atomic_load_8			; RV32I-NEXT: call __atomic_load_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_load_i64_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: addi a1, zero, 5
				; RV32IA-NEXT: call __atomic_load_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	%1 = load atomic i64, i64* %a seq_cst, align 8			%1 = load atomic i64, i64* %a seq_cst, align 8
	ret i64 %1			ret i64 %1
	}			}

	define void @atomic_store_i8_unordered(i8 *%a, i8 %b) nounwind {			define void @atomic_store_i8_unordered(i8 *%a, i8 %b) nounwind {
	; RV32I-LABEL: atomic_store_i8_unordered:			; RV32I-LABEL: atomic_store_i8_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: call __atomic_store_1			; RV32I-NEXT: call __atomic_store_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i8_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sb a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i8 %b, i8* %a unordered, align 1			store atomic i8 %b, i8* %a unordered, align 1
	ret void			ret void
	}			}

	define void @atomic_store_i8_monotonic(i8 *%a, i8 %b) nounwind {			define void @atomic_store_i8_monotonic(i8 *%a, i8 %b) nounwind {
	; RV32I-LABEL: atomic_store_i8_monotonic:			; RV32I-LABEL: atomic_store_i8_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: call __atomic_store_1			; RV32I-NEXT: call __atomic_store_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i8_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sb a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i8 %b, i8* %a monotonic, align 1			store atomic i8 %b, i8* %a monotonic, align 1
	ret void			ret void
	}			}

	define void @atomic_store_i8_release(i8 *%a, i8 %b) nounwind {			define void @atomic_store_i8_release(i8 *%a, i8 %b) nounwind {
	; RV32I-LABEL: atomic_store_i8_release:			; RV32I-LABEL: atomic_store_i8_release:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a2, zero, 3			; RV32I-NEXT: addi a2, zero, 3
	; RV32I-NEXT: call __atomic_store_1			; RV32I-NEXT: call __atomic_store_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i8_release:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, w
				; RV32IA-NEXT: sb a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i8 %b, i8* %a release, align 1			store atomic i8 %b, i8* %a release, align 1
	ret void			ret void
	}			}

	define void @atomic_store_i8_seq_cst(i8 *%a, i8 %b) nounwind {			define void @atomic_store_i8_seq_cst(i8 *%a, i8 %b) nounwind {
	; RV32I-LABEL: atomic_store_i8_seq_cst:			; RV32I-LABEL: atomic_store_i8_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a2, zero, 5			; RV32I-NEXT: addi a2, zero, 5
	; RV32I-NEXT: call __atomic_store_1			; RV32I-NEXT: call __atomic_store_1
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i8_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, w
				; RV32IA-NEXT: sb a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i8 %b, i8* %a seq_cst, align 1			store atomic i8 %b, i8* %a seq_cst, align 1
	ret void			ret void
	}			}

	define void @atomic_store_i16_unordered(i16 *%a, i16 %b) nounwind {			define void @atomic_store_i16_unordered(i16 *%a, i16 %b) nounwind {
	; RV32I-LABEL: atomic_store_i16_unordered:			; RV32I-LABEL: atomic_store_i16_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: call __atomic_store_2			; RV32I-NEXT: call __atomic_store_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i16_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sh a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i16 %b, i16* %a unordered, align 2			store atomic i16 %b, i16* %a unordered, align 2
	ret void			ret void
	}			}

	define void @atomic_store_i16_monotonic(i16 *%a, i16 %b) nounwind {			define void @atomic_store_i16_monotonic(i16 *%a, i16 %b) nounwind {
	; RV32I-LABEL: atomic_store_i16_monotonic:			; RV32I-LABEL: atomic_store_i16_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: call __atomic_store_2			; RV32I-NEXT: call __atomic_store_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i16_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sh a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i16 %b, i16* %a monotonic, align 2			store atomic i16 %b, i16* %a monotonic, align 2
	ret void			ret void
	}			}

	define void @atomic_store_i16_release(i16 *%a, i16 %b) nounwind {			define void @atomic_store_i16_release(i16 *%a, i16 %b) nounwind {
	; RV32I-LABEL: atomic_store_i16_release:			; RV32I-LABEL: atomic_store_i16_release:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a2, zero, 3			; RV32I-NEXT: addi a2, zero, 3
	; RV32I-NEXT: call __atomic_store_2			; RV32I-NEXT: call __atomic_store_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i16_release:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, w
				; RV32IA-NEXT: sh a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i16 %b, i16* %a release, align 2			store atomic i16 %b, i16* %a release, align 2
	ret void			ret void
	}			}

	define void @atomic_store_i16_seq_cst(i16 *%a, i16 %b) nounwind {			define void @atomic_store_i16_seq_cst(i16 *%a, i16 %b) nounwind {
	; RV32I-LABEL: atomic_store_i16_seq_cst:			; RV32I-LABEL: atomic_store_i16_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a2, zero, 5			; RV32I-NEXT: addi a2, zero, 5
	; RV32I-NEXT: call __atomic_store_2			; RV32I-NEXT: call __atomic_store_2
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i16_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, w
				; RV32IA-NEXT: sh a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i16 %b, i16* %a seq_cst, align 2			store atomic i16 %b, i16* %a seq_cst, align 2
	ret void			ret void
	}			}

	define void @atomic_store_i32_unordered(i32 *%a, i32 %b) nounwind {			define void @atomic_store_i32_unordered(i32 *%a, i32 %b) nounwind {
	; RV32I-LABEL: atomic_store_i32_unordered:			; RV32I-LABEL: atomic_store_i32_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: call __atomic_store_4			; RV32I-NEXT: call __atomic_store_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i32_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sw a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i32 %b, i32* %a unordered, align 4			store atomic i32 %b, i32* %a unordered, align 4
	ret void			ret void
	}			}

	define void @atomic_store_i32_monotonic(i32 *%a, i32 %b) nounwind {			define void @atomic_store_i32_monotonic(i32 *%a, i32 %b) nounwind {
	; RV32I-LABEL: atomic_store_i32_monotonic:			; RV32I-LABEL: atomic_store_i32_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a2, zero			; RV32I-NEXT: mv a2, zero
	; RV32I-NEXT: call __atomic_store_4			; RV32I-NEXT: call __atomic_store_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i32_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: sw a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i32 %b, i32* %a monotonic, align 4			store atomic i32 %b, i32* %a monotonic, align 4
	ret void			ret void
	}			}

	define void @atomic_store_i32_release(i32 *%a, i32 %b) nounwind {			define void @atomic_store_i32_release(i32 *%a, i32 %b) nounwind {
	; RV32I-LABEL: atomic_store_i32_release:			; RV32I-LABEL: atomic_store_i32_release:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a2, zero, 3			; RV32I-NEXT: addi a2, zero, 3
	; RV32I-NEXT: call __atomic_store_4			; RV32I-NEXT: call __atomic_store_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i32_release:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, w
				; RV32IA-NEXT: sw a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i32 %b, i32* %a release, align 4			store atomic i32 %b, i32* %a release, align 4
	ret void			ret void
	}			}

	define void @atomic_store_i32_seq_cst(i32 *%a, i32 %b) nounwind {			define void @atomic_store_i32_seq_cst(i32 *%a, i32 %b) nounwind {
	; RV32I-LABEL: atomic_store_i32_seq_cst:			; RV32I-LABEL: atomic_store_i32_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a2, zero, 5			; RV32I-NEXT: addi a2, zero, 5
	; RV32I-NEXT: call __atomic_store_4			; RV32I-NEXT: call __atomic_store_4
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i32_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: fence rw, w
				; RV32IA-NEXT: sw a0, 0(a1)
				; RV32IA-NEXT: ret
	store atomic i32 %b, i32* %a seq_cst, align 4			store atomic i32 %b, i32* %a seq_cst, align 4
	ret void			ret void
	}			}

	define void @atomic_store_i64_unordered(i64 *%a, i64 %b) nounwind {			define void @atomic_store_i64_unordered(i64 *%a, i64 %b) nounwind {
	; RV32I-LABEL: atomic_store_i64_unordered:			; RV32I-LABEL: atomic_store_i64_unordered:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a3, zero			; RV32I-NEXT: mv a3, zero
	; RV32I-NEXT: call __atomic_store_8			; RV32I-NEXT: call __atomic_store_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i64_unordered:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: mv a3, zero
				; RV32IA-NEXT: call __atomic_store_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	store atomic i64 %b, i64* %a unordered, align 8			store atomic i64 %b, i64* %a unordered, align 8
	ret void			ret void
	}			}

	define void @atomic_store_i64_monotonic(i64 *%a, i64 %b) nounwind {			define void @atomic_store_i64_monotonic(i64 *%a, i64 %b) nounwind {
	; RV32I-LABEL: atomic_store_i64_monotonic:			; RV32I-LABEL: atomic_store_i64_monotonic:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: mv a3, zero			; RV32I-NEXT: mv a3, zero
	; RV32I-NEXT: call __atomic_store_8			; RV32I-NEXT: call __atomic_store_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i64_monotonic:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: mv a3, zero
				; RV32IA-NEXT: call __atomic_store_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	store atomic i64 %b, i64* %a monotonic, align 8			store atomic i64 %b, i64* %a monotonic, align 8
	ret void			ret void
	}			}

	define void @atomic_store_i64_release(i64 *%a, i64 %b) nounwind {			define void @atomic_store_i64_release(i64 *%a, i64 %b) nounwind {
	; RV32I-LABEL: atomic_store_i64_release:			; RV32I-LABEL: atomic_store_i64_release:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a3, zero, 3			; RV32I-NEXT: addi a3, zero, 3
	; RV32I-NEXT: call __atomic_store_8			; RV32I-NEXT: call __atomic_store_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i64_release:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: addi a3, zero, 3
				; RV32IA-NEXT: call __atomic_store_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	store atomic i64 %b, i64* %a release, align 8			store atomic i64 %b, i64* %a release, align 8
	ret void			ret void
	}			}

	define void @atomic_store_i64_seq_cst(i64 *%a, i64 %b) nounwind {			define void @atomic_store_i64_seq_cst(i64 *%a, i64 %b) nounwind {
	; RV32I-LABEL: atomic_store_i64_seq_cst:			; RV32I-LABEL: atomic_store_i64_seq_cst:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi sp, sp, -16			; RV32I-NEXT: addi sp, sp, -16
	; RV32I-NEXT: sw ra, 12(sp)			; RV32I-NEXT: sw ra, 12(sp)
	; RV32I-NEXT: addi a3, zero, 5			; RV32I-NEXT: addi a3, zero, 5
	; RV32I-NEXT: call __atomic_store_8			; RV32I-NEXT: call __atomic_store_8
	; RV32I-NEXT: lw ra, 12(sp)			; RV32I-NEXT: lw ra, 12(sp)
	; RV32I-NEXT: addi sp, sp, 16			; RV32I-NEXT: addi sp, sp, 16
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
				;
				; RV32IA-LABEL: atomic_store_i64_seq_cst:
				; RV32IA: # %bb.0:
				; RV32IA-NEXT: addi sp, sp, -16
				; RV32IA-NEXT: sw ra, 12(sp)
				; RV32IA-NEXT: addi a3, zero, 5
				; RV32IA-NEXT: call __atomic_store_8
				; RV32IA-NEXT: lw ra, 12(sp)
				; RV32IA-NEXT: addi sp, sp, 16
				; RV32IA-NEXT: ret
	store atomic i64 %b, i64* %a seq_cst, align 8			store atomic i64 %b, i64* %a seq_cst, align 8
	ret void			ret void
	}			}