Download Raw Diff

Details

Reviewers

dschuff
jfb
sunfish
tlively

Commits

rL361884: [WebAssembly] Support for atomic fences
rG551465859113: [WebAssembly] Support for atomic fences

Summary

This adds support for translation of LLVM IR fence instruction. We
convert a singlethread fence to a pseudo compiler barrier which becomes
0 instructions in final binary, and a thread fence to an idempotent
atomicrmw instruction to a memory address.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aheejin created this revision.Aug 3 2018, 3:48 PM

Herald added subscribers: llvm-commits, sunfish, jgravelle-google, sbc100. · View Herald TranscriptAug 3 2018, 3:48 PM

Harbormaster completed remote builds in B21069: Diff 159114.Aug 3 2018, 3:48 PM

aheejin edited the summary of this revision. (Show Details)Aug 3 2018, 3:48 PM

comment fix

What do you expect for relaxed, as well as for signal fences?
Could you also check in with Conrad, since he's been working on the WebAssembly memory model?

aheejin added a reviewer: sunfish.Aug 3 2018, 4:36 PM

What do you expect for relaxed, as well as for signal fences?

Here I actually translated both a signal fence and a thread fence to the same thing. I think this will be conservatively correct, but maybe we should treat them differently. What if we translate a signal fence to a nothing but prevent reordering of instructions across it, and we might do the same thing for asm volatile("" ::: "memory") too. That can be something like making a pseudo instruction that eventually becomes nothing and prevents reordering across the pseudo instruction in the backend passes. What do you think?

Oh and for weaker fences, I added tests for them. They all translate to the same sequentially consistent atomicrmw or instruction.

Could you also check in with Conrad, since he's been working on the WebAssembly memory model?

I don't think Conrad is in the LLVM Phabricator here. Should we move the discussion to a github issue?

jbhateja added a subscriber: jbhateja.Aug 4 2018, 2:51 AM

aheejin planned changes to this revision.Aug 14 2018, 5:08 PM

Add support for singlethread fences
Fixed handling of MachineMemOperand according to changes in rL339740

Harbormaster completed remote builds in B21537: Diff 160909.Aug 15 2018, 1:31 PM

aheejin edited the summary of this revision. (Show Details)Aug 15 2018, 1:31 PM

Don't emit the pseudo compiler_fence instruction in .s file

Harbormaster completed remote builds in B21548: Diff 160941.Aug 15 2018, 4:33 PM

Use a target external symbol with MO_SYMBOL_GLOBAL

Harbormaster completed remote builds in B21642: Diff 161359.Aug 17 2018, 4:24 PM

We discussed how to implement fences here, and I think the general consensus was even though it may not be strictly necessary to emit atomicrmw for some cases, it can be user friendly and also might help support legacy builtins like __sync_synchronize(). And IMHO it is a single instruction anyway and fences are generally expected to be expensive, so no significant harm will be done. Does anyone have more concerns on this? If not, can we land this?

In D50277#1210237, @aheejin wrote:

We discussed how to implement fences here, and I think the general consensus was even though it may not be strictly necessary to emit atomicrmw for some cases, it can be user friendly and also might help support legacy builtins like __sync_synchronize(). And IMHO it is a single instruction anyway and fences are generally expected to be expensive, so no significant harm will be done. Does anyone have more concerns on this? If not, can we land this?

I have concerns which I expressed here: https://github.com/WebAssembly/tool-conventions/issues/59#issuecomment-413620669

aheejin mentioned this in D56520: [WebAssembly] Mask SIMD shift values.Jan 11 2019, 3:16 PM

rebase

Harbormaster completed remote builds in B27490: Diff 184242.Jan 29 2019, 7:40 PM

Fix bug: 0 should be i32.const 0 and not an immediate

Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2019, 7:43 PM

Harbormaster completed remote builds in B28000: Diff 186176.Feb 10 2019, 7:45 PM

Use ATOMIC_NRI for the fence + test fix

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 17 2019, 5:45 PM

aheejin added a parent revision: D58338: [WebAssembly] Refactor atomic operation definitions (NFC).Feb 17 2019, 5:45 PM

Harbormaster completed remote builds in B28239: Diff 187186.Feb 17 2019, 5:46 PM

aheejin mentioned this in D58338: [WebAssembly] Refactor atomic operation definitions (NFC).Feb 18 2019, 4:18 PM

aheejin added a subscriber: steven-johnson.Feb 26 2019, 11:28 AM

What is the status of the discussion here? We've gotten to the point where I have to manually add this patch to my local checkout in order to debug unrelated issues and it looks like the conversation fizzled out a while ago, so it would be great if we could land something.

I've now made an updated post on https://github.com/WebAssembly/tool-conventions/issues/59.

Given the discussion on the github issues, we have scheduled a discussion at the in-person CG meeting that should resolve the outstanding memory model issues, possibly with a change to the spec.

However, the lack of fence lowering is now AFAIK the only thing that holds us back from declaring the wasm backend as the primary/only supported codegen path and staring the process of removing fastcomp.
Given that, I'm going to propose that we apply this patch as-is, to match the current behavior of the existing emscripten pipeline, with the understanding that we can consider the issue still open and change LLVM's behavior based on the outcome of the discussions (and we can in any case ensure that we don't declare any stable ABIs or allow this change to get branched). We can even put it behind the emscripten triple if people feel strongly about that.
Any objections?

Herald added a subscriber: dexonsmith. · View Herald TranscriptMay 24 2019, 4:40 PM

I agree; this matches the current behavior, so it does not change anything from users' standpoint, and we ever decide do something else in the CG meeting, we can always land another patch later.

If I follow the discussion correctly, there are subtle ABI issues either way, so making fence a RMW now doesn't give us fully a future-proof ABI anyway. Consequently, I would prefer to make this patch conditional on the Emscripten triple, as wasm32-wasi and wasm32-unknown-unknown don't require Emscripten compatibility.

Yeah I agree with your assessment that this doesn't give us a future-proof ABI; that's why I want to revisit it later. Mostly this is to unblock removing fastcomp.

OK I'll predicate this under emscripten triple. But anyway someone please approve this!

LGTM ship it! (behind emscripten flag)

This revision is now accepted and ready to land.May 24 2019, 5:52 PM

Predicate behind wasm32-unknown-emscripten flag.
Add tests for non-emscripten flags

test cosmetic change

Harbormaster completed remote builds in B32574: Diff 201723.May 28 2019, 11:11 AM

Harbormaster completed remote builds in B32576: Diff 201725.

tlively added inline comments.May 28 2019, 11:19 AM

lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
113 ↗	(On Diff #201725)	This looks over-indented. Am I missing something?

LGTM modulo the indentation thing.

Fix a typo and indentations

Harbormaster completed remote builds in B32584: Diff 201762.May 28 2019, 2:08 PM

aheejin added inline comments.May 28 2019, 2:44 PM

lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
113 ↗	(On Diff #201725)	Thanks! How did it happen...??

sunfish added inline comments.May 28 2019, 2:47 PM

lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗	(On Diff #201762)	My understanding of https://github.com/WebAssembly/tool-conventions/issues/59 is that we can lower fence to zero instructions here, rather than aborting.

Closed by commit rG551465859113: [WebAssembly] Support for atomic fences (authored by aheejin). · Explain WhyMay 28 2019, 3:07 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: hiraditya. · View Herald TranscriptMay 28 2019, 3:07 PM

aheejin marked an inline comment as done.May 28 2019, 3:29 PM

aheejin added inline comments.

lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗	(On Diff #201762)	Sorry I didn't see this before committing. This is just to ensure the current crashing behavior for non-emscripten OSes. This does not change anything for them. Didn't we decide that we should commit this CL only for emscripten triple and wait for other OSes until we make a final decision on fences?

sunfish added inline comments.May 28 2019, 4:41 PM

lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗	(On Diff #201762)	The behavior before this patch is to emit zero instructions, and not crash. So this patch does change the behavior. I'm asking to restore this previous behavior for non-Emscripten OS's.

aheejin marked an inline comment as done.May 28 2019, 4:57 PM

aheejin added inline comments.

lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗	(On Diff #201762)	Really? Without this patch, it crashes for me. Of course, only when we have `-mattr=+atomics`; if we don't have atomics enabled all atomic instructions become normal instructions and fences become nothing for all triples. This behavior is still the same. This is when I enabled `-mattr=+atomics` with wasi triple, before this patch landed: $ llc atomic-fence.ll -mtriple=wasm32-unknown-wasi -mattr=+atomics,+sign-ext LLVM ERROR: Cannot select: t3: ch = AtomicFence t0, Constant:i32<7>, Constant:i32<1> t1: i32 = Constant<7> t2: i32 = Constant<1> In function: multithread_fence

sunfish added inline comments.May 28 2019, 6:17 PM

lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗	(On Diff #201762)	Ah, you're right, I did a quick test without -mattr=+atomics, which I forget is now the thing which determines whether operations like fence are lowered by LowerAtomic.

Diff 201771

llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp

Show First 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	void WebAssemblyAsmPrinter::EmitInstruction(const MachineInstr *MI) {
case WebAssembly::FALLTHROUGH_RETURN_VOID_S:		case WebAssembly::FALLTHROUGH_RETURN_VOID_S:
// This instruction represents the implicit return at the end of a		// This instruction represents the implicit return at the end of a
// function body with no return value.		// function body with no return value.
if (isVerbose()) {		if (isVerbose()) {
OutStreamer->AddComment("fallthrough-return-void");		OutStreamer->AddComment("fallthrough-return-void");
OutStreamer->AddBlankLine();		OutStreamer->AddBlankLine();
}		}
break;		break;
		case WebAssembly::COMPILER_FENCE:
		// This is a compiler barrier that prevents instruction reordering during
		// backend compilation, and should not be emitted.
		break;
case WebAssembly::EXTRACT_EXCEPTION_I32:		case WebAssembly::EXTRACT_EXCEPTION_I32:
case WebAssembly::EXTRACT_EXCEPTION_I32_S:		case WebAssembly::EXTRACT_EXCEPTION_I32_S:
// These are pseudo instructions that simulates popping values from stack.		// These are pseudo instructions that simulates popping values from stack.
// We print these only when we have -wasm-keep-registers on for assembly		// We print these only when we have -wasm-keep-registers on for assembly
// readability.		// readability.
if (!WasmKeepRegisters)		if (!WasmKeepRegisters)
break;		break;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp

	Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	void WebAssemblyDAGToDAGISel::Select(SDNode *Node) {			void WebAssemblyDAGToDAGISel::Select(SDNode *Node) {
	// If we have a custom node, we already have selected!			// If we have a custom node, we already have selected!
	if (Node->isMachineOpcode()) {			if (Node->isMachineOpcode()) {
	LLVM_DEBUG(errs() << "== "; Node->dump(CurDAG); errs() << "\n");			LLVM_DEBUG(errs() << "== "; Node->dump(CurDAG); errs() << "\n");
	Node->setNodeId(-1);			Node->setNodeId(-1);
	return;			return;
	}			}

	// Few custom selection stuff. If we need WebAssembly-specific selection,			// Few custom selection stuff.
	// uncomment this block add corresponding case statements.			SDLoc DL(Node);
	/*			MachineFunction &MF = CurDAG->getMachineFunction();
	switch (Node->getOpcode()) {			switch (Node->getOpcode()) {
				case ISD::ATOMIC_FENCE: {
				if (!MF.getSubtarget<WebAssemblySubtarget>().hasAtomics())
				break;

				uint64_t SyncScopeID =
				cast<ConstantSDNode>(Node->getOperand(2).getNode())->getZExtValue();
				switch (SyncScopeID) {
				case SyncScope::SingleThread: {
				// We lower a single-thread fence to a pseudo compiler barrier instruction
				// preventing instruction reordering. This will not be emitted in final
				// binary.
				MachineSDNode *Fence =
				CurDAG->getMachineNode(WebAssembly::COMPILER_FENCE,
				DL, // debug loc
				MVT::Other, // outchain type
				Node->getOperand(0) // inchain
				);
				ReplaceNode(Node, Fence);
				CurDAG->RemoveDeadNode(Node);
				return;
				}

				case SyncScope::System: {
				// For non-emscripten systems, we have not decided on what we should
				// traslate fences to yet.
				if (!Subtarget->getTargetTriple().isOSEmscripten())
				report_fatal_error(
				"ATOMIC_FENCE is not yet supported in non-emscripten OSes");

				// Wasm does not have a fence instruction, but because all atomic
				// instructions in wasm are sequentially consistent, we translate a
				// fence to an idempotent atomic RMW instruction to a linear memory
				// address. All atomic instructions in wasm are sequentially consistent,
				// but this is to ensure a fence also prevents reordering of non-atomic
				// instructions in the VM. Even though LLVM IR's fence instruction does
				// not say anything about its relationship with non-atomic instructions,
				// we think this is more user-friendly.
				//
				// While any address can work, here we use a value stored in
				// __stack_pointer wasm global because there's high chance that area is
				// in cache.
				//
				// So the selected instructions will be in the form of:
				// %addr = get_global $__stack_pointer
				// %0 = i32.const 0
				// i32.atomic.rmw.or %addr, %0
				SDValue StackPtrSym = CurDAG->getTargetExternalSymbol(
				"__stack_pointer", TLI->getPointerTy(CurDAG->getDataLayout()));
				MachineSDNode *GetGlobal =
				CurDAG->getMachineNode(WebAssembly::GLOBAL_GET_I32, // opcode
				DL, // debug loc
				MVT::i32, // result type
				StackPtrSym // __stack_pointer symbol
				);

				SDValue Zero = CurDAG->getTargetConstant(0, DL, MVT::i32);
				auto *MMO = MF.getMachineMemOperand(
				MachinePointerInfo::getUnknownStack(MF),
				// FIXME Volatile isn't really correct, but currently all LLVM
				// atomic instructions are treated as volatiles in the backend, so
				// we should be consistent.
				MachineMemOperand::MOVolatile \| MachineMemOperand::MOLoad \|
				MachineMemOperand::MOStore,
				4, 4, AAMDNodes(), nullptr, SyncScope::System,
				AtomicOrdering::SequentiallyConsistent);
				MachineSDNode *Const0 =
				CurDAG->getMachineNode(WebAssembly::CONST_I32, DL, MVT::i32, Zero);
				MachineSDNode *AtomicRMW = CurDAG->getMachineNode(
				WebAssembly::ATOMIC_RMW_OR_I32, // opcode
				DL, // debug loc
				MVT::i32, // result type
				MVT::Other, // outchain type
				{
				Zero, // alignment
				Zero, // offset
				SDValue(GetGlobal, 0), // __stack_pointer
				SDValue(Const0, 0), // OR with 0 to make it idempotent
				Node->getOperand(0) // inchain
				});

				CurDAG->setNodeMemRefs(AtomicRMW, {MMO});
				ReplaceUses(SDValue(Node, 0), SDValue(AtomicRMW, 1));
				CurDAG->RemoveDeadNode(Node);
				return;
				}
				default:
				llvm_unreachable("Unknown scope!");
				}
				}

	default:			default:
	break;			break;
	}			}
	*/

	// Select the default instruction.			// Select the default instruction.
	SelectCode(Node);			SelectCode(Node);
	}			}

	bool WebAssemblyDAGToDAGISel::SelectInlineAsmMemoryOperand(			bool WebAssemblyDAGToDAGISel::SelectInlineAsmMemoryOperand(
	const SDValue &Op, unsigned ConstraintID, std::vector<SDValue> &OutOps) {			const SDValue &Op, unsigned ConstraintID, std::vector<SDValue> &OutOps) {
	switch (ConstraintID) {			switch (ConstraintID) {
	Show All 19 Lines

llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td

	Show First 20 Lines • Show All 881 Lines • ▼ Show 20 Lines
	}			}

	let Predicates = [HasAtomics] in			let Predicates = [HasAtomics] in
	defm : TerRMWTruncExtPattern<			defm : TerRMWTruncExtPattern<
	atomic_cmp_swap_8, atomic_cmp_swap_16, atomic_cmp_swap_32, atomic_cmp_swap_64,			atomic_cmp_swap_8, atomic_cmp_swap_16, atomic_cmp_swap_32, atomic_cmp_swap_64,
	ATOMIC_RMW8_U_CMPXCHG_I32, ATOMIC_RMW16_U_CMPXCHG_I32,			ATOMIC_RMW8_U_CMPXCHG_I32, ATOMIC_RMW16_U_CMPXCHG_I32,
	ATOMIC_RMW8_U_CMPXCHG_I64, ATOMIC_RMW16_U_CMPXCHG_I64,			ATOMIC_RMW8_U_CMPXCHG_I64, ATOMIC_RMW16_U_CMPXCHG_I64,
	ATOMIC_RMW32_U_CMPXCHG_I64>;			ATOMIC_RMW32_U_CMPXCHG_I64>;

				//===----------------------------------------------------------------------===//
				// Atomic fences
				//===----------------------------------------------------------------------===//

				// A compiler fence instruction that prevents reordering of instructions.
				let Defs = [ARGUMENTS] in {
				let isPseudo = 1, hasSideEffects = 1 in
				defm COMPILER_FENCE : ATOMIC_NRI<(outs), (ins), [], "compiler_fence">;
				} // Defs = [ARGUMENTS]

llvm/test/CodeGen/WebAssembly/atomic-fence.ll

This file was added.

				; RUN: llc < %s \| FileCheck %s --check-prefix NOATOMIC
				; RUN: not llc < %s -mtriple=wasm32-unknown-unknown -mattr=+atomics,+sign-ext 2>&1 \| FileCheck %s --check-prefixes NOEMSCRIPTEN
				; RUN: not llc < %s -mtriple=wasm32-unknown-wasi -mattr=+atomics,+sign-ext 2>&1 \| FileCheck %s --check-prefixes NOEMSCRIPTEN
				; RUN: llc < %s -mtriple=wasm32-unknown-emscripten -asm-verbose=false -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+atomics,+sign-ext \| FileCheck %s

				target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
				target triple = "wasm32-unknown-unknown"

				; NOEMSCRIPTEN: LLVM ERROR: ATOMIC_FENCE is not yet supported in non-emscripten OSes

				; A multithread fence turns into 'global.get $__stack_pointer' followed by an
				; idempotent atomicrmw instruction.
				; CHECK-LABEL: multithread_fence:
				; CHECK: global.get $push[[SP:[0-9]+]]=, __stack_pointer
				; CHECK-NEXT: i32.const $push[[ZERO:[0-9]+]]=, 0
				; CHECK-NEXT: i32.atomic.rmw.or $drop=, 0($pop[[SP]]), $pop[[ZERO]]
				; NOATOMIC-NOT: i32.atomic.rmw.or
				define void @multithread_fence() {
				fence seq_cst
				ret void
				}

				; Fences with weaker memory orderings than seq_cst should be treated the same
				; because atomic memory access in wasm are sequentially consistent.
				; CHECK-LABEL: multithread_weak_fence:
				; CHECK: global.get $push{{.+}}=, __stack_pointer
				; CHECK: i32.atomic.rmw.or
				; CHECK: i32.atomic.rmw.or
				; CHECK: i32.atomic.rmw.or
				define void @multithread_weak_fence() {
				fence acquire
				fence release
				fence acq_rel
				ret void
				}

				; A singlethread fence becomes compiler_fence instruction, a pseudo instruction
				; that acts as a compiler barrier. The barrier should not be emitted to .s file.
				; CHECK-LABEL: singlethread_fence:
				; CHECK-NOT: compiler_fence
				define void @singlethread_fence() {
				fence syncscope("singlethread") seq_cst
				fence syncscope("singlethread") acquire
				fence syncscope("singlethread") release
				fence syncscope("singlethread") acq_rel
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Support for atomic fences
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 201771

llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp

llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp

llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td

llvm/test/CodeGen/WebAssembly/atomic-fence.ll

This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Support for atomic fencesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 201771

llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp

llvm/lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp

llvm/lib/Target/WebAssembly/WebAssemblyInstrAtomics.td

llvm/test/CodeGen/WebAssembly/atomic-fence.ll

[WebAssembly] Support for atomic fences
ClosedPublic