Page MenuHomePhabricator

[WebAssembly] Support for atomic fences
ClosedPublic

Authored by aheejin on Aug 3 2018, 3:48 PM.

Details

Summary

This adds support for translation of LLVM IR fence instruction. We
convert a singlethread fence to a pseudo compiler barrier which becomes
0 instructions in final binary, and a thread fence to an idempotent
atomicrmw instruction to a memory address.

Diff Detail

Event Timeline

aheejin created this revision.Aug 3 2018, 3:48 PM
aheejin edited the summary of this revision. (Show Details)Aug 3 2018, 3:48 PM
aheejin updated this revision to Diff 159115.Aug 3 2018, 3:52 PM

comment fix

jfb added a comment.Aug 3 2018, 4:01 PM

What do you expect for relaxed, as well as for signal fences?
Could you also check in with Conrad, since he's been working on the WebAssembly memory model?

aheejin added a comment.EditedAug 3 2018, 4:38 PM

What do you expect for relaxed, as well as for signal fences?

Here I actually translated both a signal fence and a thread fence to the same thing. I think this will be conservatively correct, but maybe we should treat them differently. What if we translate a signal fence to a nothing but prevent reordering of instructions across it, and we might do the same thing for asm volatile("" ::: "memory") too. That can be something like making a pseudo instruction that eventually becomes nothing and prevents reordering across the pseudo instruction in the backend passes. What do you think?

Oh and for weaker fences, I added tests for them. They all translate to the same sequentially consistent atomicrmw or instruction.

Could you also check in with Conrad, since he's been working on the WebAssembly memory model?

I don't think Conrad is in the LLVM Phabricator here. Should we move the discussion to a github issue?

aheejin planned changes to this revision.Aug 14 2018, 5:08 PM
aheejin updated this revision to Diff 160909.Aug 15 2018, 1:31 PM
  • Add support for singlethread fences
  • Fixed handling of MachineMemOperand according to changes in rL339740
aheejin edited the summary of this revision. (Show Details)Aug 15 2018, 1:31 PM
aheejin updated this revision to Diff 160941.Aug 15 2018, 4:33 PM

Don't emit the pseudo compiler_fence instruction in .s file

aheejin updated this revision to Diff 161359.Aug 17 2018, 4:24 PM

Use a target external symbol with MO_SYMBOL_GLOBAL

We discussed how to implement fences here, and I think the general consensus was even though it may not be strictly necessary to emit atomicrmw for some cases, it can be user friendly and also might help support legacy builtins like __sync_synchronize(). And IMHO it is a single instruction anyway and fences are generally expected to be expensive, so no significant harm will be done. Does anyone have more concerns on this? If not, can we land this?

We discussed how to implement fences here, and I think the general consensus was even though it may not be strictly necessary to emit atomicrmw for some cases, it can be user friendly and also might help support legacy builtins like __sync_synchronize(). And IMHO it is a single instruction anyway and fences are generally expected to be expensive, so no significant harm will be done. Does anyone have more concerns on this? If not, can we land this?

I have concerns which I expressed here: https://github.com/WebAssembly/tool-conventions/issues/59#issuecomment-413620669

aheejin updated this revision to Diff 186176.Feb 10 2019, 7:43 PM
  • Fix bug: 0 should be i32.const 0 and not an immediate
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2019, 7:43 PM
aheejin updated this revision to Diff 187186.Feb 17 2019, 5:45 PM
  • Use ATOMIC_NRI for the fence + test fix

What is the status of the discussion here? We've gotten to the point where I have to manually add this patch to my local checkout in order to debug unrelated issues and it looks like the conversation fizzled out a while ago, so it would be great if we could land something.

Given the discussion on the github issues, we have scheduled a discussion at the in-person CG meeting that should resolve the outstanding memory model issues, possibly with a change to the spec.

However, the lack of fence lowering is now AFAIK the only thing that holds us back from declaring the wasm backend as the primary/only supported codegen path and staring the process of removing fastcomp.
Given that, I'm going to propose that we apply this patch as-is, to match the current behavior of the existing emscripten pipeline, with the understanding that we can consider the issue still open and change LLVM's behavior based on the outcome of the discussions (and we can in any case ensure that we don't declare any stable ABIs or allow this change to get branched). We can even put it behind the emscripten triple if people feel strongly about that.
Any objections?

I agree; this matches the current behavior, so it does not change anything from users' standpoint, and we ever decide do something else in the CG meeting, we can always land another patch later.

If I follow the discussion correctly, there are subtle ABI issues either way, so making fence a RMW now doesn't give us fully a future-proof ABI anyway. Consequently, I would prefer to make this patch conditional on the Emscripten triple, as wasm32-wasi and wasm32-unknown-unknown don't require Emscripten compatibility.

Yeah I agree with your assessment that this doesn't give us a future-proof ABI; that's why I want to revisit it later. Mostly this is to unblock removing fastcomp.

aheejin added a comment.EditedMay 24 2019, 5:51 PM

OK I'll predicate this under emscripten triple. But anyway someone please approve this!

tlively accepted this revision.May 24 2019, 5:52 PM

LGTM ship it! (behind emscripten flag)

This revision is now accepted and ready to land.May 24 2019, 5:52 PM
aheejin updated this revision to Diff 201723.May 28 2019, 11:09 AM
  • Predicate behind wasm32-unknown-emscripten flag.
  • Add tests for non-emscripten flags
aheejin updated this revision to Diff 201725.May 28 2019, 11:11 AM
  • test cosmetic change
Harbormaster completed remote builds in B32576: Diff 201725.
tlively added inline comments.May 28 2019, 11:19 AM
lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
113 ↗(On Diff #201725)

This looks over-indented. Am I missing something?

dschuff accepted this revision.May 28 2019, 2:04 PM

LGTM modulo the indentation thing.

aheejin updated this revision to Diff 201762.May 28 2019, 2:06 PM
aheejin marked an inline comment as done.
  • Fix a typo and indentations
aheejin added inline comments.May 28 2019, 2:44 PM
lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
113 ↗(On Diff #201725)

Thanks! How did it happen...??

sunfish added inline comments.May 28 2019, 2:47 PM
lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗(On Diff #201762)

My understanding of https://github.com/WebAssembly/tool-conventions/issues/59 is that we can lower fence to zero instructions here, rather than aborting.

This revision was automatically updated to reflect the committed changes.
aheejin marked an inline comment as done.May 28 2019, 3:29 PM
aheejin added inline comments.
lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗(On Diff #201762)

Sorry I didn't see this before committing. This is just to ensure the current crashing behavior for non-emscripten OSes. This does not change anything for them. Didn't we decide that we should commit this CL only for emscripten triple and wait for other OSes until we make a final decision on fences?

sunfish added inline comments.May 28 2019, 4:41 PM
lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗(On Diff #201762)

The behavior before this patch is to emit zero instructions, and not crash. So this patch does change the behavior. I'm asking to restore this previous behavior for non-Emscripten OS's.

aheejin marked an inline comment as done.May 28 2019, 4:57 PM
aheejin added inline comments.
lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗(On Diff #201762)

Really? Without this patch, it crashes for me. Of course, only when we have -mattr=+atomics; if we don't have atomics enabled all atomic instructions become normal instructions and fences become nothing for all triples. This behavior is still the same.
This is when I enabled -mattr=+atomics with wasi triple, before this patch landed:

$ llc atomic-fence.ll -mtriple=wasm32-unknown-wasi -mattr=+atomics,+sign-ext
LLVM ERROR: Cannot select: t3: ch = AtomicFence t0, Constant:i32<7>, Constant:i32<1>
  t1: i32 = Constant<7>
  t2: i32 = Constant<1>
In function: multithread_fence
sunfish added inline comments.May 28 2019, 6:17 PM
lib/Target/WebAssembly/WebAssemblyISelDAGToDAG.cpp
111 ↗(On Diff #201762)

Ah, you're right, I did a quick test without -mattr=+atomics, which I forget is now the thing which determines whether operations like fence are lowered by LowerAtomic.