This adds support for translation of LLVM IR fence instruction. We
convert a singlethread fence to a pseudo compiler barrier which becomes
0 instructions in final binary, and a thread fence to an idempotent
atomicrmw instruction to a memory address.
What do you expect for relaxed, as well as for signal fences?
Here I actually translated both a signal fence and a thread fence to the same thing. I think this will be conservatively correct, but maybe we should treat them differently. What if we translate a signal fence to a nothing but prevent reordering of instructions across it, and we might do the same thing for asm volatile("" ::: "memory") too. That can be something like making a pseudo instruction that eventually becomes nothing and prevents reordering across the pseudo instruction in the backend passes. What do you think?
Oh and for weaker fences, I added tests for them. They all translate to the same sequentially consistent atomicrmw or instruction.
Could you also check in with Conrad, since he's been working on the WebAssembly memory model?
I don't think Conrad is in the LLVM Phabricator here. Should we move the discussion to a github issue?
We discussed how to implement fences here, and I think the general consensus was even though it may not be strictly necessary to emit atomicrmw for some cases, it can be user friendly and also might help support legacy builtins like __sync_synchronize(). And IMHO it is a single instruction anyway and fences are generally expected to be expensive, so no significant harm will be done. Does anyone have more concerns on this? If not, can we land this?
What is the status of the discussion here? We've gotten to the point where I have to manually add this patch to my local checkout in order to debug unrelated issues and it looks like the conversation fizzled out a while ago, so it would be great if we could land something.