Expose llvm fence instruction as clang builtin for AMDGPU target
__builtin_amdgcn_fence(unsigned int memoryOrdering, const char *syncScope)
The first argument of this builtin is one of the memory-ordering specifiers
ATOMIC_ACQUIRE, ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST
following C++11 memory model semantics. This is mapped to corresponding
LLVM atomic memory ordering for the fence instruction using LLVM atomic C
ABI. The second argument is an AMDGPU-specific synchronization scope
defined as string.
The proposed builtin does not claim to be an OpenCL builtin, so it's probably not correct to simply assume the OpenCL model. Should the model be chosen based on the source language specified?