Expose llvm fence instruction as clang builtin for AMDGPU target
__builtin_amdgcn_fence(unsigned int memoryOrdering, const char *syncScope)
The first argument of this builtin is one of the memory-ordering specifiers
ATOMIC_ACQUIRE, ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST
following C++11 memory model semantics. This is mapped to corresponding
LLVM atomic memory ordering for the fence instruction using LLVM atomic C
ABI. The second argument is an AMDGPU-specific synchronization scope
defined as string.
We probably don't need to document the new builtin here. Clearly, we have not documented any other AMDGPU builtin, and there is no need to start now. If necessary, we can take that up as a separate task covering all builtins.