diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst --- a/clang/docs/LanguageExtensions.rst +++ b/clang/docs/LanguageExtensions.rst @@ -2455,6 +2455,59 @@ and ``__OPENCL_MEMORY_SCOPE_SUB_GROUP`` are provided, with values corresponding to the enumerators of OpenCL's ``memory_scope`` enumeration.) +``__builtin_memory_fence`` +------------------------- + +``__builtin_memory_fence`` allows using `Fence instruction `_ +from clang. It takes C++11 compatible memory-ordering and target-specific +sync-scope as arguments, and generates a fence instruction in the IR. + +**Syntax**: + +.. code-block:: c++ + + __builtin_memory_fence(unsigned int memory_ordering, String sync_scope) + +**Example of use**: + +.. code-block:: c++ + + void my_fence(int i) { + i++; + __builtin_memory_fence(__ATOMIC_ACQUIRE, "workgroup"); + i--; + __builtin_memory_fence(__ATOMIC_SEQ_CST, "agent"); + } + +**Description**: + +The first argument of ``__builtin_memory_fence()`` builtin is one of the +memory-ordering specifiers ``__ATOMIC_ACQUIRE``, ``__ATOMIC_RELEASE``, +``__ATOMIC_ACQ_REL``, or ``__ATOMIC_SEQ_CST`` following C++11 memory model +semantics. Equivalent enum values of these memory-ordering can also be +specified. The builtin maps these C++ memory-ordering to corresponding +LLVM Atomic Memory Ordering for the fence instruction using LLVM Atomic C +ABI, as given in the table below. The second argument is a target-specific +synchronization scope defined as a String. This builtin transparently +passes the second argument to fence instruction and relies on target +implementation for validity check. + ++------------------------------+--------------------------------+ +| Input in clang | Output in IR | +| (C++11 Memory-ordering) | (LLVM Atomic Memory-ordering) | ++======================+=======+========================+=======+ +| Enum | Value | Enum | Value | ++----------------------+-------+------------------------+-------+ +| ``__ATOMIC_ACQUIRE`` | 2 | Acquire | 4 | ++----------------------+-------+------------------------+-------+ +| ``__ATOMIC_RELEASE`` | 3 | Release | 5 | ++----------------------+-------+------------------------+-------+ +| ``__ATOMIC_ACQ_REL`` | 4 | AcquireRelease | 6 | ++----------------------+-------+------------------------+-------+ +| ``__ATOMIC_SEQ_CST`` | 5 | SequentiallyConsistent | 7 | ++----------------------+-------+------------------------+-------+ + + Low-level ARM exclusive memory builtins --------------------------------------- diff --git a/clang/test/CodeGenCXX/builtin-memory-fence-failure.cpp b/clang/test/CodeGenCXX/builtin-memory-fence-failure.cpp new file mode 100644 --- /dev/null +++ b/clang/test/CodeGenCXX/builtin-memory-fence-failure.cpp @@ -0,0 +1,9 @@ +// REQUIRES: amdgpu-registered-target +// RUN: not %clang_cc1 %s -S \ +// RUN: -triple=amdgcn-amd-amdhsa 2>&1 | FileCheck %s + +void test_memory_fence_failure() { + + // CHECK: error: Unsupported atomic synchronization scope + __builtin_memory_fence(__ATOMIC_SEQ_CST, "foobar"); +} \ No newline at end of file diff --git a/clang/test/CodeGenHIP/builtin_memory_fence.cpp b/clang/test/CodeGenCXX/builtin-memory-fence.cpp rename from clang/test/CodeGenHIP/builtin_memory_fence.cpp rename to clang/test/CodeGenCXX/builtin-memory-fence.cpp