Index: docs/LangRef.rst =================================================================== --- docs/LangRef.rst +++ docs/LangRef.rst @@ -2160,10 +2160,23 @@ .. _singlethread: -If an atomic operation is marked ``singlethread``, it only *synchronizes -with* or participates in modification and seq\_cst total orderings with -other operations running in the same thread (for example, in signal -handlers). +If an atomic operation is marked ``singlethread``, it only *synchronizes with*, +and only participates in the seq\_cst total orderings of, other operations +running in the same thread (for example, in signal handlers). + +.. _syncscope: + +If an atomic operation is marked ``syncscope([ss0])``, then it +*synchronizes with*, and participates in the seq\_cst total orderings of, other +atomic operations marked ``syncscope([ss0])``. It is target defined how it +interacts with atomic operations marked ``singlethread``, marked +``syncscope([ss1])`` where ``[ss0] != [ss1]``, or not marked ``singlethread`` or +``syncscope([ss0])``. + +Otherwise, an atomic operation that is not marked ``singlethread`` or +``syncscope([ss0])`` *synchronizes with*, and participates in the global +seq\_cst total orderings of, other operations that are not marked +``singlethread`` or ``syncscope([ss0])``. .. _fastmath: @@ -7251,7 +7264,7 @@ :: = load [volatile] , * [, align ][, !nontemporal !][, !invariant.load !][, !invariant.group !][, !nonnull !][, !dereferenceable !][, !dereferenceable_or_null !][, !align !] - = load atomic [volatile] , * [singlethread] , align [, !invariant.group !] + = load atomic [volatile] , * [singlethread|syncscope([ss])] , align [, !invariant.group !] ! = !{ i32 1 } ! = !{i64 } ! = !{ i64 } @@ -7272,14 +7285,14 @@ :ref:`volatile operations `. If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering -` and optional ``singlethread`` argument. The ``release`` and -``acq_rel`` orderings are not valid on ``load`` instructions. Atomic loads -produce :ref:`defined ` results when they may see multiple atomic -stores. The type of the pointee must be an integer, pointer, or floating-point -type whose bit width is a power of two greater than or equal to eight and less -than or equal to a target-specific size limit. ``align`` must be explicitly -specified on atomic loads, and the load has undefined behavior if the alignment -is not set to a value which is at least the size in bytes of the +` and optional ``singlethread`` or ``syncscope([ss])`` argument. The +``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions. +Atomic loads produce :ref:`defined ` results when they may see +multiple atomic stores. The type of the pointee must be an integer, pointer, or +floating-point type whose bit width is a power of two greater than or equal to +eight and less than or equal to a target-specific size limit. ``align`` must be +explicitly specified on atomic loads, and the load has undefined behavior if the +alignment is not set to a value which is at least the size in bytes of the pointee. ``!nontemporal`` does not have any defined semantics for atomic loads. The optional constant ``align`` argument specifies the alignment of the @@ -7380,7 +7393,7 @@ :: store [volatile] , * [, align ][, !nontemporal !][, !invariant.group !] ; yields void - store atomic [volatile] , * [singlethread] , align [, !invariant.group !] ; yields void + store atomic [volatile] , * [singlethread|syncscope([ss])] , align [, !invariant.group !] ; yields void Overview: """"""""" @@ -7400,14 +7413,14 @@ structural type `) can be stored. If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering -` and optional ``singlethread`` argument. The ``acquire`` and -``acq_rel`` orderings aren't valid on ``store`` instructions. Atomic loads -produce :ref:`defined ` results when they may see multiple atomic -stores. The type of the pointee must be an integer, pointer, or floating-point -type whose bit width is a power of two greater than or equal to eight and less -than or equal to a target-specific size limit. ``align`` must be explicitly -specified on atomic stores, and the store has undefined behavior if the -alignment is not set to a value which is at least the size in bytes of the +` and optional ``singlethread`` or ``syncscope([ss])`` argument. The +``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions. +Atomic loads produce :ref:`defined ` results when they may see +multiple atomic stores. The type of the pointee must be an integer, pointer, or +floating-point type whose bit width is a power of two greater than or equal to +eight and less than or equal to a target-specific size limit. ``align`` must be +explicitly specified on atomic stores, and the store has undefined behavior if +the alignment is not set to a value which is at least the size in bytes of the pointee. ``!nontemporal`` does not have any defined semantics for atomic stores. The optional constant ``align`` argument specifies the alignment of the @@ -7468,7 +7481,7 @@ :: - fence [singlethread] ; yields void + fence [singlethread|syncscope([ss])] ; yields void Overview: """"""""" @@ -7502,17 +7515,17 @@ ``acquire`` and ``release`` semantics specified above, participates in the global program order of other ``seq_cst`` operations and/or fences. -The optional ":ref:`singlethread `" argument specifies -that the fence only synchronizes with other fences in the same thread. -(This is useful for interacting with signal handlers.) +A ``fence`` instruction can also take an optional +":ref:`singlethread `" or ":ref:`syncscope `" argument. Example: """""""" .. code-block:: llvm - fence acquire ; yields void - fence singlethread seq_cst ; yields void + fence acquire ; yields void + fence singlethread seq_cst ; yields void + fence syncscope(amdgpu_agent) seq_cst ; yields void .. _i_cmpxchg: @@ -7524,7 +7537,7 @@ :: - cmpxchg [weak] [volatile] * , , [singlethread] ; yields { ty, i1 } + cmpxchg [weak] [volatile] * , , [singlethread|syncscope([ss])] ; yields { ty, i1 } Overview: """"""""" @@ -7553,10 +7566,8 @@ stronger than that on success, and the failure ordering cannot be either ``release`` or ``acq_rel``. -The optional "``singlethread``" argument declares that the ``cmpxchg`` -is only atomic with respect to code (usually signal handlers) running in -the same thread as the ``cmpxchg``. Otherwise the cmpxchg is atomic with -respect to all other code in the system. +A ``cmpxchg`` instruction can also take an optional +":ref:`singlethread `" or ":ref:`syncscope `" argument. The pointer passed into cmpxchg must have alignment greater than or equal to the size in memory of the operand. @@ -7610,7 +7621,7 @@ :: - atomicrmw [volatile] * , [singlethread] ; yields ty + atomicrmw [volatile] * , [singlethread|syncscope([ss])] ; yields ty Overview: """"""""" @@ -7644,6 +7655,9 @@ order of execution of this ``atomicrmw`` with other :ref:`volatile operations `. +A ``atomicrmw`` instruction can also take an optional +":ref:`singlethread `" or ":ref:`syncscope `" argument. + Semantics: """""""""" Index: include/llvm/Bitcode/LLVMBitCodes.h =================================================================== --- include/llvm/Bitcode/LLVMBitCodes.h +++ include/llvm/Bitcode/LLVMBitCodes.h @@ -387,9 +387,15 @@ }; /// Encoded SynchronizationScope values. -enum AtomicSynchScopeCodes { +enum AtomicSynchScopeCodes : uint8_t { + /// Encoded value for SingleThread synchronization scope. SYNCHSCOPE_SINGLETHREAD = 0, - SYNCHSCOPE_CROSSTHREAD = 1 + + /// Encoded value for CrossThread synchronization scope. + SYNCHSCOPE_CROSSTHREAD = 1, + + /// First encoded value for target-specific synchronization scope. + SYNCHSCOPE_FIRSTTARGET= 2 }; /// Markers and flags for call instruction. Index: include/llvm/CodeGen/SelectionDAGNodes.h =================================================================== --- include/llvm/CodeGen/SelectionDAGNodes.h +++ include/llvm/CodeGen/SelectionDAGNodes.h @@ -1124,6 +1124,10 @@ /// Memory reference information. MachineMemOperand *MMO; + /// The synchronization scope of this memory operation. Not quite enough room + /// in SubclassData for everything, so synch scope gets its own field. + SynchronizationScope SynchScope; + public: MemSDNode(unsigned Opc, unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemoryVT, MachineMemOperand *MMO); Index: include/llvm/IR/Instructions.h =================================================================== --- include/llvm/IR/Instructions.h +++ include/llvm/IR/Instructions.h @@ -30,6 +30,7 @@ #include "llvm/IR/Function.h" #include "llvm/IR/InstrTypes.h" #include "llvm/IR/OperandTraits.h" +#include "llvm/IR/SyncScope.h" #include "llvm/IR/Type.h" #include "llvm/IR/Use.h" #include "llvm/IR/User.h" @@ -47,11 +48,6 @@ class DataLayout; class LLVMContext; -enum SynchronizationScope { - SingleThread = 0, - CrossThread = 1 -}; - //===----------------------------------------------------------------------===// // AllocaInst Class //===----------------------------------------------------------------------===// @@ -230,30 +226,30 @@ void setAlignment(unsigned Align); - /// Returns the ordering effect of this fence. + /// Returns the ordering constraint of this load instruction. AtomicOrdering getOrdering() const { return AtomicOrdering((getSubclassDataFromInstruction() >> 7) & 7); } - /// Set the ordering constraint on this load. May not be Release or - /// AcquireRelease. + /// Sets the ordering constraint on this load instruction. May not be Release + /// or AcquireRelease. void setOrdering(AtomicOrdering Ordering) { setInstructionSubclassData((getSubclassDataFromInstruction() & ~(7 << 7)) | ((unsigned)Ordering << 7)); } + /// Returns the synchronization scope of this load instruction. SynchronizationScope getSynchScope() const { - return SynchronizationScope((getSubclassDataFromInstruction() >> 6) & 1); + return SynchScope; } - /// Specify whether this load is ordered with respect to all - /// concurrently executing threads, or only with respect to signal handlers - /// executing in the same thread. - void setSynchScope(SynchronizationScope xthread) { - setInstructionSubclassData((getSubclassDataFromInstruction() & ~(1 << 6)) | - (xthread << 6)); + /// Sets the synchronization scope on this load instruction. + void setSynchScope(SynchronizationScope SynchScope) { + this->SynchScope = SynchScope; } + /// Sets the ordering constraint and synchronization scope on this load + /// instruction. void setAtomic(AtomicOrdering Ordering, SynchronizationScope SynchScope = CrossThread) { setOrdering(Ordering); @@ -290,6 +286,11 @@ void setInstructionSubclassData(unsigned short D) { Instruction::setInstructionSubclassData(D); } + + /// The synchronization scope of this load instruction. Not quite enough room + /// in SubClassData for everything, so synchronization scope gets its own + /// field. + SynchronizationScope SynchScope; }; //===----------------------------------------------------------------------===// @@ -351,30 +352,30 @@ void setAlignment(unsigned Align); - /// Returns the ordering effect of this store. + /// Returns the ordering constraint of this store instruction. AtomicOrdering getOrdering() const { return AtomicOrdering((getSubclassDataFromInstruction() >> 7) & 7); } - /// Set the ordering constraint on this store. May not be Acquire or - /// AcquireRelease. + /// Sets the ordering constraint on this store instruction. May not be Acquire + /// or AcquireRelease. void setOrdering(AtomicOrdering Ordering) { setInstructionSubclassData((getSubclassDataFromInstruction() & ~(7 << 7)) | ((unsigned)Ordering << 7)); } + /// Returns the synchronization scope of this store instruction. SynchronizationScope getSynchScope() const { - return SynchronizationScope((getSubclassDataFromInstruction() >> 6) & 1); + return SynchScope; } - /// Specify whether this store instruction is ordered with respect to all - /// concurrently executing threads, or only with respect to signal handlers - /// executing in the same thread. - void setSynchScope(SynchronizationScope xthread) { - setInstructionSubclassData((getSubclassDataFromInstruction() & ~(1 << 6)) | - (xthread << 6)); + /// Sets the synchronization scope on this store instruction. + void setSynchScope(SynchronizationScope SynchScope) { + this->SynchScope = SynchScope; } + /// Sets the ordering constraint and synchronization scope on this store + /// instruction. void setAtomic(AtomicOrdering Ordering, SynchronizationScope SynchScope = CrossThread) { setOrdering(Ordering); @@ -414,6 +415,11 @@ void setInstructionSubclassData(unsigned short D) { Instruction::setInstructionSubclassData(D); } + + /// The synchronization scope of this store instruction. Not quite enough room + /// in SubClassData for everything, so synchronization scope gets its own + /// field. + SynchronizationScope SynchScope; }; template <> @@ -453,28 +459,26 @@ void *operator new(size_t, unsigned) = delete; - /// Returns the ordering effect of this fence. + /// Returns the ordering constraint of this fence instruction. AtomicOrdering getOrdering() const { return AtomicOrdering(getSubclassDataFromInstruction() >> 1); } - /// Set the ordering constraint on this fence. May only be Acquire, Release, - /// AcquireRelease, or SequentiallyConsistent. + /// Sets the ordering constraint on this fence instruction. May only be + /// Acquire, Release, AcquireRelease, or SequentiallyConsistent. void setOrdering(AtomicOrdering Ordering) { setInstructionSubclassData((getSubclassDataFromInstruction() & 1) | ((unsigned)Ordering << 1)); } + /// Returns the synchronization scope of this fence instruction. SynchronizationScope getSynchScope() const { - return SynchronizationScope(getSubclassDataFromInstruction() & 1); + return SynchScope; } - /// Specify whether this fence orders other operations with respect to all - /// concurrently executing threads, or only with respect to signal handlers - /// executing in the same thread. - void setSynchScope(SynchronizationScope xthread) { - setInstructionSubclassData((getSubclassDataFromInstruction() & ~1) | - xthread); + /// Sets the synchronization scope on this fence instruction. + void setSynchScope(SynchronizationScope SynchScope) { + this->SynchScope = SynchScope; } // Methods for support type inquiry through isa, cast, and dyn_cast: @@ -491,6 +495,11 @@ void setInstructionSubclassData(unsigned short D) { Instruction::setInstructionSubclassData(D); } + + /// The synchronization scope of this fence instruction. Not quite enough room + /// in SubClassData for everything, so synchronization scope gets its own + /// field. + SynchronizationScope SynchScope; }; //===----------------------------------------------------------------------===// @@ -558,7 +567,14 @@ /// Transparently provide more efficient getOperand methods. DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Value); - /// Set the ordering constraint on this cmpxchg. + /// Returns the ordering constraint of this cmpxchg instruction when store + /// occurs. + AtomicOrdering getSuccessOrdering() const { + return AtomicOrdering((getSubclassDataFromInstruction() >> 2) & 7); + } + + /// Sets the ordering constraint on this cmpxchg instruction when store + /// occurs. void setSuccessOrdering(AtomicOrdering Ordering) { assert(Ordering != AtomicOrdering::NotAtomic && "CmpXchg instructions can only be atomic."); @@ -566,6 +582,14 @@ ((unsigned)Ordering << 2)); } + /// Returns the ordering constraint of this cmpxchg instruction when store + /// does not occur. + AtomicOrdering getFailureOrdering() const { + return AtomicOrdering((getSubclassDataFromInstruction() >> 5) & 7); + } + + /// Sets the ordering constraint on this cmpxchg instruction when store + /// does not occur. void setFailureOrdering(AtomicOrdering Ordering) { assert(Ordering != AtomicOrdering::NotAtomic && "CmpXchg instructions can only be atomic."); @@ -573,28 +597,14 @@ ((unsigned)Ordering << 5)); } - /// Specify whether this cmpxchg is atomic and orders other operations with - /// respect to all concurrently executing threads, or only with respect to - /// signal handlers executing in the same thread. - void setSynchScope(SynchronizationScope SynchScope) { - setInstructionSubclassData((getSubclassDataFromInstruction() & ~2) | - (SynchScope << 1)); - } - - /// Returns the ordering constraint on this cmpxchg. - AtomicOrdering getSuccessOrdering() const { - return AtomicOrdering((getSubclassDataFromInstruction() >> 2) & 7); - } - - /// Returns the ordering constraint on this cmpxchg. - AtomicOrdering getFailureOrdering() const { - return AtomicOrdering((getSubclassDataFromInstruction() >> 5) & 7); + /// Returns the synchronization scope of this cmpxchg instruction. + SynchronizationScope getSynchScope() const { + return SynchScope; } - /// Returns whether this cmpxchg is atomic between threads or only within a - /// single thread. - SynchronizationScope getSynchScope() const { - return SynchronizationScope((getSubclassDataFromInstruction() & 2) >> 1); + /// Sets the synchronization scope on this cmpxchg instruction. + void setSynchScope(SynchronizationScope SynchScope) { + this->SynchScope = SynchScope; } Value *getPointerOperand() { return getOperand(0); } @@ -649,6 +659,11 @@ void setInstructionSubclassData(unsigned short D) { Instruction::setInstructionSubclassData(D); } + + /// The synchronization scope of this cmpxchg instruction. Not quite enough + /// room in SubClassData for everything, so synchronization scope gets its own + /// field. + SynchronizationScope SynchScope; }; template <> @@ -747,7 +762,12 @@ /// Transparently provide more efficient getOperand methods. DECLARE_TRANSPARENT_OPERAND_ACCESSORS(Value); - /// Set the ordering constraint on this RMW. + /// Returns the ordering constraint of this RMW instruction. + AtomicOrdering getOrdering() const { + return AtomicOrdering((getSubclassDataFromInstruction() >> 2) & 7); + } + + /// Sets the ordering constraint on this RMW instruction. void setOrdering(AtomicOrdering Ordering) { assert(Ordering != AtomicOrdering::NotAtomic && "atomicrmw instructions can only be atomic."); @@ -755,25 +775,16 @@ ((unsigned)Ordering << 2)); } - /// Specify whether this RMW orders other operations with respect to all - /// concurrently executing threads, or only with respect to signal handlers - /// executing in the same thread. - void setSynchScope(SynchronizationScope SynchScope) { - setInstructionSubclassData((getSubclassDataFromInstruction() & ~2) | - (SynchScope << 1)); - } - - /// Returns the ordering constraint on this RMW. - AtomicOrdering getOrdering() const { - return AtomicOrdering((getSubclassDataFromInstruction() >> 2) & 7); - } - - /// Returns whether this RMW is atomic between threads or only within a - /// single thread. + /// Returns the synchronization scope of this RMW instruction. SynchronizationScope getSynchScope() const { - return SynchronizationScope((getSubclassDataFromInstruction() & 2) >> 1); + return SynchScope; } + /// Sets the synchronization scope on this RMW instruction. + void setSynchScope(SynchronizationScope SynchScope) { + this->SynchScope = SynchScope; + } + Value *getPointerOperand() { return getOperand(0); } const Value *getPointerOperand() const { return getOperand(0); } static unsigned getPointerOperandIndex() { return 0U; } @@ -803,6 +814,11 @@ void setInstructionSubclassData(unsigned short D) { Instruction::setInstructionSubclassData(D); } + + /// The synchronization scope of this RMW instruction. Not quite enough room + /// in SubClassData for everything, so synchronization scope gets its own + /// field. + SynchronizationScope SynchScope; }; template <> Index: include/llvm/IR/SyncScope.h =================================================================== --- /dev/null +++ include/llvm/IR/SyncScope.h @@ -0,0 +1,54 @@ +//===-- llvm/SyncScope.h - LLVM Synchronization Scopes ----------*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file defines LLVM's set of synchronization scopes. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_IR_SYNCSCOPE_H +#define LLVM_IR_SYNCSCOPE_H + +namespace llvm { + +/// Predefined synchronization scopes. +enum SynchronizationScope : uint8_t { + /// Synchronized with respect to signal handlers executing in the same thread. + SingleThread = 0, + + /// Synchronized with respect to all concurrently executing threads. + CrossThread = 1, + + /// First target-specific synchronization scope. + FirstTargetSS = 2, + + /// AMDGPU_AGENT - Synchronized with respect to the agent, which includes all + /// work-items on the same agent executing kernel dispatches for the same + /// application process as the executing work-item. Only supported for the + /// global segment. + AMDGPU_AGENT = 2, + + /// AMDGPU_WORKGROUP - Synchronized with respect to the work-group, which + /// includes all work-items in the same work-group as the executing work-item. + AMDGPU_WORKGROUP = 3, + + /// AMDGPU_WAVEFRONT - Synchronized with respect to the wavefront, which + /// includes all work-items in the same wavefront as the executing work-item. + AMDGPU_WAVEFRONT = 4, + + /// AMDGPU_IMAGE - Synchronized with respect to image fence instruction + /// executing in the same work-item. + AMDGPU_IMAGE = 5, + + /// The highest possible synchronization scope ID. + MaxSS = 0xFF +}; + +} // end namespace llvm + +#endif // LLVM_IR_SYNCSCOPE_H Index: lib/AsmParser/LLLexer.cpp =================================================================== --- lib/AsmParser/LLLexer.cpp +++ lib/AsmParser/LLLexer.cpp @@ -542,7 +542,15 @@ KEYWORD(release); KEYWORD(acq_rel); KEYWORD(seq_cst); + + // Synchronization scopes: KEYWORD(singlethread); + KEYWORD(syncscope); + KEYWORD(amdgpu_agent); + KEYWORD(amdgpu_agent); + KEYWORD(amdgpu_workgroup); + KEYWORD(amdgpu_wavefront); + KEYWORD(amdgpu_image); KEYWORD(nnan); KEYWORD(ninf); Index: lib/AsmParser/LLParser.h =================================================================== --- lib/AsmParser/LLParser.h +++ lib/AsmParser/LLParser.h @@ -243,6 +243,8 @@ bool ParseOptionalDerefAttrBytes(lltok::Kind AttrKind, uint64_t &Bytes); bool ParseScopeAndOrdering(bool isAtomic, SynchronizationScope &Scope, AtomicOrdering &Ordering); + bool ParseScope(SynchronizationScope &Scope); + bool ParseTargetScope(SynchronizationScope &Scope); bool ParseOrdering(AtomicOrdering &Ordering); bool ParseOptionalStackAlignment(unsigned &Alignment); bool ParseOptionalCommaAlign(unsigned &Alignment, bool &AteExtraComma); Index: lib/AsmParser/LLParser.cpp =================================================================== --- lib/AsmParser/LLParser.cpp +++ lib/AsmParser/LLParser.cpp @@ -17,6 +17,7 @@ #include "llvm/ADT/Optional.h" #include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/AsmParser/SlotMapping.h" #include "llvm/IR/Argument.h" #include "llvm/IR/AutoUpgrade.h" @@ -1882,8 +1883,11 @@ } /// ParseScopeAndOrdering -/// if isAtomic: ::= 'singlethread'? AtomicOrdering -/// else: ::= +/// if isAtomic: +/// ::= 'singlethread'? AtomicOrdering +// ::= 'syncscope' '(' ')'? AtomicOrdering +/// else +/// ::= /// /// This sets Scope and Ordering to the parsed values. bool LLParser::ParseScopeAndOrdering(bool isAtomic, SynchronizationScope &Scope, @@ -1891,11 +1895,62 @@ if (!isAtomic) return false; + return ParseScope(Scope) || ParseOrdering(Ordering); +} + +/// ParseScope +/// ::= /* empty */ +/// ::= 'singlethread' +/// ::= 'syncscope' '(' ')' +/// +/// This sets Scope to the parsed value. +bool LLParser::ParseScope(SynchronizationScope &Scope) { + if (EatIfPresent(lltok::kw_syncscope)) + return ParseTargetScope(Scope); + Scope = CrossThread; if (EatIfPresent(lltok::kw_singlethread)) Scope = SingleThread; - return ParseOrdering(Ordering); + return false; +} + +/// ParseTargetScope +/// ::= 'amdgpu_agent' +/// ::= 'amdgpu_workgroup' +/// ::= 'amdgpu_wavefront' +/// ::= 'amdgpu_image' +/// +/// This sets Scope to the target-specific synchronization scope. +bool LLParser::ParseTargetScope(SynchronizationScope &Scope) { + auto StartParenAt = Lex.getLoc(); + if (!EatIfPresent(lltok::lparen)) + return Error(StartParenAt, "Expected '(' in syncscope"); + + auto TargetScopeAt = Lex.getLoc(); + switch (Lex.getKind()) { + case lltok::kw_amdgpu_agent: + Scope = AMDGPU_AGENT; + break; + case lltok::kw_amdgpu_workgroup: + Scope = AMDGPU_WORKGROUP; + break; + case lltok::kw_amdgpu_wavefront: + Scope = AMDGPU_WAVEFRONT; + break; + case lltok::kw_amdgpu_image: + Scope = AMDGPU_IMAGE; + break; + default: + return Error(TargetScopeAt, "Invalid target syncscope"); + } + Lex.Lex(); + + auto EndParenAt = Lex.getLoc(); + if (!EatIfPresent(lltok::rparen)) + return Error(EndParenAt, "Expected ')' in syncscope"); + + return false; } /// ParseOrdering Index: lib/AsmParser/LLToken.h =================================================================== --- lib/AsmParser/LLToken.h +++ lib/AsmParser/LLToken.h @@ -93,7 +93,15 @@ kw_release, kw_acq_rel, kw_seq_cst, + + // Synchronization scopes: kw_singlethread, + kw_syncscope, + kw_amdgpu_agent, + kw_amdgpu_workgroup, + kw_amdgpu_wavefront, + kw_amdgpu_image, + kw_nnan, kw_ninf, kw_nsz, Index: lib/Bitcode/Reader/BitcodeReader.cpp =================================================================== --- lib/Bitcode/Reader/BitcodeReader.cpp +++ lib/Bitcode/Reader/BitcodeReader.cpp @@ -936,9 +936,15 @@ } static SynchronizationScope getDecodedSynchScope(unsigned Val) { + if (Val >= bitc::SYNCHSCOPE_FIRSTTARGET) { + assert(Val == uint8_t(Val) && "expected 8-bit integer (too large)"); + return SynchronizationScope( + FirstTargetSS + (Val - bitc::SYNCHSCOPE_FIRSTTARGET)); + } + switch (Val) { + default: llvm_unreachable("Invalid syncscope"); case bitc::SYNCHSCOPE_SINGLETHREAD: return SingleThread; - default: // Map unknown scopes to cross-thread. case bitc::SYNCHSCOPE_CROSSTHREAD: return CrossThread; } } Index: lib/Bitcode/Writer/BitcodeWriter.cpp =================================================================== --- lib/Bitcode/Writer/BitcodeWriter.cpp +++ lib/Bitcode/Writer/BitcodeWriter.cpp @@ -580,11 +580,16 @@ } static unsigned getEncodedSynchScope(SynchronizationScope SynchScope) { + if (SynchScope >= FirstTargetSS) { + return unsigned( + bitc::SYNCHSCOPE_FIRSTTARGET + (SynchScope - FirstTargetSS)); + } + switch (SynchScope) { + default: llvm_unreachable("Invalid syncscope"); case SingleThread: return bitc::SYNCHSCOPE_SINGLETHREAD; case CrossThread: return bitc::SYNCHSCOPE_CROSSTHREAD; } - llvm_unreachable("Invalid synch scope"); } static void writeStringRecord(BitstreamWriter &Stream, unsigned Code, Index: lib/CodeGen/SelectionDAG/SelectionDAG.cpp =================================================================== --- lib/CodeGen/SelectionDAG/SelectionDAG.cpp +++ lib/CodeGen/SelectionDAG/SelectionDAG.cpp @@ -7241,7 +7241,8 @@ MemSDNode::MemSDNode(unsigned Opc, unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT memvt, MachineMemOperand *mmo) - : SDNode(Opc, Order, dl, VTs), MemoryVT(memvt), MMO(mmo) { + : SDNode(Opc, Order, dl, VTs), MemoryVT(memvt), MMO(mmo), + SynchScope(CrossThread) { MemSDNodeBits.IsVolatile = MMO->isVolatile(); MemSDNodeBits.IsNonTemporal = MMO->isNonTemporal(); MemSDNodeBits.IsDereferenceable = MMO->isDereferenceable(); Index: lib/IR/AsmWriter.cpp =================================================================== --- lib/IR/AsmWriter.cpp +++ lib/IR/AsmWriter.cpp @@ -2097,6 +2097,8 @@ void writeAtomicCmpXchg(AtomicOrdering SuccessOrdering, AtomicOrdering FailureOrdering, SynchronizationScope SynchScope); + void writeSynchScope(SynchronizationScope SynchScope); + void writeTargetSynchScope(SynchronizationScope SynchScope); void writeAllMDNodes(); void writeMDNode(unsigned Slot, const MDNode *Node); @@ -2162,11 +2164,7 @@ if (Ordering == AtomicOrdering::NotAtomic) return; - switch (SynchScope) { - case SingleThread: Out << " singlethread"; break; - case CrossThread: break; - } - + writeSynchScope(SynchScope); Out << " " << toIRString(Ordering); } @@ -2176,15 +2174,50 @@ assert(SuccessOrdering != AtomicOrdering::NotAtomic && FailureOrdering != AtomicOrdering::NotAtomic); - switch (SynchScope) { - case SingleThread: Out << " singlethread"; break; - case CrossThread: break; - } - + writeSynchScope(SynchScope); Out << " " << toIRString(SuccessOrdering); Out << " " << toIRString(FailureOrdering); } +void AssemblyWriter::writeSynchScope(SynchronizationScope SynchScope) { + if (SynchScope >= FirstTargetSS) { + writeTargetSynchScope(SynchScope); + } else { + switch (SynchScope) { + case SingleThread: + Out << " singlethread"; + break; + case CrossThread: + break; + default: + llvm_unreachable("Invalid syncscope"); + } + } +} + +void AssemblyWriter::writeTargetSynchScope(SynchronizationScope SynchScope) { + assert(SynchScope >= FirstTargetSS); + + Out << " syncscope("; + switch (SynchScope) { + case AMDGPU_AGENT: + Out << "amdgpu_agent"; + break; + case AMDGPU_WORKGROUP: + Out << "amdgpu_workgroup"; + break; + case AMDGPU_WAVEFRONT: + Out << "amdgpu_wavefront"; + break; + case AMDGPU_IMAGE: + Out << "amdgpu_image"; + break; + default: + llvm_unreachable("Invalid target syncscope"); + } + Out << ")"; +} + void AssemblyWriter::writeParamOperand(const Value *Operand, AttributeList Attrs, unsigned Idx) { if (!Operand) { Index: lib/Target/SystemZ/SystemZISelLowering.cpp =================================================================== --- lib/Target/SystemZ/SystemZISelLowering.cpp +++ lib/Target/SystemZ/SystemZISelLowering.cpp @@ -3199,7 +3199,7 @@ // The only fence that needs an instruction is a sequentially-consistent // cross-thread fence. if (FenceOrdering == AtomicOrdering::SequentiallyConsistent && - FenceScope == CrossThread) { + FenceScope != SingleThread) { return SDValue(DAG.getMachineNode(SystemZ::Serialize, DL, MVT::Other, Op.getOperand(0)), 0); Index: lib/Target/X86/X86ISelLowering.cpp =================================================================== --- lib/Target/X86/X86ISelLowering.cpp +++ lib/Target/X86/X86ISelLowering.cpp @@ -22761,7 +22761,7 @@ // The only fence that needs an instruction is a sequentially-consistent // cross-thread fence. if (FenceOrdering == AtomicOrdering::SequentiallyConsistent && - FenceScope == CrossThread) { + FenceScope != SingleThread) { if (Subtarget.hasMFence()) return DAG.getNode(X86ISD::MFENCE, dl, MVT::Other, Op.getOperand(0)); Index: lib/Transforms/Instrumentation/ThreadSanitizer.cpp =================================================================== --- lib/Transforms/Instrumentation/ThreadSanitizer.cpp +++ lib/Transforms/Instrumentation/ThreadSanitizer.cpp @@ -379,9 +379,9 @@ static bool isAtomic(Instruction *I) { if (LoadInst *LI = dyn_cast(I)) - return LI->isAtomic() && LI->getSynchScope() == CrossThread; + return LI->isAtomic() && LI->getSynchScope() != SingleThread; if (StoreInst *SI = dyn_cast(I)) - return SI->isAtomic() && SI->getSynchScope() == CrossThread; + return SI->isAtomic() && SI->getSynchScope() != SingleThread; if (isa(I)) return true; if (isa(I)) Index: test/Assembler/atomic.ll =================================================================== --- test/Assembler/atomic.ll +++ test/Assembler/atomic.ll @@ -7,10 +7,14 @@ load atomic i32, i32* %x unordered, align 4 ; CHECK: load atomic volatile i32, i32* %x singlethread acquire, align 4 load atomic volatile i32, i32* %x singlethread acquire, align 4 + ; CHECK: load atomic volatile i32, i32* %x syncscope(amdgpu_agent) acquire, align 4 + load atomic volatile i32, i32* %x syncscope(amdgpu_agent) acquire, align 4 ; CHECK: store atomic i32 3, i32* %x release, align 4 store atomic i32 3, i32* %x release, align 4 ; CHECK: store atomic volatile i32 3, i32* %x singlethread monotonic, align 4 store atomic volatile i32 3, i32* %x singlethread monotonic, align 4 + ; CHECK: store atomic volatile i32 3, i32* %x syncscope(amdgpu_workgroup) monotonic, align 4 + store atomic volatile i32 3, i32* %x syncscope(amdgpu_workgroup) monotonic, align 4 ; CHECK: cmpxchg i32* %x, i32 1, i32 0 singlethread monotonic monotonic cmpxchg i32* %x, i32 1, i32 0 singlethread monotonic monotonic ; CHECK: cmpxchg volatile i32* %x, i32 0, i32 1 acq_rel acquire @@ -19,13 +23,19 @@ cmpxchg i32* %x, i32 42, i32 0 acq_rel monotonic ; CHECK: cmpxchg weak i32* %x, i32 13, i32 0 seq_cst monotonic cmpxchg weak i32* %x, i32 13, i32 0 seq_cst monotonic + ; CHECK: cmpxchg weak i32* %x, i32 13, i32 0 syncscope(amdgpu_wavefront) seq_cst monotonic + cmpxchg weak i32* %x, i32 13, i32 0 syncscope(amdgpu_wavefront) seq_cst monotonic ; CHECK: atomicrmw add i32* %x, i32 10 seq_cst atomicrmw add i32* %x, i32 10 seq_cst ; CHECK: atomicrmw volatile xchg i32* %x, i32 10 monotonic atomicrmw volatile xchg i32* %x, i32 10 monotonic + ; CHECK: atomicrmw volatile xchg i32* %x, i32 10 syncscope(amdgpu_image) monotonic + atomicrmw volatile xchg i32* %x, i32 10 syncscope(amdgpu_image) monotonic ; CHECK: fence singlethread release fence singlethread release ; CHECK: fence seq_cst fence seq_cst + ; CHECK: fence syncscope(amdgpu_image) seq_cst + fence syncscope(amdgpu_image) seq_cst ret void } Index: test/Bitcode/atomic-no-syncscope.ll =================================================================== --- /dev/null +++ test/Bitcode/atomic-no-syncscope.ll @@ -0,0 +1,14 @@ +; RUN: llvm-dis -o - %s.bc | FileCheck %s + +; CHECK: load atomic i32, i32* %x unordered, align 4 +; CHECK: load atomic volatile i32, i32* %x singlethread acquire, align 4 +; CHECK: store atomic i32 3, i32* %x release, align 4 +; CHECK: store atomic volatile i32 3, i32* %x singlethread monotonic, align 4 +; CHECK: cmpxchg i32* %x, i32 1, i32 0 singlethread monotonic monotonic +; CHECK: cmpxchg volatile i32* %x, i32 0, i32 1 acq_rel acquire +; CHECK: cmpxchg i32* %x, i32 42, i32 0 acq_rel monotonic +; CHECK: cmpxchg weak i32* %x, i32 13, i32 0 seq_cst monotonic +; CHECK: atomicrmw add i32* %x, i32 10 seq_cst +; CHECK: atomicrmw volatile xchg i32* %x, i32 10 monotonic +; CHECK: fence singlethread release +; CHECK: fence seq_cst