diff --git a/llvm/docs/ConvergentOperations.rst b/llvm/docs/ConvergentOperations.rst new file mode 100644 --- /dev/null +++ b/llvm/docs/ConvergentOperations.rst @@ -0,0 +1,625 @@ +============================== +Convergent Operation Semantics +============================== + +.. contents:: + :local: + :depth: 4 + +Overview +======== + +Some parallel execution environments execute threads in groups that allow +efficient communication within each group. Notably this is the case for +whole-program vectorization environments such as GPUs, where threads are mapped +to lanes of a SIMD vector. Efficient communication among threads is possible in +this case by simply exchanging data between the lanes of a vector. However, the +semantics defined in this document are independent of such implementation +details. + +When control flow diverges, i.e. threads of the same group follow different +paths through the CFG, not all threads of the group may be available to +participate in this communication. This is the defining characteristic that +distinguishes convergent operations from other inter-thread communication and +which requires the use of the ``convergent`` attribute to indicate +*additional constraints* on program transforms: + +A convergent operation involves inter-thread communication outside of the +memory model, where the set of threads which participate in communication is +implicitly defined or at least affected by control flow. + +For example, in the following GPU compute kernel, communication during the +convergent operation is expected to occur precisely among those threads of an +implementation-defined execution scope (such as workgroup or subgroup) for +which ``condition`` is true: + +.. code-block:: c++ + + void example_kernel() { + ... + if (condition) + convergent_operation(); + ... + } + +In structured programming languages, there is often an intuitive and +unambiguous way of determining the threads that are expected to communicate. +However, this is not always the case even in structured programming languages, +and the intuition breaks down entirely in unstructured control flow. This +document describes the formal semantics in LLVM, i.e. how to determine the set +of communicating threads for convergent operations. + +The definitions in this document leave many details open, such as how groups of +threads are formed in the first place. It focuses on the questions that are +relevant for deciding the correctness of generic program transforms and +convergence-related analyses such as divergence analysis. + +Note: It is common among practicioners to think about convergent operations in +terms of *divergence* and *reconvergence*: sets of threads split at branch +instructions if threads follow different paths through the control flow graph +(divergence) and may later merge when they reach the same static point in the +program (reconvergence). This operational point of view is often convenient for +backend implementations and it can sometimes be useful for guiding intuition. +However, the semantics defined here are declarative, since experience has shown +that the operational view is too restrictive in practice for general compiler +transforms. It is up to each implementation to operationalize those declarative +semantics in a way that makes sense for the underlying hardware, which varies +wildly. + + +Motivating Examples of Convergent Operations +============================================ + +(This section is informative.) + +The following stylized pixel shader samples a texture at a given set of +coordinates. Texture sampling requires screen-space derivatives of the +coordinates to determine the level of detail (mipmap level) of the sample. +They are commonly approximated by taking the difference between neighboring +pixels, which are computed by different threads in the same group: + +.. code-block:: c++ + + void example_shader() { + ... + color = textureSample(texture, coordinates); + if (condition) { + use(color); + } + ... + } + +From a purely single-threaded perspective, sinking the `textureSample` into +the if-statement appears legal. However, if the condition is false for some +neighboring pixels, then their corresponding threads will not execute together +in the group, making it impossible to take the difference of coordinates as an +approximation of the screen-space derivative. In practice, the outcome will be +an undefined value. + +That is, the `textureSample` operation fits our definition of a convergent +operation: + + 1. It communicates with a set of threads that implicitly depends on control + flow. + + 2. Correctness depends on this set of threads. + +The following example shows that merging common code of branches can be +incorrect in the face of convergent operations: + +.. code-block:: c++ + + void example_kernel() { + delta = ... + if (delta > 0) { + total_gains = subgroupAdd(delta); + ... + } else { + total_losses = subgroupAdd(delta); + ... + } + } + +The ``subgroupAdd`` computing the ``total_gains`` will be executed by the +subset of threads with positive ``delta`` in a subgroup (wave), and so will sum +up all the ``delta`` values of those threads; and similarly for the +``subgroupAdd`` that computes the ``total_losses``. + +If we were to hoist and merge the ``subgroupAdd`` above the if-statement, it +would sum up the ``delta`` across *all* threads instead. + +Finally, consider an example of how jump threading removes structure in a way +that can make semantics non-obvious: + +.. code-block:: llvm + + void example_original() { + entry: + ... + br i1 %cond1, label %then1, label %mid + + then1: + ... + %cond2 = ... + br label %mid + + mid: + %flag = phi i1 [ true, %entry ], [ %cond2, %then1 ] + br i1 %flag, label %then2, label %end + + then2: + ... + call void @subgroupControlBarrier() + ... + br label %end + + end: + } + + void example_jumpthreaded() { + entry: + br i1 %cond1, label %then1, label %then2 + + then1: + ... + %cond2 = ... + br i1 %cond2, label %then2, label %end + + then2: + ... + call void @subgroupControlBarrier() + ... + br label %end + + end: + } + +Is the control barrier guaranteed to synchronize among the same set of threads +in both cases? Different implementations in the literature may give different +answers to this question: + +* In an implementation that reconverges at post-dominators, threads reconverge + at ``mid`` in the first version, so that all threads (within a subgroup/wave) + that execute the control barrier do so together. In the second version, + threads that reach the control barrier via different paths synchronize + separately. + +* An implementation that sorts basic blocks topologically and ensures maximal + reconvergence for each basic block would behave the same way in both + versions. + + +Dynamic Instances and Convergence Tokens +======================================== + +Every execution of an LLVM IR instruction occurs in a *dynamic instance* of +the instruction. Dynamic instances are the formal objects by which we talk +about communicating threads in convergent operations. They satisfy: + +1. Different executions of the same static instruction by a single thread + give rise to different dynamic instances of that instruction. + +2. Executions of different static instructions always occur in different + dynamic instances. + +3. Executions of the same static instruction by different threads may occur in + the same dynamic instance. + +4. When executing a convergent operation, the set of threads that execute the + same dynamic instance is the set of threads that communicate with each other + for that operation. + +*Convergence tokens* are values of ``token`` type, i.e. they cannot be used in +``phi`` or ``select`` instructions. A convergence token value represents the +dynamic instance of the instruction that produced it. + +Convergent operations typically have a ``convergencectrl`` operand bundle with +a convergence token operand to define the set of communicating threads relative +to some anchor. The details are described in the +:ref:`Formal Rules ` section. + +The convergence control intrinsics described in this document and convergent +operations that have a ``convergencectrl`` operand bundle are considered +*controlled* convergent operations. Other convergent operations are +*uncontrolled*. + +The use of uncontrolled convergent operations is deprecated. + + +Convergence Control Intrinsics +============================== + +This section describes target-independent intrinsics that can be used to +produce convergence tokens. + +.. _llvm.experimental.convergence.anchor: + +``llvm.experimental.convergence.anchor`` +---------------------------------------- + +.. code-block:: llvm + + token @llvm.experimental.convergence.anchor() convergent readnone + +This intrinsic is a marker that acts as an "anchor" producing an initial +convergence token. The set of threads executing the same dynamic instance of +this intrinsic is implementation-defined. + +The expectation is that all threads within a group that "happen to be active at +the same time" will execute the same dynamic instance, so that programs can +detect the maximal set of threads that can communicate efficiently within +some local region of the program. + + +.. _llvm.experimental.convergence.loop: + +``llvm.experimental.convergence.loop`` +-------------------------------------- + +.. code-block:: llvm + + token @llvm.experimental.convergence.loop() [ "convergencectrl"(token) ] convergent readnone + +This intrinsic defines the "heart" of a loop, i.e. the place where an imaginary +loop counter is incremented for the purpose of determining convergence +semantics. + +The convergence control token operand is usually defined outside of the loop, +but this is not a requirement for the validity of a program (the resulting +behavior is quite different, though). + +The resulting convergence token *can* be used outside of the loop; see the +:ref:`Formal Rules ` section for details. + + +.. _llvm.experimental.convergence.entry: + +``llvm.experimental.convergence.entry`` +---------------------------------------- + +.. code-block:: llvm + + token @llvm.experimental.convergence.entry() convergent readnone + +This intrinsic returns the convergence token that was used in the +``convergencectrl`` operand bundle when the current function was called. + +Behavior is undefined if the containing function was called from IR without +such a bundle. + +The expectation is that for program "main" functions, such as kernel entry +functions, whose caller is not visible to LLVM, the implementation returns a +convergence token that represents uniform control flow, i.e. that is guaranteed +to refer to all threads within a (target- or environment-dependent) group. + +Behavior is undefined if this intrinsic appears in a function that isn't +``convergent``. + +Behavior is undefined if this intrinsic appears inside of another convergence +region or outside of a function's entry block. + +Function inlining substitutes this intrinsic with the token from the operand +bundle. For example: + +.. code-block:: c++ + + // Before inlining: + + void callee() convergent { + %tok = call token @llvm.experimental.convergence.entry() + convergent_operation(...) [ "convergencectrl"(token %tok) ] + } + + void main() { + %outer = call token @llvm.experimental.convergence.anchor() + for (...) { + %inner = call token @llvm.experimental.convergence.loop() [ "convergencectrl"(token %outer) ] + callee() [ "convergencectrl"(token %inner) ] + } + } + + // After inlining: + + void main() { + %outer = call token @llvm.experimental.convergence.anchor() + for (...) { + %inner = call token @llvm.experimental.convergence.loop() [ "convergencectrl"(token %outer) ] + convergent_operation(...) [ "convergencectrl"(token %inner) ] + } + } + + +.. _convergence_formal_rules: + +Formal Rules +============ + +Rules on the execution of dynamic instances: + +1. Let U be a static controlled convergent operation other than + :ref:`llvm.experimental.convergence.loop ` + whose convergence token is produced by an instruction D. Two threads + executing U execute the same dynamic instance of U if and only if they + obtained the token value from the same dynamic instance of D. + +2. Two threads executing the same static call U of + :ref:`llvm.experimental.convergence.loop ` + execute the same dynamic instance of U if and only if + (1) they obtained the ``convergencectrl`` token operand value from the same + dynamic instance of the defining instruction, and + (2) there is an *n* such that both threads execute U for the *n*'th time + with that same token operand value. + +3. If two threads execute the same static call U of + :ref:`llvm.experimental.convergence.entry `, + and at least one of them executes the function F containing U because it was + called by a ``call``, ``invoke``, or ``callbr`` instruction, then they + execute the same dynamic instance of U if and only if both threads execute F + because it was called by the same dynamic instance of a ``call``, ``invoke``, + or ``callbr`` instruction. + + (Informational note: If the function is executed for some reason outside of + the scope of LLVM IR, e.g. because it is a kernel entry function, then this + rule does not apply. On the other hand, if a thread executes the function + due to a call from IR, then the thread cannot "spontaneously converge" with + threads that execute the function for some other reason.) + +4. Target-specific rules determine whether two threads execute the same + dynamic instance of an uncontrolled convergent operation. + +Static rules that must be satisfied by valid programs: + +1. Every cycle in the CFG that contains a use of a convergence token other + than a use by + :ref:`llvm.experimental.convergence.loop ` + must also contain the definition of the token. + +2. Every cycle in the CFG that contains two or more static uses of a + convergence token by + :ref:`llvm.experimental.convergence.loop ` + must also contain the definition of the token. + +3. The *convergence region* corresponding to a convergence token T is the + region in which T is live (i.e., the subset of the dominance region of the + definition of T from which a use of T can be reached without leaving the + dominance region). + +4. If a convergence region contains a use of a convergence token, then it must + also contain its definition. + +The freedom of targets to define the target-specific rule about uncontrolled +convergent operations is limited by the following rule: A transform is correct +for uncontrolled convergence operations if it does not make such operations +control-dependent on additional values. + + +Memory Model Non-Interaction +============================ + +The fact that an operation is convergent has no effect on how it is treated for +memory model purposes. In particular, an operation that is ``convergent`` and +``readnone`` does not introduce additional ordering constraints as far as the +memory model is concerned. There is no implied barrier, neither in the memory +barrier sense nor in the control barrier sense of synchronizing the execution +of threads. + +Threads that execute the same dynamic instance do not necessarily do so at the +same time. + + +Other Interactions +================== + +``convergent`` vs. ``speculatable``. A function can be both ``convergent`` and +``speculatable``, indicating that the function does not have undefined +behavior and has no effects besides calculating its result, but is still +affected by the set of threads executing this function. This typically +prevents speculation of calls to the function unless the constraint imposed +by ``convergent`` is further relaxed by some other means. + + +Examples for the Correctness of Program Transforms +================================================== + +(This section is informative.) + +As implied by the rules in the previous sections, program transforms are correct +with respect to convergent operations if they preserve or refine their +semantics. This means that the set of communicating threads in the transformed +program must have been possible in the original program. + +Program transforms with a single-threaded focus are generally conservatively +correct if they do not sink or hoist convergent operations across a branch. +This applies even to program transforms that change the control flow graph. + +For example, unrolling a loop that does not contain convergent operations +cannot break any of the guarantees required for convergent operations outside +of the loop. + +An arbitrary loop that contains convergent operations *can* be unrolled if +all convergent operations refer back to an anchor inside the loop. +For example (in pseudo-code): + +.. code-block:: llvm + + while (counter > 0) { + %tok = call tok @llvm.experimental.convergence.anchor() + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + counter--; + } + +This can be unrolled to: + +.. code-block:: llvm + + while (counter >= 2) { + %tok = call tok @llvm.experimental.convergence.anchor() + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + %tok = call tok @llvm.experimental.convergence.anchor() + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + counter -= 2; + } + while (counter > 0) { + %tok = call tok @llvm.experimental.convergence.anchor() + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + counter--; + } + +This is likely to change the behavior of the convergent operation if there +are threads whose initial counter value is not a multiple of 2. That is allowed +because the anchor intrinsic has implementation-defined convergence behavior +and the loop unrolling transform is considered to be part of the +implementation. + +If the loop contains uncontrolled convergent operations, this unrolling is +forbidden. + +Unrolling a loop with convergent operations that refer to tokens produced +outside the loop is also forbidden in the general case. Consider: + +.. code-block:: llvm + + %outer = call tok @llvm.experimental.convergence.anchor() + while (counter > 0) { + %inner = call tok @llvm.experimental.convergence.loop() [ "convergencectrl"(token %outer) ] + call void @convergent.operation() [ "convergencectrl"(token %inner) ] + counter--; + } + +Threads whose counter is not a multiple of the unroll count will not communicate +with the expected set of threads during the remainder loop. On the other hand, +if the loop counter is known to be a multiple, then unrolling is allowed, +though care must be taken to correct the use of the loop intrinsic. +For example, unrolling by 2: + +.. code-block:: llvm + + %outer = call tok @llvm.experimental.convergence.anchor() + while (counter > 0) { + %inner = call tok @llvm.experimental.convergence.loop() [ "convergencectrl"(token %outer) ] + call void @convergent.operation() [ "convergencectrl"(token %inner) ] + call void @convergent.operation() [ "convergencectrl"(token %inner) ] + counter -= 2; + } + +Loops in which a +:ref:`llvm.experimental.convergence.loop ` +intrinsic outside of the loop header uses a token defined outside of the loop +can generally not be unrolled. + +Even though +:ref:`llvm.experimental.convergence.anchor ` +is marked as ``convergent``, it can be sunk in some cases. For example, in +pseudo-code: + +.. code-block:: llvm + + %tok = @call tok llvm.experimental.convergence.anchor() + if (condition) { + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + } + +Assuming that ``%tok`` is only used inside the conditional block, the anchor can +be sunk. Again, the rationale is that the anchor has implementation-defined +behavior, and the sinking is part of the implementation. + +Anchors can be hoisted in acyclic control flow. For example: + +.. code-block:: llvm + + if (condition) { + %tok = @call tok llvm.experimental.convergence.anchor() + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + } else { + %tok = @call tok llvm.experimental.convergence.anchor() + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + } + +The anchors can be hoisted, resulting in: + +.. code-block:: llvm + + %tok = @call tok llvm.experimental.convergence.anchor() + if (condition) { + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + } else { + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + } + +The behavior is unchanged, since each of the static convergent operations only +ever communicates with threads that have the same ``condition`` value. +By contrast, hoisting the convergent operations themselves is forbidden. + +Hoisting and sinking anchors out of and into loops is forbidden. For example: + +.. code-block:: llvm + + for (;;) { + %tok = call tok @llvm.experimental.convergence.anchor() + call void @convergent.operation() [ "convergencectrl"(token %tok) ] + } + +Hoisting the anchor would make the program invalid according to the static +validity rules. Conversely: + +.. code-block:: llvm + + %outer = call tok @llvm.experimental.convergence.anchor() + while (counter > 0) { + %inner = call tok @llvm.experimental.convergence.loop() [ "convergencectrl"(token %outer) ] + call void @convergent.operation() [ "convergencectrl"(token %inner) ] + counter--; + } + +The program would stay valid if the anchor was sunk into the loop, but its +behavior could end up being different. If the anchor is inside the loop, then +the grouping of threads during the execution of the anchor -- i.e., the sets of +threads executing the same dynamic instance of it -- can change in an arbitrary, +implementation-defined way in each iteration. + +Convergent operations can be sunk together with their anchor. Again in +pseudo-code: + +.. code-block:: llvm + + %tok = call tok @llvm.experimental.convergence.anchor() + %a = call T @pure.convergent.operation(...) [ "convergencectrl"(token %tok) ] + %b = call T @pure.convergent.operation(...) [ "convergencectrl"(token %tok) ] + if (condition) { + use(%a, %b) + } + +Assuming that ``%tok``, ``%a``, and ``%b`` are only used inside the conditional +block, all can be sunk together: + +.. code-block:: llvm + + if (condition) { + %tok = call tok @llvm.experimental.convergence.anchor() + %a = call T @pure.convergent.operation(...) [ "convergencectrl"(token %tok) ] + %b = call T @pure.convergent.operation(...) [ "convergencectrl"(token %tok) ] + use(%a, %b) + } + +The rationale is that the anchor intrinsic has implementation-defined behavior, +and the sinking transform is considered to be part of the implementation. + +However, sinking *only* the convergent operation producing ``%b`` would be +incorrect. That would allow the (remainder of the) implementation to include +threads for which ``condition`` is false to participate in the same dynamic +instance of the anchor and therefore in the calculation of ``%a``, and so the +set of threads communicating for the calculations of ``%a`` and ``%b`` could be +different, which the original program doesn't allow. + +Note that the entry intrinsic behaves differently. Sinking the convergent +operations is forbidden in the following snippet: + +.. code-block:: llvm + + %tok = call tok @llvm.experimental.convergence.entry() + %a = call T @pure.convergent.operation(...) [ "convergencectrl"(token %tok) ] + %b = call T @pure.convergent.operation(...) [ "convergencectrl"(token %tok) ] + if (condition) { + use(%a, %b) + } + diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -1471,24 +1471,22 @@ function call are also considered to be cold; and, thus, given low weight. ``convergent`` - In some parallel execution models, there exist operations that cannot be - made control-dependent on any additional values. We call such operations - ``convergent``, and mark them with this attribute. - - The ``convergent`` attribute may appear on functions or call/invoke - instructions. When it appears on a function, it indicates that calls to - this function should not be made control-dependent on additional values. - For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so - calls to this intrinsic cannot be made control-dependent on additional - values. + Some parallel execution environments execute threads in groups that allow + efficient communication within the group, among a subset of threads that + is implicitly defined by control flow. We call such operations + ``convergent`` and mark them with this attribute. + + The ``convergent`` attribute may appear on call/invoke instructions to + indicate that the instruction is a convergent operation, or on functions + to indicate that calls to this function are convergent operations. - When it appears on a call/invoke, the ``convergent`` attribute indicates - that we should treat the call as though we're calling a convergent - function. This is particularly useful on indirect calls; without this we - may treat such calls as though the target is non-convergent. + The presence of this attribute indicates that certain program transforms + involving control flow are forbidden. For a detailed description, see the + `Convergent Operations `_ document. The optimizer may remove the ``convergent`` attribute on functions when it - can prove that the function does not execute any convergent operations. + can prove that the function does not execute uncontrolled convergent + operations or ``llvm.experimental.convergent.entry``. Similarly, the optimizer may remove ``convergent`` on calls/invokes when it can prove that the call/invoke cannot call a convergent function. ``inaccessiblememonly`` @@ -1603,10 +1601,14 @@ (synchronize) with another thread through memory or other well-defined means. Synchronization is considered possible in the presence of `atomic` accesses that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses, - as well as `convergent` function calls. Note that through `convergent` function calls - non-memory communication, e.g., cross-lane operations, are possible and are also - considered synchronization. However `convergent` does not contradict `nosync`. - If an annotated function does ever synchronize with another thread, + as well as `convergent` function calls. + + Note that `convergent` operations can involve communication that is + considered to be not through memory and does notnecessarily imply an + ordering between threads for the purposes of the memory model. Therefore, + an operation can be both `convergent` and `nosync`. + + If a `nosync` function does ever synchronize with another thread, the behavior is undefined. ``nounwind`` This function attribute indicates that the function never raises an @@ -2283,6 +2285,14 @@ :ref:`stackmap entry `. See the intrinsic description for further details. +Convergence Control Operand Bundles +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A "convergencectrl" operand bundle is only valid on a ``convergent`` operation. +When present, the operand bundle must contain exactly one value of token type. +See the `Convergent Operations `_ document for +details. + .. _moduleasm: Module-Level Inline Assembly @@ -16089,6 +16099,14 @@ %a = load i16, i16* @x, align 2 %res = call float @llvm.convert.from.fp16(i16 %a) +Convergence Intrinsics +---------------------- + +The LLVM convergence intrinsics for controlling the semantics of ``convergent`` +operations, which all start with the ``llvm.experimental.convergence.`` +prefix, are described in the `Convergent Operations `_ +document. + .. _dbg_intrinsics: Debugger Intrinsics diff --git a/llvm/docs/Reference.rst b/llvm/docs/Reference.rst --- a/llvm/docs/Reference.rst +++ b/llvm/docs/Reference.rst @@ -15,6 +15,7 @@ BranchWeightMetadata Bugpoint CommandGuide/index + ConvergentOperations Coroutines DependenceGraphs/index ExceptionHandling @@ -130,6 +131,9 @@ :doc:`GlobalISel/index` This describes the prototype instruction selection replacement, GlobalISel. +:doc:`ConvergentOperations` + Description of ``convergent`` operation semantics and related intrinsics. + ===================== Testing and Debugging ===================== diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td --- a/llvm/include/llvm/IR/Intrinsics.td +++ b/llvm/include/llvm/IR/Intrinsics.td @@ -1556,7 +1556,14 @@ //===---------- Intrinsics to query properties of scalable vectors --------===// def int_vscale : Intrinsic<[llvm_anyint_ty], [], [IntrNoMem]>; -//===----------------------------------------------------------------------===// +//===------- Convergence Intrinsics ---------------------------------------===// + +def int_experimental_convergence_entry + : Intrinsic<[llvm_token_ty], [], [IntrNoMem, IntrConvergent]>; +def int_experimental_convergence_anchor + : Intrinsic<[llvm_token_ty], [], [IntrNoMem, IntrConvergent]>; +def int_experimental_convergence_loop + : Intrinsic<[llvm_token_ty], [], [IntrNoMem, IntrConvergent]>; //===----------------------------------------------------------------------===// // Target-specific intrinsics diff --git a/llvm/include/llvm/IR/LLVMContext.h b/llvm/include/llvm/IR/LLVMContext.h --- a/llvm/include/llvm/IR/LLVMContext.h +++ b/llvm/include/llvm/IR/LLVMContext.h @@ -87,12 +87,13 @@ /// operand bundle tags without comparing strings. Keep this in sync with /// LLVMContext::LLVMContext(). enum : unsigned { - OB_deopt = 0, // "deopt" - OB_funclet = 1, // "funclet" - OB_gc_transition = 2, // "gc-transition" - OB_cfguardtarget = 3, // "cfguardtarget" - OB_preallocated = 4, // "preallocated" - OB_gc_live = 5, // "gc-live" + OB_deopt = 0, // "deopt" + OB_funclet = 1, // "funclet" + OB_gc_transition = 2, // "gc-transition" + OB_cfguardtarget = 3, // "cfguardtarget" + OB_preallocated = 4, // "preallocated" + OB_gc_live = 5, // "gc-live" + OB_convergencectrl = 6, // "convergencectrl" }; /// getMDKindID - Return a unique non-zero ID for the specified metadata kind. diff --git a/llvm/lib/IR/LLVMContext.cpp b/llvm/lib/IR/LLVMContext.cpp --- a/llvm/lib/IR/LLVMContext.cpp +++ b/llvm/lib/IR/LLVMContext.cpp @@ -78,6 +78,11 @@ "gc-transition operand bundle id drifted!"); (void)GCLiveEntry; + auto *ConvergenceCtrlEntry = pImpl->getOrInsertBundleTag("convergencectrl"); + assert(ConvergenceCtrlEntry->second == LLVMContext::OB_convergencectrl && + "convergencectrl operand bundle id drifted!"); + (void)ConvergenceCtrlEntry; + SyncScope::ID SingleThreadSSID = pImpl->getOrInsertSyncScopeID("singlethread"); assert(SingleThreadSSID == SyncScope::SingleThread && diff --git a/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll b/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll --- a/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll +++ b/llvm/test/Bitcode/operand-bundles-bc-analyzer.ll @@ -9,6 +9,7 @@ ; CHECK-NEXT: