Index: llvm/docs/LangRef.rst =================================================================== --- llvm/docs/LangRef.rst +++ llvm/docs/LangRef.rst @@ -16366,6 +16366,79 @@ %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef +.. _int_get_active_lane_mask: + +'``llvm.get.active.lane.mask.*``' Intrinsics +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" +This is an overloaded intrinsic. + +:: + + declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n) + declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n) + declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n) + declare @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n) + + +Overview: +""""""""" + +Create a mask representing active and inactive vector lanes. + + +Arguments: +"""""""""" + +Both operands have the same scalar integer type. The result is a vector with +the i1 element type. The arguments are scalar types to accomodate scalable +vector types, for which it is unknown what the the type of the step vector +needs to be that enumerate its lanes without overflow. + + +Semantics: +"""""""""" + +The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent +to: + +:: + + %m[i] = icmp ule (%base + i), %n + +where ``%m`` is a vector (mask) of active/inactive lanes with its elements +indexed by ``i``, and ``%base``, ``%n`` are the two arguments to +``llvm.get.active.lane.mask.*``, ``%imcp`` is an integer compare and ``ule`` +the unsigned less-than-equal comparison operator. Overflow cannot occur in +``(%base + i)`` and its comparison against ``%n`` as it is performed in integer +numbers and not in machine numbers. The above is equivalent to: + +:: + + %m = @llvm.get.active.lane.mask(%base, %n) + +This can, for example, be emitted by the loop vectorizer. Then, ``%base`` is +the first element of the vector induction variable (VIV), and ``%n`` is the +Back-edge Taken Count (BTC). Thus, these intrinsics perform an element-wise +less than or equal comparison of VIV with BTC, producing a mask of true/false +values representing active/inactive vector lanes, except if the VIV overflows +in which case they return false in the lanes where the VIV overflows. This +mask can e.g. be used in the masked load/store instructions. These intrinsics +provides a hint to the backend. I.e., for a vector loop, the back-edge taken +count of the original scalar loop is explicit as the second argument. + + +Examples: +""""""""" + +.. code-block:: llvm + + %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429) + %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef) + + .. _int_mload_mstore: Masked Vector Load and Store Intrinsics Index: llvm/include/llvm/IR/Intrinsics.td =================================================================== --- llvm/include/llvm/IR/Intrinsics.td +++ llvm/include/llvm/IR/Intrinsics.td @@ -1294,6 +1294,10 @@ } +def int_get_active_lane_mask: + Intrinsic<[llvm_anyvector_ty], + [llvm_anyint_ty, LLVMMatchType<1>], + [IntrNoMem, IntrNoSync, IntrWillReturn]>; //===-------------------------- Masked Intrinsics -------------------------===// //