Index: llvm/docs/LangRef.rst =================================================================== --- llvm/docs/LangRef.rst +++ llvm/docs/LangRef.rst @@ -16366,6 +16366,79 @@ %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef +.. _int_get_active_lane_mask: + +'``llvm.get.active.lane.mask.*``' Intrinsics +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Syntax: +""""""" +This is an overloaded intrinsic. + +:: + + declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n) + declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n) + declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n) + declare @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n) + + +Overview: +""""""""" + +Create a mask representing active and inactive vector lanes. + + +Arguments: +"""""""""" + +Both operands have the same scalar integer type. The first operand is the first +element of the Vector Induction Variable (VIV), denoted by ``%base.`` The +second operand is the scalar loop Back-edge Taken Count (BTC), denoted by +``%n``. The result is a vector with the same number of elements as the VIV, but +with the i1 element value type. + +The arguments are scalar types to accomodate scalable vector types, for which +it is unknown what the the type of the step vector needs to be that enumerate +its lanes without overflow. + + +Semantics: +"""""""""" + +The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent to: + +:: + + %m[i] = icmp ule (%base + i) <= %n + +where ``%m`` is the mask of active/inactive lanes with its elements indexed by +``i``, ``%base`` is the first element of the vector induction variable (VIV), +and ``%n`` is the back-edge taken count, except when the VIV overflows, in +which case they return false in the lanes where the VIV overflows. And this is +equivalent to: + +:: + + %m = @llvm.get.active.lane.mask(%base, %n) + +Thus, these intrinsics perform an element-wise less than or equal comparison of +VIV with BTC, producing a mask of true/false values representing +active/inactive vector lanes. This mask can e.g. be used in the masked +load/store instructions. These intrinsics provides a hint to the backend. I.e., +for a vector loop, the back-edge taken count of the original scalar loop is +explicit as the second argument. + + +Examples: +""""""""" + +.. code-block:: llvm + + %get.active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429) + %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %get.active.lane.mask, <4 x i32> undef) + + .. _int_mload_mstore: Masked Vector Load and Store Intrinsics Index: llvm/include/llvm/IR/Intrinsics.td =================================================================== --- llvm/include/llvm/IR/Intrinsics.td +++ llvm/include/llvm/IR/Intrinsics.td @@ -1294,6 +1294,10 @@ } +def int_get_active_lane_mask: + Intrinsic<[llvm_anyvector_ty], + [llvm_anyint_ty, LLVMMatchType<1>], + [IntrNoMem, IntrNoSync, IntrWillReturn]>; //===-------------------------- Masked Intrinsics -------------------------===// //