The vectoriser sometimes generates predicated vector loops using
the llvm.get.active.lane.mask intrinsic so it's important that we
are able to calculate a valid cost for the call instruction. When
SVE is enabled we are able to use a single whilelo instruction
for some vector types - in such cases I've marked the cost as 1.
For all other cases I've set the cost to some reasonably high
value.
Tests added here:
Analysis/CostModel/AArch64/sve-intrinsics.ll Analysis/CostModel/ARM/active_lane_mask.ll Analysis/CostModel/RISCV/active_lane_mask.ll
Can the default case (i.e. when shouldExpandGetActiveLaneMask = true) be added to BasicTTIImpl?
Then in the target's TTI, you only need to handle the case where the intrinsic is not expanded.