This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.
This depends on D64744 that introduces the llvm.loop.vectorize.predicate.enable
pragma and metadata node, and D64916 which is a refactoring to make tail
folding a more general concept.
I think the nuance here is rather ScalarEpilogueNotNeededPredicatePragma. In other words, if scalar epilogue is needed for some other reason (but still okay to skip scalar epilogue execution when vector code executes), scalar epilogue can be emitted/utilized. Runtime vectorization legality check of all kinds fits in that profile. We shouldn't overload "predicated vector code" pragma with "don't emit scalar epilogue" meaning.