This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.
This depends on D64744 that introduces the llvm.loop.vectorize.predicate.enable
pragma and metadata node, and D64916 which is a refactoring to make tail
folding a more general concept.
nit: