This is a follow up to e4df6a. As noted in my last comment on the review, I realized that supporting tail folding of multiple exit loops was much more straight forward than I first realized. The hard part is forming the predicate masks, and the code already knows how to do that.
The only slightly tricky bit here is getting the conditions right (e.g. using the proper form requiresScalarEpilogue() vs foldTailByMasking in each place since this is effectively a three way decision)
This should be entirely orthogonal to D93725 and can landed in either order.