When we have small trip count loops, the cost of out of loop reduction
becomes significant. We do not consider the cost of out of loop
reductions in loop vectorizer (in-loop vectorizations are handled in
cost modelling).
This patch extends the logic used by cost modelling of runtime checks to figure out the minimum trip count under which runtime checks are profitable. We reuse the same idea for out of loop reductions.
This minimum trip count of 56 comes about because we compute the reduction cost as 140 (and there are two such reductions, with total being 280).
This is a pretty high cost which is computed through https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/CodeGen/BasicTTIImpl.h#L2372.
This is correct for a triple with mattr such as:
However, I'm not sure if this is a correct cost for other triples.