Currently the loop vectorizer has a single parameter in the CostModel that controls FoldTailByMasking. It is set fairly early, and can't be changed later meaning we need to pick between tail folding and non tail folding before we have done much cost modelling. This patch aims to alter that so that there is not a single parameter, moving it eventually into the vplan, so that we can have plans both with and without FoldTailByMasking, that can be costed against one another and the best one picked for vectorization.
A lot of the changes are fairly mechanical and attempted to be non-disruptive to keep the patch simpler, but there are still a fair number of changes. The important parts are:
- FoldTailByMasking is removed from CostModel. It now has a MayFoldTailByMasking variable to hold whether FoldTailByMasking VPlans can be created.
- A number of maps in the cost model like InstsToScalarize/Uniforms/Scalars/ForcedScalars were made conditional on the pair of (FoldTailByMasking, VF), so that they still contain the information required in both FoldTailByMasking=true and FoldTailByMasking=false cases. The semi-random name VectorizationScheme was used to describe the pair of (FoldTailByMasking, VF). Alternative suggestions welcome.
- We then create vplans with both FoldTailByMasking=false and FoldTailByMasking=true if we MayFoldTailByMasking. The VPlan from then stores whether it is FoldTailByMasking. For VF=1 only non-predicated plans are created.
- Tail folded and non tail folded vplans then need to be able to be costed against one another. This patch makes FoldTailByMasking win on a tie if the costs are equal, which will be more consistent with the vectorization before this patch.
- Epilogue loops for the moment are always use FoldTailByMasking=false. This can hopefully be changed in the near future to allow predicated epilogues for unpredicated loop bodies.
Overall this has the effect of allowing us to model and cost tail folded and non tail folded vplans against one another, and should in the future allow us to generate predicated epilogues for unpredicated vector loops.
This comment needs to be moved to line no. 224.