Now that we store the ScalarCost in the VectorizationFactor it is possible to use it to get a slightly more accurate cost in isMoreProfitable between two vector factors. This extends the logic added in D101726 to non-tail-folded cases, using the costs of VecCost * (TripCount / VF) + ScalarCost * (TripCount % VF) to compare VFs where the TripCount is known.
This shouldn't alter very much as small trip counts are usually not vectorized, but does seem to help in the testcase where 4 * VF4 is chosen as profitable compared to 2 * VF8 + 4* scalar.
Hi @dmgreen, perhaps I've missed something here but it looks like there is a divide-by-zero in the code?
If A.Width.getFixedValue() is non-zero then we calculate (CostA * divideCeil(MaxTripCount, A.Width.getFixedValue())), and if it is zero then we do (CostA * (MaxTripCount / A.Width.getFixedValue()) + ...