Page MenuHomePhabricator

[LV] Don't apply "TinyTripCountVectorThreshold" for loops with compile time known TC.
Needs ReviewPublic

Authored by ebrevnov on Dec 14 2021, 1:37 AM.

Details

Summary

When trip count is not known at compile time, there are additional overhead to make sure it's safe to perform next VF iterations. Thus, if vector loop is skipped at runtime then such vectorization is unprofitable. When trip count is known to be small enough there is high chance to get into such situation. Currently, LV is not able to properly cost model in this case since it doesn't account for cost of the epilog loop. Instead "short trip count" heuristic is employed.

While "short trip count" heuristic  makes sense in general (at least for current state) it can be slightly lifted up when trip count is compile time known constant. In   this case it's known at compile time how many vector iterations will be executed and there is implied overhead by trip count checks.  Cost modeling is simple as well, if one vector iteration costs less than one scalar iteration multiple VF then vectorization is profitable.

Note: One may say, that "short trip count" heuristic is the needed to reduce code size in assumption that short trip count loops can't be performance critical. That statement turns out to be false in many cases (for example, nested loops) and should not be driving factor.

Diff Detail