This is an archive of the discontinued LLVM Phabricator instance.

[LV] Refactor Cost Model's selectVectorizationFactor(), driven by a LoopVectorizationPlanner
ClosedPublic

Authored by Ayal on Mar 6 2017, 7:46 AM.

Details

Summary

Refactoring Cost Model's selectVectorizationFactor() so that it handles only the selection of the best VF from a pre-computed range of candidate VF's, extracting early-exit criteria and the computation of a MaxVF upper-bound to other methods, all driven by a LoopVectorizationPlanner.

Follows https://reviews.llvm.org/D28975 and its tentative breakdown starting with the 1st item "refactor Cost-Model to provide MaxVF and early-exit methods.". The refactoring and Planner proposed in this patch are independent of VPlan though.

No change in output intended.

Join work with Gil.

Diff Detail

Repository
rL LLVM

Event Timeline

Ayal created this revision.Mar 6 2017, 7:46 AM
mkuper added inline comments.Mar 6 2017, 1:12 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
6163 ↗(On Diff #90687)

Any reason not to use Optional<> instead?

7385 ↗(On Diff #90687)

Maybe:

DEBUG(dbgs() << "LV: Using user VF " << UserVF << ".\n");
assert(isPowerOf2_32(UserVF) && "VF needs to be a power of two");
if (UserVF != 1)
  CM.selectUserVectorizationFactor(UserVF);
return {UserVF, 0}

?

(I think isPowerOf2_32(1) is true.)

Ayal added inline comments.Mar 6 2017, 2:28 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
6163 ↗(On Diff #90687)

Ahh, sure, done.

Any suggestion for a better method name, avoiding having two computeMaxVF()'s?

7385 ↗(On Diff #90687)

Right, isPowerOf2_32(1) is true.

Will actually simplify this further by dropping the "if (UserVF != 1)", since selectUserVectorizationFactor() knows how to handle a UserVF of 1 (namely, by doing nothing).

mkuper added inline comments.Mar 6 2017, 4:18 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
6163 ↗(On Diff #90687)

Not really, since we're sort of mixing VF selection with legality in this function. :-\

6201 ↗(On Diff #90687)

I understand this is the way it has always been, and isn't changing in this patch, but now I realize it's fairly odd.
Why are we bailing on "TC % MaxVF != 0" instead of trying to reduce MaxVF so that it actually is 0?

Am I missing something here? If not, could you add a FIXME, and/or fix it in a follow-up commit?

Ayal/Gil,

I have no suggestions for this patch beyond what Michael has already mentioned. But I really like the way you're splitting up the larger patch into smaller pieces. This will make it much easier to review. Thanks!

Ayal added inline comments.Mar 7 2017, 2:02 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
6163 ↗(On Diff #90687)

These checks could potentially move into Legality, but in some sense they are "early pruning" due to excessive cost, rather than "legal" obstacles that cannot be handled. Plus only (Max)VF is considered here, not (Max)UF.

6201 ↗(On Diff #90687)

It would indeed be better to search for a smaller MaxVF that does divide TC, instead of giving up. Added a FIXME.

We should also check if loop requiresScalarEpilog(), which in turn should be determined more accurately per-VF. Added another FIXME.

Ayal updated this revision to Diff 90933.Mar 7 2017, 2:12 PM

Addressing review suggestions. Thanks!

mkuper accepted this revision.Mar 7 2017, 2:29 PM

LGTM

This revision is now accepted and ready to land.Mar 7 2017, 2:29 PM
This revision was automatically updated to reflect the committed changes.