This is an archive of the discontinued LLVM Phabricator instance.

[LV] Let selectVectorizationFactor reason directly on VectorizationFactor.
ClosedPublic

Authored by sdesmalen on Apr 8 2021, 9:05 AM.

Details

Summary

Rather than maintaining two separate values, a float for the per-lane
cost and a Width for the VF, maintain a single VectorizationFactor which
comprises the two and also removes the need for converting an integer value
to float.

This simplifies the query when asking if one VF is more profitable than
another when we want to extend this for scalable vectors (which may
require additional options to determine if e.g. a scalable VF of the
some cost, is more profitable than a fixed VF of the same cost).

The patch isn't entirely NFC because it also fixes an issue in
selectEpilogueVectorizationFactor, where the cost passed to ProfitableVFs
no longer truncates the floating-point cost from float to unsigned to
then perform the calculation on the truncated cost. It now does
a cost comparison with the correct precision.

Diff Detail

Event Timeline

sdesmalen created this revision.Apr 8 2021, 9:05 AM
sdesmalen requested review of this revision.Apr 8 2021, 9:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2021, 9:05 AM

The patch isn't entirely NFC because it also fixes an issue in selectEpilogueVectorizationFactor, where the cost passed to ProfitableVFs is no longer the total cost, but rather the cost per lane which is also what is used to determine the VF for the vector body loop.

I'm a bit confused by this...before we saved the per-lane cost in ProfitableVFs and compared that with the Result variable in selectEpilogueVectorizationFactor. The Result variable gets initialized as Disabled and then gets refined as we go through the ProfitableVFs, getting assigned a value from ProfitableVFs. That means that the Result variable could only get assigned per-lane costs. Since we are now comparing per-lane costs as well, I don't see this change being non-NFC.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1607

I think this should become a member of the VectorizationFactor struct....then code like A.isMoreProfitableThan(B) reads better.

sdesmalen edited the summary of this revision. (Show Details)Apr 8 2021, 10:52 AM

You're right, that comment wasn't correct. I wrote the code last week and incorrectly remembered the change when writing up the comment today.

The issue has to do with the truncation. It stores the floating-point cost in the ProfitableVFs list as unsigned.

For llvm/test/Transforms/LoopVectorize/X86/intrinsiccost.ll, it then compares the costs as follows:

1 < 2 ? true
0 < 1 ? true
0 < 0 ? false

Where it now compares the costs properly as:

5/4 < 5/2 ? true
5/8 < 5/4 ? true
5/16 < 5/8 ? true

You're right, that comment wasn't correct. I wrote the code last week and incorrectly remembered the change when writing up the comment today.

The issue has to do with the truncation. It stores the floating-point cost in the ProfitableVFs list as unsigned.

For llvm/test/Transforms/LoopVectorize/X86/intrinsiccost.ll, it then compares the costs as follows:

1 < 2 ? true
0 < 1 ? true
0 < 0 ? false

Where it now compares the costs properly as:

5/4 < 5/2 ? true
5/8 < 5/4 ? true
5/16 < 5/8 ? true

I see. That makes sense.

sdesmalen added inline comments.Apr 8 2021, 1:04 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1607

I'm happy to implement your suggestion, although I think in the future a LoopVectorizationCostModel::isMoreProfitable method will be needed anyway, so that the method can query the LoopHints to understand how to interpret the costs of a scalable VF, and possibly favour it over fixed-width VFs of similar cost. But perhaps that's just something to worry about in a future patch?

sdesmalen updated this revision to Diff 336209.Apr 8 2021, 1:35 PM

Moved LoopVectorizationCostModel::isMoreProfitable -> VectorizationFactor::isMoreProfitableThan.

dmgreen added inline comments.Apr 12 2021, 6:28 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
203

rhs -> RHS

If the backend knew that the SVE vector length was 256, as opposed 128, how would it best communicate that information to here?

sdesmalen added inline comments.Apr 12 2021, 8:15 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
203

Doing it here without context would be a bit difficult, as it needs information from CostModel/TargetTransformInfo in order to know which one to favour. Hence my reason to initially create this method in the LV CostModel (see the previous revision https://reviews.llvm.org/D100121?id=336138), so it would mean either passing that information separately, or moving it to the other class again.

dmgreen added inline comments.Apr 15 2021, 7:53 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
203

Yeah, I think that makes sense. If we have a good reason to put it into the cost model (and @bmahjour doesn't object), then keeping it in the cost model so it could have access to TTI etc would make sense to me.

bmahjour added inline comments.Apr 15 2021, 7:59 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
203

I don't have any objections to moving it back, if it makes information more accessible for future extensions.

Changed back to having isMoreProfitable in LV CostModel.

dmgreen accepted this revision.Apr 16 2021, 1:07 AM

Thanks. Seems like a useful step forward. LGTM.

This revision is now accepted and ready to land.Apr 16 2021, 1:07 AM

Thanks for the review @bmahjour and @dmgreen!

rogfer01 added inline comments.Apr 23 2021, 6:25 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
191

Hi Sander, a minor question here: for scalable vectorization do you plan to change the definition of Disabled? Perhaps this may not be needed?

We were considering something like return {ElementCount:getNull(), 0}; so this VectorizationFactor value is effectively not a valid vectorization factor at all. But maybe this is not the intent of Disabled?