This is an archive of the discontinued LLVM Phabricator instance.

[LV] Update type in cost model for scalarization
ClosedPublic

Authored by mssimpso on May 23 2017, 12:30 PM.

Details

Reviewers
delena
mkuper
Ayal
Summary

For non-uniform instructions marked for scalarization, we should update VectorTy when computing instruction costs to reflect the scalar type. In addition to determining instruction costs, this type is also used to signal that all instructions in the loop will be scalarized. This currently affects memory instructions and non-pointer induction variables and their updates. (We also mark GEPs scalar after vectorization, but their cost is computed together with memory instructions.) For scalarized induction updates, this patch also scales the scalar cost by the vectorization factor, corresponding to each induction step.

Event Timeline

mssimpso created this revision.May 23 2017, 12:30 PM
Ayal added inline comments.May 23 2017, 2:18 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
7160

isVectorTy() implies !isVoidTy(), so checking the latter becomes redundant. Does the VF > 1 check also become redundant?

mssimpso added inline comments.May 23 2017, 2:36 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
7160

Right, I thought there may be some redundancy here. So we should replace "!VectorTy->isVoidTy()" with "VectorTy->isVectorTy()". Checking that the type is actually a vector is what we really want.

I'm not quite sure about "VF > 1". We only care about scalarization if we are vectorizing. I think it might be needed for the "TTI.getNumberOfParts(VectorTy) < VF" check, which wouldn't make sense if VF == 1.

mssimpso updated this revision to Diff 100000.EditedMay 23 2017, 2:48 PM
mssimpso marked an inline comment as done.

Addressed Ayal's comment.

  • Removed !isVoidTy() check since this is implied by isVectorTy().
Ayal accepted this revision.May 23 2017, 3:12 PM

Curious if you've found benchmarks that are affected by this change?

This revision is now accepted and ready to land.May 23 2017, 3:12 PM

Thanks, Ayal! The change is mostly noise on the hardware (AArch64 Kryo/Falkor) and benchmarks (spec, test-suite) I tested.

For some context, I had been experimenting with a patch that changes the cost model's "tie-break" rule when I noticed the issue. Right now we select the smallest VF with the best score (so scalar wins out if there's no benefit from vectorization). But I was testing a patch that changed this. If any VF > 1 was better than VF == 1, it selected the largest VF having the best score. So we would prefer wider vectors in a tie if any vectorization was better than scalar.

For some loops, this interacted badly with the MaxVF determination based on types. Currently, if we know a load or store will be scalarized, we don't consider it when collecting type sizes, which determine the MaxVF. So with the above experimental patch, I was seeing loops where nothing was vectorized being effectively unrolled by the MaxVF, because VectorTy wasn't being set properly.

Sorry for the long explanation!

mssimpso closed this revision.May 24 2017, 8:27 AM

Committed in rL303763.