This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Move recipe based VPlan generation to separate function.
ClosedPublic

Authored by fhahn on May 29 2018, 7:09 AM.

Details

Summary

This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.

Diff Detail

Repository
rL LLVM

Event Timeline

fhahn created this revision.May 29 2018, 7:09 AM

Thanks for the patch, Florian! Ok, now I understand what you mean exactly. Some comments inline.

lib/Transforms/Vectorize/LoopVectorizationPlanner.h
352 ↗(On Diff #148902)

I really like this change! This is decoupling the VPlan construction with VPInstructions from the legacy VPlan construction using recipes. I understand that you will refactor even more the code in buildVPRecipes to implement the temporary VPInstruction-to-VPRecipe step in the native path.

Maybe we should name this buildVPlanWithRecipes or something like that to clearly state that it's creating a new VPlan?

lib/Transforms/Vectorize/LoopVectorize.cpp
6342 ↗(On Diff #148902)

Why BestVF.Width + 1? Shouldn't be BestVF.Width?

6347 ↗(On Diff #148902)

The rationale of building VPlans for all the VFs and then invoke CM to choose the best VF (without using those VPlans) was to eventually port CM to work on these VPlans. This hasn't happened yet but @hsaito is working on this VPlan based version of the CM. We think that this step is very important in the converge of both vectorization paths since it will allow us to move the VPlan construction in the inner loop path to earlier stages of the path.

Since this is a small change, my suggestion would be: 1) if this change is needed by your subsequent patches, let's go with it. Hideki could move the VPlan construction again after CM when his code is ready; 2) otherwise, let's keep the original version. Please, note that we are mostly sharing a single VPlan for all the VFs so we shouldn't be using unnecessary memory most of the times.

WDYT?

Thanks Diego!

lib/Transforms/Vectorize/LoopVectorizationPlanner.h
352 ↗(On Diff #148902)

Yep there is some more refactoring I plan on doing. Ideally I think we should try to move as much of the VPlan related implementation out of LoopVectorize.cpp

lib/Transforms/Vectorize/LoopVectorize.cpp
6342 ↗(On Diff #148902)

yep, buildVPRecipes does not really need a range , so this could be simplified.

6347 ↗(On Diff #148902)

I understand, but do you think we will be able to replace the legacy cost model anytime soon? I expect implementing VPlan based inner loop vectorization that does not introduce regressions over the legacy cost model will be a big task. I thought the plan was to develop this cost model in the VPlan native path (with support for inner loops). That will allow us to get an initial cost model in and iterate on that, until we match the performance of the legacy one. At that point, we can retire the legacy inner loop vectorizer. What do you think?

lib/Transforms/Vectorize/LoopVectorize.cpp
6347 ↗(On Diff #148902)

I think we should be doing both, and that's probably the only way to gain enough credibility with the rest of the community. Certainly not an easy task. From my perspective of CM refactoring, if VPlan is built before CM runs, incremental work of introducing VPlan-ness is easier. If you move it to after CM computing VF/UF, the first thing I need to do, for incremental introduction of VPlan-ness, is essentially undoing your change.

First CM refactoring I plan to do is to change "BB->Instruction" traversal into "VPBB -> Recipe -> Instruction" traversal, making sure that the change is NFC. We can then proceed to change "per instruction" modeling to "per Recipe" modeling one by one and/or to "per VPInstruction" modeling, making sure cost modeling methodology and computed cost are comparable or better (we'll probably have to establish some metric there) each step. Along the way, we'll need to reflect many "behind the scenes smarts" into VPlan. All that expects VPlan to be built before CM computation.

Does this make sense to you?

fhahn added inline comments.May 29 2018, 2:41 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
6347 ↗(On Diff #148902)

Please, note that we are mostly sharing a single VPlan for all the VFs so we shouldn't be using unnecessary memory most of the times.

Right Diego, thanks!

I think we should be doing both, and that's probably the only way to gain enough credibility with the rest of the community. Certainly not an easy task. From my perspective of CM refactoring, if VPlan is built before CM runs, incremental work of introducing VPlan-ness is easier. If you move it to after CM computing VF/UF, the first thing I need to do, for incremental introduction of VPlan-ness, is essentially undoing your change.

First CM refactoring I plan to do is to change "BB->Instruction" traversal into "VPBB -> Recipe -> Instruction" traversal, making sure that the change is NFC. We can then proceed to change "per instruction" modeling to "per Recipe" modeling one by one and/or to "per VPInstruction" modeling, making sure cost modeling methodology and computed cost are comparable or better (we'll probably have to establish some metric there) each step. Along the way, we'll need to reflect many "behind the scenes smarts" into VPlan. All that expects VPlan to be built before CM computation.

Does this make sense to you?

Yep thanks! My understanding from one of Diego's responses at D46827 was that we want to try to keep changes to the Vplan native path, to allow us to iterate quickly without introducing regressions.

Anyways, I think it makes sense to move building the Vplans before cost modelling again, as it sounds like the VPlan based cost modelling is coming soon :) Do you think we should get the rest of this patch in regardless?

hsaito added inline comments.May 29 2018, 3:28 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
6347 ↗(On Diff #148902)

There's a fast path for velocity of implementation Diego is working on and a gradual path for credibility building that I'm working on. Eventually, we'll have enough sharing of code between the two paths and that's the time we can declare to conclude the transition.

I'm neutral to the rest of the changes, minus the fact that NeedDef computation is needed only once, and we'll end up in more than once after the change. If you think this is a step forward for constructing HCFG first and then populate Recipes later for the innermost loop vectorization path, I'd say go for it. Otherwise, I'll stay neutral.

Decoupling is the opposite of what we are trying to do. Increased sharing is the direction we are heading.

dcaballe added inline comments.May 29 2018, 3:35 PM
lib/Transforms/Vectorize/LoopVectorizationPlanner.h
352 ↗(On Diff #148902)

+1

lib/Transforms/Vectorize/LoopVectorize.cpp
6347 ↗(On Diff #148902)

Yep thanks! My understanding from one of Diego's responses at D46827 was that we want to try to keep changes to the Vplan native path, to allow us to iterate quickly without introducing regressions.

Sorry if my comment was confusing. We have to evaluate case by case an make individual decisions. We think the CM refactoring is necessary for convergence: 1) to be able to move the VPlan construction to an earlier stage in the inner loop path, and, 2) to be able to decouple and move some of the code in CM outside of CM. We still have to evaluate the actual consequences of this approach, though. However, what I think we shouldn't do in the VPlan native path is to build unnecessary recipes (since we are trying to get rid of them) or move code related to them without doing the proper porting of that code to VPInstructions. Otherwise we would end up having a VPlan native path with the same problems as the inner loop path. Does it make sense?

Do you think we should get the rest of this patch in regardless?

I think so! I like the other changes!

fhahn updated this revision to Diff 149309.May 31 2018, 10:00 AM
fhahn retitled this revision from [VPlan] Avoid building VPlans for each VF in the ILV (NFC). to [VPlan] Move recipe based VPlan generation to separate function..
fhahn edited the summary of this revision. (Show Details)

This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.

Thanks for revisiting this, Florian!
No major issues. Just one question below.

Thanks,
Diego

lib/Transforms/Vectorize/LoopVectorizationPlanner.h
357 ↗(On Diff #149309)

ijf -> if

lib/Transforms/Vectorize/LoopVectorize.cpp
6341 ↗(On Diff #149309)

Just wondering if buildVPlans happened before the if condition for some reason. Is it possible that we still need the VF=1 VPlan for interleaving? Not sure if there is a test covering this case. Could you please have a look? If should be easy to test using #pragma clang loop interleave.

fhahn updated this revision to Diff 149485.Jun 1 2018, 8:42 AM
fhahn marked an inline comment as done.

Thanks for having a look Diego!

lib/Transforms/Vectorize/LoopVectorize.cpp
6341 ↗(On Diff #149309)

Thanks for spotting this. I think at the moment there are some assertions making sure we only vectorize with VF > 1, but I moved the check to the original position, as for loop aware SLP it might be profitable to vectorize even if the computed VF is 1.

fhahn updated this revision to Diff 149493.Jun 1 2018, 8:58 AM

Add missing context.

dcaballe accepted this revision.Jun 1 2018, 9:20 AM

Thanks, Florian! Again, wait for the pending review to commit this one.

Diego

This revision is now accepted and ready to land.Jun 1 2018, 9:20 AM
fhahn added a comment.Jun 4 2018, 4:42 AM

Will do, thanks for having a look Diego!

fhahn added a comment.Jun 6 2018, 10:29 AM

Thanks, Florian! Again, wait for the pending review to commit this one.

as I updated D46827, do you think there is anything left that's pending for this patch?

Thanks, Florian! Again, wait for the pending review to commit this one.

as I updated D46827, do you think there is anything left that's pending for this patch?

I don't think so! Please, go ahead and thanks for your patience! :)

This revision was automatically updated to reflect the committed changes.