This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Thanks for the patch, Florian! Ok, now I understand what you mean exactly. Some comments inline.
lib/Transforms/Vectorize/LoopVectorizationPlanner.h | ||
---|---|---|
352 ↗ | (On Diff #148902) | I really like this change! This is decoupling the VPlan construction with VPInstructions from the legacy VPlan construction using recipes. I understand that you will refactor even more the code in buildVPRecipes to implement the temporary VPInstruction-to-VPRecipe step in the native path. Maybe we should name this buildVPlanWithRecipes or something like that to clearly state that it's creating a new VPlan? |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
6342 ↗ | (On Diff #148902) | Why BestVF.Width + 1? Shouldn't be BestVF.Width? |
6347 ↗ | (On Diff #148902) | The rationale of building VPlans for all the VFs and then invoke CM to choose the best VF (without using those VPlans) was to eventually port CM to work on these VPlans. This hasn't happened yet but @hsaito is working on this VPlan based version of the CM. We think that this step is very important in the converge of both vectorization paths since it will allow us to move the VPlan construction in the inner loop path to earlier stages of the path. Since this is a small change, my suggestion would be: 1) if this change is needed by your subsequent patches, let's go with it. Hideki could move the VPlan construction again after CM when his code is ready; 2) otherwise, let's keep the original version. Please, note that we are mostly sharing a single VPlan for all the VFs so we shouldn't be using unnecessary memory most of the times. WDYT? |
Thanks Diego!
lib/Transforms/Vectorize/LoopVectorizationPlanner.h | ||
---|---|---|
352 ↗ | (On Diff #148902) | Yep there is some more refactoring I plan on doing. Ideally I think we should try to move as much of the VPlan related implementation out of LoopVectorize.cpp |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
6342 ↗ | (On Diff #148902) | yep, buildVPRecipes does not really need a range , so this could be simplified. |
6347 ↗ | (On Diff #148902) | I understand, but do you think we will be able to replace the legacy cost model anytime soon? I expect implementing VPlan based inner loop vectorization that does not introduce regressions over the legacy cost model will be a big task. I thought the plan was to develop this cost model in the VPlan native path (with support for inner loops). That will allow us to get an initial cost model in and iterate on that, until we match the performance of the legacy one. At that point, we can retire the legacy inner loop vectorizer. What do you think? |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6347 ↗ | (On Diff #148902) | I think we should be doing both, and that's probably the only way to gain enough credibility with the rest of the community. Certainly not an easy task. From my perspective of CM refactoring, if VPlan is built before CM runs, incremental work of introducing VPlan-ness is easier. If you move it to after CM computing VF/UF, the first thing I need to do, for incremental introduction of VPlan-ness, is essentially undoing your change. First CM refactoring I plan to do is to change "BB->Instruction" traversal into "VPBB -> Recipe -> Instruction" traversal, making sure that the change is NFC. We can then proceed to change "per instruction" modeling to "per Recipe" modeling one by one and/or to "per VPInstruction" modeling, making sure cost modeling methodology and computed cost are comparable or better (we'll probably have to establish some metric there) each step. Along the way, we'll need to reflect many "behind the scenes smarts" into VPlan. All that expects VPlan to be built before CM computation. Does this make sense to you? |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6347 ↗ | (On Diff #148902) |
Right Diego, thanks!
Yep thanks! My understanding from one of Diego's responses at D46827 was that we want to try to keep changes to the Vplan native path, to allow us to iterate quickly without introducing regressions. Anyways, I think it makes sense to move building the Vplans before cost modelling again, as it sounds like the VPlan based cost modelling is coming soon :) Do you think we should get the rest of this patch in regardless? |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6347 ↗ | (On Diff #148902) | There's a fast path for velocity of implementation Diego is working on and a gradual path for credibility building that I'm working on. Eventually, we'll have enough sharing of code between the two paths and that's the time we can declare to conclude the transition. I'm neutral to the rest of the changes, minus the fact that NeedDef computation is needed only once, and we'll end up in more than once after the change. If you think this is a step forward for constructing HCFG first and then populate Recipes later for the innermost loop vectorization path, I'd say go for it. Otherwise, I'll stay neutral. Decoupling is the opposite of what we are trying to do. Increased sharing is the direction we are heading. |
lib/Transforms/Vectorize/LoopVectorizationPlanner.h | ||
---|---|---|
352 ↗ | (On Diff #148902) | +1 |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
6347 ↗ | (On Diff #148902) |
Sorry if my comment was confusing. We have to evaluate case by case an make individual decisions. We think the CM refactoring is necessary for convergence: 1) to be able to move the VPlan construction to an earlier stage in the inner loop path, and, 2) to be able to decouple and move some of the code in CM outside of CM. We still have to evaluate the actual consequences of this approach, though. However, what I think we shouldn't do in the VPlan native path is to build unnecessary recipes (since we are trying to get rid of them) or move code related to them without doing the proper porting of that code to VPInstructions. Otherwise we would end up having a VPlan native path with the same problems as the inner loop path. Does it make sense?
I think so! I like the other changes! |
This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.
Thanks for revisiting this, Florian!
No major issues. Just one question below.
Thanks,
Diego
lib/Transforms/Vectorize/LoopVectorizationPlanner.h | ||
---|---|---|
357 ↗ | (On Diff #149309) | ijf -> if |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
6341 ↗ | (On Diff #149309) | Just wondering if buildVPlans happened before the if condition for some reason. Is it possible that we still need the VF=1 VPlan for interleaving? Not sure if there is a test covering this case. Could you please have a look? If should be easy to test using #pragma clang loop interleave. |
Thanks for having a look Diego!
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
6341 ↗ | (On Diff #149309) | Thanks for spotting this. I think at the moment there are some assertions making sure we only vectorize with VF > 1, but I moved the check to the original position, as for loop aware SLP it might be profitable to vectorize even if the computed VF is 1. |
as I updated D46827, do you think there is anything left that's pending for this patch?