This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Compute costs for plans directly after construction.
Needs ReviewPublic

Authored by fhahn on Feb 13 2023, 12:10 PM.

Details

Summary

Directly compute the cost of a VPlan after construction and track it
together with a plan. This allows moving selecting the best VF to the
planner. This seems to be a good fit anyways, and removes code from the
cost-model that is not directly related to assigning costs to a specific
plan/VF. Later this can be swapped out with computing the cost for a
plan directly.

This may help to simplify D142015.

Diff Detail

Event Timeline

fhahn created this revision.Feb 13 2023, 12:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 12:10 PM
fhahn requested review of this revision.Feb 13 2023, 12:10 PM
tschuett added inline comments.Feb 13 2023, 12:33 PM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
393

Could you add a comment /// Returns true when Factor A is more profitable than Factor B ? I was unclear about the direction A v.s B.

arcbbb added a subscriber: arcbbb.Feb 16 2023, 7:23 AM
fhahn updated this revision to Diff 510133.Mar 31 2023, 2:30 PM

Rebase, remove commented out code, improve structure.

fhahn updated this revision to Diff 510832.Apr 4 2023, 9:05 AM

Remove unnecessary code, add comment as suggested.

fhahn marked an inline comment as done.

Ping. I think the latest version should be cleaned up and ready for review.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
393

Thanks, added the comment!

fhahn retitled this revision from [VPlan] Compute costs for plans directly after construction (WIP). to [VPlan] Compute costs for plans directly after construction..Apr 4 2023, 9:06 AM
Matt added a subscriber: Matt.Apr 5 2023, 12:54 PM
fhahn updated this revision to Diff 512238.Apr 10 2023, 1:30 PM
fhahn marked an inline comment as done.

Rebase on top of current main

fhahn updated this revision to Diff 514014.Apr 16 2023, 8:02 AM

Rebase on top of current main.

fhahn updated this revision to Diff 516305.Apr 24 2023, 1:37 AM

rebase and ping :)

Ayal added a comment.Apr 27 2023, 3:44 PM

Good move towards getting Planning and eventually VPlans involved with their cost.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Perhaps the following would be better, at-least when first moving VectorizationFactors from CM to LVP: keep VPlans as a vector of all VPlanPtrs, as its name suggests; hold all VectorizationFactors in a separate vector or set indexed by their ElementCount Width (as in LVP::plan()'s CandidateVFs)? Can also hold a map associating each VectorizationFactor with its VPlan, if needed; is the reverse mapping useful?

(Independent of this patch): Hold the boolean IsVec as a field inside VectorizationFactor, rename VectorizationFactor more accurately.

287

/// ...

328

nit (independent of this patch): "all the vectorization factors in question"?

347–350

nit: comment deserves update - CandidateVFs is no longer a parameter. (Independent of this patch:) UserVF is irrelevant, behavior of ForceVectorization should be explained.

352

/// ... (independent of this patch)

390

/// ... (independent of this patch)

404

nit (independent of this patch): above needs L but below needs not?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5453

nit (independent of this patch): cache RTCost in VectorizationFactor if TripCount is known and Width isn't scalable?

5475

nit (independent of this patch): comment for multiplying instead of dividing should appear above, before favoring scalable A.

5513

This is independent of Plan, right?

5523

Thought (independent of patch): would a Plan/VectorizationFactor of Width VF with !HasVec be similar to a Plan of Width 1 and UF=VF? If so, is HasVec redundant (when !ForceVectorization)?

5659

nit: can assert there's a Plan that has NextVF.Width?

6362

nit: change independent of this patch?

6814

ditto

8752

nit: also prevents unrolling, hence no need to retain even the scalar VPlan of VF=1 for unrolling UF>1.

fhahn updated this revision to Diff 518239.Apr 29 2023, 2:45 PM
fhahn marked 6 inline comments as done.

Address comments directly for the patch, will address the independent nits separately, thanks!

fhahn marked an inline comment as done.Apr 29 2023, 2:46 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Thanks, I updated the code to use a separate mapping from VPlan->vector of pair<VectorizationFactor, bool>. This ensures that the costs are associated with a specific VPlan. WDYT?

As for the IsVec field, I am going to investigate if this is actually needed or if we can get the information from the existing VPlan.

287

Not needed in the latest version, I removed the code.

347–350

Thanks, updated the comment!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5513

Yes. I think actually filtering out the unprofitable VFs is not really needed, as the loop below should handle all cases already. Removed the code. Could arguably also be split off.

5659

Added, thanks!

8752

Will update this independently.

fhahn updated this revision to Diff 518259.Apr 29 2023, 4:06 PM
fhahn marked an inline comment as done.

Remove need for IsVec boolean flag in LVP.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

As for the IsVec field, I am going to investigate if this is actually needed or if we can get the information from the existing VPlan.

We don't need it at this stage any longer, we can instead just avoid adding such VFs to CandidateVFs. Removed the boolean flag from LVP.

Ayal added inline comments.Apr 30 2023, 8:15 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
281–283

nit: retain initial size of 4?

283

hold all VectorizationFactors in a separate vector or set indexed by their ElementCount Width (as in LVP::plan()'s CandidateVFs)? Can also hold a map associating each VectorizationFactor with its VPlan, if needed; is the reverse mapping useful?

Thanks, I updated the code to use a separate mapping from VPlan->vector of pair<VectorizationFactor, bool>. This ensures that the costs are associated with a specific VPlan. WDYT?

A single cost is currently computed and associated with a specific VF as indicated by expectedCost(), select[Epilogue]VectorizationFactor() and LVP::plan()'s CandidateVFs; a single Plan is expected for each VF including 1 and BestVF. So LVP can hold all VectorizationFactors in a vector for now, or in a set indexed by Width, regardless of their Plans.
Can revisit this if/when the Plan associated with each VectorizationFactor and/or the set of VectorizationFactors corresponding to a Plan's VFs are needed. Sounds reasonable?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5513

Very well. Comment was to clarify that the association with Plans is not needed and if so to remove the named variable Plan.

5530

Retain condition that there is more than a (i.e., the scalar) single VFCandidate?
Always initialize ChosenFactor to getMax() and then skip considering scalar VF only if ForceVectorization? (Or else compute ScalarCost and set ChosenFactor to it.)

5532

Simply traverse all VFCandidates in a single loop as before?

5533

nit: retrain comment that scalar cost was already computed?

5535

Worth a comment: all VectorizationFactors cache the same ScalarCost, so pick the first one? It need not represent the scalar VF=1.
Can retain the current call to expectedCost(ScalarFactor) and replace it independent of this patch.

5539–5540
5655

Simpler to retain single loop over all VectorizationFactors? VPlan used only in assert.

7594

Worth continuing to pass VFCandidates to selectVectorizationFactor()? The former is still initialized to span range of all possible VF's, before trying to build VPlans, which will possibly exclude VF's. Passing VFCandidates to buildVPlansWithVPRecipes() would allow the latter to filter out non candidates (Fixed/Scalable), and then have selectVectorizationFactor() compute costs and select VF from within the remaining candidates? Can also introduce a separate LVP::computeCosts(VFCandidates).

7639

nit: change to auto independent of this patch.

8093–8094

nit: change from push_back to emplace_back independent of this patch?

8094

Where/Are CandidateVFs filled when building w/o recipes?

8719

Could SubRange be excluded from VFCandidates here, for the latter to be traversed subsequently to compute costs independently, rather than fusing it with the construction of each Plan and its SubRange?

8731

Place this dbgs dump and continue after the #ifndef NDEBUG dump below, as done now?

llvm/lib/Transforms/Vectorize/VPlan.h
2323 ↗(On Diff #518259)

This is redundant?

llvm/test/Transforms/LoopVectorize/AArch64/no_vector_instructions.ll
10

Could the error message "Not considering vector loop of width 2 because it will not generate any vector instructions" be retained, even when filtering these options?

fhahn updated this revision to Diff 518349.Apr 30 2023, 1:19 PM
fhahn marked 21 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
281–283

Retained, thanks!

283

Follow-up patches will break the 1-1 mapping from VFs->Cost e.g. when planning with multiple cost models there may be different plans for the same VFs but with different costs. For that, the VPlan->costs mapping seems to be required; happy to have VF->Cost mapping as intermediate step, but it might lead to less churn to use the Plan->costs mapping in this patch already?

328
347–350
352

added comment in 6fa07a87abc96c3b924be07ce446e79691a7f186 and moved it here in the patch.

390

add comment, thanks! I also removed the definition in the cost model

404
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5530

Retained the condition for now

5532

There's now no single vector condition all candidate VFs and costs. We either need to iterate over the entries in the map or over the plans, the latter seeming slightly preferable, as it will make it easy to return the associated plan as well (instead of later looking it up)

5533

retained, thanks!

5535

Thanks, for now the version retained expectedCost(ScalarFactor), to be replaced in a follow-on patch.

5539–5540

Updated, thanks!

5655

Same as above, there's now no single vector condition all candidate VFs and costs.

7594

I am not sure; the main motivation of the patch is to compute the cost on construction and delaying cost computation until selectVectorizationFactor would still mostly retain the original behavior? AFAICT the original selectVectorizationFactor doesn't filter out candidates before computing the cost; Scalable VFs are only added to VFCandidates below depending on MaxFactors.ScalableVF. So if the target doesn't support scalable VFs, no VPlans with scalable VFs should be generated.

7639

Changed it back to VPlanPtr to retain type info.

8093–8094

Changed back to push_back, emplace_back as only beneficial in earlier versions that created a pair here.

8094

Do you mean for the native path? In that case selectVectorizationFactor isn't used yet, so CandidateVFs isn't used.

Once the change is in we could build on top and try to converge the native and ILV paths at this point and create plans for inner and outer loops and associate costs with the outer loop versions as well.

8719

Updated to retain the original loop and then compute the costs by iterating over all plans and computing costs for all VFs associated with them

8731

moved, thanks!

llvm/lib/Transforms/Vectorize/VPlan.h
2323 ↗(On Diff #518259)

The latest version uses this to iterate over the VFs of a plan. I changed it to return ArrayRef<> though.

llvm/test/Transforms/LoopVectorize/AArch64/no_vector_instructions.ll
10

Yes, this works as expected with the latest version, I undid the change.

Ayal added inline comments.May 2 2023, 4:14 PM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

This may indeed lead to less churn, but it would be good to reason about which mapping is needed along with a usage that needs it, such as planning with multiple cost models. Given that CM currently holds ProfitableVFs as a vector of VectorizationFactors, suffice for this patch to focus on moving such a vector to LVP?

352

Thanks!

Replace ::Disabled with null/optional? (Independent of this patch. Along with finding a better name for VectorizationFactor)

396

nit (independent of this patch): \p VF is not needed either.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5532

There's now no single vector condition all candidate VFs and costs. We either need to iterate over the entries in the map or over the plans, the latter seeming slightly preferable, as it will make it easy to return the associated plan as well (instead of later looking it up)

OK, but we currently do look it up later.

7594

Agreed, cost should in the long term be computed as part of VPlan2VPlan transformations, which currently take place inside (the excessive) tryToBuildVPlanWithVPRecipes().

8094

Worth a comment when defining VFCandidates that they serve only 'legacy' path.

8719

OK, then worth keeping the original simpler if (auto Plan = ... version?

Or perhaps something like

if (auto Plan = tryToBuildVPlanWithVPRecipes(SubRange, DeadInstructions)) {
  VPlans.push_back(std::move(*Plan));
  computePlanCosts(Plan, SubRange, Costs);
}

where the latter may some day turn into Costs.push_back(Plan->computeCost());

8779

Bailing out of vectorizing a loop due to disallowed conditional stores should better be done by buildVPlansWithVPRecipes() "before" computing their costs / independent of CM? (Independent of this patch).

fhahn marked an inline comment as done.May 9 2023, 6:55 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Good point, I split off the parts of only moving the selectVectorizationFactor logic to LVP to D150197.

I'll update this patch once that is in, which should also drastically reduce the diff here.