This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Compute costs for plans directly after construction.
Needs ReviewPublic

Authored by fhahn on Feb 13 2023, 12:10 PM.

Details

Summary

Directly compute the cost of a VPlan after construction and track it
together with a plan. This allows moving selecting the best VF to the
planner. This seems to be a good fit anyways, and removes code from the
cost-model that is not directly related to assigning costs to a specific
plan/VF. Later this can be swapped out with computing the cost for a
plan directly.

This may help to simplify D142015.

Diff Detail

Event Timeline

fhahn created this revision.Feb 13 2023, 12:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 12:10 PM
fhahn requested review of this revision.Feb 13 2023, 12:10 PM
tschuett added inline comments.Feb 13 2023, 12:33 PM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
394

Could you add a comment /// Returns true when Factor A is more profitable than Factor B ? I was unclear about the direction A v.s B.

arcbbb added a subscriber: arcbbb.Feb 16 2023, 7:23 AM
fhahn updated this revision to Diff 510133.Mar 31 2023, 2:30 PM

Rebase, remove commented out code, improve structure.

fhahn updated this revision to Diff 510832.Apr 4 2023, 9:05 AM

Remove unnecessary code, add comment as suggested.

fhahn marked an inline comment as done.

Ping. I think the latest version should be cleaned up and ready for review.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
394

Thanks, added the comment!

fhahn retitled this revision from [VPlan] Compute costs for plans directly after construction (WIP). to [VPlan] Compute costs for plans directly after construction..Apr 4 2023, 9:06 AM
Matt added a subscriber: Matt.Apr 5 2023, 12:54 PM
fhahn updated this revision to Diff 512238.Apr 10 2023, 1:30 PM
fhahn marked an inline comment as done.

Rebase on top of current main

fhahn updated this revision to Diff 514014.Apr 16 2023, 8:02 AM

Rebase on top of current main.

fhahn updated this revision to Diff 516305.Apr 24 2023, 1:37 AM

rebase and ping :)

Ayal added a comment.Apr 27 2023, 3:44 PM

Good move towards getting Planning and eventually VPlans involved with their cost.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Perhaps the following would be better, at-least when first moving VectorizationFactors from CM to LVP: keep VPlans as a vector of all VPlanPtrs, as its name suggests; hold all VectorizationFactors in a separate vector or set indexed by their ElementCount Width (as in LVP::plan()'s CandidateVFs)? Can also hold a map associating each VectorizationFactor with its VPlan, if needed; is the reverse mapping useful?

(Independent of this patch): Hold the boolean IsVec as a field inside VectorizationFactor, rename VectorizationFactor more accurately.

288

/// ...

325–326

nit (independent of this patch): "all the vectorization factors in question"?

342–345

nit: comment deserves update - CandidateVFs is no longer a parameter. (Independent of this patch:) UserVF is irrelevant, behavior of ForceVectorization should be explained.

347

/// ... (independent of this patch)

391

/// ... (independent of this patch)

405

nit (independent of this patch): above needs L but below needs not?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5468

nit (independent of this patch): cache RTCost in VectorizationFactor if TripCount is known and Width isn't scalable?

5476

This is independent of Plan, right?

5477

Thought (independent of patch): would a Plan/VectorizationFactor of Width VF with !HasVec be similar to a Plan of Width 1 and UF=VF? If so, is HasVec redundant (when !ForceVectorization)?

5490

nit (independent of this patch): comment for multiplying instead of dividing should appear above, before favoring scalable A.

5610

nit: can assert there's a Plan that has NextVF.Width?

6314

nit: change independent of this patch?

6766

ditto

8685

nit: also prevents unrolling, hence no need to retain even the scalar VPlan of VF=1 for unrolling UF>1.

fhahn updated this revision to Diff 518239.Apr 29 2023, 2:45 PM
fhahn marked 6 inline comments as done.

Address comments directly for the patch, will address the independent nits separately, thanks!

fhahn marked an inline comment as done.Apr 29 2023, 2:46 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Thanks, I updated the code to use a separate mapping from VPlan->vector of pair<VectorizationFactor, bool>. This ensures that the costs are associated with a specific VPlan. WDYT?

As for the IsVec field, I am going to investigate if this is actually needed or if we can get the information from the existing VPlan.

288

Not needed in the latest version, I removed the code.

342–345

Thanks, updated the comment!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5476

Yes. I think actually filtering out the unprofitable VFs is not really needed, as the loop below should handle all cases already. Removed the code. Could arguably also be split off.

5610

Added, thanks!

8685

Will update this independently.

fhahn updated this revision to Diff 518259.Apr 29 2023, 4:06 PM
fhahn marked an inline comment as done.

Remove need for IsVec boolean flag in LVP.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

As for the IsVec field, I am going to investigate if this is actually needed or if we can get the information from the existing VPlan.

We don't need it at this stage any longer, we can instead just avoid adding such VFs to CandidateVFs. Removed the boolean flag from LVP.

Ayal added inline comments.Apr 30 2023, 8:15 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
281

nit: retain initial size of 4?

283

hold all VectorizationFactors in a separate vector or set indexed by their ElementCount Width (as in LVP::plan()'s CandidateVFs)? Can also hold a map associating each VectorizationFactor with its VPlan, if needed; is the reverse mapping useful?

Thanks, I updated the code to use a separate mapping from VPlan->vector of pair<VectorizationFactor, bool>. This ensures that the costs are associated with a specific VPlan. WDYT?

A single cost is currently computed and associated with a specific VF as indicated by expectedCost(), select[Epilogue]VectorizationFactor() and LVP::plan()'s CandidateVFs; a single Plan is expected for each VF including 1 and BestVF. So LVP can hold all VectorizationFactors in a vector for now, or in a set indexed by Width, regardless of their Plans.
Can revisit this if/when the Plan associated with each VectorizationFactor and/or the set of VectorizationFactors corresponding to a Plan's VFs are needed. Sounds reasonable?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5476

Very well. Comment was to clarify that the association with Plans is not needed and if so to remove the named variable Plan.

5482

Simply traverse all VFCandidates in a single loop as before?

5482

Retain condition that there is more than a (i.e., the scalar) single VFCandidate?
Always initialize ChosenFactor to getMax() and then skip considering scalar VF only if ForceVectorization? (Or else compute ScalarCost and set ChosenFactor to it.)

5483

nit: retrain comment that scalar cost was already computed?

5487

Worth a comment: all VectorizationFactors cache the same ScalarCost, so pick the first one? It need not represent the scalar VF=1.
Can retain the current call to expectedCost(ScalarFactor) and replace it independent of this patch.

5491–5492
5606

Simpler to retain single loop over all VectorizationFactors? VPlan used only in assert.

7546

Worth continuing to pass VFCandidates to selectVectorizationFactor()? The former is still initialized to span range of all possible VF's, before trying to build VPlans, which will possibly exclude VF's. Passing VFCandidates to buildVPlansWithVPRecipes() would allow the latter to filter out non candidates (Fixed/Scalable), and then have selectVectorizationFactor() compute costs and select VF from within the remaining candidates? Can also introduce a separate LVP::computeCosts(VFCandidates).

7589

nit: change to auto independent of this patch.

8045–8046

nit: change from push_back to emplace_back independent of this patch?

8046

Where/Are CandidateVFs filled when building w/o recipes?

8673

Could SubRange be excluded from VFCandidates here, for the latter to be traversed subsequently to compute costs independently, rather than fusing it with the construction of each Plan and its SubRange?

8685

Place this dbgs dump and continue after the #ifndef NDEBUG dump below, as done now?

llvm/lib/Transforms/Vectorize/VPlan.h
2323

This is redundant?

llvm/test/Transforms/LoopVectorize/AArch64/no_vector_instructions.ll
10 ↗(On Diff #518259)

Could the error message "Not considering vector loop of width 2 because it will not generate any vector instructions" be retained, even when filtering these options?

fhahn updated this revision to Diff 518349.Apr 30 2023, 1:19 PM
fhahn marked 21 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
281

Retained, thanks!

283

Follow-up patches will break the 1-1 mapping from VFs->Cost e.g. when planning with multiple cost models there may be different plans for the same VFs but with different costs. For that, the VPlan->costs mapping seems to be required; happy to have VF->Cost mapping as intermediate step, but it might lead to less churn to use the Plan->costs mapping in this patch already?

325–326
342–345
347

added comment in 6fa07a87abc96c3b924be07ce446e79691a7f186 and moved it here in the patch.

391

add comment, thanks! I also removed the definition in the cost model

405
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5482

Retained the condition for now

5482

There's now no single vector condition all candidate VFs and costs. We either need to iterate over the entries in the map or over the plans, the latter seeming slightly preferable, as it will make it easy to return the associated plan as well (instead of later looking it up)

5483

retained, thanks!

5487

Thanks, for now the version retained expectedCost(ScalarFactor), to be replaced in a follow-on patch.

5491–5492

Updated, thanks!

5606

Same as above, there's now no single vector condition all candidate VFs and costs.

7546

I am not sure; the main motivation of the patch is to compute the cost on construction and delaying cost computation until selectVectorizationFactor would still mostly retain the original behavior? AFAICT the original selectVectorizationFactor doesn't filter out candidates before computing the cost; Scalable VFs are only added to VFCandidates below depending on MaxFactors.ScalableVF. So if the target doesn't support scalable VFs, no VPlans with scalable VFs should be generated.

7589

Changed it back to VPlanPtr to retain type info.

8045–8046

Changed back to push_back, emplace_back as only beneficial in earlier versions that created a pair here.

8046

Do you mean for the native path? In that case selectVectorizationFactor isn't used yet, so CandidateVFs isn't used.

Once the change is in we could build on top and try to converge the native and ILV paths at this point and create plans for inner and outer loops and associate costs with the outer loop versions as well.

8673

Updated to retain the original loop and then compute the costs by iterating over all plans and computing costs for all VFs associated with them

8685

moved, thanks!

llvm/lib/Transforms/Vectorize/VPlan.h
2323

The latest version uses this to iterate over the VFs of a plan. I changed it to return ArrayRef<> though.

llvm/test/Transforms/LoopVectorize/AArch64/no_vector_instructions.ll
10 ↗(On Diff #518259)

Yes, this works as expected with the latest version, I undid the change.

Ayal added inline comments.May 2 2023, 4:14 PM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

This may indeed lead to less churn, but it would be good to reason about which mapping is needed along with a usage that needs it, such as planning with multiple cost models. Given that CM currently holds ProfitableVFs as a vector of VectorizationFactors, suffice for this patch to focus on moving such a vector to LVP?

347

Thanks!

Replace ::Disabled with null/optional? (Independent of this patch. Along with finding a better name for VectorizationFactor)

397

nit (independent of this patch): \p VF is not needed either.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5482

There's now no single vector condition all candidate VFs and costs. We either need to iterate over the entries in the map or over the plans, the latter seeming slightly preferable, as it will make it easy to return the associated plan as well (instead of later looking it up)

OK, but we currently do look it up later.

7546

Agreed, cost should in the long term be computed as part of VPlan2VPlan transformations, which currently take place inside (the excessive) tryToBuildVPlanWithVPRecipes().

8046

Worth a comment when defining VFCandidates that they serve only 'legacy' path.

8673

OK, then worth keeping the original simpler if (auto Plan = ... version?

Or perhaps something like

if (auto Plan = tryToBuildVPlanWithVPRecipes(SubRange, DeadInstructions)) {
  VPlans.push_back(std::move(*Plan));
  computePlanCosts(Plan, SubRange, Costs);
}

where the latter may some day turn into Costs.push_back(Plan->computeCost());

8712

Bailing out of vectorizing a loop due to disallowed conditional stores should better be done by buildVPlansWithVPRecipes() "before" computing their costs / independent of CM? (Independent of this patch).

fhahn marked an inline comment as done.May 9 2023, 6:55 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Good point, I split off the parts of only moving the selectVectorizationFactor logic to LVP to D150197.

I'll update this patch once that is in, which should also drastically reduce the diff here.