This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Compute costs for plans directly after construction.
Needs ReviewPublic

Authored by fhahn on Feb 13 2023, 12:10 PM.

Details

Summary

Directly compute the cost of a VPlan after construction and track it
together with a plan. This allows moving selecting the best VF to the
planner. This seems to be a good fit anyways, and removes code from the
cost-model that is not directly related to assigning costs to a specific
plan/VF. Later this can be swapped out with computing the cost for a
plan directly.

This may help to simplify D142015.

Diff Detail

Event Timeline

fhahn created this revision.Feb 13 2023, 12:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2023, 12:10 PM
fhahn requested review of this revision.Feb 13 2023, 12:10 PM
tschuett added inline comments.Feb 13 2023, 12:33 PM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
386

Could you add a comment /// Returns true when Factor A is more profitable than Factor B ? I was unclear about the direction A v.s B.

arcbbb added a subscriber: arcbbb.Feb 16 2023, 7:23 AM
fhahn updated this revision to Diff 510133.Mar 31 2023, 2:30 PM

Rebase, remove commented out code, improve structure.

fhahn updated this revision to Diff 510832.Apr 4 2023, 9:05 AM

Remove unnecessary code, add comment as suggested.

fhahn marked an inline comment as done.

Ping. I think the latest version should be cleaned up and ready for review.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
386

Thanks, added the comment!

fhahn retitled this revision from [VPlan] Compute costs for plans directly after construction (WIP). to [VPlan] Compute costs for plans directly after construction..Apr 4 2023, 9:06 AM
Matt added a subscriber: Matt.Apr 5 2023, 12:54 PM
fhahn updated this revision to Diff 512238.Apr 10 2023, 1:30 PM
fhahn marked an inline comment as done.

Rebase on top of current main

fhahn updated this revision to Diff 514014.Apr 16 2023, 8:02 AM

Rebase on top of current main.

fhahn updated this revision to Diff 516305.Apr 24 2023, 1:37 AM

rebase and ping :)

Ayal added a comment.Apr 27 2023, 3:44 PM

Good move towards getting Planning and eventually VPlans involved with their cost.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Perhaps the following would be better, at-least when first moving VectorizationFactors from CM to LVP: keep VPlans as a vector of all VPlanPtrs, as its name suggests; hold all VectorizationFactors in a separate vector or set indexed by their ElementCount Width (as in LVP::plan()'s CandidateVFs)? Can also hold a map associating each VectorizationFactor with its VPlan, if needed; is the reverse mapping useful?

(Independent of this patch): Hold the boolean IsVec as a field inside VectorizationFactor, rename VectorizationFactor more accurately.

288

/// ...

326

nit (independent of this patch): "all the vectorization factors in question"?

342–345

nit: comment deserves update - CandidateVFs is no longer a parameter. (Independent of this patch:) UserVF is irrelevant, behavior of ForceVectorization should be explained.

347

/// ... (independent of this patch)

383

/// ... (independent of this patch)

397

nit (independent of this patch): above needs L but below needs not?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5423

nit (independent of this patch): cache RTCost in VectorizationFactor if TripCount is known and Width isn't scalable?

5441

This is independent of Plan, right?

5445

nit (independent of this patch): comment for multiplying instead of dividing should appear above, before favoring scalable A.

5457

Thought (independent of patch): would a Plan/VectorizationFactor of Width VF with !HasVec be similar to a Plan of Width 1 and UF=VF? If so, is HasVec redundant (when !ForceVectorization)?

5625

nit: can assert there's a Plan that has NextVF.Width?

6329

nit: change independent of this patch?

6781

ditto

8729

nit: also prevents unrolling, hence no need to retain even the scalar VPlan of VF=1 for unrolling UF>1.

fhahn updated this revision to Diff 518239.Apr 29 2023, 2:45 PM
fhahn marked 6 inline comments as done.

Address comments directly for the patch, will address the independent nits separately, thanks!

fhahn marked an inline comment as done.Apr 29 2023, 2:46 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Thanks, I updated the code to use a separate mapping from VPlan->vector of pair<VectorizationFactor, bool>. This ensures that the costs are associated with a specific VPlan. WDYT?

As for the IsVec field, I am going to investigate if this is actually needed or if we can get the information from the existing VPlan.

288

Not needed in the latest version, I removed the code.

342–345

Thanks, updated the comment!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5441

Yes. I think actually filtering out the unprofitable VFs is not really needed, as the loop below should handle all cases already. Removed the code. Could arguably also be split off.

5625

Added, thanks!

8729

Will update this independently.

fhahn updated this revision to Diff 518259.Apr 29 2023, 4:06 PM
fhahn marked an inline comment as done.

Remove need for IsVec boolean flag in LVP.

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

As for the IsVec field, I am going to investigate if this is actually needed or if we can get the information from the existing VPlan.

We don't need it at this stage any longer, we can instead just avoid adding such VFs to CandidateVFs. Removed the boolean flag from LVP.

Ayal added inline comments.Apr 30 2023, 8:15 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
281

nit: retain initial size of 4?

283

hold all VectorizationFactors in a separate vector or set indexed by their ElementCount Width (as in LVP::plan()'s CandidateVFs)? Can also hold a map associating each VectorizationFactor with its VPlan, if needed; is the reverse mapping useful?

Thanks, I updated the code to use a separate mapping from VPlan->vector of pair<VectorizationFactor, bool>. This ensures that the costs are associated with a specific VPlan. WDYT?

A single cost is currently computed and associated with a specific VF as indicated by expectedCost(), select[Epilogue]VectorizationFactor() and LVP::plan()'s CandidateVFs; a single Plan is expected for each VF including 1 and BestVF. So LVP can hold all VectorizationFactors in a vector for now, or in a set indexed by Width, regardless of their Plans.
Can revisit this if/when the Plan associated with each VectorizationFactor and/or the set of VectorizationFactors corresponding to a Plan's VFs are needed. Sounds reasonable?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5441

Very well. Comment was to clarify that the association with Plans is not needed and if so to remove the named variable Plan.

5475

Worth a comment: all VectorizationFactors cache the same ScalarCost, so pick the first one? It need not represent the scalar VF=1.
Can retain the current call to expectedCost(ScalarFactor) and replace it independent of this patch.

5479–5480
5483

Retain condition that there is more than a (i.e., the scalar) single VFCandidate?
Always initialize ChosenFactor to getMax() and then skip considering scalar VF only if ForceVectorization? (Or else compute ScalarCost and set ChosenFactor to it.)

5491

Simply traverse all VFCandidates in a single loop as before?

5492

nit: retrain comment that scalar cost was already computed?

5621

Simpler to retain single loop over all VectorizationFactors? VPlan used only in assert.

7561

Worth continuing to pass VFCandidates to selectVectorizationFactor()? The former is still initialized to span range of all possible VF's, before trying to build VPlans, which will possibly exclude VF's. Passing VFCandidates to buildVPlansWithVPRecipes() would allow the latter to filter out non candidates (Fixed/Scalable), and then have selectVectorizationFactor() compute costs and select VF from within the remaining candidates? Can also introduce a separate LVP::computeCosts(VFCandidates).

7604

nit: change to auto independent of this patch.

8060

nit: change from push_back to emplace_back independent of this patch?

8061

Where/Are CandidateVFs filled when building w/o recipes?

8688

Could SubRange be excluded from VFCandidates here, for the latter to be traversed subsequently to compute costs independently, rather than fusing it with the construction of each Plan and its SubRange?

8700

Place this dbgs dump and continue after the #ifndef NDEBUG dump below, as done now?

llvm/lib/Transforms/Vectorize/VPlan.h
2323

This is redundant?

llvm/test/Transforms/LoopVectorize/AArch64/no_vector_instructions.ll
10

Could the error message "Not considering vector loop of width 2 because it will not generate any vector instructions" be retained, even when filtering these options?

fhahn updated this revision to Diff 518349.Apr 30 2023, 1:19 PM
fhahn marked 21 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
281

Retained, thanks!

283

Follow-up patches will break the 1-1 mapping from VFs->Cost e.g. when planning with multiple cost models there may be different plans for the same VFs but with different costs. For that, the VPlan->costs mapping seems to be required; happy to have VF->Cost mapping as intermediate step, but it might lead to less churn to use the Plan->costs mapping in this patch already?

326
342–345
347

added comment in 6fa07a87abc96c3b924be07ce446e79691a7f186 and moved it here in the patch.

383

add comment, thanks! I also removed the definition in the cost model

397
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5475

Thanks, for now the version retained expectedCost(ScalarFactor), to be replaced in a follow-on patch.

5479–5480

Updated, thanks!

5483

Retained the condition for now

5491

There's now no single vector condition all candidate VFs and costs. We either need to iterate over the entries in the map or over the plans, the latter seeming slightly preferable, as it will make it easy to return the associated plan as well (instead of later looking it up)

5492

retained, thanks!

5621

Same as above, there's now no single vector condition all candidate VFs and costs.

7561

I am not sure; the main motivation of the patch is to compute the cost on construction and delaying cost computation until selectVectorizationFactor would still mostly retain the original behavior? AFAICT the original selectVectorizationFactor doesn't filter out candidates before computing the cost; Scalable VFs are only added to VFCandidates below depending on MaxFactors.ScalableVF. So if the target doesn't support scalable VFs, no VPlans with scalable VFs should be generated.

7604

Changed it back to VPlanPtr to retain type info.

8060

Changed back to push_back, emplace_back as only beneficial in earlier versions that created a pair here.

8061

Do you mean for the native path? In that case selectVectorizationFactor isn't used yet, so CandidateVFs isn't used.

Once the change is in we could build on top and try to converge the native and ILV paths at this point and create plans for inner and outer loops and associate costs with the outer loop versions as well.

8688

Updated to retain the original loop and then compute the costs by iterating over all plans and computing costs for all VFs associated with them

8700

moved, thanks!

llvm/lib/Transforms/Vectorize/VPlan.h
2323

The latest version uses this to iterate over the VFs of a plan. I changed it to return ArrayRef<> though.

llvm/test/Transforms/LoopVectorize/AArch64/no_vector_instructions.ll
10

Yes, this works as expected with the latest version, I undid the change.

Ayal added inline comments.May 2 2023, 4:14 PM
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

This may indeed lead to less churn, but it would be good to reason about which mapping is needed along with a usage that needs it, such as planning with multiple cost models. Given that CM currently holds ProfitableVFs as a vector of VectorizationFactors, suffice for this patch to focus on moving such a vector to LVP?

347

Thanks!

Replace ::Disabled with null/optional? (Independent of this patch. Along with finding a better name for VectorizationFactor)

389

nit (independent of this patch): \p VF is not needed either.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5491

There's now no single vector condition all candidate VFs and costs. We either need to iterate over the entries in the map or over the plans, the latter seeming slightly preferable, as it will make it easy to return the associated plan as well (instead of later looking it up)

OK, but we currently do look it up later.

7561

Agreed, cost should in the long term be computed as part of VPlan2VPlan transformations, which currently take place inside (the excessive) tryToBuildVPlanWithVPRecipes().

8061

Worth a comment when defining VFCandidates that they serve only 'legacy' path.

8688

OK, then worth keeping the original simpler if (auto Plan = ... version?

Or perhaps something like

if (auto Plan = tryToBuildVPlanWithVPRecipes(SubRange, DeadInstructions)) {
  VPlans.push_back(std::move(*Plan));
  computePlanCosts(Plan, SubRange, Costs);
}

where the latter may some day turn into Costs.push_back(Plan->computeCost());

8756

Bailing out of vectorizing a loop due to disallowed conditional stores should better be done by buildVPlansWithVPRecipes() "before" computing their costs / independent of CM? (Independent of this patch).

fhahn marked an inline comment as done.May 9 2023, 6:55 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
283

Good point, I split off the parts of only moving the selectVectorizationFactor logic to LVP to D150197.

I'll update this patch once that is in, which should also drastically reduce the diff here.