This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Initial modeling of runtime VF * UF as VPValue.
AbandonedPublic

Authored by fhahn on Aug 7 2023, 12:52 PM.

Details

Summary

This patch starts initial modeling of runtime VF * UF in VPlan.
Initially, introduce a dedicated RuntimeVFxUF VPValue, which is then
populated during VPlan::prepareToExecute. Initially, the runtime
VF * UF applies only to the main vector loop region. Once we extend the
scope of VPlan in the future, we may want to associate different VFxUFs
with different vector loop regions (e.g. the epilogue vector loop)

This allows explicitly parameterizing recipes that rely on the runtime
VF * UF, like the canonical induction increment. At the moment, this
mainly helps to avoid generating some duplicated calls to vscale with
scalable vectors. It should also allow using EVL as induction increments
explicitly in D99750. Referring to VF * UF is also needed in other
places that we plan to migrate to VPlan, like the minimum trip count
check during skeleton creation.

The first version creates the value for VF * UF directly in
prepareToExecute to limit the scope of the patch. A follow-on patch will
model VF * UF computation explicitly in VPlan using recipes.

Note: Not all tests have been updated yet. I will update the remaining
tests once we are happy with the overall structure of the patch.

Diff Detail

Event Timeline

fhahn created this revision.Aug 7 2023, 12:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2023, 12:52 PM
fhahn requested review of this revision.Aug 7 2023, 12:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2023, 12:52 PM
ABataev added inline comments.Aug 7 2023, 12:58 PM
llvm/lib/Transforms/Vectorize/VPlan.h
2560

const member function

Matt added a subscriber: Matt.Aug 7 2023, 3:44 PM
fhahn updated this revision to Diff 548207.Aug 8 2023, 7:09 AM

Update printing of CanonicalIVIncrement to print both operands.

fhahn marked an inline comment as done.Aug 8 2023, 7:10 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlan.h
2560

Unfortunately this causes problems with various users which expect non-const return value.

ABataev added inline comments.Aug 8 2023, 7:19 AM
llvm/lib/Transforms/Vectorize/VPlan.h
2560

But can you return non-const return value but using const function?
Like

VPValue &getRuntimeVFxUF() const {return RuntimeVFxUF;}
llvm/lib/Transforms/Vectorize/VPlan.h
2479

might be better to clarify that it's loop invariant VPValue.
With @ABataev patch in, we may followup to unify EVL and RuntimeVFxUF. Does it sound reasonable ?

ssinad added a subscriber: ssinad.Aug 11 2023, 3:07 PM
fhahn marked 3 inline comments as done.Aug 25 2023, 1:01 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlan.h
2479

might be better to clarify that it's loop invariant VPValue.

Updated, thanks!

With @ABataev patch in, we may followup to unify EVL and RuntimeVFxUF. Does it sound reasonable ?

Sounds good. I think another suggestion/option that @Ayal had was to keep the canonical induction increment fixed (incrementing by a variable value would make it non-canonical) to control the loop and use EVL to only control the lanes accessed by (vector) recipes.

2560

Unfortunately this yields the following error as this would return a mutable reference to a member of a const pointer.

llvm/lib/Transforms/Vectorize/VPlan.h:2560:45: error: binding reference of type 'VPValue' to value of type 'const VPValue' drops 'const' qualifier
  VPValue &getRuntimeVFxUF() const { return RuntimeVFxUF; }
                                            ^~~~~~~~~~~~
fhahn updated this revision to Diff 553584.Aug 25 2023, 1:02 PM
fhahn marked 2 inline comments as done.

Update comment to state that RuntimeVFxUF is loop-invariant.

llvm/lib/Transforms/Vectorize/VPlan.h
2479

Can you please elaborate needs of having canonical loop form out of the vectorizer ?

Ayal added a comment.Thu, Nov 30, 6:26 AM

Overall approach and structure of the patch look good to me!

Perhaps instead of "RuntimeVFxUF" the VPValue should be called "VectorStep", complementing "VectorTripCount", essentially referring to the loop's strip-mine factor. This step is invariant and may be either a Compile-time constant or Runtime constant, depending on whether VF is Fixed or Scalable, respectively. Down the road, when a VPlan is restricted to a single Fixed VF (either having a singleton range to begin with, or once a best VF is chosen), constant-propagation could replace VectorStep with its immediate value.

Now that CanonicalIVIncrement is simply adding two VPValues together, could it be replaced by a VPInstruction having Instruction::Add as its opcode?

It may be helpful to initially try and keep VPlan printing in tact and also continue to introduce calls to llvm.vscale in-loop, in order to reduce initial changes to tests, to be followed by VPlanPrinting update and LICM'ing of these calls.

The VPValue should essentially belong to its Region rather than to the VPlan, but this probably deserves to be addressed as a follow-up: it holds for VectorTripCount as well, and also relates to non-loop Regions - whose Trip-count is essentially the Step of their enclosing loop Region, both resulting from the same strip-mining.

llvm/lib/Transforms/Vectorize/VPlan.cpp
867

Could end up w/o any users, if constant-propagated?

llvm/test/Transforms/LoopVectorize/reduction-inloop-cond.ll
507

Is this change substantial, replacing the original "CHECK-NOT: @llvm.vector.reduce.and" with series of CHECK-NEXT's?
Here and below.