This is an archive of the discontinued LLVM Phabricator instance.

[LV] Move reduction PHI node fixup to VPlan::execute (NFC).
ClosedPublic

Authored by fhahn on Apr 8 2021, 8:03 AM.

Details

Summary

All information to fix-up the reduction phi nodes in the vectorized loop
is available in VPlan now. This patch moves the code to do so, to make
this clearer. Fixing up the loop exit value still relies on other
information and remains outside of VPlan for now.

Diff Detail

Unit TestsFailed

Event Timeline

fhahn created this revision.Apr 8 2021, 8:03 AM
fhahn requested review of this revision.Apr 8 2021, 8:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2021, 8:03 AM
Herald added a subscriber: vkmr. · View Herald Transcript
Ayal added inline comments.May 3 2021, 12:47 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9625

better turn on an IsOrdered bit inside VPReductionRecipe during its creation or planning, than during its execute? Admittedly goes back to D98435.

There are 3 types of reductions: (1) parallel lanes: each lane computes an independent partial-sum; (2) parallel parts: each vector computes an independent partial sum, in-loop within vectors but not across parts, and (3) serial: in-loop across vectors and parts, the header phi operates on a single scalar. Would be good to explicitly indicate which type of reduction is chosen, in both ReductionRecipe(s) and the header phi; it effects their code-gen.

llvm/lib/Transforms/Vectorize/VPlan.cpp
63

Defining a method of LVP inside VPlan.cpp seems odd, at best.

The use of a gating EnableStrictReductions knob (in contrast to a CM cost-based decision), belongs to Legal rather than CM or Planner? I.e., Legal should bail out from vectorizing a loop having an ordered reduction, when !EnableStrictReductions? Admittedly goes back to D98435.

883

Moving this to VPlan::execute is great!
Can use getSingleVPValue(), getBackedgeValue().
May be worth explaining that Val is first set per-part supporting parallel partial sums, either vector/per-lane (1) or scalar/per-part (2), and then set to last-part supporting serial reduction (3).
Admittedly goes back to D98435.

fhahn updated this revision to Diff 343863.May 8 2021, 1:54 PM

Rebased and updated the code to check whether to use ordered reductions during codegen to get the VPReductionRecipe and check isOrdered on the reduction descriptr. This builds on the code to determine whether a VPWidenPHI is part of an in-loop reduction.

fhahn added inline comments.May 8 2021, 1:58 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9625

better turn on an IsOrdered bit inside VPReductionRecipe during its creation or planning, than during its execute? Admittedly goes back to D98435.

I think we already have all the required information available with the existing pieces. I updated the patch to check isOrdered on the recurrence descriptor of the VPReductionRecipe and a helper to get the VPReductionRecipe for a VPWidenPHIRecipe, if there's any.

llvm/lib/Transforms/Vectorize/VPlan.cpp
63

Agreed! I updated the code to check isOrdered on the reduction recurrence descriptor directly during code generation. To do so, I evolved the isInLoopReduction helper to an getInLoopReduction member that returns the VPReductionRecipe, if there's one.

883

I rebased the code, and it now uses getVPSingleValue/getBackedgeValue. I can add the additional comments separately or in this change.

fhahn updated this revision to Diff 363313.Jul 31 2021, 12:30 PM

Rebased after recent changes, should be much simpler now :)

Ayal added inline comments.Aug 1 2021, 11:01 PM
llvm/lib/Transforms/Vectorize/VPlan.cpp
871

Place this next to hooking up incoming of FOR header phi's above, or even fuse both into a common loop over header phi's? The isOrdered case does behave quite similar to FOR phi's.

fhahn updated this revision to Diff 363573.Aug 2 2021, 2:15 PM

Merge FOR and reduction handling loops as suggested, thanks!

fhahn marked an inline comment as done.Aug 2 2021, 2:16 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlan.cpp
871

done, thanks!

Ayal added inline comments.Aug 3 2021, 4:20 PM
llvm/lib/Transforms/Vectorize/VPlan.cpp
824

The isOrdered case does behave quite similar to FOR phi's.

How about having a common treatment, e.g.,

for (VPRecipeBase &R : Header->phis()) {
  auto *PhiR = dyn_cast<VPWidenPHIRecipe>(&R);
  if (!PhiR)
    continue;
  // For first-order recurrences and in-order reduction phis, only a single part
  // is generated, which provides the last part from the previous iteration.
  // Otherwise all UF parts are generated.
  bool IsOrdered = (isa<VPFirstOrderRecurrencePHIRecipe>(&R)
                    || cast<VPReductionPHIRecipe>(&R)->isOrdered());
  unsigned LastPartForNewPhi = IsOrdered ? 1 : State->UF;
  for (unsigned Part = 0; Part < LastPartForNewPhi; ++Part) {
    Value *VecPhi = State->get(PhiR->getVPSingleValue(), Part);
      // Or suffice State->get(PhiR, Part)?
    Value *Val = State->get(PhiR->getBackedgeValue(),
                            IsOrdered ? State->UF - 1 : Part);
    cast<PHINode>(VecPhi)->addIncoming(Val, VectorLatchBB);
  }

?

fhahn updated this revision to Diff 364552.Aug 5 2021, 11:02 AM
fhahn marked an inline comment as done.

Merge FOR & reduction phi handling as suggested, thanks!

fhahn added inline comments.Aug 5 2021, 11:03 AM
llvm/lib/Transforms/Vectorize/VPlan.cpp
824

Looks good, updated with a small re-naming. of IsOrdered -> SinglePartNeeded and a check to limit to FOR and reduction PHIs only.

Ayal accepted this revision.Aug 5 2021, 1:18 PM

Thanks! This looks good to me ;-)

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4309

nit: place above comment about start value where it belongs?

This revision is now accepted and ready to land.Aug 5 2021, 1:18 PM
fhahn marked an inline comment as done.Aug 6 2021, 12:17 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4309

I'll move it to VPReductionPHIRecipe::execute, which creates & sets the start value.

This revision was automatically updated to reflect the committed changes.
fhahn marked an inline comment as done.