This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Fix in-loop reduction chains using VPlan def-use chains (NFCI)
ClosedPublic

Authored by fhahn on Jul 20 2023, 7:41 AM.

Details

Summary

Update adjustRecipesForReductions to directly use the VPlan def-use
chains for in-loop reductions to collect the reduction operations that
need adjusting.

This allows the removal of

  • ReductionChainMap
  • recording of recipes for instruction in the reduction chain
  • removes late uses of getVPValue
  • removes to need for removeVPValueFor.

Diff Detail

Event Timeline

fhahn created this revision.Jul 20 2023, 7:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 7:41 AM
fhahn requested review of this revision.Jul 20 2023, 7:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 7:41 AM
Herald added subscribers: wangpc, vkmr. · View Herald Transcript
Ayal added a comment.Jul 22 2023, 3:46 PM

Thanks for following-up on this nice cleanup!
Looking over the code raises various initial thoughts, which could be pursued in several steps/patches.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1611
1775–1777
1782–1783
9152

?
Can also filter out non-isOrdered phi's when VF=1, to then iterate only over phi's of interest.

9156

Bypass this for loop if MinVF.isScalar()?

9166–9167

Would one SetVector work?

This represents the Chain - should it hold Recipes as Links, each residing inside the loop and having a single user (except last with live-out), rather than VPValues?

9170

Worth a comment explaining what's going to happen next.

9183–9186
9189

Above initialization of Worklist tolerates recipes with multiple defined values, but here only single valued recipes are expected?

9192

Hoist Kind and its assert for being a select-compare kind next to setting RdxDesc above?

9195

IndexOfFirstOperand?
Initialize? Comment what this variable holds? (Admittedly missing from current code base.)

9205–9206

Dead assert.

Worth a comment why we continue here - to skip over the compare of the select?
What about Fminimum/Fmaximum reductions which are isMinMaxRecurrenceKind but have a call to an intrinsic as their WidenRecipe, rather than a compare-select?

9211

comment appears out of date.

9219
9225
9228

Is this setting of Chain needed?

9251

Can this setting of Chain continue to be placed at the end of the loop below?

Ayal added inline comments.Jul 22 2023, 3:46 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9187
9188

Could this for loop be replaced by a simpler for (auto Link : Chain) { loop?

9189
9190
9193
9197

Comment deserves update - FMulAdd is also supported.
Potentially renamed isAnyOfRecurrenceKind() in D155786.

9216

It's somewhat alarming that we're indifferent to which operand is VecOp, in non commutative cases such as compare-select. The semantics of the compare/select are presumably captured by the min/max kind, rather than the condition itself (or its negation). Worth a comment, and ensuring test coverage of having VecOp in both positions.

9219

Or rather BasicBlock *CurrentLinkBB = CurrentLinkI->getParent();

Can also define and reuse: VPBasicBlock *CurrentLinkVPBB = CurrentLink->getParent();

9232

Should this breaking of fmuladd into an fadd recurrence fed by an fmul take place separately, as a preparatory step?
Should compare-select pairs be replaced by Min/Max recipes?
In order to convert an (initially widened) reduction recurrence into an in-loop one, it seems that isolating the reduction operation into a single, dedicated operation would be useful. It could also help canonicalize where to find the reduced value operands, aka vecOps.

9241
9262

Why/Is this needed?

Ayal added inline comments.Jul 22 2023, 11:07 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9156

Bypass this for loop if MinVF.isScalar()?

Ah, isOrdered reductions are considered inLoop even for VF=1 (but only them).

fhahn updated this revision to Diff 544541.Jul 26 2023, 3:30 PM
fhahn marked 27 inline comments as done.

Address latest comments, thanks! I tried to include most adjustments directly in this patch to keep things together.

fhahn added inline comments.Jul 26 2023, 3:32 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1611

Adjusted, thanks!

1775–1777

Updated, thanks!

1782–1783

Updated, thanks!

9152

Adjusted and collect only VPReductionPHIRecipe

9166–9167

Updated to use SetVector and use VPRecipeBase as element type, thanks!

9170

added a comment, thanks!

9183–9186

Updated, Thanks!

9188

Updated to use iterator based loop which skips the first element (aka PreviousLink)

9189

Updated, thanks!

Above initialization of Worklist tolerates recipes with multiple defined values, but here only single valued recipes are expected?

Added an assert in the collection loop

9190

Updated, thanks!

9192

Hoisted out of loop, thanks!

9193

Updated , thanks!

9195

Renamed & added comment. It should be guaranteed to be initialized before use.

9197

Updated the message; any reduction should be allowed, besides select/cmp ones.

9205–9206

Will update to have a check for the cmp only. In-loop reductions won't be used for those intrinsics at the moment. Will make sure there's a test. Added tests with min-max intrinsics in cc3986643696, for which in-loop reductions aren't applied yet

9211

Updated to also include FMulAdd

9216

Yep, that looks right, added a comment, thanks!

Added test coverage with flipped operands in cc3986643696.

9219

Updated, thanks!

9225

Updated, thanks!

9228

No, removed ,thanks!

9232

Might be good as follow up?

9251

Yep, moved, thanks!

9262

Not needed in the latest version, removed!

Ayal added a comment.Jul 31 2023, 3:32 PM

Added some more comments to help clean this up more, can be applied now or later if preferred to keep the diff closer to a VPlan refactoring alone.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9164

nit: Should this filtering be applied above when inserting into InLoopReductionPhis?
Should unordered reductions be considered non-isInLoop in VPlans of VF=1?

9206

note: may assert that UserRecipe resides inside the vector loop, as the outside user is a LiveOut, at-least for now.

9208
9211

assert CurrentLink is a select recipe (whose first operand index is 1)?

9213

Treat FMulAdd separately from the binary operators - the operand along the chain is in index 2, go ahead and generate the FMul of indices 0 and 1 to produce VecOp? (Keeping the assert message intact ;-)

9215

nit: assert better placed above, asap, independent of filling Worklist.

9217

assert that VecOp != PreviousLinkV and that the other candidate does equal PreviousLinkV?
(CurrentLink->getOperand(IndexOfFirstOperand+IndexOfFirstOperand+1-VecOpId))

9221

Wonder if this information may already be recorded in VPlan - perhaps BlockInMask already exists?

9228

nit: would always considering the last two indices be (correct and) easier than considering either first two indices or indices 1 and 2? Excluding FMulAdd which deserves a separate treatment.

9231
9232

Sure, if not two.

9233–9236

assert VecOpId == 0?

9244

Could we use PreviousLink serve as/instead of CompareRecipe?

fhahn updated this revision to Diff 546239.Aug 1 2023, 3:11 PM
fhahn marked 11 inline comments as done.

Address latest comments, thanks!

fhahn added inline comments.Aug 1 2023, 3:12 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9164

Moved the filtering. Left scalar handling as is for now to keep with original code.

9206

Updated, thanks!

9208

Updated, thanks!

9211

added assert, thanks!

9213

Moved, thanks!

9215

Moved up, thanks!

9217

Added assert, thanks!

9221

Not easily yet, but I think as follow-up this could be handled by checking the predecessors of the block to find the predecessor with branch-on-mask.

9228

For the initial version, I think it would be good retain the original logic as much as possible. This also makes it easier to retain the asserts.

9233–9236

Not needed after code movement.

9244

Unfortunately I don't think so, but we can let dead-recipe removal take care of this removal. I removed the code.

Ayal added inline comments.Aug 1 2023, 4:17 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9170–9177
9186

?

9192
9194
9210

If VF=1 CurrentLink should be a Replicate instead of a Widen(Call) recipe?

Better place the above assert(RecurrenceDescriptor::isFMulAddIntrinsic(CurrentLinkI)) here?

9244

If PreviousLink is set to Cmp before early-continuing above?

OTOH, dead-recipe removal could also reclaim CurrentLink.

fhahn updated this revision to Diff 546406.Aug 2 2023, 4:26 AM
fhahn marked 9 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9170–9177

Reordered, thanks!

9186

Removed, thanks!

9192

Fixed, thanks!

9194

added, thanks!

9210

Moved assert and refined condition as suggested, thanks!

9231

Adjusted, thanks!

9244

Yep, updated, thanks!

Ayal accepted this revision.Aug 2 2023, 7:23 AM

This looks good to me, thanks!
Adding last minor nits.

Would be good to see how adjustRecipesForReductions() may become a VPlan2VPlan pass, and apply fmuladd decomposition / minmax composition as separate, preparatory steps.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9216

Note: in this case PreviousLink stays the recipe before the compare, feeding one of the operands of the select.

9237
9244

nit: may be worth noting somewhere that this transformation may leave dead recipes for a subsequent pass to clean up.

This revision is now accepted and ready to land.Aug 2 2023, 7:23 AM
This revision was landed with ongoing or failed builds.Aug 2 2023, 9:05 AM
This revision was automatically updated to reflect the committed changes.
fhahn marked 3 inline comments as done.
fhahn added inline comments.Aug 2 2023, 9:25 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9237

Sounds better, thanks!

9244

Added a note in the committed version, thanks!