This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Fix in-loop reduction chains using VPlan def-use chains (NFCI)
ClosedPublic

Authored by fhahn on Jul 20 2023, 7:41 AM.

Details

Summary

Update adjustRecipesForReductions to directly use the VPlan def-use
chains for in-loop reductions to collect the reduction operations that
need adjusting.

This allows the removal of

  • ReductionChainMap
  • recording of recipes for instruction in the reduction chain
  • removes late uses of getVPValue
  • removes to need for removeVPValueFor.

Diff Detail

Event Timeline

fhahn created this revision.Jul 20 2023, 7:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 7:41 AM
fhahn requested review of this revision.Jul 20 2023, 7:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 7:41 AM
Herald added subscribers: wangpc, vkmr. · View Herald Transcript
Ayal added a comment.Jul 22 2023, 3:46 PM

Thanks for following-up on this nice cleanup!
Looking over the code raises various initial thoughts, which could be pursued in several steps/patches.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1607
1770–1771
1775–1776
9134–9135

?
Can also filter out non-isOrdered phi's when VF=1, to then iterate only over phi's of interest.

9138

Bypass this for loop if MinVF.isScalar()?

9173–9174
9173–9174

Hoist Kind and its assert for being a select-compare kind next to setting RdxDesc above?

9173–9174

IndexOfFirstOperand?
Initialize? Comment what this variable holds? (Admittedly missing from current code base.)

9174–9175

Would one SetVector work?

This represents the Chain - should it hold Recipes as Links, each residing inside the loop and having a single user (except last with live-out), rather than VPValues?

9175

Above initialization of Worklist tolerates recipes with multiple defined values, but here only single valued recipes are expected?

9178

Worth a comment explaining what's going to happen next.

9179–9180

Dead assert.

Worth a comment why we continue here - to skip over the compare of the select?
What about Fminimum/Fmaximum reductions which are isMinMaxRecurrenceKind but have a call to an intrinsic as their WidenRecipe, rather than a compare-select?

9201–9202

comment appears out of date.

9230
9234–9235
9238

Is this setting of Chain needed?

9242

Can this setting of Chain continue to be placed at the end of the loop below?

Ayal added inline comments.Jul 22 2023, 3:46 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9173–9174
9174
9174

Could this for loop be replaced by a simpler for (auto Link : Chain) { loop?

9174–9177

Comment deserves update - FMulAdd is also supported.
Potentially renamed isAnyOfRecurrenceKind() in D155786.

9175
9176
9230

It's somewhat alarming that we're indifferent to which operand is VecOp, in non commutative cases such as compare-select. The semantics of the compare/select are presumably captured by the min/max kind, rather than the condition itself (or its negation). Worth a comment, and ensuring test coverage of having VecOp in both positions.

9230

Or rather BasicBlock *CurrentLinkBB = CurrentLinkI->getParent();

Can also define and reuse: VPBasicBlock *CurrentLinkVPBB = CurrentLink->getParent();

9237–9238

Should this breaking of fmuladd into an fadd recurrence fed by an fmul take place separately, as a preparatory step?
Should compare-select pairs be replaced by Min/Max recipes?
In order to convert an (initially widened) reduction recurrence into an in-loop one, it seems that isolating the reduction operation into a single, dedicated operation would be useful. It could also help canonicalize where to find the reduced value operands, aka vecOps.

9238–9239
9244

Why/Is this needed?

Ayal added inline comments.Jul 22 2023, 11:07 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9138

Bypass this for loop if MinVF.isScalar()?

Ah, isOrdered reductions are considered inLoop even for VF=1 (but only them).

fhahn updated this revision to Diff 544541.Jul 26 2023, 3:30 PM
fhahn marked 27 inline comments as done.

Address latest comments, thanks! I tried to include most adjustments directly in this patch to keep things together.

fhahn added inline comments.Jul 26 2023, 3:32 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1607

Adjusted, thanks!

1770–1771

Updated, thanks!

1775–1776

Updated, thanks!

9134–9135

Adjusted and collect only VPReductionPHIRecipe

9173–9174

Updated, Thanks!

9173–9174

Hoisted out of loop, thanks!

9173–9174

Renamed & added comment. It should be guaranteed to be initialized before use.

9174

Updated to use iterator based loop which skips the first element (aka PreviousLink)

9174

Updated , thanks!

9174–9175

Updated to use SetVector and use VPRecipeBase as element type, thanks!

9174–9177

Updated the message; any reduction should be allowed, besides select/cmp ones.

9175

Updated, thanks!

Above initialization of Worklist tolerates recipes with multiple defined values, but here only single valued recipes are expected?

Added an assert in the collection loop

9176

Updated, thanks!

9178

added a comment, thanks!

9179–9180

Will update to have a check for the cmp only. In-loop reductions won't be used for those intrinsics at the moment. Will make sure there's a test. Added tests with min-max intrinsics in cc3986643696, for which in-loop reductions aren't applied yet

9201–9202

Updated to also include FMulAdd

9230

Updated, thanks!

9230

Yep, that looks right, added a comment, thanks!

Added test coverage with flipped operands in cc3986643696.

9234–9235

Updated, thanks!

9237–9238

Might be good as follow up?

9238

No, removed ,thanks!

9242

Yep, moved, thanks!

9244

Not needed in the latest version, removed!

Ayal added a comment.Jul 31 2023, 3:32 PM

Added some more comments to help clean this up more, can be applied now or later if preferred to keep the diff closer to a VPlan refactoring alone.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9140–9172

nit: Should this filtering be applied above when inserting into InLoopReductionPhis?
Should unordered reductions be considered non-isInLoop in VPlans of VF=1?

9181
9183

note: may assert that UserRecipe resides inside the vector loop, as the outside user is a LiveOut, at-least for now.

9184

assert CurrentLink is a select recipe (whose first operand index is 1)?

9192

nit: assert better placed above, asap, independent of filling Worklist.

9203

Treat FMulAdd separately from the binary operators - the operand along the chain is in index 2, go ahead and generate the FMul of indices 0 and 1 to produce VecOp? (Keeping the assert message intact ;-)

9205

nit: would always considering the last two indices be (correct and) easier than considering either first two indices or indices 1 and 2? Excluding FMulAdd which deserves a separate treatment.

9231

assert that VecOp != PreviousLinkV and that the other candidate does equal PreviousLinkV?
(CurrentLink->getOperand(IndexOfFirstOperand+IndexOfFirstOperand+1-VecOpId))

9232

Wonder if this information may already be recorded in VPlan - perhaps BlockInMask already exists?

9237–9238

Sure, if not two.

9237–9238
9237–9238

assert VecOpId == 0?

9241

Could we use PreviousLink serve as/instead of CompareRecipe?

fhahn updated this revision to Diff 546239.Aug 1 2023, 3:11 PM
fhahn marked 11 inline comments as done.

Address latest comments, thanks!

fhahn added inline comments.Aug 1 2023, 3:12 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9140–9172

Moved the filtering. Left scalar handling as is for now to keep with original code.

9181

Updated, thanks!

9183

Updated, thanks!

9184

added assert, thanks!

9192

Moved up, thanks!

9203

Moved, thanks!

9205

For the initial version, I think it would be good retain the original logic as much as possible. This also makes it easier to retain the asserts.

9231

Added assert, thanks!

9232

Not easily yet, but I think as follow-up this could be handled by checking the predecessors of the block to find the predecessor with branch-on-mask.

9237–9238

Not needed after code movement.

9241

Unfortunately I don't think so, but we can let dead-recipe removal take care of this removal. I removed the code.

Ayal added inline comments.Aug 1 2023, 4:17 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9146–9153
9162

?

9168
9170
9183

If VF=1 CurrentLink should be a Replicate instead of a Widen(Call) recipe?

Better place the above assert(RecurrenceDescriptor::isFMulAddIntrinsic(CurrentLinkI)) here?

9241

If PreviousLink is set to Cmp before early-continuing above?

OTOH, dead-recipe removal could also reclaim CurrentLink.

fhahn updated this revision to Diff 546406.Aug 2 2023, 4:26 AM
fhahn marked 9 inline comments as done.

Address latest comments, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9146–9153

Reordered, thanks!

9162

Removed, thanks!

9168

Fixed, thanks!

9170

added, thanks!

9183

Moved assert and refined condition as suggested, thanks!

9237–9238

Adjusted, thanks!

9241

Yep, updated, thanks!

Ayal accepted this revision.Aug 2 2023, 7:23 AM

This looks good to me, thanks!
Adding last minor nits.

Would be good to see how adjustRecipesForReductions() may become a VPlan2VPlan pass, and apply fmuladd decomposition / minmax composition as separate, preparatory steps.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9206

Note: in this case PreviousLink stays the recipe before the compare, feeding one of the operands of the select.

9227
9241

nit: may be worth noting somewhere that this transformation may leave dead recipes for a subsequent pass to clean up.

This revision is now accepted and ready to land.Aug 2 2023, 7:23 AM
This revision was landed with ongoing or failed builds.Aug 2 2023, 9:05 AM
This revision was automatically updated to reflect the committed changes.
fhahn marked 3 inline comments as done.
fhahn added inline comments.Aug 2 2023, 9:25 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9227

Sounds better, thanks!

9241

Added a note in the committed version, thanks!