This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Remove redundant blocks by merging them into predecessors.
ClosedPublic

Authored by fhahn on Dec 13 2022, 2:45 AM.

Details

Summary

Add and run VPlan transform to fold blocks with a single predecessor
into the predecessor. This remove redundant blocks and addresses a TODO
to replace special handling for the vector latch VPBB.

Diff Detail

Event Timeline

fhahn created this revision.Dec 13 2022, 2:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2022, 2:45 AM
fhahn requested review of this revision.Dec 13 2022, 2:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2022, 2:45 AM
Ayal added inline comments.Dec 20 2022, 6:05 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9158

nit: not sure what is meant here by "earlier" or improper maintenance of VPlan transforms?

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
313

Is RPOT needed (due to destructive / make_early_inc_range) or would depth_first suffice?

Suffice to iterate over blocksOnly<VPBasicBlock> and tryToMerge[Basic]BlockIntoPredecessor(VPBasicBlock)?

315

Is tryToMergeBlockIntoPredecessor() still needed (by buildVPlanWithVPRecipes()), or can this pass absorb it?

llvm/lib/Transforms/Vectorize/VPlanTransforms.h
43

Missing documentation. (Here are in few other transforms, independently of this patch)

fhahn updated this revision to Diff 485234.Dec 25 2022, 11:31 AM

Address latest comments, thanks!

fhahn marked 2 inline comments as done.Dec 25 2022, 12:04 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
313

Unforunately depth_first doesn't support make_early_inc_range at the moment, I am not sure if that's because of VPBlockRecursiveTraversalWrapper or something else.

Suffice to iterate over blocksOnly<VPBasicBlock> and tryToMerge[Basic]BlockIntoPredecessor(VPBasicBlock)?

Done, thanks!

315

Done, it's not needed any longer.

llvm/lib/Transforms/Vectorize/VPlanTransforms.h
43

Added, thanks!

fhahn marked 3 inline comments as done.Dec 25 2022, 12:34 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9158

I think this mainly referred to doing mergeBlocksIntoPredecessors on 2 individual blocks here and immediately after constructing the initial recipes. The latter has been removed as well now

barannikov88 added inline comments.
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
328

Q1: Can VPBB be a successor of itself? If it can, this is probably fine, as we (1) shouldn't reach here (unless VPBB is unreachable) and (2) reconnecting it to itself should do no harm as it is going to be deleted anyway. Worth asserting?
Q2: Can VPBB have references other than control flow edges? I'm asking because I see VPBasicBlock::dropAllReferences interface.

Ayal added inline comments.Dec 26 2022, 6:46 AM
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
313

Another option may be to scan depth_first and push candidates VPBB's onto a stack that is later traversed (succ before pred) to merge each candidate into its predecessor and delete it. This could potentially minimize RPOT copying overhead, although VPlan sizes are currently small(?)

fhahn updated this revision to Diff 485315.Dec 26 2022, 12:11 PM
fhahn marked an inline comment as done.

Adjust code to keep nodes to process in vector.

fhahn marked 2 inline comments as done.Dec 26 2022, 12:16 PM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
313

RPOT is not needed, but if we want to use existing graph iterators I think we need to do the traversal first and then process it.

I updated the code to cache depth-first traversal, to make clear RPOT is not needed. Alternatively we could do the traversal manually, but this would need more work if we want to traverse through regions.

328

. Q1: Can VPBB be a successor of itself? If it can, this is probably fine, as we (1) shouldn't reach here (unless VPBB is unreachable) and (2) reconnecting it to itself should do no harm as it is going to be deleted anyway. Worth asserting?

In theory there's nothing preventing that, but in practice all cycles should be represented as regions without explicit back-edge at the moment.

Q2: Can VPBB have references other than control flow edges? I'm asking because I see VPBasicBlock::dropAllReferences interface.

VPBB previously could have other users, but those have been removed AFAICT.

VPBasicBlock::dropAllReferences removes references to all recipes in the block.

Ayal added inline comments.Dec 26 2022, 1:09 PM
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
313

Ahh, right, candidates can also be visited in pred-before-succ order, or any other.

Original thought was to collect all VPBB's having a single BB predecessor which in turn has a single successor, as this collection is expected to be smaller than all blocks/basic-blocks, if not empty; plus one could return !empty() instead of Changed. But this is mostly a matter of style, i.e., separating the logic into inspection and execution. Current approach is also fine, what would you prefer?

fhahn updated this revision to Diff 485317.Dec 26 2022, 1:42 PM
fhahn marked 3 inline comments as done.

Update to only collect merge-able blocks.

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
313

Oh right, updated to only collect merge-able blocks, thanks!

Ayal accepted this revision.Dec 26 2022, 2:02 PM

Thanks!

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
318

nit: seems clearer w/o early-exit

if (PredVPBB && PredVPBB->getNumSuccessors() == 1)
  WorkList.push_back(VPBB);

?

This revision is now accepted and ready to land.Dec 26 2022, 2:02 PM