This is an archive of the discontinued LLVM Phabricator instance.

Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
ClosedPublic

Authored by asbirlea on Jun 14 2018, 5:09 PM.

Details

Summary

Two utils methods have essentially the same functionality. This is an attempt to merge them into one.

  1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
  2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor

Prior to the patch:

  1. MergeBasicBlockIntoOnlyPred

Updates either DomTree or DeferredDominance
Moves all instructions from Pred to BB, deletes Pred
Asserts BB has single predecessor
If address was taken, replace the block address with constant 1 (?)

  1. MergeBlockIntoPredecessor

Updates DomTree, LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

After the patch:
Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:

  1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp

Updated in this patch. No challenges.

  1. lib/CodeGen/CodeGenPrepare.cpp

Updated in this patch.

i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks

Some interesting aspects:

  • Since Pred is not deleted (BB is), the entry block does not need updating.
  • The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
  • isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
  • eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
  • adding some test owner as subscribers for the interesting tests modified: test/CodeGen/X86/avx-cmp.ll test/CodeGen/AMDGPU/nested-loop-conditions.ll test/CodeGen/AMDGPU/si-annotate-cf.ll test/CodeGen/X86/hoist-spill.ll test/CodeGen/X86/2006-11-17-IllegalMove.ll
  1. lib/Transforms/Scalar/JumpThreading.cpp

Not covered in this patch. It is the only use case using the DeferredDominance.
I would defer to Brian Rzycki to make this replacement.

Diff Detail

Event Timeline

asbirlea created this revision.Jun 14 2018, 5:09 PM
dberris added inline comments.
lib/CodeGen/CodeGenPrepare.cpp
526

This could just be 'auto *BB =' since you mention BasicBlock already in the cast.

527–528

You could reverse this, so you end up with:

for (auto &Block : Blocks) {
  if (auto *BB = cast_or_null<BasicBlock>(Block)) {
    ...
  }
}
lib/Transforms/Utils/BasicBlockUtils.cpp
162

Consider Updates.emplace_back(DominatorTree::Delete, PredBB, BB) instead? Or does that not work because DominatorTree::UpdateType is an aggregate?

dmgreen added inline comments.Jun 16 2018, 3:08 PM
lib/Transforms/Utils/BasicBlockUtils.cpp
137–139

This and the previous block is just (PredBB->getUniqueSuccessor() == BB)?

166

I see this comes from the other function, but PredBB should only have one successor (BB), if I am reading this correctly. So this should always be true.

I think this makes sense. Except Jumpthreading needs to be modified to make sure iterator is valid after merging. I can work on the jump threading part once this lands and if you do not mind @brzycki.

kuhar added a subscriber: kuhar.Jun 17 2018, 1:24 PM
asbirlea updated this revision to Diff 151758.Jun 18 2018, 11:51 AM
asbirlea marked 5 inline comments as done.

Address comments.

Thank you for the comments!

lib/CodeGen/CodeGenPrepare.cpp
527–528

Agreed, but I favored reducing nesting here for easier readability.

lib/Transforms/Utils/BasicBlockUtils.cpp
137–139

This method is expected to also merge blocks when there are multiple edges from PredBB to BB.
Example: test/Transforms/SimplifyCFG/2003-08-17-FoldSwitch.ll: test3

Start:          ; preds = %0
  switch i32 3, label %TheDest [
  i32 0, label %TheDest
  i32 1, label %TheDest
  i32 2, label %TheDest
  i32 5, label %TheDest
  ]
TheDest:
  ret i32 1234
162

Being an aggregate should still work, but it doesn't link because it can't find Delete, declared as static constexpr in DominatorTreeBase. The options I got are:

  • Instead of DominatorTree::Delete use the "original" enum which is DomTreeBuilder::UpdateKind::Delete, but that's icky to expose.
  • Use a temporary local variable auto tmp = DominatorTree::Delete, and use that in the emplace_back.
  • Use push_back.

I'm inclined to just keep the push_back here, but let me know if you think otherwise.

166

Agreed, updated.

You may need to update Transforms/LoopSimplifyCFG/scev.ll now too (which is hopefully simple). Otherwise this looks sensible to me.

lib/Transforms/Utils/BasicBlockUtils.cpp
137–139

I believe this is what getUniqueSuccessor does (as opposed to getSingleSuccessor). You are not making things any worse by leaving it as it is though.

asbirlea updated this revision to Diff 151947.Jun 19 2018, 11:03 AM

Address comments and rebase.

asbirlea marked 4 inline comments as done.Jun 19 2018, 11:08 AM

Updated scev.ll to be fully parametric (vs updating one label). All other tests should use variables as well, but that can be a separate cleanup.

lib/Transforms/Utils/BasicBlockUtils.cpp
137–139

Ah, I missed the difference from getSingleSuccessor. Thank you for following up!

dmgreen accepted this revision.EditedJun 20 2018, 1:20 AM

LGTM, thanks.

(Providing no one has any problems with the rest of these test changes)

This revision is now accepted and ready to land.Jun 20 2018, 1:20 AM
chandlerc requested changes to this revision.Jun 20 2018, 11:23 AM

Still a few things left, but really close.

lib/CodeGen/CodeGenPrepare.cpp
522

You can use a range based loop here now that you're putting them into a temporary array.

598–605

Due to building a WeakTrackingVH, may not work.

But why not use a range based loop below? should be safe now.

778–779

This seems like a bug then -- we can't print the SinglePred after it is deleted...

This revision now requires changes to proceed.Jun 20 2018, 11:23 AM
asbirlea updated this revision to Diff 152150.Jun 20 2018, 1:40 PM
asbirlea marked an inline comment as done.

Address comments.

asbirlea marked 4 inline comments as done.Jun 20 2018, 1:41 PM

Thank you for the review!

lib/CodeGen/CodeGenPrepare.cpp
778–779

Updated comment, SinglePred is not deleted.

kuhar added inline comments.Jun 20 2018, 2:07 PM
lib/Transforms/Utils/BasicBlockUtils.cpp
162

FYI: Updates.push_back({DominatorTree::Delete, PredBB, BB}); is used in almost all places that do DT updates. I'm not sure what benefit emplace_back would have over push_back in this case.

asbirlea marked 3 inline comments as done.Jun 20 2018, 2:13 PM
asbirlea added inline comments.
lib/Transforms/Utils/BasicBlockUtils.cpp
162

Ack, thank you for the input! Leaving as is.

This revision is now accepted and ready to land.Jun 20 2018, 2:19 PM
This revision was automatically updated to reflect the committed changes.
asbirlea marked an inline comment as done.

My apologies for missing this review, for some reason I wasn't notified of being added. I will look into the potential replacement of MergeBlockIntoPredecessor for JumpThreading.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2019, 8:51 AM

@brzycki: Great, thank you!

Am I right that there is yet another method doing the same thing?
lib/Transforms/Utils/LoopUnroll.cpp:foldBlockIntoPredecessor

/// Folds a basic block into its predecessor if it only has one predecessor, and
/// that predecessor only has one successor.
/// The LoopInfo Analysis that is passed will be kept consistent.
BasicBlock *llvm::foldBlockIntoPredecessor(BasicBlock *BB, LoopInfo *LI,
                                           ScalarEvolution *SE,
                                           DominatorTree *DT) {