This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics
ClosedPublic

Authored by Josh on Apr 18 2016, 11:51 PM.

Details

Summary

Make it possible that TryToSimplifyUncondBranchFromEmptyBlock merges empty basic block including lifetime intrinsics as well as phi nodes and unconditional branch into its successor or predecessor(s).

If successor of empty block has single predecessor, all contents including lifetime intrinsics are sinked into the successor. Otherwise, they are hoisted into its predecessor(s) and then merged into the predecessor(s) like below example codes.

Input IR:

define void @foo(i1 %x, i1 %y) {
entry:

%a = alloca i32, align 4
br label %while.cond

while.cond: ; preds = %if.end, %entry

br i1 %y, label %while.body, label %bb

while.body: ; preds = %while.cond

%b = bitcast i32* %a to i8*
call void @llvm.lifetime.start(i64 4, i8* %b) #1
%d = load i32, i32* %a, align 4
br i1 %x, label %if.then, label %if.else

if.then: ; preds = %while.body

%e = add i32 %d, 1
store i32 %e, i32* %a, align 4
br label %if.end

if.else: ; preds = %while.body

%g = sub i32 %d, 1
store i32 %g, i32* %a, align 4
br label %if.end

if.end: ; preds = %if.else, %if.then

%c = bitcast i32* %a to i8*
call void @llvm.lifetime.end(i64 4, i8* %c) #1
br label %while.cond

bb: ; preds = %while.cond

ret void

}

Optimized output IR by SimplifyCFG:
define void @foo(i1 %x, i1 %y) {
entry:

%a = alloca i32, align 4
br label %while.cond

while.cond: ; preds = %if.then, %if.else, %entry

br i1 %y, label %while.body, label %bb

while.body: ; preds = %while.cond

%b = bitcast i32* %a to i8*
call void @llvm.lifetime.start(i64 4, i8* %b)
%d = load i32, i32* %a, align 4
br i1 %x, label %if.then, label %if.else

if.then: ; preds = %while.body

%e = add i32 %d, 1
store i32 %e, i32* %a, align 4
%c = bitcast i32* %a to i8*
call void @llvm.lifetime.end(i64 4, i8* %c)
br label %while.cond

if.else: ; preds = %while.body

%g = sub i32 %d, 1
store i32 %g, i32* %a, align 4
%0 = bitcast i32* %a to i8*
call void @llvm.lifetime.end(i64 4, i8* %0)
br label %while.cond

bb: ; preds = %while.cond

ret void

}

Diff Detail

Repository
rL LLVM

Event Timeline

Josh updated this revision to Diff 54165.Apr 18 2016, 11:51 PM
Josh retitled this revision from to [SimplifyCFG] Extend TryToSimplifyUncondBranchFromEmptyBlock for empty block including lifetime intrinsics.
Josh updated this object.
Josh added a reviewer: bkramer.
Josh added a subscriber: llvm-commits.
hans edited edge metadata.Apr 22 2016, 9:44 AM

I've put some comments in the code, but my main concern is that this patch needs a lot of test cases.

include/llvm/IR/BasicBlock.h
159 ↗(On Diff #54165)

nit: The formatting of this comment doesn't look right.

lib/IR/BasicBlock.cpp
216 ↗(On Diff #54165)

The getFirstNonPHIOrDbgOrLifetime() function also considers Intrinsic::lifetime_start. Should this function do that too?

lib/Transforms/Utils/Local.cpp
799 ↗(On Diff #54165)

No need to repeat the name of the function in the comment. This applies to the other functions as well.

803 ↗(On Diff #54165)

How about writing this with a range-based for-loop instead?

816 ↗(On Diff #54165)

No need for the the ?: operator here.
In fact, how about just:

return isa<PHINode>(BB->begin());

And that means maybe we don't need a separate function for this? For example, on line 977 we already do the same kind of check with "if (isa<PHINode>(Succ->begin())) {"

825 ↗(On Diff #54165)

No need to declare these before they're used.

830 ↗(On Diff #54165)

You could use a range-based for-loop over predecessors(BB) instead

834 ↗(On Diff #54165)

Would it be sufficient to just check "BI->getSingleSuccessor()"?

839 ↗(On Diff #54165)

Some of these "else" and "return false" lines seem unnecessary.. how about something like:

for (each predecessor P) {
  if (not P has single successor)
    return false;
}
846 ↗(On Diff #54165)

Again, range-based for-loop would be nice.

849 ↗(On Diff #54165)

I'm not sure this extra logic to splice instead of clone the instruction into the predecessor is worth it.

855 ↗(On Diff #54165)

I suppose this is because we might have moved I into a predecessor? This makes the loop hard to follow. If you used cloned the instruction into the predecessor, there would be no need for this, right?

873 ↗(On Diff #54165)

This all looks very complicated.

881 ↗(On Diff #54165)

Didn't we check earlier the contents of the BB so we know it can be hoisted? In that case there shouldn't be a need for all this checking and "return false" statements. Can you just add asserts that the instructions are what we expect, and then clone them to the predecessors?

905 ↗(On Diff #54165)

You could use a range-based for-loop over predecessors(BB) instead, and successors(BB) in the loop below.

917 ↗(On Diff #54165)

Didn't we check above that BB has no phi instructions?

Josh updated this revision to Diff 55142.Apr 26 2016, 5:59 PM
Josh removed a reviewer: flyingforyou.
Josh marked 16 inline comments as done.
Josh added a subscriber: flyingforyou.

According to Hans's comments, I updated this patch as followings.

  • Add test cases
  • Consider Intrisic:lifetime_start
  • Make codes more clear

Thanks.

hans added a comment.Apr 28 2016, 10:15 AM

Thanks! This is looking much better :-)

Some comments below.

lib/IR/BasicBlock.cpp
211 ↗(On Diff #55142)

No need to declare this until its use, further below.

But I also think maybe we can do without it; see below.

221 ↗(On Diff #55142)

I think we can do away with the TI variable by just doing something like:

if (auto *BCI = dyn_cast<BitCastInst>(&I)) {
  if (auto *II = dyn_cast<IntrinsicInst>(++I.getIterator())) {
    if ((II->getIntrinsicID() == Intrinsic::lifetime_start ||
         II->getIntrinsicID() == Intrinsic::lifetime_end) &&
        II->getOperand(1) == BCI) {
      continue;
    }
  }
}
lib/Transforms/Utils/Local.cpp
799 ↗(On Diff #55142)

Still no need to repeat the name in the comment. Just write this as

/// Return true if BB has lifetime.end intrinsic.

This applies to the functions you're declaring below as well.

I know older functions do repeat the name in their comment, but it's unnecessary and we should stop doing that.

802 ↗(On Diff #55142)

I would suggest putting braces around the for-loop body. Even though it's technically one statement, it's several lines long so braces will help make it easier to read.

826 ↗(On Diff #55142)

clone() is a member of Instruction, so there's no need to go via the iterator here. Just

I.clone()->insertBefore(Pred->getTerminator());

should do it.

835 ↗(On Diff #55142)

I'm not super happy with this code..

First, there's no need to iterate backwards to look for the IntrinsicInst we just inserted. We could do something like:

Instruction *NewII = I->clone();
NewII->insertBefore(pred->getTerminator());

and then use NewII below.

Also, even if the instruction before NewII is a BitCast, can we be sure it's *the right* BitCast, i.e. the one we cloned earlier? I wish we could do this in a more principled way..

I think it would be better if we could clone the bitcast and intrinsic together. How about we ignore the "if (isa<BitCastInst>)" case above, and then do something like this here:

for (auto Pred : predecessors(BB)) {
  Instruction *NewII = I->clone();
  NewII->insertBefore(Pred->getTerminator());

  if (I != BB->begin()) {
    if (auto BC = dyn_cast<BitCastInst>(--I->getIterator())) {
      assert(BC == I->getOperand(1));
      auto NewBC = BC->clone();
      NewBC->insertBefore(NewII);
      NewII->setOperand(1, NewBC);
    }
  }
}
test/Transforms/SimplifyCFG/lifetime.ll
31 ↗(On Diff #55142)

This is great, thanks!

It would be good to have a few tests that cover the situations when the block *cannot* be removed as well.

Josh updated this revision to Diff 55517.Apr 28 2016, 6:23 PM
Josh marked 7 inline comments as done.

Hans, first of all, thanks for your kind comments.

Your suggested codes are very efficient and highly readable.
I followed your guide for all commented codes.

Thanks

hans accepted this revision.Apr 29 2016, 4:33 PM
hans edited edge metadata.

lgtm

Do you have commit access, or do you need someone to commit it for you?

This revision is now accepted and ready to land.Apr 29 2016, 4:33 PM
Josh updated this revision to Diff 55768.May 1 2016, 8:41 PM
Josh edited edge metadata.

Hans, thank you for your quick and kind review.

I do not have commit access. So, if you don't mind, would you please commit this code for me?
After I get the commit access, I'll do by myself.

(This diff revision is for fix formatting of comment)

Thanks.

This revision was automatically updated to reflect the committed changes.