This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
ClosedPublic

Authored by jmolloy on Aug 26 2016, 12:46 AM.

Details

Reviewers
spatel
hans
Summary

This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

if (a)
  x(1);
else if (b)
  x(2);

This produces the following CFG:

   [if]
  /    \
[x(1)] [if]
  |     | \
  |     |  \
  |  [x(2)] |
   \    |  /
    [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

    [if]
   /    \
 [x(1)] [if]
   |     | \
   |     |  \
   |  [x(2)] |
    \   /    |
[sink.split] |
      \     /
      [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

Diff Detail

Repository
rL LLVM

Event Timeline

jmolloy updated this revision to Diff 69325.Aug 26 2016, 12:46 AM
jmolloy retitled this revision from to [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches.
jmolloy updated this object.
jmolloy added reviewers: spatel, hans.
jmolloy set the repository for this revision to rL LLVM.
jmolloy added a subscriber: llvm-commits.
hans added inline comments.Aug 26 2016, 4:28 PM
lib/Transforms/Utils/SimplifyCFG.cpp
1463

(Ultra nit: I think 'f' is a better name for a random function than 'x' which is more commonly used for variables.)

1483

Would this scale to larger if-else chains where some subset of them have similar instructions? I assume not because the order of the if-statements makes this hard even if it doesn't really matter for semantics.

1496

Maybe call these Predecessors and UnconditionalPredecessors?

jmolloy added inline comments.Aug 28 2016, 10:11 AM
lib/Transforms/Utils/SimplifyCFG.cpp
1483

In general this is a very hard problem as you say.

I've been thinking that it should be possible to use a suffix tree or levenshtein distance to find the longest common substring in all incoming blocks. It should be possible to support the case where some blocks have a common subsequence and others don't...

Bit of a longer term blue sky thing though.

jmolloy updated this revision to Diff 69547.Aug 29 2016, 2:41 AM
jmolloy marked 2 inline comments as done.

Hi Hans,

Thanks for the comments! New version attached.

James

hans added inline comments.Aug 29 2016, 2:43 PM
lib/Transforms/Utils/SimplifyCFG.cpp
1497

Actually, is Preds even used after it's been populated?

If not, maybe this is all we need:

SmallVector<BasicBlock *, 4> UnconditionalPreds;
Instruction *ConditionalPred = nullptr;

and the logic in the for loop could be simplified a little:

if (T is unconditional)
  push to UnconditionalPreds
else if (T is conditional && !ConditionalPred)
  set ConditionalPred
else
  bail out
jmolloy updated this revision to Diff 69708.Aug 30 2016, 9:01 AM

Thanks Hans. I should have addressed all of your comments now.

hans accepted this revision.Aug 30 2016, 9:41 AM
hans edited edge metadata.

lgtm

This revision is now accepted and ready to land.Aug 30 2016, 9:41 AM
Eugene.Zelenko closed this revision.Oct 4 2016, 3:46 PM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in rL280364. Please specify "Differential revision: <URL>" in commit message.