# Improving the efficiency of the Loop Unswitching passClosedPublicActions

Authored by Abhilash on Nov 4 2016, 2:52 AM.

# Details

Summary

{F2564183}The existing Loop Unswitching Pass generates an exponential number of loops with respect to the number of unswitched branches.
For any loop, the number of generated loops are n^2 where n is the number of unswitched branches.
The exponential nature of the transformation is an obstacle in generating better unswitched code.

Consider the following testcase:

1: for(int i=0;i<100;i++){
2: if (m)
3: a[i]*=2;
4: else if (n)
5: a[i]*=3;
6:
7: a[i]*=4;
8: }

m and n are loop invariants.

The existing algorithm generates 4 unswitched loops for the following cases:

1. Condition at line 2 is true and Condition at line 4 is true.
2. Condition at line 2 is true and Condition at line 4 is false.
3. Condition at line 2 is false and Condition at line 4 is true.
4. Condition at line 2 is false and Condition at line 4 is false.

However, due to the very nature of the else-if chain, there exists a control dependency between Conditions at line 2 and 4. When the Condition at line 2 is true, the Condition at line 4 is never taken and hence the outcome of the Condition at line 4 is irrelevant (don't care scenario), in this case.
This has not been accounted for, in the existing infrastructure. Currently, two similar loops are generated for cases 1 and 2 leading to redundancy and inefficiency.

It is sufficient, if we handle the following three cases:

1. Condition at line 2 is true.
2. Condition at line 2 is false and Condition at line 4 is true.
3. Condition at line 2 is false and Condition at line 4 is false.

One of the possible solutions, is to find the unreachable branches and eliminate them before unswitching.

The attached patch has the solution implemented in the Loop Unswitching Pass. The choice of this solution (among other possible solutions) is for the following reasons:

1. Eliminating unreachable branches so that many other branches are unswitched out of their enclosing loops (probably, better code).
2. Difficult to handle in the existing CFG Simplification Pass.

# Diff Detail

Repository
rL LLVM

### Event Timeline

Abhilash updated this revision to Diff 76914.Nov 4 2016, 2:52 AM
Abhilash retitled this revision from to Improving the efficiency of the Loop Unswitching pass.
Abhilash updated this object.
Abhilash added reviewers: hfinkel, lattner, atrick.
Abhilash set the repository for this revision to rL LLVM.
Abhilash added a subscriber: llvm-commits.
mzolotukhin edited edge metadata.Nov 4 2016, 2:57 PM

Hi,

Some initial comments, unrelated to the patch idea:

1. Please upload the patch with the full context (see http://llvm.org/docs/Phabricator.html).
2. Please run clang-format on it: some lines are too long.

Thanks,
Michael

lib/Transforms/Scalar/LoopUnswitch.cpp
624 ↗(On Diff #76914)

already break in line 618?

lattner resigned from this revision.Nov 5 2016, 7:10 PM
lattner removed a reviewer: lattner.
Abhilash updated this revision to Diff 76988.Nov 6 2016, 9:32 AM
Abhilash updated this object.
Abhilash edited edge metadata.

The following changes have been incorporated in the updated diff:
As suggested by Michael Zolothukin:

2. Clang-format has been applied.

As suggested by Weiming Zhao:

1. Removed the redundant conditional break on 'isUnreachable'.
This comment was removed by Abhilash.
weimingz added inline comments.Nov 8 2016, 10:40 AM
lib/Transforms/Scalar/LoopUnswitch.cpp
611 ↗(On Diff #76988)

Does the True destination and False destination get swapped?

Hi,

Please find some comments inline. Also, you still need to add a test for this change.

Michael

lib/Transforms/Scalar/LoopUnswitch.cpp
596–600 ↗(On Diff #76988)

Please use the same style for comments as in the rest of the file:

```// Some branches ...
// ...
// ...```
607 ↗(On Diff #76988)

Maybe rewrite it as

```if (!BI || !BI->isConditional())
continue;
...```

?

Then we can also continue if Cond is not a constant.

609 ↗(On Diff #76988)

s/NULL/nullptr/

611 ↗(On Diff #76988)

AFAIU, Succ is the unreachable successor. Probably, another name would work better here, because indeed, it's confusing.

630–631 ↗(On Diff #76988)

I think this message would be confusing. We either need to provide more details (like why is it unreachable and under which conditions), or drop it completely.

Abhilash updated this revision to Diff 77726.EditedNov 12 2016, 9:20 AM
Abhilash marked an inline comment as done.

Hi Michael Zolotukhin,

I have made the suggested changes and have added the testcase (earlier, I mistook about the testcase and attached my testcase with this revision).

Thanks,
Abhilash Bhandari

sanjoy added a subscriber: sanjoy.Nov 13 2016, 4:44 PM

Hi Abhilash,

Thanks for working on this, please find some comments below.

Michael

lib/Transforms/Scalar/LoopUnswitch.cpp
603 ↗(On Diff #77726)

Nit: we have a BI variable, and here we create BInst. Can we use a more descriptive name for one of them to help distinguishing them?

test/Transforms/LoopUnswitch/elseif-non-exponential-behavior.ll
2 ↗(On Diff #77726)

Please don't run entire -O3 pipeline in the test. It's slow and makes the test fragile (because passes in the pipeline change the input).

You can get IR just before loop-unswitching and use it as the test (and in the test run opt -loop-unswitch without -O3). To get full list of passes in O3 pipeline, you can run opt -O3 -debug-pass=Arguments.

Also, it's better to check what exactly what we get after loop-unswitching, not just look for a debug message. We use FileCheck tool for this, and annotate tests with special CHECK statements. The documentation is available here: http://llvm.org/docs/CommandGuide/FileCheck.html, and you can take a look at existing loop-unswitching tests if you need an example.

anna added a subscriber: anna.Nov 14 2016, 2:31 PM

Hi Abhilash,

I've added some comments mainly related to stylistic changes. Please have a look.

Thanks,
Anna

lib/Transforms/Scalar/LoopUnswitch.cpp
620–621 ↗(On Diff #77726)

Isn't it enough to check if the basic block UnreachableSucc dominates I->getParent()?

627 ↗(On Diff #77726)

I think this looks a bit confusing. isUnreachable is only set to true once inside the while loop, where we break from it. I think it maybe better to separate the while loop into another function/lambda such as bool isBranchUnreachable() and change all the breaks into return false. This way it is easier to read when you have more cases for unreachable branches.

635–636 ↗(On Diff #77726)

There's only formatting changes here right? Usually, in diff's it's better to avoid formatting changes not related to your code. I think this case is okay (since it's just a single one).

If you're using git, there's some script to clang-format *just your changes*. See git clang-format.

test/Transforms/LoopUnswitch/elseif-non-exponential-behavior.ll
3 ↗(On Diff #77726)

Also, if there are any test cases that the current code will not catch, it will be good to add those. This will help future developers on what would be good cases to work on.

Abhilash updated this revision to Diff 78466.Nov 17 2016, 9:46 PM
Abhilash marked 2 inline comments as done.

Hi Michael and Anna,

All the suggested changes have been incorporated.
The testcase now includes CHECK patterns.
The patterns to be checked are kept as minimal as possible.

Thanks,
Abhilash

anna added inline comments.Nov 18 2016, 2:24 PM
lib/Transforms/Scalar/LoopUnswitch.cpp
510 ↗(On Diff #78466)

Prior diff question:
Isn't it enough to check if the basic block UnreachableSucc dominates I->getParent()?
i.e. `DT->dominates(UnreachableSucc, BB).
Is there something else subtle here, that I'm missing?
IIUC, the algorithm keeps traversing backwards to an immediate dominator of the current DomBB, and checks if it's an unreachable block.

512 ↗(On Diff #78466)

Is this clang-formatted? I think single statements within if-conditions dont need braces.

Hi Abhilash,

Please find some stylistic comments from me as well.

Thanks,
Michael

lib/Transforms/Scalar/LoopUnswitch.cpp
488 ↗(On Diff #78466)

Please rename this function, as the name is to generic now, and LLVM already uses term unreachable with another meaning. Here the block is rendered unreachable because we unswitched some conditional branches - we should somehow reflect that in the name (I don't have good suggestions though :) ).

512 ↗(On Diff #78466)

I think clang-format doesn't remove such braces, but yep, we don't need them in this case.

test/Transforms/LoopUnswitch/elseif-non-exponential-behavior.ll
2–3 ↗(On Diff #78466)

Why do we need -sroa? Why can't we use its output?

Also, if we do check the actual contents of the function, there is no need in checking debug messages, and thus we can remove first RUN invocation and REQUIRES: asserts.

8–12 ↗(On Diff #78466)

Some more recommendations about tests:

1. Run opt -instnamer on the test so that names like %7 are replaced with symbolic names. It makes it much easier to edit the test later if necessary (we need to preserve numeration if we don't use symbolic names).
2. Please add CHECK-LABEL: @b. That'll help to distinguish function bodies in case someone adds another function into this test.
3. It might be an overkill to check every instruction. It's usually enough to check key instructions in key blocks. E.g.:
```; CHECK: bb1:
; CHECK-NOT br i1      ; Check that the branch is not conditional any more
; CHECK: br label %succ ; Check that we're unconditionally jumping to correct successor```

And we don't need to check what's in between label bb1 and branch from the block. This way it's easier to understand what the test checks, and also the test becomes more robust to random change that can somehow change the bodies of the blocks.

Abhilash updated this revision to Diff 78716.Nov 21 2016, 6:45 AM
Abhilash marked 4 inline comments as done.

Hi Anna and Michael Zolothukin,

I have renamed the function and reduced the CHECK patterns in the testcase.
I have tried to keep the CHECK patterns generic.
I have retained the CHECK patterns for only 3 loops (the number of loops that would be generated with this revision) and the corresponding phi nodes at the exit (join) nodes.

One help: Is there a way to make the phi-nodes (in the CHECK pattern) more generic?

Thanks,
Abhilash

mzolotukhin accepted this revision.Nov 21 2016, 7:46 AM
mzolotukhin edited edge metadata.

Hi Abhilash,

The change looks good to me now, thank you! Regarding phis - I would probably just check how many operands the phi-instruction has, without explicit specifying the operands (if it doesn't make the regex much easier, then it's ok to keep it as you have it now I think). Thanks for your patience in going through so many iterations of review! :)

Michael

This revision is now accepted and ready to land.Nov 21 2016, 7:46 AM
anna added a comment.Nov 21 2016, 10:05 AM

Hi Abhilash, Michael,

One small suggestion inline from my side.

Thanks,
Anna

lib/Transforms/Scalar/LoopUnswitch.cpp
509 ↗(On Diff #78716)

I was thinking of this signature in DominatorTree:
dominates(BasicBlock*, BasicBlock*).
So, the above changes to: DT->dominates(UnreachableSucc, BB).

For other such usages, take a look in LoopUnswitch::RewriteLoopBodyWithConditionConstant function.

`if (Latch && DT->dominates(SISucc, Latch))`

SISucc and Latch are both basic blocks.

For more clarification, check out Dominator class:
class DominatorTree : public DominatorTreeBase<BasicBlock>. The templatized NodeT is a basic block.

Abhilash updated this revision to Diff 78827.Nov 21 2016, 9:07 PM
Abhilash edited edge metadata.
Abhilash marked 4 inline comments as done.

Hi Anna,

All this time, I have misunderstood your comment. Please find the changes incorporated.

Thanks,
Abhilash

Abhilash marked an inline comment as done.Nov 21 2016, 9:10 PM

Hi Abhilash,

The change looks good to me now, thank you! Regarding phis - I would probably just check how many operands the phi-instruction has, without explicit specifying the operands (if it doesn't make the regex much easier, then it's ok to keep it as you have it now I think). Thanks for your patience in going through so many iterations of review! :)

Michael

Thank you Michael. It's a learning phase for me.

Abhilash

lib/Transforms/Scalar/LoopUnswitch.cpp
609 ↗(On Diff #76988)

The nullptr initialization is no longer required and hence has been removed.

611 ↗(On Diff #76988)

No. The code just identifies the unreachable successors.
After unswitching, some of the branch conditions may just be constants (True/false). If the condition is true, then the else (false) path is unreachable and otherwise.
All the branches that are unreachable (because of previous unswitching) should not be considered for further unswitching. The code filters out such unreachable conditions.

620–621 ↗(On Diff #77726)

The change has been incorporated.

510 ↗(On Diff #78466)

I have rectified this in the latest revision.

anna added a comment.Nov 22 2016, 5:05 AM

great Abhilash, the changes look good to me. Thanks for taking this to completion :)

This revision was automatically updated to reflect the committed changes.