Page MenuHomePhabricator

Loop rotation
Needs RevisionPublic

Authored by hiraditya on Jul 21 2016, 8:40 AM.

Details

Summary

This patch implements Loop Rotation Pass:

  1. Canonicalize loop latch to have only one successor.
  2. Clone all the BBs which are exiting the loop (There is a cost function to prevent code size increase)
  3. Adjust phi of cloned BBs
  4. Add phi to the new loop header
  5. Update DOM

With this patch all the basic blocks which are exiting the loop can be copied during rotation. This is helpful for passes like gvn/licm which can now remove loop invariant code
because some computations (which might not dominate all the basic blocks of loop) can now be PRE'd as they are available outside of the loop (due to loop rotation).

Several test cases have been added to exercise loop rotation for different loop bodies.

Relevant test case to demonstrate loop rotation can help gvn.
llvm/test/Transforms/LoopVectorize/vect.omp.persistence.ll

This patch depends on https://reviews.llvm.org/D22558, which is used to clone SEME.

Worked in collaboration with Sebastian Pop

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
mzolotukhin added inline comments.Dec 28 2016, 1:44 PM
llvm/lib/IR/Metadata.cpp
1109–1112

Can we remove this function and use SmallVector::append instead?

llvm/lib/Transforms/Scalar/LoopRotation.cpp
18–21

Nitpicks: 1) sentences should end with a dot. 2) I'd prefer to have some bullets in this list (I mean formatting), like:

- property1
- property2
192

Maybe I forgot something, but why exactly do we still need this check? I thought that one of the main benefits of this patch was to remove this limitation.

240

Why is it called 'TODO'?

This revision now requires changes to proceed.Dec 28 2016, 1:44 PM
hiraditya marked an inline comment as done.Jan 11 2017, 8:22 AM
hiraditya added inline comments.
llvm/lib/IR/Metadata.cpp
1109–1112

We can but then there should be a way to get the Attachments from MDAttachmentMap. Currently there is none.

llvm/lib/Transforms/Scalar/LoopRotation.cpp
192

If the loop-header has invoke instruction, or some other TerminatorInst, I'm not sure how to deal with them.

240

Because, even if the loop's size is greater than RotationMaxSize, what we can do is peel fewer basic blocks than needed to completely rotate the loop. That way, at least, some redundancies will be exposed.

llvm/test/Analysis/ScalarEvolution/2012-03-26-LoadConstant.ll
0

The test fails if I remove simplifycfg.

llvm/test/Transforms/LoopRotate/preserve-loop-simplify.ll
22–28

Originally, only the basic block inner.body was cloned outside the loop which became the new loop header. In the new implementation inner.header and inner.body both are cloned outside the loop such that inner.latch becomes the new loop header.

hiraditya updated this revision to Diff 83981.Jan 11 2017, 8:24 AM
hiraditya edited edge metadata.

Addressed Michael's comments.

mzolotukhin added inline comments.Jan 11 2017, 4:09 PM
llvm/lib/Transforms/Scalar/LoopRotation.cpp
240

I see, then I guess it should be expressed clearer in the comment. Currently it's not obvious what is there that needs to be done.

llvm/test/Analysis/ScalarEvolution/2012-03-26-LoadConstant.ll
0

It doesn't test what it was supposed to test. We either need to make sure it's useful, or remove it. I tried to revert the change the test was commited with and the test still passed, which means it no longer checks what it checked before. If you remove -simplifycfg it starts to fail, sure (I assume because there is for.body in the output) - but can you explain why there shouldn't be for.body? Can you explain what this test does? Juggling flags just to make it passes doesn't help us and only slows down everyone (mostly you, as you have to deal with it:) ).

llvm/test/Transforms/LoopRotate/indirectbr-1.ll
6 ↗(On Diff #83981)

We have a test checking this in loop-rotate.ll (@foo3). How are they different? Can we combine them (and place in one file)?

llvm/test/Transforms/LoopRotate/loop-rotate.ll
211

Will the following routine test the same?

define i32 @bar5() {

entry:
  br label %loop.header

loop.header:
  %a = phi i32 [ %b, %latch ], [ 0, %entry ]
  switch i8 undef, label %loop.body [
    i8 0, label %unreach
    i8 1, label %unreach
  ]

loop.body:
  br i1 undef, label %latch, label %return

latch:
  %b = phi i32 [ %a, %loop.body ]
  br label %loop.header

unreach:
  unreachable

return:
  ret i32 0
}

If it's sufficient, can you minimize other test in the similar manner? It should be much easier to understand what's going on, and hopefully they'll be much less fragile.

llvm/test/Transforms/LoopRotate/multiple-exits-merge-phi.ll
2–3 ↗(On Diff #83981)

Please remove this.

8 ↗(On Diff #83981)

How is it different from the test in multiple-exits.ll? What phis are merged (the name of the test hints that they should be)?

llvm/test/Transforms/LoopRotate/vect.omp.persistence.ll
2–3

We don't need to run the entire pipeline to check that loop-rotate doesn't drop metadata. Please keep only the loop-rotate part here (but please make sure we actually check the metadata - that's what this test was about).

llvm/test/Transforms/LoopVectorize/vect.omp.persistence.ll
19

Please update the comments, as the test has nothing to do with loop-rotate now. I'd recommend committing it separately.

hiraditya marked 3 inline comments as done.Jan 13 2017, 8:48 AM
hiraditya added inline comments.
llvm/test/Analysis/ScalarEvolution/2012-03-26-LoadConstant.ll
0

The for.body gets deleted because, once the loop is rotated everything in the loop becomes redundant/invariant. Previously, loop-rotate used to call simplifycfg, which I removed. So, I added the -simplifycfg in this test to make sure that the behavior is preserved.

I agree, it does not seem to test what it is supposed to. I can just remove this test (maybe in a separate patch together with changes related to vect.omp.persistence.ll)

llvm/test/Transforms/LoopRotate/indirectbr-1.ll
6 ↗(On Diff #83981)

This test checks that loop-latch having indirect branch is not rotated. This is different from all the tests in loop-rotate.ll. I'll merge this in loop-rotate.ll.

llvm/test/Transforms/LoopRotate/multiple-exits-merge-phi.ll
8 ↗(On Diff #83981)

It seems both are same. merge-phi is the phi in return statement, I was just trying to distinguish this from loop-header phi. I'll remove this test as this is redundant.

llvm/test/Transforms/LoopVectorize/vect.omp.persistence.ll
19

Will do

hiraditya updated this revision to Diff 84315.Jan 13 2017, 8:49 AM
hiraditya edited edge metadata.

Addressed Michael's comments.

Removed redundant tests
Merged indirectbr-1.ll into loop-rotate.ll

Please address my previous remarks regarding the tests.

hiraditya updated this revision to Diff 87288.Feb 6 2017, 1:07 PM

Fixed other test cases as per the review.

llvm/test/Transforms/LoopRotate/loop-rotate.ll
211

I have minimized other tests as well.

llvm/test/Transforms/LoopRotate/nosimplifylatch.ll
6–9

After for-in there is a phi now

mzolotukhin requested changes to this revision.Feb 28 2017, 7:00 PM

Hi,

Please find some comments inline. Also, please reinspect the tests - in many of them the comments don't correspond to test bodies, they are not minimal, and some tests duplicate others.

Thanks,
Michael

llvm/lib/Transforms/Scalar/LoopRotation.cpp
19

Why do we need this requirement?

24

Please end sentences with dots.

108–110

This function seems to be unused.

151–153

Formatting is slightly off here.

182–185

I don't understand how this comment corresponds to the code.

193–195

Why do we need this check?

249–272

Please add a comment about what this function is doing and what it expects.

317

Why is UserInst guaranteed to be a PHINode?

356–357

I find it a bit confusing that you're saying about blocks being copied while the function in fact just populates the list.

375–399

I think this check is the reason you kept that OrigH->isExiting() check. I think that what we rather should do here is to copy all blocks up to an exiting block that we're going to make a new latch. Does it make sense?

In other words, I'd like the following case to be rotated:
Entry -> {A}
A -> {B}
B -> {Exit, C}
C -> {A}

I see that current implementation is generic enough to handle such cases, and the only reasons why it won't do it is that (IMHO unnecessary) limitations.

424

...a constant, a function argument, or a global.

462–463

Why don't we need to update LI if there is no parent loop?

504–513

You have the same typedef inside LoopRotate.

505–508

If it's not used anywhere, please remove it.

584

Can we avoid quadratic complexity here?

llvm/test/Transforms/LoopRotate/loop-rotate.ll
74

This function doesn't have any indirect branches.

150–154

Why do we need two switches in this test?

This revision now requires changes to proceed.Feb 28 2017, 7:00 PM
hiraditya marked 7 inline comments as done.Mar 3 2017, 9:43 AM

Hi,

Please find some comments inline. Also, please reinspect the tests - in many of them the comments don't correspond to test bodies, they are not minimal, and some tests duplicate others.

Thanks,
Michael

Thank you for the detailed review. I did not try to remove so much from tests from file other than loop-rotate.ll because they were there from the past.
For loop-rotate.ll I think the tests are a combination of multiple scenarios, and it will be good to have them.

llvm/lib/Transforms/Scalar/LoopRotation.cpp
317

Because Inst does not dominate UserInst it must be the loop-closed-phi as the loop is in loop-closed-ssa form.

356–357

I've updated the comment.

375–399

We can certainly do that. From other reviewers (including you) there were comments in the past that the loop-rotation should have a way to reach idempotence. If we keep rotating when the header is not exiting, there will be no way to stabilize loop rotation.

462–463

For the current loop under rotation, there are no new basic blocks added. Only when the loop-header is changed around line 539 L->moveToHeader is called.

584

Actually this while loop is bounded by the number of exiting basicblocks from the loop which were cloned.

llvm/test/Transforms/LoopRotate/loop-rotate.ll
74

I'll update the comment.

150–154

This switch statement is a loop preheader as it is branching to the loop header. So I think, it would exercise if the dominators are updated properly.

hiraditya updated this revision to Diff 90501.Mar 3 2017, 9:45 AM
hiraditya edited edge metadata.

Addressed comments from Michael.

mzolotukhin requested changes to this revision.Mar 9 2017, 3:04 PM

Please fix the tests as requested.

We need to cover multiple scenarios by a minimal set of tests. There is no need in combining scenarios unless it triggers some corner case (in which case we should explicitly state which corner case we're testing). For instance, using a custom type %fp is completely irrelevant from what loop-rotation is doing, so it should be removed from the test. Many tests contain unreachable blocks (I realize you get them from bugpoint) - what's the purpose of keeping them in the test? Why can't we remove them? And so forth...

Michael

llvm/lib/Transforms/Scalar/LoopRotation.cpp
584

Can you elaborate? I don't see any exits from the loop, but I see how we start from the beginning.

llvm/test/Transforms/LoopRotate/loop-rotate.ll
12

Why can't we use i64 instead of %fp?

150–154

Loop simplify splits the edge to the loop anyway. Thus, this test, if checking anything, checks only loop simplify. For testing loop-rotation the first switch is absolutely unnecessary.

This revision now requires changes to proceed.Mar 9 2017, 3:04 PM
hiraditya updated this revision to Diff 91422.Mar 10 2017, 2:51 PM
hiraditya edited edge metadata.

Removed unreachable from testcases.
Replaced fp* with i64*

llvm/lib/Transforms/Scalar/LoopRotation.cpp
584

This part of code executes only for the basic blocks which are not in the loop, i.e., exit basic blocks. For the exit basic blocks dominator tree may not be up to date and hence is being updated here. Once the exit basic blocks have their dominators up-to-date, and since their sub-trees are already up-to-date => NewIDom == StaleIDom (line:581) for all successors of exit basic blocks.

llvm/test/Transforms/LoopRotate/loop-rotate.ll
150–154

There is no loop-simplify in the new loop rotation.

This revision now requires changes to proceed.Apr 24 2017, 8:40 AM
bjope removed a subscriber: bjope.May 10 2017, 12:11 AM

@mzolotukhin Please let me know if you have any further comments.

Gentle ping and a few comments. There are a few proprietary benchmarks that perform better with this pass and I'd like to see this (hopefully) moving towards a commit.

llvm/lib/Analysis/LoopInfo.cpp
225

What does getLoopID() return today in the event of multiple latches? @hiraditya maybe it makes sense, for now at least, to return { nullptr, nullptr }; when size of GetLoopLatches() > 1?

llvm/lib/IR/Metadata.cpp
1311

I'd prefer a message more like "No Instruction Metadata found".

llvm/lib/Transforms/Scalar/LoopRotation.cpp
78–80

The bug is marked as fixed by a duplicate. Could you please remove and verify it is working?

FYI: https://bugs.llvm.org/show_bug.cgi?id=28104

192

It'd be helpful to have a comment mentioning that you're not handling the unconventional exits like invoke.

317

This also would be good to have as a comment. And possibly an assert() that validates the claim.

bjope added a subscriber: bjope.Oct 25 2018, 3:48 AM