This is an archive of the discontinued LLVM Phabricator instance.

[PM] Rewrite the loop pass manager to use a worklist and augmented run arguments much like the CGSCC pass manager.
ClosedPublic

Authored by chandlerc on Jan 4 2017, 6:06 AM.

Details

Summary

This is a major redesign following the pattern established for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

The unittests are somewhat a placeholder. I wrote them using gmock because it
was substantially easier to make progress and as a demonstration of the
improvements that the gmock infrsatructure can bring to writing precise and
clear test passes to plug into the pass manager. I'm starting an llvm-dev
discussion about whether this is enough to convince folks that we should start
leveraging gmock within LLVM. If we don't go that route, I'll rewrite the tests
without gmock (but likely lose some precision and coverage in the process).

Event Timeline

chandlerc updated this revision to Diff 83044.Jan 4 2017, 6:06 AM
chandlerc retitled this revision from to [PM] Rewrite the loop pass manager to use a worklist and augmented run arguments much like the CGSCC pass manager..
chandlerc updated this object.
chandlerc added reviewers: jlebar, bogner.
chandlerc added a subscriber: llvm-commits.
davide added a subscriber: davide.Jan 4 2017, 6:18 AM
davide added a comment.Jan 4 2017, 6:24 AM

Couple of high-level comments:

  1. What's the price paid (in compile time) for computing the analyses upfront in a function pass and making them available to the loop passes?
  2. What's specifically your plan for LCSSA?

Couple of high-level comments:

  1. What's the price paid (in compile time) for computing the analyses upfront in a function pass and making them available to the loop passes?

Very low. We are already computing all of these in the main loop pipeline today. This just makes it explicit.

Many of these are "almost always cached" analyses as well such as TLI, TTI, etc.

That said, I don't have numbers. It's way too early to even get numbers. We can change the particular set of analyses we do this with later as needed though. I just picked all the analyses that had multiple loop pass users and where those passes seemed like actively developed and used things and the analyses didn't seem weird or overly specialized.

  1. What's specifically your plan for LCSSA?

The current thinking based on a discussion between myself and Michael Z is to teach the pass manager itself to form loops into LCSSA before starting the inner pipeline (when it is a function pass and can do this) and verify that it is preserved throughout the run. No ambiguity or implicit handshake.

As to the *exact* implementation, I want to play with the code to see what works cleanly before I speculate.

davide added a comment.Jan 4 2017, 7:05 AM

This is a large patch, but here's a first round of comments (inline).
I personally like a lot the idea of having the analyses always available (with my pass writer hat on).

Also:

  1. You propose as FIXME not using RPO. Do you have data/evidence that a different visitation order actually matters? I'm not entirely convinced about it (but I haven't thought about the problem very carefully so).
  2. You removed a call to getContext().yield(). Do you happen to know why it was introduced in the first place?
include/llvm/Analysis/LoopPassManager.h
56–64

Nit: infrastructure.

86–90

ehm, is this comment correct?
Shouldn't it say The Loop pass manager ?

88–90

Also doesn't it run over a sequency of Loop passes?

156–157

invalidation of the Module or the Function?

196–197

I think "correct" is unneeded. I assume every algorithm implemented here is supposed to be correct.

200–202

It could be just me, but I never saw RPO on a tree (only on graphs). Therefore, maybe this function can be called appendLoopsInPreorder as it's what the function is doing (and move the comment explaining why the two forms are equivalent on the top of the function).

203

traveral -> traversal.

316–324

Style nit: all these newlines between member variables are inconsistent with what you use elsewhere (I don't care which style you use, but pick one and stick with it).

lib/Analysis/LoopPassManager.cpp
71

Invaliadtion -> Invalidation.

chandlerc marked 8 inline comments as done.Jan 4 2017, 6:23 PM
  1. You propose as FIXME not using RPO. Do you have data/evidence that a different visitation order actually matters? I'm not entirely convinced about it (but I haven't thought about the problem very carefully so).

I do? Are you referring to the order of loops in LoopInfo?

The only thing I found surprising is that the natural order of loops in LoopInfo is from the bottom of the function to the top. I haven't checked if it is exactly in postorder of the CFG but it looks close.

I wrote this patch to make us visit loops "forward" program order (top to bottom), and left a FIXME that we might want to reverse them in LoopInfo itself.

As for which order of visit is better, I don't have data at all. It turns out the current loop pass manager processes loops in reverse. However, very nicely, they wrote a long comment about how there wasn't any particular reason. There is some hand waving in it about deleting uses before processing defs, but that actually seems unlikely to matter to me.

The reason I had wanted to process in forward order are three fold:

First and foremost: I find it much less surprising. I wrote a lot of tests assuming it would be forward and was surprised when it wasn't. This is just that I find forward program order (in the absence of something like the bottom-up walk) less surprising in general.

Second: it seems useful to enable SSA propagation of constants and things from unrolling one loop into its sibling. That typically argues for processing defs before uses.

I think my second argument is a bit more compelling than the idea of deleting uses before processing defs. But not by much. Hence the primary rationale was a principle of least surprise.

  1. You removed a call to getContext().yield(). Do you happen to know why it was introduced in the first place?

I removed it? I thought I just added a commented out bit below a FIXME that hopefully explains what needs to happen here. Happy to dig into this though, and maybe I'm staring at the diff badly...

include/llvm/Analysis/LoopPassManager.h
156–157

This comment and surrounding comments were all wrong. Updated.

200–202

I'm not sure why "preorder" is more or less confusing for a tree than "reverse postorder"?

Anyways, I agree RPO isn't that useful in the API. I've gone for a more high-level 'appendLoopsToWorklist' and made the adjustment to the comments. Let me know if this works?

chandlerc updated this revision to Diff 83173.Jan 4 2017, 6:31 PM
chandlerc marked 2 inline comments as done.

Fixes from Davide's review feedback, thanks!!!!

jlebar updated this object.Jan 5 2017, 10:13 AM
jlebar edited edge metadata.Jan 5 2017, 11:41 AM

I've gotten through everything except the test bodies. Overall this seems pretty straightforward -- or at least, it's a pretty straightforward application of what you've developed previously.

include/llvm/Analysis/LoopInfo.h
856–857

Nit, "function"?

include/llvm/Analysis/LoopPassManager.h
16

comma after "possible"

25

My usage dictionary says "innermost" and "outermost".

58

sp

93

Ah, I wasn't sold on typedef PassManager<Function> FunctionPassManager, but now I am.

95

Can we add a bit of verbiage about what we're trying to accomplish here? It's not obvious to me.

128

Nit, Do you really want explicit on this two-arg constructor?

171

Here's that InnerAM again. :)

I think what I'm supposed to get from this name is that this is the AM corresponding to the inner AM in the outer-to-inner proxy that is LoopAnalysisManagerFunctionProxy. Is that right?

Because "LoopAnalysisManagerFunctionProxy" does not contain the words "inner" or "outer", understanding "InnerAM" in this way requires a lot of effort. It's also not clear to me what I gain from understanding it in this way? I already know it's the loop AM from its type.

Suggest LoopAM, or just AM.

199

, so

199

Perhaps simplify "the observed order as it is processed" and put our goal first:

We want to process loops in postorder, but the worklist is a LIFO data structure, so we append to it in *reverse* postorder.

207

Not sure this comment is necessary.

211

s/between unrelated loop nests//

224

s/a/the/? Or remove the comment, I think it's pretty clear.

247

for updating

250

s/ensure they//

253

I mean, someone has to construct it. Maybe it's sufficient to combine with the above para:

An instance of this class is passed as an argument to each loop pass, and loop passes should use it to update the LPM infrastructure if they modify the loop nest.

255

This is kind of a weird name -- at least as described above, it doesn't sound like a "result". LPMUpdater?

263

reflow comment

263

, as

263

Do you mean

If this returns true, the loop object may have been deleted, so passes should take care not to touch the object.

?

287

s/The//

291

, as

297

s/,/ --/ or s/, i/: I/

305

s/The//

308

Should we assert that the loops are actually siblings of the current loop? Similarly in addChildLoops, should we check that the loops are actually children?

309

Do you want to have an assertion in appendLoopsToWorklist (or here if you can't put it there) that none of the loops are nested within one another, to catch the invariant described above?

311

, as

319

This changes if you call addSiblingLoops or addChildLoops, so it doesn't seem so fixed. Maybe we can just say what it is instead?

387

Update this comment if we change the name of LPMUpdateResult.

458

Perhaps simply

Pass for printing a loop's contents as textual IR.

lib/Analysis/LoopPassManager.cpp
31

s/manager/manager's/

32

It's not call graph updates we care about, right? It's the addition / removal of loops?

41

For my information, why are we using a boolean rather than the DEBUG infrastructure? For one thing, DEBUG compiled away in the builds we release, whereas llvm::dbgs() does not, right?

72

What do SCCs have to do with this? (Stale comment?)

95

s/ so/. So/

138

Can you rephrase the "Because" sentence?

139

its being

144

s/New/Now/? Otherwise I don't know what you mean.

unittests/Analysis/LoopPassManagerTest.cpp
40

Maybe just MockAnalysisHandleBase?

78

s/The base provides/We provide/

Or perhaps just

Derived classes should call this in their constructor to set up default mock actions. (We can't do this in our constructor because this has to run after the DerivedT is constructed.)

100–112

Comment what is I

159

ibid

191

I think some judicious comments are called for above, especially given that your readers may be totally unfamiliar with gmock.

jlebar added a comment.Jan 5 2017, 1:51 PM

I <3 gmock. Strong approve; this would be really hard to write without.

unittests/Analysis/LoopPassManagerTest.cpp
1084

the last inner loop

1141

outermost

1141

s/iteration/loop/?

1174

Same re this comment, not sure it's necessary by this point.

1310

ITYM ", which", but not sure. Do you mean you delete the last loop period, or that you delete the last loop that happens to contain another loop?

1311

reuse

1334

reuse

1345

, reusing

1356

just-deleted

1359

Can we rephrase?

To respect our inner-to-outer traversal order, we must visit the newly-inserted sibling of the loop we just deleted before we visit the outer loop. When we do so, this must compute a fresh analysis result, even though our new loop has the same pointer value as the loop we deleted.

1371

, including

1372

, and

1417

Same wrt this comment.

jlebar added inline comments.Jan 5 2017, 1:51 PM
unittests/Analysis/LoopPassManagerTest.cpp
257–260

FWIW -- I think we talked about this elsewhere, but I forget the conclusion -- I would find a raw string far easier to read (and modify).

258

Nit, it would be really nice if we'd adopt the boolean arg-naming convention, foo(/*some_arg=*/false) -- see also the now five-year-old blog post that keeps on giving, http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html

259

Maybe

Register MLAHandle's analysis.

or even

Register our mock analysis.

? I like the latter because it says what we're doing without just repeating what's in the code.

273–279

s/Cross register/Cross-register/

286

analysis results for them

or, probably better

get their analysis results

?

288

All or almost all of the expectations on run() are of the form

run(HasName("foo"), _, _, _)

It would be kind of nice to simplify this to

run(HasName("foo"))

which we could do by changing the MOCK_METHOD4 to MOCK_METHOD1, per https://github.com/google/googletest/blob/master/googlemock/docs/CookBook.md#simplifying-the-interface-without-breaking-existing-code

I'm on the fence about it. WTYT?

290
306

I pondered over this for a good five minutes before figuring it out.

I think part of my problem was that based on the comments and the scoping, it seems that this runs some passes somehow.

It's clearer to me in the other tests because we don't have an explicit scope.

Maybe just a brief comment on this block would fix my confusion.

398

, causing

413

upp

438

s/themselves//?

538

triggers

538

s/,/; it/

663

"Set up" :)

664

"just need to get"

665

gets invalidated

665

s/our 'A' analysis/AnalysisA/, and likewise for AnalysisB?

701

, and

702

s/"/'/

729

, with

761

Set up

762

s/by default//?

789

preserve

790

New sentence, "This should trigger"

790

", despite their being preserved." or ", despite the fact that they were preserved."

801

required?

885

, so

926

Same here about chaining WillOnce, and same below.

933–1419

In the second run

984

At this point in the file this comment is probably unnecessary.

1066

, as

chandlerc updated this revision to Diff 83522.Jan 7 2017, 12:59 AM
chandlerc marked 68 inline comments as done.
chandlerc edited edge metadata.

Updates based on review from Justin.

Thanks so much for the review! Patch updated and responses below.

include/llvm/Analysis/LoopPassManager.h
95

Let me know if this isn't enough.

128

Well, I left it there by accident. But yea, I probably don't want to allow {}s to calls this constructor without a type name. Sadly, I want most constructors to be explicit in this day and age... =[ But its less important for multiple argument constructors. Happy to remove or leave as you see fit.

171

So the first thing to realize is that I'm specifically avoiding AM because that's this is *also* a member of the containing analysis class, and that class has a method that gets an analysis manager argument typically spelled 'AM', and this is importantly *not* the same...

Sadly, changing this to LoopAM ends up confusing as well because we don't specialize the Analysis class, and so its member continues to be spelled InnerAM. And we even see this, because we do specialize a *method* of the Analysis, specifically where we pass that InnerAM into the result. I somewhat like the names matching... But if necessary, I guess they could be mismatched... Any other ideas?

199

OMG, yes, thank you, so much better. Here and elsewhere, I just want to say thank you for helping me wrap the text of these comments back around to sanity. Half the time, the comments evolve over *many* rounds of edits and time, and end up with really convoluted wording. And at that point i've been staring at it far too long. I appreciate your patience rewording this stuff.

207

I'm trying to explain why we have a second worklist?

255

So, for both this and the analysis results, I left in somewhat intentionally bad names because i wanted to get help naming them.

I generally like ...Updater here, and ...AnalysisResults above.

My biggest problem is -- what prefix? LPM is actually kinda misleading as they're much more about loop passes and not about loop pass management. I could go with LP, but it seems a pretty lame prefix. I could go with LoopPass to at least avoid the inscrutability of LP at the cost of longer names.

Halp...

263

Yes, yes I do.

297

I have such a complicated relationship with commas....

308

Yes. I meant to add those asserts and then didn't. Thanks for catching that.

309

The assert seems easier to do here while we're checking that they're siblings.

319

What it refers to doesn't change is what I meant.... But sure, better comments added instead.

387

I can make the comment less redundant and avoid the need to update...

lib/Analysis/LoopPassManager.cpp
32

I WOULD NEVER CUT AND PASTE CODE BLINDLY FROM ONE FILE TO ANOTHER AND FAIL TO UPDATE ALL THE COMMENTS!

Except of course for all of this code where I did exactly that. Sorry. ;]

41

Because I haven't gone back and undone this. :: sigh ::

I do have an idea for how to fix this to be a little less painful. We want to keep a small amount of this stuff even in no-asserts builds, but *WAY* less than currently, so most of this goes into the lovely DEBUG macro. ANyways, planned for a future patch that does this across all the PMs. I can expedite that if useful.

72

Yep.

However, this code doesn't work as written it occurs to me. I've left a FIXME for now, I need to think about how to handle this.

138

Tried rewriting this whole thing. Lemme know if this is better.

144

Also added more detail.

unittests/Analysis/LoopPassManagerTest.cpp
257–260

Yea, I just haven't gotten around to cleaning all of these up... Will try to actually do that, but would rather not in this patch.

258

Yep. But what would be even more nice is to actually accept a more useful argument like the raw_ostream to use. ;] Anyways, I had planned to fix this when lessenging the use of it at all and using DEBUG() more heavily.

288

The problem is that even though we don't really need to *match* on these arguments, we do want to have access to them in the Invoke'd actions.

306

Yep, good suggestion. Let me know if my commets are helping?

438

Tried a better word. Trying to differentiate from failing to preserve the LoopAnalysis.

538

It's not an it though.

I think this should be "the rest don't invalidate; they only trigger ..." which then agrees?

(But I so don't trust myself here...)

801

no, this is a RequiresAnalysis pass thingy....

926

Same what about chaining WillOnce?

984

I was being consistent and trying to help the reader that just came into this file at the bottom...

1141

Yes, but i dunno why I wrote this. the key thing here is that this is testing *top-level* insertion. Commented accordingly now.

1174

Same as above. I can nuke them all if yo uwant, but i'd as soon have all of them as one of them...

1310

the which is an aside, you were correct to suggest a comma.

1372

You shoul count the commas and fine me or something....

jlebar accepted this revision.Jan 8 2017, 11:42 AM
jlebar edited edge metadata.

\o/

include/llvm/Analysis/LoopPassManager.h
97

Ah, got it. lgtm, but suggest s/correctly// s/necessary//.

128

Sadly, I want most constructors to be explicit in this day and age... =[

Yeah, understood. So long as it's intentional, that's cool.

171

So the first thing to realize is that I'm specifically avoiding AM because that's this is *also* a member of the containing analysis class

Sorry if this is obvious, who exactly has an AM member variable? Not LoopAnalysisManagerFunctionProxy == InnerAnalysisManagerFunctionProxy afaict? The outer proxy does, but the inner one's member is named InnerAM, like you say. I looked through all of PassManager.h and don't see anyone else that seems relevant with a member called AM.

Sadly, changing this to LoopAM ends up confusing as well because we don't specialize the Analysis class, and so its member continues to be spelled InnerAM.

OK, I'm sold on that basis.

207

Hm, you're saying we can't use Worklist for the preorder traversal? That seems obvious to me because our goal here is to add some stuff to Worklist, but maybe it's my ignorance that's making me think that's obvious. You could say

Worklist is our output param; we need a second worklist for the preorder traversal.

?

255

LPMUpdateResult seems to be more about updating the LPM, not updating an LP. It's the class that LPs use to update the LPM, as I read it. So LPMUpdater or LoopPassManagerUpdater seems fine.

I agree that the analysis results one is less about the manager. LPAnalysisResults or LoopPassAnalysisResults? I kind of want to say that they're the analysis results you get "for free". LPStandardAnalysisResults?

I also agree you should be consistent in whether you abbreviate "LP/M".

431

It would be cool if this set of analyses had a name, then you could say

We also preserve the standard loop analyses [or whatever].

lib/Analysis/LoopPassManager.cpp
41

I can expedite that if useful.

No need.

91

Suggest

we can do this by building a preorder sequence and walking it in reverse

112

here too it would be good to have a standard name for these analyses.

133

s/the result object/this object/?

138

It's a lot better, but now that I understand what you're saying, I think this should be combined into the comment above. AIUI the loop above is taking the place of a call to InnerAM->clear() in the destructor, so the two comments are related.

Walk the loops for this function and clear the results associated with this function. Then set InnerAM to null so that we don't call InnerAM->clear() (which nukes everything from orbit) when this object is destroyed as invalid.

Note that although the LoopInfo may be stale at this point, the loop objects themselves remain the only viable keys that could be in the analysis manager's cache, so we can just walk the keys and forcibly clear those results. The order doesn't matter here as this will directly destroy the results without calling methods on them.

FIXME ...

unittests/Analysis/LoopPassManagerTest.cpp
107

Suggest changing to active voice and then providing a brief example:

If you want to register multiple loop analysis passes, you'll need to instantiate this type with different values for I. For example:

MockLoopAnalysisHandleTemplate<0> h0;
MockLoopAnalysisHandleTemplate<1> h1;
typedef decltype(h0)::Analysis Analysis0;
typedef decltype(h1)::Analysis Analysis1;

In fact I could see getting rid of the example, too.

(I got rid of the enum because you only ever use it once, so who cares if you hardcode constants?)

257–260

np, sgtm.

258

sgtm

288

Ah, scratch that plan then.

306

Yes, looks good.

540

There's a typo here, but other than that it's fine. Because phab is awful I am not sure if this is the line the above comments were talking about or what, but I'm sure it's all fine here.

926

Sorry, that was related to a comment I deleted.

984

Eh, copy-pasted comments are almost as bad as copy-pasted code. But up to you, not a big deal.

This revision is now accepted and ready to land.Jan 8 2017, 11:42 AM
davide accepted this revision.Jan 8 2017, 1:27 PM
davide added a reviewer: davide.
  1. You propose as FIXME not using RPO. Do you have data/evidence that a different visitation order actually matters? I'm not entirely convinced about it (but I haven't thought about the problem very carefully so).

I do? Are you referring to the order of loops in LoopInfo?

The only thing I found surprising is that the natural order of loops in LoopInfo is from the bottom of the function to the top. I haven't checked if it is exactly in postorder of the CFG but it looks close.

I wrote this patch to make us visit loops "forward" program order (top to bottom), and left a FIXME that we might want to reverse them in LoopInfo itself.

As for which order of visit is better, I don't have data at all. It turns out the current loop pass manager processes loops in reverse. However, very nicely, they wrote a long comment about how there wasn't any particular reason. There is some hand waving in it about deleting uses before processing defs, but that actually seems unlikely to matter to me.

The reason I had wanted to process in forward order are three fold:

First and foremost: I find it much less surprising. I wrote a lot of tests assuming it would be forward and was surprised when it wasn't. This is just that I find forward program order (in the absence of something like the bottom-up walk) less surprising in general.

Second: it seems useful to enable SSA propagation of constants and things from unrolling one loop into its sibling. That typically argues for processing defs before uses.

I think my second argument is a bit more compelling than the idea of deleting uses before processing defs. But not by much. Hence the primary rationale was a principle of least surprise.

  1. You removed a call to getContext().yield(). Do you happen to know why it was introduced in the first place?

I removed it? I thought I just added a commented out bit below a FIXME that hopefully explains what needs to happen here. Happy to dig into this though, and maybe I'm staring at the diff badly...

Both comments make sense. Thanks for explaining. LGTM if/when others are happy.

chandlerc updated this revision to Diff 83786.Jan 10 2017, 2:44 AM
chandlerc marked 8 inline comments as done.
chandlerc edited edge metadata.

Thanks for all the reviews! I've done the suggested renaming from our discussion on IRC:

LPMAnalysisResults -> LoopStandardAnalysisResults

LPMUpdateResult -> LPMUpdater

I'll land this once i get gmock and co. landed so the tests work.

This revision was automatically updated to reflect the committed changes.

Landed, thanks so much for all the reviews!!!