Page MenuHomePhabricator

[PM] Rewrite the loop pass manager to use a worklist and augmented run arguments much like the CGSCC pass manager.
ClosedPublic

Authored by chandlerc on Jan 4 2017, 6:06 AM.

Details

Summary

This is a major redesign following the pattern established for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

The unittests are somewhat a placeholder. I wrote them using gmock because it
was substantially easier to make progress and as a demonstration of the
improvements that the gmock infrsatructure can bring to writing precise and
clear test passes to plug into the pass manager. I'm starting an llvm-dev
discussion about whether this is enough to convince folks that we should start
leveraging gmock within LLVM. If we don't go that route, I'll rewrite the tests
without gmock (but likely lose some precision and coverage in the process).

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jlebar added inline comments.Jan 5 2017, 1:51 PM
unittests/Analysis/LoopPassManagerTest.cpp
235 ↗(On Diff #83173)

FWIW -- I think we talked about this elsewhere, but I forget the conclusion -- I would find a raw string far easier to read (and modify).

236 ↗(On Diff #83173)

Nit, it would be really nice if we'd adopt the boolean arg-naming convention, foo(/*some_arg=*/false) -- see also the now five-year-old blog post that keeps on giving, http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html

237 ↗(On Diff #83173)

Maybe

Register MLAHandle's analysis.

or even

Register our mock analysis.

? I like the latter because it says what we're doing without just repeating what's in the code.

251 ↗(On Diff #83173)

s/Cross register/Cross-register/

264 ↗(On Diff #83173)

analysis results for them

or, probably better

get their analysis results

?

266 ↗(On Diff #83173)

All or almost all of the expectations on run() are of the form

run(HasName("foo"), _, _, _)

It would be kind of nice to simplify this to

run(HasName("foo"))

which we could do by changing the MOCK_METHOD4 to MOCK_METHOD1, per https://github.com/google/googletest/blob/master/googlemock/docs/CookBook.md#simplifying-the-interface-without-breaking-existing-code

I'm on the fence about it. WTYT?

268 ↗(On Diff #83173)
283 ↗(On Diff #83173)

I pondered over this for a good five minutes before figuring it out.

I think part of my problem was that based on the comments and the scoping, it seems that this runs some passes somehow.

It's clearer to me in the other tests because we don't have an explicit scope.

Maybe just a brief comment on this block would fix my confusion.

374 ↗(On Diff #83173)

, causing

389 ↗(On Diff #83173)

upp

414 ↗(On Diff #83173)

s/themselves//?

514 ↗(On Diff #83173)

triggers

514 ↗(On Diff #83173)

s/,/; it/

639 ↗(On Diff #83173)

"Set up" :)

640 ↗(On Diff #83173)

"just need to get"

641 ↗(On Diff #83173)

gets invalidated

641 ↗(On Diff #83173)

s/our 'A' analysis/AnalysisA/, and likewise for AnalysisB?

677 ↗(On Diff #83173)

, and

678 ↗(On Diff #83173)

s/"/'/

705 ↗(On Diff #83173)

, with

737 ↗(On Diff #83173)

Set up

738 ↗(On Diff #83173)

s/by default//?

765 ↗(On Diff #83173)

preserve

766 ↗(On Diff #83173)

New sentence, "This should trigger"

766 ↗(On Diff #83173)

", despite their being preserved." or ", despite the fact that they were preserved."

777 ↗(On Diff #83173)

required?

859 ↗(On Diff #83173)

, so

900 ↗(On Diff #83173)

Same here about chaining WillOnce, and same below.

907 ↗(On Diff #83173)

In the second run

958 ↗(On Diff #83173)

At this point in the file this comment is probably unnecessary.

1040 ↗(On Diff #83173)

, as

chandlerc updated this revision to Diff 83522.Jan 7 2017, 12:59 AM
chandlerc marked 68 inline comments as done.
chandlerc edited edge metadata.

Updates based on review from Justin.

Thanks so much for the review! Patch updated and responses below.

include/llvm/Analysis/LoopPassManager.h
95 ↗(On Diff #83173)

Let me know if this isn't enough.

126 ↗(On Diff #83173)

Well, I left it there by accident. But yea, I probably don't want to allow {}s to calls this constructor without a type name. Sadly, I want most constructors to be explicit in this day and age... =[ But its less important for multiple argument constructors. Happy to remove or leave as you see fit.

169 ↗(On Diff #83173)

So the first thing to realize is that I'm specifically avoiding AM because that's this is *also* a member of the containing analysis class, and that class has a method that gets an analysis manager argument typically spelled 'AM', and this is importantly *not* the same...

Sadly, changing this to LoopAM ends up confusing as well because we don't specialize the Analysis class, and so its member continues to be spelled InnerAM. And we even see this, because we do specialize a *method* of the Analysis, specifically where we pass that InnerAM into the result. I somewhat like the names matching... But if necessary, I guess they could be mismatched... Any other ideas?

197 ↗(On Diff #83173)

OMG, yes, thank you, so much better. Here and elsewhere, I just want to say thank you for helping me wrap the text of these comments back around to sanity. Half the time, the comments evolve over *many* rounds of edits and time, and end up with really convoluted wording. And at that point i've been staring at it far too long. I appreciate your patience rewording this stuff.

205 ↗(On Diff #83173)

I'm trying to explain why we have a second worklist?

253 ↗(On Diff #83173)

So, for both this and the analysis results, I left in somewhat intentionally bad names because i wanted to get help naming them.

I generally like ...Updater here, and ...AnalysisResults above.

My biggest problem is -- what prefix? LPM is actually kinda misleading as they're much more about loop passes and not about loop pass management. I could go with LP, but it seems a pretty lame prefix. I could go with LoopPass to at least avoid the inscrutability of LP at the cost of longer names.

Halp...

261 ↗(On Diff #83173)

Yes, yes I do.

295 ↗(On Diff #83173)

I have such a complicated relationship with commas....

306 ↗(On Diff #83173)

Yes. I meant to add those asserts and then didn't. Thanks for catching that.

307 ↗(On Diff #83173)

The assert seems easier to do here while we're checking that they're siblings.

317 ↗(On Diff #83173)

What it refers to doesn't change is what I meant.... But sure, better comments added instead.

370 ↗(On Diff #83173)

I can make the comment less redundant and avoid the need to update...

lib/Analysis/LoopPassManager.cpp
32 ↗(On Diff #83173)

I WOULD NEVER CUT AND PASTE CODE BLINDLY FROM ONE FILE TO ANOTHER AND FAIL TO UPDATE ALL THE COMMENTS!

Except of course for all of this code where I did exactly that. Sorry. ;]

41 ↗(On Diff #83173)

Because I haven't gone back and undone this. :: sigh ::

I do have an idea for how to fix this to be a little less painful. We want to keep a small amount of this stuff even in no-asserts builds, but *WAY* less than currently, so most of this goes into the lovely DEBUG macro. ANyways, planned for a future patch that does this across all the PMs. I can expedite that if useful.

72 ↗(On Diff #83173)

Yep.

However, this code doesn't work as written it occurs to me. I've left a FIXME for now, I need to think about how to handle this.

135 ↗(On Diff #83173)

Tried rewriting this whole thing. Lemme know if this is better.

141 ↗(On Diff #83173)

Also added more detail.

unittests/Analysis/LoopPassManagerTest.cpp
235 ↗(On Diff #83173)

Yea, I just haven't gotten around to cleaning all of these up... Will try to actually do that, but would rather not in this patch.

236 ↗(On Diff #83173)

Yep. But what would be even more nice is to actually accept a more useful argument like the raw_ostream to use. ;] Anyways, I had planned to fix this when lessenging the use of it at all and using DEBUG() more heavily.

266 ↗(On Diff #83173)

The problem is that even though we don't really need to *match* on these arguments, we do want to have access to them in the Invoke'd actions.

283 ↗(On Diff #83173)

Yep, good suggestion. Let me know if my commets are helping?

414 ↗(On Diff #83173)

Tried a better word. Trying to differentiate from failing to preserve the LoopAnalysis.

514 ↗(On Diff #83173)

It's not an it though.

I think this should be "the rest don't invalidate; they only trigger ..." which then agrees?

(But I so don't trust myself here...)

777 ↗(On Diff #83173)

no, this is a RequiresAnalysis pass thingy....

900 ↗(On Diff #83173)

Same what about chaining WillOnce?

958 ↗(On Diff #83173)

I was being consistent and trying to help the reader that just came into this file at the bottom...

1115 ↗(On Diff #83173)

Yes, but i dunno why I wrote this. the key thing here is that this is testing *top-level* insertion. Commented accordingly now.

1148 ↗(On Diff #83173)

Same as above. I can nuke them all if yo uwant, but i'd as soon have all of them as one of them...

1284 ↗(On Diff #83173)

the which is an aside, you were correct to suggest a comma.

1346 ↗(On Diff #83173)

You shoul count the commas and fine me or something....

jlebar accepted this revision.Jan 8 2017, 11:42 AM
jlebar edited edge metadata.

\o/

include/llvm/Analysis/LoopPassManager.h
126 ↗(On Diff #83173)

Sadly, I want most constructors to be explicit in this day and age... =[

Yeah, understood. So long as it's intentional, that's cool.

169 ↗(On Diff #83173)

So the first thing to realize is that I'm specifically avoiding AM because that's this is *also* a member of the containing analysis class

Sorry if this is obvious, who exactly has an AM member variable? Not LoopAnalysisManagerFunctionProxy == InnerAnalysisManagerFunctionProxy afaict? The outer proxy does, but the inner one's member is named InnerAM, like you say. I looked through all of PassManager.h and don't see anyone else that seems relevant with a member called AM.

Sadly, changing this to LoopAM ends up confusing as well because we don't specialize the Analysis class, and so its member continues to be spelled InnerAM.

OK, I'm sold on that basis.

205 ↗(On Diff #83173)

Hm, you're saying we can't use Worklist for the preorder traversal? That seems obvious to me because our goal here is to add some stuff to Worklist, but maybe it's my ignorance that's making me think that's obvious. You could say

Worklist is our output param; we need a second worklist for the preorder traversal.

?

253 ↗(On Diff #83173)

LPMUpdateResult seems to be more about updating the LPM, not updating an LP. It's the class that LPs use to update the LPM, as I read it. So LPMUpdater or LoopPassManagerUpdater seems fine.

I agree that the analysis results one is less about the manager. LPAnalysisResults or LoopPassAnalysisResults? I kind of want to say that they're the analysis results you get "for free". LPStandardAnalysisResults?

I also agree you should be consistent in whether you abbreviate "LP/M".

97 ↗(On Diff #83522)

Ah, got it. lgtm, but suggest s/correctly// s/necessary//.

431 ↗(On Diff #83522)

It would be cool if this set of analyses had a name, then you could say

We also preserve the standard loop analyses [or whatever].

lib/Analysis/LoopPassManager.cpp
41 ↗(On Diff #83173)

I can expedite that if useful.

No need.

135 ↗(On Diff #83173)

It's a lot better, but now that I understand what you're saying, I think this should be combined into the comment above. AIUI the loop above is taking the place of a call to InnerAM->clear() in the destructor, so the two comments are related.

Walk the loops for this function and clear the results associated with this function. Then set InnerAM to null so that we don't call InnerAM->clear() (which nukes everything from orbit) when this object is destroyed as invalid.

Note that although the LoopInfo may be stale at this point, the loop objects themselves remain the only viable keys that could be in the analysis manager's cache, so we can just walk the keys and forcibly clear those results. The order doesn't matter here as this will directly destroy the results without calling methods on them.

FIXME ...

91 ↗(On Diff #83522)

Suggest

we can do this by building a preorder sequence and walking it in reverse

112 ↗(On Diff #83522)

here too it would be good to have a standard name for these analyses.

133 ↗(On Diff #83522)

s/the result object/this object/?

unittests/Analysis/LoopPassManagerTest.cpp
107 ↗(On Diff #83522)

Suggest changing to active voice and then providing a brief example:

If you want to register multiple loop analysis passes, you'll need to instantiate this type with different values for I. For example:

MockLoopAnalysisHandleTemplate<0> h0;
MockLoopAnalysisHandleTemplate<1> h1;
typedef decltype(h0)::Analysis Analysis0;
typedef decltype(h1)::Analysis Analysis1;

In fact I could see getting rid of the example, too.

(I got rid of the enum because you only ever use it once, so who cares if you hardcode constants?)

540 ↗(On Diff #83522)

There's a typo here, but other than that it's fine. Because phab is awful I am not sure if this is the line the above comments were talking about or what, but I'm sure it's all fine here.

235 ↗(On Diff #83173)

np, sgtm.

236 ↗(On Diff #83173)

sgtm

266 ↗(On Diff #83173)

Ah, scratch that plan then.

283 ↗(On Diff #83173)

Yes, looks good.

900 ↗(On Diff #83173)

Sorry, that was related to a comment I deleted.

958 ↗(On Diff #83173)

Eh, copy-pasted comments are almost as bad as copy-pasted code. But up to you, not a big deal.

This revision is now accepted and ready to land.Jan 8 2017, 11:42 AM
davide accepted this revision.Jan 8 2017, 1:27 PM
davide added a reviewer: davide.
  1. You propose as FIXME not using RPO. Do you have data/evidence that a different visitation order actually matters? I'm not entirely convinced about it (but I haven't thought about the problem very carefully so).

I do? Are you referring to the order of loops in LoopInfo?

The only thing I found surprising is that the natural order of loops in LoopInfo is from the bottom of the function to the top. I haven't checked if it is exactly in postorder of the CFG but it looks close.

I wrote this patch to make us visit loops "forward" program order (top to bottom), and left a FIXME that we might want to reverse them in LoopInfo itself.

As for which order of visit is better, I don't have data at all. It turns out the current loop pass manager processes loops in reverse. However, very nicely, they wrote a long comment about how there wasn't any particular reason. There is some hand waving in it about deleting uses before processing defs, but that actually seems unlikely to matter to me.

The reason I had wanted to process in forward order are three fold:

First and foremost: I find it much less surprising. I wrote a lot of tests assuming it would be forward and was surprised when it wasn't. This is just that I find forward program order (in the absence of something like the bottom-up walk) less surprising in general.

Second: it seems useful to enable SSA propagation of constants and things from unrolling one loop into its sibling. That typically argues for processing defs before uses.

I think my second argument is a bit more compelling than the idea of deleting uses before processing defs. But not by much. Hence the primary rationale was a principle of least surprise.

  1. You removed a call to getContext().yield(). Do you happen to know why it was introduced in the first place?

I removed it? I thought I just added a commented out bit below a FIXME that hopefully explains what needs to happen here. Happy to dig into this though, and maybe I'm staring at the diff badly...

Both comments make sense. Thanks for explaining. LGTM if/when others are happy.

chandlerc updated this revision to Diff 83786.Jan 10 2017, 2:44 AM
chandlerc marked 8 inline comments as done.
chandlerc edited edge metadata.

Thanks for all the reviews! I've done the suggested renaming from our discussion on IRC:

LPMAnalysisResults -> LoopStandardAnalysisResults

LPMUpdateResult -> LPMUpdater

I'll land this once i get gmock and co. landed so the tests work.

This revision was automatically updated to reflect the committed changes.

Landed, thanks so much for all the reviews!!!