This is an archive of the discontinued LLVM Phabricator instance.

[Polly][PM][WIP] Polly pass registration
ClosedPublic

Authored by philip.pfaffe on Jul 15 2017, 3:20 PM.

Details

Summary

This patch is a first attempt at registering Polly passes with the LLVM tools. Tool plugins are still unsupported, but this registration is usable from the tools if Polly is linked into them (albeit requiring minimal patches to those tools). Registration requires a small amount of machinery (the owning analysis proxies), necessary for injecting ScopAnalysisManager objects into the calling tools.

This patch is marked WIP because the registration is incomplete. Parsing manual pipelines is fully supported, but default pass injection into the O3 pipeline is lacking, mostly because there is opportunity for some redesign here, I believe. The first point of order would be insertion points. I think it makes sense to run before the vectorizers. Running Polly Early, however, is weird. Mostly because it actually is the default (which to me is unexpected), and because Polly runs it's own O1 pipeline. Why not instead insert it at an appropriate place somewhere after simplification happend? Running after the loop optimizers seems intuitive, but it also seems wasteful, since multiple consecutive loops might well be a single scop, and we don't need to run for all of them.

My second request for comments would be regarding all those smallish helper passes we have, like PollyViewer, PollyPrinter, PollyImportJScop. Right now these are controlled by command line options, deciding whether they should be part of the Polly pipeline. What is your opinion on treating them like real passes, and have the user write an appropriate pipeline if they want to use any of them?

Diff Detail

Repository
rL LLVM

Event Timeline

philip.pfaffe created this revision.Jul 15 2017, 3:20 PM
grosser edited edge metadata.Jul 16 2017, 12:41 AM

Hi Philip,

http://polly.llvm.org/docs/Architecture.html#polly-in-the-llvm-pass-pipeline discusses exactly why Polly is run at which position in the pass pipeline. We all seem to agree that the best position is to place it right before the vectorizer. We are just waiting for Michael's delicm to be finally enabled and upstreamed. We are very close, I believe. Maybe @Meinersbur can give an update on the timeline. Would it be possible to just take over the -polly-position command line option as we have it for the old pass manager. This will allow us to play with the different options.

Regarding the other small passes. The idea is that users can enable certain optimizations from the clang command line. E.g. our buildbots use commands such as 'clang -O3 -mllvm -polly -mllvm -polly-enable-simplify". The easiest way to add such options was for us to add/remove passes as needed. However, maybe there is a better design we could evolve towards?

Best,
Tobias

grosser requested changes to this revision.Jul 16 2017, 6:56 AM

Mark this as "request changes" to move this out of my "to-review" queue.

This revision now requires changes to proceed.Jul 16 2017, 6:56 AM

Hi Tobias,

I concur with your assessment regarding adding the passes via command line options. Will update the patch accordingly. Since none of the respective passes have been ported yet, that update might just be a TODO for the moment. Incremental progress and all :)

Some more thoughts regarding the appropriate extension points for Polly: The Early EP doesn't exist anymore. While I have an idea about getting something comparable back, we still should explore other options. I think the CGSCCOptimizerLateEPCallback could be worth consideration. It runs CGSCC passes right at the end of the module simplification pipeline. Another option could be ScalarLate. While that isn't exactly a full O1 (because that is a subset of O3, but not a true prefix) it could be close enough. Most importantly, it does include an InlinerPass. To get a good idea about the O pipelines, check out https://github.com/llvm-mirror/llvm/blob/master/test/Other/new-pm-defaults.ll. This test shows the complete order of passes in the O pipelines, as well as the positions of the individual extension points.
LoopLate seems to be a bad idea to me, I'd probably just skip this for now, and add it should the need arise.

Here is what I meant with minimal changes required to the LLVM tools: https://github.com/llvm-mirror/clang/blob/master/tools/driver/cc1_main.cpp#L68
Things like clang, opt, and bugpoint need to call RegisterPollyPasses from this patch on the new-PM path just like they did initializePollyPasses for the legacy PM. Will follow up with patches.

Hi Philip,

I don't think CGSCCOptimizerLateEPCallback corresponds to what we have nowadays with Polly early. I think for experiments it would be helpful to have an EP that corresponds to the original 'early' point. For the actual uses in the future, I believe the before-vectorizer EP is indeed what we want.

Regarding the small calls to register Polly. I think they are very uncontroversial and can probably be committed quickly. Having pass registration working for external modules would be great, as this is what our buildbots are using. Looking forward for D35258 to become available.

The Early and Late mechanisms aren't available. If we really (like, _really_) want to use them, I can submit another patch to the PassBuilder to add an equivalent mechanism back. The idea is having another callback that gets to wrap the entire pipeline. That's just one callback, but it's a lot more powerful than the EPs. In my eyes the urgency of this depends on the status of BeforeVectorizer usability.

grosser accepted this revision.Jul 16 2017, 2:48 PM

It seems adding an "early" EP is not so easy. In this case, let's just leave it and start testing with DeLICM and co enabled. If they are not yet up-to what we need, we anyhow should report bugs. Feel free to commit this as soon as the prerequsites are in.

This revision is now accepted and ready to land.Jul 16 2017, 2:48 PM

Skeleton implementation for building the Polly pipeline. It's currently lacking the utilities available from the command line options. These should be added incrementally once someone took the time to port them over.

I've also ported the CodePreparation pass that takes care of splitting out the entry alloca block for the function.

That looks great so far.

bollu edited edge metadata.Jul 17 2017, 3:22 PM

A question from someone following this as an "outsider" with no context -
Does this allow others to obtain, say, Scop information (outside of polly) the way one can access DomTree / SCEV information?

If so, that would be really cool. If not, is that something that can be made possible? I'm just curious if this brings more flexible ways to use Polly.

A question from someone following this as an "outsider" with no context -
Does this allow others to obtain, say, Scop information (outside of polly) the way one can access DomTree / SCEV information?

Yes, this is absolutely possible. This works for Function-level analyses of course (e.g. ScopInfo), but also for Scop-level analyses. If the querying pass is itself a ScopPass, this Just Works. Any other pass type can access Scop-analyses through the ScopAnalysisManagerFunctionProxy, like this:

/* ... */ run(Function &F, FunctionAnalysisManager &FAM) {
  Scop &S = /*...*/;
  auto &SAM = FAM.getResult<ScopAnalysisManagerFunctionProxy>(F).getManager();
  auto &IA = SAM.getResult<IslAstAnalysis>(S);
}

Small updates:

  • Add the CodePreparation-Pass
  • Accept empty Scop-Pipelines
  • Add NotImplemented-assertions for the passes added via commandline options
philip.pfaffe edited the summary of this revision. (Show Details)

Do not preserve RegionInfo in polly-prepare. This fixes PR33876 for the New-PM.
The bug is still there in the legacy PM, as it is a bug in RegionInfo itself.

Also includes a rebase.

This revision was automatically updated to reflect the committed changes.