This is an archive of the discontinued LLVM Phabricator instance.

[New-PM] Lift Scop Pipeline to CGSCC-level
AcceptedPublic

Authored by lksbhm on Apr 3 2018, 7:48 AM.

Details

Summary

This patch is intended as a preparation to teach polly interprocedural analysis.

The reason we want to have polly passes run in CGSCC scope instead of function scope is the deterministic iteration order over functions: To be able to "look into" a region's callees, we ideally want ScopInfo to be already available for every callee. The CGSCC AnalysisManager ensures this - except for callees which are part of the analyzed function's SCC. But we aren't interested in those anyways. For that matter, the differential's most important part is the change from FunctionToScopPassAdaptor to CGSCCToScopPassAdaptor and buildDefaultPollyPipeline taking a CGSCC-AM now instead of a Function-AM.

However, this differential also includes some further CGSCC-related changes to prepare for upcoming patches: In RegisterPasses.cpp we add a parseCGSCCPipeline callback, allowing us to run the ScopInfoPrinterPass in a CGSCC-pipeline as well (making no difference at the moment, but using the CGSCC_PASS macro in PollyPasses.def).

Regarding tests: @philip.pfaffe is currently working on porting the existing lit test-suite to the new passmanager. I would suggest waiting for this effort to be complete before I add any more tests to the repo. That being said, I am not opposed to doing so if asked.

I apologize for the seemingly unnecessary small changes (e.g. renaming FAM->AM), but we had a lot of code churn while developing our IPO changes, and I didn't think it important enough to waste time reverting them again.

Diff Detail

Repository
rPLO Polly

Event Timeline

lksbhm created this revision.Apr 3 2018, 7:48 AM
lksbhm updated this revision to Diff 140801.Apr 3 2018, 8:22 AM

Remove OwningInnerAnalysisManagerProxy's friend class ... and AnalysisKey which weren't necessary.
Move OwningScopAnalysisManagerFunctionProxy-related code in ScopPass.cpp closer together [nfc]

Are you also going to downstream patches that need this? If yes, what exactly do they need to analyze the callee for?

lksbhm added a comment.EditedApr 3 2018, 1:31 PM

Are you also going to downstream patches that need this? If yes, what exactly do they need to analyze the callee for?

I hope I didn't misunderstand your question, but we intend to fully integrate our interprocedural analysis efforts into upstream polly. In fact, I intend to get the next patch ready for review tomorrow which probably will make things clearer.

Basically, we made an extended version of ScopInfo which stores additional information about how MemoryAccesses depend on function parameters for all accesses in a function. By querying this information for a callee, we can add the induced MemoryAccesses to the surrounding ScopStmt by projecting them into the caller-function's scope and therefore model the effects of the CallInst (I hope this explanation is both understandable and accurate. @philip.pfaffe can hopefully correct me if there is anything wrong or missing). I hope I was able to clarify things a little.

Thanks for your response. Looking forward to your follow-up patches. I am not too knowledgeable with the new pass manager, so I will let Philip review.
Would it make sense to keep code that analyzes SCoP's function-wise and only use a CGSCC pass when interprocedural analysis is enabled?

If I understand correctly, you are analyzing which data a function accesses. Similar to the "summary functions" in PENCIL.

Another possibility would have been to inline the callee's scop to get a single scop. We already have a 'hack' for this called ScopInliner.

Michael

Would it make sense to keep code that analyzes SCoP's function-wise and only use a CGSCC pass when interprocedural analysis is enabled?

Our approach doesn't actually change the level at which scop analysis happens, if that is the question. I.e. ScopInfoAnalysis remains a function analysis. The only reason we want to process the callgraph at cgscc-level when running a scop pass pipeline is the guaranteed ordering which maximizes the callee ScopInfo availability. Or, from the implementation point of view, we cannot run ScopInfoAnalysis on-demand for a callee but instead have to depend on it already having run before. This is where cgscc's post-order traversal comes in handy.
However, if you're asking whether it would make sense to have both FunctionToScopPassAdaptor alongside CGSCCToScopPassAdaptor, this might be something worth considering.

Similar to the "summary functions" in PENCIL.

From quickly glancing at the paper you linked to, the "function summaries" section seems very related to our approach :)

Another possibility would have been to inline the callee's scop to get a single scop. We already have a 'hack' for this called ScopInliner.

I'm not familiar with the way ScopInliner works. I'm assuming this is on llvm IR level(?)

philip.pfaffe added inline comments.Apr 4 2018, 1:36 PM
include/polly/ScopPass.h
36

Unrelated change

52

Should have three slashes

180

Unrelated change

258

No curly braces required here

276

unrelated change

lib/Analysis/ScopPass.cpp
15

This shouldn't be removed

lib/Support/RegisterPasses.cpp
719

Irrelevant

Philip and me talked about the idea keeping both Function- and CGSCC-level passes around. Philipp was convinced that there was no benefit. For reference, here were my concerns (I don't know the new pas manager that well):

  1. A function-level transformation pass wants to use Polly analysis. Since function passes can be run in any order by the pass manager, it does not necessarily correspond to the CGSCC (reverse? post-order) expected by the Polly passes. Callees have been already modified by this hypothetical pass after being analyzed by Polly.
  1. A JIT or similar (e.g. clang runs a few LLVM passes directly after codegening a function) may not have added or know all callers/callees to the llvm::Module yet, i.e. the call graph is not complete/available.

[Rational: There must be some reason for why not all LLVM function passes are CGSCC-passes]

Other than a few nitpicks: LGTM.

include/polly/ScopPass.h
276

The comments are added by clang-format, make check-polly probably already complained about format changes.

lib/Analysis/ScopInfo.cpp
5150–5159

[Style] You could just call the other overload than to duplicate its code.

lib/Analysis/ScopPass.cpp
149

[Nit] Unnecessary change

lib/Support/RegisterPasses.cpp
702

The meaning of the FIXME comment is not clear without the deleted code.
Suggestion:

// FIXME: When VerifyEachPass is enabled, we are supposed to add a VerifierPass to the pipeline here. However, there is no CGSCC-level verifier pass.
lksbhm updated this revision to Diff 141111.Apr 5 2018, 12:27 AM

Implement changes requested in review

lksbhm added inline comments.Apr 5 2018, 12:36 AM
include/polly/ScopPass.h
276

This is the only remark I didn't do any changes for with the latest update. Since the closing brace is for the CGSCCToScopPassAdaptor, the "namespace" comment is clearly misplaced. I ran the polly-update-format target with a clang-format built half an hour ago and it didn't re-insert the comment. However, RegisterPasses.cpp would get it's corresponding header moved above the other includes. I assume this is an unrelated change as well, though, so I didn't put it in the diff.

lksbhm marked 9 inline comments as done.Apr 5 2018, 12:39 AM
  1. A function-level transformation pass wants to use Polly analysis. Since function passes can be run in any order by the pass manager, it does not necessarily correspond to the CGSCC (reverse? post-order) expected by the Polly passes. Callees have been already modified by this hypothetical pass after being analyzed by Polly.

This patch only changes the layering for ScopPasses. Note that Scop analyses are still being proxied in and out of function. As a consequence, function layer passes are not affected by this change.

  1. A JIT or similar (e.g. clang runs a few LLVM passes directly after codegening a function) may not have added or know all callers/callees to the llvm::Module yet, i.e. the call graph is not complete/available.

This should probably be discussed as part of a review for interprocedural analysis. But fundamentally speaking, if the callee cannot be analyzed then the region containing the call cannot be modeled. So, worst case, nothing changes from what we're doing right now.

Meinersbur added inline comments.Apr 5 2018, 8:49 AM
include/polly/ScopPass.h
276

You are of course right that the // namespace polly is misplaced. Sorry, I didn't notice that. I just committed a change to correct that (if you have commit rights, and notice such things, you can just fix them in a commit tagged "NFC").

Regarding the header move: It looks like this patch moves #include "polly/RegisterPasses.h" to where it belongs alphabetically. However, clang-format identifies it as the header of the translation unit and moves it to the beginning (where, AFAICS it is in trunk).

Part of make check-polly is that it is clang-format clean. We have a problem if it is not, the buildbots (http://lab.llvm.org:8011/builders/polly-amd64-linux) will consider the build as "failed". This forces us to use the formatting as defined by clang-format (if clang-format itself changes, we apply its suggestions in a dedicated commit).

lksbhm updated this revision to Diff 141588.Apr 9 2018, 12:13 AM

Rebase onto top of trunk and run clang-format.
Only changes between previous patch version and this are // namespace polly comment removal (already in trunk) and #include RegisterPasses.h now first include in RegisterPasses.cpp

lksbhm marked an inline comment as done.Apr 9 2018, 12:14 AM
philip.pfaffe accepted this revision.Apr 10 2018, 1:22 AM

LGTM too!

Since we can do new-PM lit testing now, does it make sense to include test cases here?

This revision is now accepted and ready to land.Apr 10 2018, 1:22 AM
lksbhm updated this revision to Diff 141859.Apr 10 2018, 8:45 AM

Rebase patch and port some tests where the cgscc() nesting needs to be explicit to the new passmanager

We already have a fully ported testsuite (see D45493) and these are the tests that fail after this patch without the necessary modifications.
Therefore we have decided that it makes sense to attach them as tests for this change. NOTE however that this patch now depends on D45484 in order for the tests to be executable (and D45493 for the remaining ports)

lksbhm marked 3 inline comments as done.Apr 10 2018, 8:50 AM
lksbhm added inline comments.
test/Isl/CodeGen/loop_with_condition_nested.ll
226 ↗(On Diff #141859)

The BasicBlock order is different in new-pm it seems. So that's why we need to duplicate this

test/Isl/CodeGen/perf_monitoring_trip_counts_per_scop.ll
68 ↗(On Diff #141859)

Globals are printed in a different order as well

test/ScopInfo/aliasing_with_non_affine_access.ll
15 ↗(On Diff #141859)

Function iteration order on cgscc level is also different sometimes. Should make sense to have a different run for each test anyways

lksbhm updated this revision to Diff 142616.Apr 16 2018, 4:52 AM
  1. Add parseCGSCCPipeline callback to PassBuilder in RegisterPasses.cpp (no idea how this went missing before)
  2. Change some new-pm lit test RUN lines to use the function-level scop analysis printer pass. Those turned out to be function traversal ordering dependent, which isn't the same on cgscc-level as it is on function-level