This is an archive of the discontinued LLVM Phabricator instance.

[NPM] Do not run function simplification pipeline unnecessarily
AcceptedPublic

Authored by mtrofin on Mar 5 2021, 10:10 PM.

Details

Summary

The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1].

This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since.

The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios.

[1] A compile time tracker run with the option enabled.

Diff Detail

Event Timeline

mtrofin created this revision.Mar 5 2021, 10:10 PM
mtrofin requested review of this revision.Mar 5 2021, 10:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2021, 10:10 PM
xbolva00 added a subscriber: nikic.Mar 21 2021, 3:30 AM
mtrofin updated this revision to Diff 332488.Mar 22 2021, 6:01 PM

added a test

mtrofin updated this revision to Diff 336139.Apr 8 2021, 9:05 AM

updated patch

mtrofin retitled this revision from WIP: don't run function simplification pipeline unnecessarily to [NFC][NPM] Do not run function simplification pipeline unnecessarily.Apr 8 2021, 9:11 AM
mtrofin edited the summary of this revision. (Show Details)
mtrofin edited the summary of this revision. (Show Details)

Restating summary of offline discussion:
Up until now, the repetition of simplification passes has hidden phase ordering issues or the need to rerun certain passes. Turning this flag on will improve compile time and resolve potential exponential compile-time increases, but will also reveal all the cases where the pass pipeline needs to be adjusted to not miss optimizations that were previously done on the follow-up optimization runs.

Thank you for pushing this!

gentle reminder - thanks!

asbirlea accepted this revision.Apr 12 2021, 11:02 AM
This revision is now accepted and ready to land.Apr 12 2021, 11:02 AM

nit with the summary, I think of "[NFC]" as more of refactoring types of changes, this adds new functionality under a flag

llvm/lib/Analysis/CGSCCPassManager.cpp
552

adding a comment here would be good

906

can this be !FunctionPass?

llvm/lib/Passes/PassBuilder.cpp
18

unused?

llvm/test/Other/new-pass-manager-cgscc-fct-proxy.ll
39

also check if we rerun SROA on f3?

40

typo

mtrofin updated this revision to Diff 336926.Apr 12 2021, 12:09 PM
mtrofin marked 3 inline comments as done.

feedback

llvm/lib/Analysis/CGSCCPassManager.cpp
906

then we wouldn't invalidate when updateCGAndAnalysisManagerForFunctionPass is called. CoroSplit calls it, for instance.

mtrofin retitled this revision from [NFC][NPM] Do not run function simplification pipeline unnecessarily to [NPM] Do not run function simplification pipeline unnecessarily.Apr 12 2021, 12:09 PM
mtrofin updated this revision to Diff 336933.Apr 12 2021, 12:22 PM
mtrofin marked 2 inline comments as done.

test

addressed remaining 2 items

fhahn added a subscriber: fhahn.Apr 12 2021, 2:51 PM

Do you have any data on whether there are any binary differences for a set of programs/benchmarks with this enabled compared to disabled?

Do you have any data on whether there are any binary differences for a set of programs/benchmarks with this enabled compared to disabled?

There are, in the strict sense of binary diff. I did not try it on benchmarks.

fhahn added a comment.Apr 12 2021, 2:56 PM

Do you have any data on whether there are any binary differences for a set of programs/benchmarks with this enabled compared to disabled?

There are, in the strict sense of binary diff. I did not try it on benchmarks.

Interesting. But then doesn't this indicate that the passes were not run unnecessarily? Just curious why this happens, e.g. are some changes not reported properly?

Do you have any data on whether there are any binary differences for a set of programs/benchmarks with this enabled compared to disabled?

There are, in the strict sense of binary diff. I did not try it on benchmarks.

Interesting. But then doesn't this indicate that the passes were not run unnecessarily? Just curious why this happens, e.g. are some changes not reported properly?

I believe it's what @asbirlea pointed at: there are some optimization passes that, if re-run on an unchanged function, do find more work/changes; and/or optimization passes that should be re-run after others in the pipeline (maybe these are the same thing)

Maybe my choice of words - i.e. "unnecessarily" is not strictly correct; not sure what would concisely capture it though - maybe "unintentionally"?; in any case, the idea is that we arguably want to get to a place where 1) if the function isn't changed, and the function simplification passes were rerun, they are still unchanged, and 2) this flag is enabled by default.

I agree with @aeubanks on removing the NFC from the title. This is changing the functionality, even if it's under a disabled flag.

For additional context, with the flag enabled we're seeing performance regressions which indicate optimizations are being missed and picked up by the subsequent runs of the optimization pass.
The purpose of this initial patch is to stage the transition by first introducing the flag, then focus on understanding where the missed optimizations occur so we can fix (for lack of a better word) the NPM pass pipeline.
When the flag is eventually flipped, it's possible the compile-time benefits also disappear due to the introduction of additional optimization passes, but the current pathological cases of high compile times due to repeating (truly unnecessary) optimizations, should remain resolved.

some comments on the summary

  • we should clarify that this is under a flag and is not currently changing any defaults. [2] says that, but we should make sure it's very clear somewhere besides a footer
  • the simplification pipeline is known to not be idempotent (and it would be very hard to make it so without majorly regressing compile times), so should not have a code quality effect is not true at all. but we should note that in a perfect world this wouldn't change code quality if the function simplification were perfect
  • the function passes will be re-run -> the function simplification pipeline will be re-run is a bit clearer, same with This patch allows the function passes be skipped if they were run once and a function was not modified since.
llvm/lib/Analysis/CGSCCPassManager.cpp
557–558

I don't think this is necessarily true. ArgumentPromotion is a CGSCC pass that changes functions but doesn't change the call graph so it doesn't need to call updateCGAndAnalysisManagerForCGSCCPass(). It handles invalidating analyses itself.
Same with PostOrderFunctionAttrsPass, but it just invalidates everything if anything has changed.

588

nit: /*FunctionPass*/ true, /*InvalidateSA*/ false

906

I feel like I'm missing something obvious, why is this invalidate here necessary? Can we just see what the passes invalidate?

llvm/test/Other/new-pass-manager-cgscc-fct-proxy.ll
39

sorry, I also meant to check that we don't rerun it in NOREPS

40

still typo CHCEK-NEXT

mtrofin edited the summary of this revision. (Show Details)Apr 12 2021, 9:19 PM
mtrofin marked 4 inline comments as done.Apr 12 2021, 9:51 PM

I agree with @aeubanks on removing the NFC from the title. This is changing the functionality, even if it's under a disabled flag.

For additional context, with the flag enabled we're seeing performance regressions which indicate optimizations are being missed and picked up by the subsequent runs of the optimization pass.
The purpose of this initial patch is to stage the transition by first introducing the flag, then focus on understanding where the missed optimizations occur so we can fix (for lack of a better word) the NPM pass pipeline.
When the flag is eventually flipped, it's possible the compile-time benefits also disappear due to the introduction of additional optimization passes, but the current pathological cases of high compile times due to repeating (truly unnecessary) optimizations, should remain resolved.

llvm/lib/Analysis/CGSCCPassManager.cpp
557–558

ArgumentPromotion outright deletes the function and clears the AM. I made the comment more general.

906

When passes call into here, I we can assume FunctionStatusAnalysis can be invalidated because something changed.

But this is also entered from line 596, which happens after the function simplification pipeline finishes. The last thing that the pipeline does is it caches a FunctionStatusAnalysis::Result. Well, so next thing that would happen is we'd yank it here - unless we indicate this is the one case we don't want that to happen.

llvm/test/Other/new-pass-manager-cgscc-fct-proxy.ll
39

oh - we still run it there. I think there's one more inlining happening in f3.

mtrofin updated this revision to Diff 337042.Apr 12 2021, 9:52 PM
mtrofin marked 3 inline comments as done.

fixups

mtrofin marked an inline comment as done.Apr 13 2021, 9:48 AM
aeubanks added inline comments.Apr 14 2021, 1:47 PM
llvm/lib/Analysis/CGSCCPassManager.cpp
906

How would it be yanked? The RequireAnalysis pass after the function simplification pipeline preserves all analyses.
Removing these two lines and running check-llvm still passes.

mtrofin marked an inline comment as done.Apr 14 2021, 3:11 PM
mtrofin added inline comments.
llvm/lib/Analysis/CGSCCPassManager.cpp
906

Well, removing these 2 lines then doesn't invalidate FunctionAnalysisStatus ever :)

Talking offline with mtrofin, IMO in an ideal world:

  • the analysis should be invalidatable like a typical stateful analysis result (I hadn't noticed the analysis here was immutable, that's why I was confused, plus the invalidate vs clear distinction)
  • the logic around InvalidateSA shouldn't be necessary, the existing invalidation infra should handle it
  • all CGSCC passes specify that they preserve all function analyses and only manually invalidate function analyses on functions that they have modified
    • this wouldn't happen in updateCGAndAnalysisManagerForPass() since that's mostly just for keeping the call graph up to date, passes that don't change the call graph won't call that (e.g. PostOrderFunctionAttrsPass)
    • this could save some compile time in general since right now we're invalidating function analysis results for all functions in an SCC if any function has changed
mtrofin updated this revision to Diff 342827.May 4 2021, 12:35 PM
mtrofin marked an inline comment as done.

updated post consolidation of pass invalidation

aeubanks added inline comments.May 4 2021, 1:11 PM
llvm/lib/Analysis/CGSCCPassManager.cpp
588

this can go back to the original code

llvm/lib/Passes/PassBuilder.cpp
1033

do we need to add a pass to invalidate FunctionStatusAnalysis after the inliner pipeline so that CGSCC pipelines after the inliner pipeline aren't skipped?

mtrofin updated this revision to Diff 342852.May 4 2021, 1:37 PM
mtrofin marked 2 inline comments as done.

feedback

llvm/lib/Passes/PassBuilder.cpp
1033

done

aeubanks added inline comments.May 4 2021, 2:06 PM
llvm/lib/Passes/PassBuilder.cpp
1033

We already have InvalidateAnalysisPass<FunctionStatusAnalysis> for that.
It'd be nice if we could add it inside buildInlinerPipeline() since it's logically part of the inliner pipeline, but we only use it in one place so maybe it doesn't matter so much

mtrofin updated this revision to Diff 342866.May 4 2021, 2:23 PM

Using InvalidateAnalysisPass

mtrofin marked an inline comment as done.May 4 2021, 2:26 PM
mtrofin added inline comments.
llvm/lib/Passes/PassBuilder.cpp
1033

done for the InvalidateAnalysisPass, thanks for pointing it out!

I want to rationalize a bit better things in the module inliner wrapper pass, so the invalidation of the status analysis should fall in in that patch.

aeubanks accepted this revision.May 4 2021, 2:33 PM

looks good, just some nits on comments

llvm/lib/Analysis/CGSCCPassManager.cpp
558–559

this is no longer relevant

llvm/lib/Passes/PassBuilder.cpp
1033

actually, maybe having a bool param in createCGSCCToFunctionPassAdaptor on whether to check for FunctionStatusAnalysis is better than adding the invalidation pass
maybe we could even have the adaptor run the RequireAnalysisPass (or some manual version of it)

just something to consider, not blocking

mtrofin updated this revision to Diff 342873.May 4 2021, 2:37 PM
mtrofin marked 3 inline comments as done.

Comment

mtrofin updated this revision to Diff 343474.May 6 2021, 12:24 PM

updated test

This revision was landed with ongoing or failed builds.May 6 2021, 12:34 PM
This revision was automatically updated to reflect the committed changes.
This revision is now accepted and ready to land.Nov 1 2021, 4:32 PM