This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Alwaysinliner time explosion with new pass manager
AbandonedPublic

Authored by davide on Nov 23 2022, 1:14 PM.

Details

Summary

There have been problems with the inliner taking a long time in the new pass manager before, discussed in https://reviews.llvm.org/D98481, https://reviews.llvm.org/D120584, and the finally landed https://reviews.llvm.org/D121084.

Unfortunately the fix that landed is a cost-model tweak to suppress exponential inlining so doesn't apply to the alwaysinline case. I've had some success with going the D98481 route internally, but that has the significant(!) disadvantage of actually disabling inlining of some alwaysinline functions (no-one seems to have noticed though).

Just before Clang removed support for the legacy pass-manager entirely, the soon-to-be-attached reduced case took ~9s with the new one, and 0.04s with the old one. ToT is (as expected) also 9s.

I understand this might not be a definitive solution, but I'd like to use this review to brainstorm a path forward.

see: https://github.com/llvm/llvm-project/issues/59126

Diff Detail

Event Timeline

davide created this revision.Nov 23 2022, 1:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 1:14 PM
davide requested review of this revision.Nov 23 2022, 1:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 1:14 PM

if alwaysinline is causing issues with recursive code, can you not mark the recursive code with alwaysinline (if I'm understanding the issue correctly)? that seems bad in general, we should respect alwaysinline as much as possible

when you say that the exponential cost model "fix" doesn't apply to alwaysinline, do you mean that the alwaysinliner (a separate pass) doesn't do the exponent cost modelling (fixable)? or that alwaysinline always trumps the cost multiplier (what I mentioned above)?

Hi @aeubanks -- admittedly I haven't touched the inliner in a very long time so I might be wrong here, but my understanding is that alwaysinline trumps the cost multiplier (that's why the previous patch is not enough).

To answer your question: most of this code is something that clients don't control (e.g. sometimes it's library code).
It took me a long time to even understand why the inliner went bananas. The old pass manager didn't really have this behavior, so clients internally upgrading the compilers get caught by surprise. To the best of my understanding, Rust and others hit this behavior too.

I concur with you that we should try respecting always_inline -- although in this case it seems to hit a pathological behavior.
If we don't want to take this patch "as-is", maybe a good tradeoff is that of having a cl:opt? WDYT?

To elaborate further, the testcase in the project issue reported is reduced from a preprocessed file that's approximately 1 million lines.
Even if people would like the remove always_inline -- it's very hard to understand where to start (and, realistically, it's a jenga tower of optimizations + templates).

@aeubanks @davidxl can we get some movement on this one, please?

sorry, this fell between the cracks

are the alwaysinline attributes added purely for optimization purposes (counterexample, I've seen cases for testing where they are to make sure the number of frames is consistent)? if so I'd try removing literally every alwaysinline attribute and seeing what performance difference that makes

and I assume this is -O2/3?

sorry, this fell between the cracks

are the alwaysinline attributes added purely for optimization purposes (counterexample, I've seen cases for testing where they are to make sure the number of frames is consistent)? if so I'd try removing literally every alwaysinline attribute and seeing what performance difference that makes

They're pulled from different libraries, so it's a mix of both. My main observation is that it was extremely hard even for compiler engineers to figure out the composition of all these attributes.
The only observable effect was something not terminating (after 5 days compiling ; )

and I assume this is -O2/3?

This is -Os

An interesting alternative might be limiting this to when we're optimizing for size. Not sure if it sounds good for you, but maybe it's better than a cl::opt.

I know I came up with the patch originally, but I think more emphasis needs to be put on the fact that it violates the required semantics of alwaysinline. I think this is a dirty hack that worked for us but isn't up to OSS quality.

@t.p.northover regardless of whether this meets the bar, this is a test case that takes forever reduced from a real project.
We should have a solution upstream and this is the purpose of the patch. Downstream can't live with hacks forever.
Do you have any thoughts?

ah I think I've found the issue

to accommodate another weird recursion + alwaysinline case, we added https://reviews.llvm.org/D86988 (later modified by https://reviews.llvm.org/D91567). that ended up running a round of alwaysinline before the normal inliner pass. with this case of mutually recursive alwaysinline functions, one round of inlining seems ok, but two rounds blows up. -mandatory-inlining-first=0 makes compile times reasonable in your reduced test case by disabling that feature

I never liked D86988, but it works around other abuses of alwaysinline

the alternative to D86988 I considered at the time, which would probably resolve this case, was to visit alwaysinline calls before normal calls, rather than running two rounds of inlining

It seems that the explosion can happen when the always-inline callee has multiple call edges within the nontrivial SCC. Should the check be more refined?

llvm/lib/Transforms/IPO/Inliner.cpp
906

typo : recursion

(not sure if the ping is for the author or reviewers, there are alternatives suggested already)

davide abandoned this revision.Jul 13 2023, 6:16 PM

Not relevant anymore

do you know what changed? was it https://reviews.llvm.org/D150989?

Yes, that was the one.