This is an archive of the discontinued LLVM Phabricator instance.

[Inliner][NewPM] Inline functions outside of current SCC first
Needs ReviewPublic

Authored by haicheng on Nov 15 2017, 1:05 PM.

Details

Summary

When we enabled the new PM, we noticed several big regressions. One reason is that the new PM has different inline orders. The legacy PM has a step that the new PM does not have which is moving the call sites calling functions in the current SCC to the end of the iteration list. I don't know whether the new PM omits this intentionally or not, but I can see two benefits of doing this.

  1. This step first inlines functions outside of the current SCC to discourage functions inside the same SCC inlining each other which can bloat up the code size.
  1. Inlining a callee inside the current SCC likely makes the caller recursive. LLVM does not inline any recursive functions.

One drawback I can think of is that callsites from the same caller are stored in two places instead of one. Thus, we may have to switch function proxies more often.

This patch just copied the code from the legacy PM to the new PM. Here is the SPEC performance and code size change

code size (%) (- is smaller)performance (%) (+ is faster)
spec2000/ammp-1.32+0.02
spec2000/vortex-1.06-0.21
spec2006/gobmk+0.8+1.22
spec2006/povray-0.11+31.90
spec2017/leela-1.67-0.78
spec2017/povray-0.24+27.12

Diff Detail

Repository
rL LLVM

Event Timeline

haicheng created this revision.Nov 15 2017, 1:05 PM

Kindly Ping

Please let me know if this is the right approach.

davidxl edited edge metadata.Nov 22 2017, 10:00 AM

Using heuristic like this for inline order decision is like tossing a coin. It is very likely that doing this can hurt some cases where inlining of inner edges are important but gets blocked.

Due to current practical limitation in the inliner such as lack of the ability to inline self recursive functions, this patch can help to workaround that limitation a little, so the patch looks fine for now, though we should not depend strongly on this inline behavior in the future.

Using heuristic like this for inline order decision is like tossing a coin. It is very likely that doing this can hurt some cases where inlining of inner edges are important but gets blocked.

Due to current practical limitation in the inliner such as lack of the ability to inline self recursive functions, this patch can help to workaround that limitation a little, so the patch looks fine for now, though we should not depend strongly on this inline behavior in the future.

Thank you for your reply.

I agree that this heuristic is not the best, but it is consistent with what the legacy PM does so that we wouldn't have unpleasant surprise (e.g. spec2006/2017 povray listed in the summary) when we switch to the new PM. When we have a better one, we can certainly replace this heuristic.

Or I can modify the heuristic in this way: only call sites which can make the callers recursive if the callees are inlined are moved to the end of the candidate list.

Kindly Ping.

Kindly Ping (#2).

Kindly Ping (#3)

haicheng updated this revision to Diff 128252.Dec 27 2017, 5:41 PM

Now I only move the callsites whose callees also call the callers to the end of the inline list to delay the creation of recursive functions. Is it more acceptable?

Unfortunately, I don't know much about the inliner to accept it, but propsy for finding it and fixing.
I am sorry to see that it takes so much time for the review, I hope someone who knows about the inliner will review it soon.

Thank you, Piotr.

This patch just lets NewPM be consistent with the legacy PM to prevent big performance drop we observed when turning on NewPM.

sanjoy resigned from this revision.Jan 29 2022, 5:33 PM