Page MenuHomePhabricator

Repeat DevirtSCCRepeatedPass if new indirect calls are found.
Needs ReviewPublic

Authored by yamauchi on Nov 12 2019, 2:45 PM.



This aims to fix a missed inlining case.

If there's a virtual call in the callee on an alloca (stack allocated object) in
the caller, and the callee is inlined into the caller, the post-inline cleanup
would devirtualize the virtual call, but if the next iteration of
DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is
based on a heuristic to determine whether to reiterate, we may miss inlining the
devirtualized call.

Event Timeline

yamauchi created this revision.Nov 12 2019, 2:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2019, 2:45 PM

This is an alternative approach to

Any compile time impact numbers?

davidxl added inline comments.Nov 12 2019, 3:24 PM

Alternatively, if compile time is a concern, the inline analysis 'shouldInline' can return a flag indicating if the callee has any indirect calls. If it is true and if the site is inlined, set this flag.

The way this patch currently is, the compile time impact in Clang's self build is ~5.3%.

The compile time impact looks too high. Might consider more fine grained check to trigger more iterations.

yamauchi retitled this revision from Repeat DevirtSCCRepeatedPass if any inlining happens. to Repeat DevirtSCCRepeatedPass if new indirect calls are found..Nov 15 2019, 1:31 PM

Now it would repeat DevirtSCCRepeatedPass if new indirect calls appear after inlining.

With this version, the compile time impact is ~1.48%.

Around line 1323 of InlineCost.cpp, there is code that detects if the new indirect callsites can become direct sites after inlining. This information can be passed back to the caller (via GetInlineCost and shouldInline interfaces). WIth that, I think the compile time impact can be minimized.

As I understand, that code around line 1323 of InlineCost.cpp would handle a simple case where a caller is passing a (constant) function pointer as a call argument and the callee simply calls it (constant propagation). But it wouldn't handle a (more complex) virtual call (vtable lookup though memory) on a stack allocated (in the caller) object passed to a callee, and would not work for this particular missed inlining case, at least, as is.

To handle such a case, it would have to add logic like the original patch (D69591) that detects a vtable look up pattern along with consideration for aliasing (since the vtable is in memory). But it'd be difficult to do because CallAnalyzer/InlineCost operates without actually having the code inlined (ie. interprocedural) and won't be able to reuse the facilities like llvm::FindAvailablePtrLoadStore.

It'd be conceivable to beef up CallAnalyzer to do this sort of things, but it'd be potentially non-trivial in complexity and compile-time-increasing.

The original patch D69591 (which, I think is, fairly fine grained to this case, btw) doesn't have this issue because it applies right after the call is actually inlined and the scope of analysis becomes intraprocedural.