Page MenuHomePhabricator

Devirtualize a call on alloca without waiting for post inline cleanup and next DevirtSCCRepeatedPass iteration.
Needs ReviewPublic

Authored by yamauchi on Oct 29 2019, 3:38 PM.

Details

Reviewers
davidxl
Summary

This aims to fix a missed inlining case.

If there's a virtual call in the callee on an alloca (stack allocated object) in
the caller, and the callee is inlined into the caller, the post-inline cleanup
would devirtualize the virtual call, but if the next iteration of
DevirtSCCRepeatedPass doesn't happen (under the new pass manager), which is
based on a heuristic to determine whether to reiterate, we may miss inlining the
devirtualized call.

This enables inlining in clang/test/CodeGenCXX/member-function-pointer-calls.cpp.

Event Timeline

yamauchi created this revision.Oct 29 2019, 3:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 29 2019, 3:38 PM

In this case (shown in godbot), why does PM decides to not iterate another round?

In this case (shown in godbot), why does PM decides to not iterate another round?

DevirtSCCRepeatedPass uses the following heuristic to decide whether to reiterate:

  1. It memorizes the indirect calls in the function body before one iteration. After the iteration, check if any of those indirect calls became direct. If so, reiterate.
  2. It counts the numbers of the direct calls and the indirect calls before one iteration. After the iteration, count gain. If the number of indirect calls decreased and the number of direct calls increased, reiterate.

Neither applies here because

  1. The virtual call in question isn't seen as a virtual call by DevirtSCCRepeatedPass because it appears after inlining and gets devirtualized during cleanup right after. DevirtSCCRepeatedPass only looks at the code before and after the whole inlining + cleanup pipeline.
  1. Also, the numbers of direct/indirect calls in this case are 2/0 before and 2/0 after. So, it doesn't kick in, either.

An old mailing list thread about a related issue: http://lists.llvm.org/pipermail/llvm-dev/2017-December/thread.html#119598. I checked that this patch also fixes the example cited there: https://godbolt.org/z/BP0jGT

An alternative idea from there is "to pass the value handles to the inliner through CGSCCUpdateResult and let the inliner augment it as the caller gets indirect calls due to inlining."
I think that would work. We can add to the list of indirect calls as we inline and find indirect calls (where this patch calls tryPromoteCall()) and then DevirtSCCRepeatedPass would check if any of the indirect call has become direct after the iteration. This would be a more general and yet precise approach to these cases.

Another from there is to repeat DevirtSCCRepeatedPass until a fix point. I think that would fix these cases, but as the DevirtSCCRepeatedPass termination is based on a heuristic (described above) and currently likely not repeated to a fix point, it may have other broader impact like potentially inlining a lot more, and increasing compile time. To detect a fix point, we'd need some way to communicate that sort of like the above CGSCCUpdateResult approach.

It turned out the approach of "to pass the value handles to the inliner through CGSCCUpdateResult and let the inliner augment it as the caller gets indirect calls due to inlining" won't work because the instcombine that turns the indirect call into a direct call creates a new call instruction rather than updating the existing call instruction, which the value handle can't keep track of.

https://reviews.llvm.org/D70147 attempts to implement the approach that repeats DevirtSCCRepeatedPass as long as new inlining happens (up to the existing max count).

davidxl added inline comments.Nov 19 2019, 9:36 AM
llvm/lib/Analysis/Loads.cpp
458 ↗(On Diff #226982)

what is the change intended for? If it fixes some bugs, it should be split out.

llvm/lib/Transforms/IPO/Inliner.cpp
897

The pattern recognition part can be extracted as a utility/helper function and put int perhaps IndirectCallPromotionAnalysis.h|cpp file.

yamauchi marked 2 inline comments as done.Nov 19 2019, 3:36 PM
yamauchi added inline comments.
llvm/lib/Analysis/Loads.cpp
458 ↗(On Diff #226982)

If AA were available (not null), this code wouldn't be necessary as it can handle the general AA. However, in the new code in Inliner.cpp, we currently call this function (FindAvailableLoadedValue) with a null AA as AA appears to be unavailable in the inliner. I may be confused but I am under the impression that AA is too expensive to recompute or update during inlining (inlining involves relatively a large amount of transformation of code) or that AA may be stateless in reality, but its interface doesn't give that guarantee and not to be relied upon. This code here attempts to handle a simple/special case of AA for stores/loads with different constant offsets off the same base pointer in the absence of AA, in the spirit of the code just above (lines 450-453). This is important for the targeted missed inlining case because it's about a virtual call on a stacked allocated object where it's quite likely some other fields of the object may be stored into besides the vtable pointer in the constructor. Without this code or AA, this function wouldn't go far enough through loads/stores to reach the vtable pointer store in the constructor and the target function pointer in it, and it wouldn't work.

llvm/lib/Transforms/IPO/Inliner.cpp
897

Will try.

yamauchi updated this revision to Diff 230343.Nov 20 2019, 3:14 PM

Moved tryPromoteCall to CallPromotionUtils.h/cpp.

yamauchi marked an inline comment as done.Nov 20 2019, 3:14 PM
yamauchi added inline comments.
llvm/lib/Transforms/IPO/Inliner.cpp
897

Done.

yamauchi updated this revision to Diff 233205.Dec 10 2019, 3:28 PM

Split the utility code into D71307 and D71308.

Use AAManager (include/llvm/Analysis/AliasAnalysis.h) by registering only BasicAA ?

Use AAManager (include/llvm/Analysis/AliasAnalysis.h) by registering only BasicAA ?

How do you mean?

It seems that what type of AA is used is up to how the pass manager is set up (eg. PassBuilder::buildDefaultAAPipeline), not up to an individual pass like the inline pass or a utility function like this.

yamauchi updated this revision to Diff 241852.Fri, Jan 31, 5:19 PM

Update clang/test/CodeGenCXX/member-function-pointer-calls.cpp which this change
combined with D71308 enables inlining/simplification.

Add a new test that's derived from
clang/test/CodeGenCXX/member-function-pointer-calls.cpp.

Herald added a project: Restricted Project. · View Herald TranscriptFri, Jan 31, 5:19 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
davidxl added inline comments.Fri, Jan 31, 5:30 PM
llvm/test/Transforms/Inline/devirtualize-4.ll
3

typo ? UN:

yamauchi updated this revision to Diff 242103.Mon, Feb 3, 9:03 AM

Fix typo.

yamauchi edited the summary of this revision. (Show Details)Tue, Feb 11, 10:09 AM
yamauchi updated this revision to Diff 243911.EditedTue, Feb 11, 10:24 AM

Rebase past D71308. PTAL.