This is an archive of the discontinued LLVM Phabricator instance.

[CGSCC] Detect devirtualization in more cases
ClosedPublic

Authored by aeubanks on Oct 16 2020, 12:56 PM.

Details

Summary

The devirtualization wrapper misses cases where if it wraps a pass
manager, an individual pass may devirtualize an indirect call created by
a previous pass. For example, inlining may create a new indirect call
which is devirtualized by instcombine. Currently the devirtualization
wrapper will not see that because it only checks cgscc edges at the very
beginning and end of the pass (manager) it wraps.

This fixes some tests testing this exact behavior in the legacy PM.

This piggybacks off of updateCGAndAnalysisManagerForPass()'s detection
of promoted ref to call edges.

This supercedes one of the previous mechanisms to detect
devirtualization by keeping track of potentially promoted call
instructions via WeakTrackingVHs.

There is one more existing way of detecting devirtualization, by
checking if the number of indirect calls has decreased and the number of
direct calls has increased in a function. It handles cases where calls
to functions without definitions are promoted, and some tests rely on
that. LazyCallGraph doesn't track edges to functions without
definitions so this part can't be removed in this change.

check-llvm and check-clang with -abort-on-max-devirt-iterations-reached
on by default doesn't show any failures outside of tests specifically
testing it so it doesn't needlessly rerun passes more than necessary.
(The NPM -O2/3 pipeline run the inliner/function simplification pipeline
under a devirtualization repeater pass up to 4 times by default).

Diff Detail

Event Timeline

aeubanks created this revision.Oct 16 2020, 12:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2020, 12:56 PM
aeubanks requested review of this revision.Oct 16 2020, 12:56 PM
asbirlea accepted this revision.Oct 22 2020, 2:37 PM

This makes sense to me.

This revision is now accepted and ready to land.Oct 22 2020, 2:37 PM
This revision was automatically updated to reflect the committed changes.
aeubanks reopened this revision.Oct 24 2020, 3:14 PM
This revision is now accepted and ready to land.Oct 24 2020, 3:14 PM
aeubanks planned changes to this revision.Oct 24 2020, 3:14 PM

Will investigate compile time regression

Using sqlite3 in llvm-test-suite, without this change DevirtSCCRepeatedPass never reruns the pipeline, but with this change it does for lots of SCCs.
Off the top of my head the only thing I can think of that would rerun passes with this is that a call promoted in one pass that gets deleted in a later pass would trigger a rerun.

The issue is that the inliner is inserting ref edges right before it calls updateCGAndAnalysisManagerForFunctionPass, which then promotes those edges to call edges, making us think there's been a devirtualization.
https://github.com/llvm/llvm-project/blob/72ddd559b8aafef402091f8e192e025022e4ebef/llvm/lib/Transforms/IPO/Inliner.cpp#L944

I think the proper way to address this is to fix the TODOs in updateCGAndAnalysisManagerForPass where it doesn't allow non-trivial edges to be added. With that fixed, we can remove the hack in the inliner that adds ref edges just to be promoted in updateCGAndAnalysisManagerForPass. That will also include changing the call from the inliner from updateCGAndAnalysisManagerForFunctionPass to updateCGAndAnalysisManagerForCGSCCPass.

aeubanks updated this revision to Diff 305286.Nov 13 2020, 6:08 PM

rebase
use WeakTrackingVH solution instead of just looking for promoted edges

This revision is now accepted and ready to land.Nov 13 2020, 6:08 PM

It turns out that using promoted edges didn't catch the case where a function that returns a function pointer is inlined into another function that calls the returned function pointer. That test happened to pass previously because we were accidentally always detecting promoted edges so we always reran the pipeline at least once, which was fixed in https://reviews.llvm.org/D91046.
The WeakTrackingVH approach seems to cover all cases that the promoted edge method caught, so I just used that in updateCGAndAnalysisManagerForPass() as you suggested offline.

rnk added inline comments.Nov 16 2020, 2:56 PM
llvm/include/llvm/Analysis/CGSCCPassManager.h
321

Using a Value pointer as a key seems like it could suffer from the ABA problem. Suppose a CallInst is deallocated and another is reallocated at the same address. The map will contain a false entry for the new call.

It seems like the lookup functionality (find/insert) is only needed during the update scan. You could build a map from Value* to handle index before that loop, and then do the instruction scan.

Could you clarify why a map and not a SmallVector?

I was expecting the indirect calls to be added by all CGSCC passes, e.g. lib/Transforms/IPO/Inliner.cpp:750. Adding the calls in updateCG... instead, introduces a single point of failure so less chances of having passes forget to add them; the one thing I'm not clear is if the updateCG... call may be added too late in a CGSCC pass; for the inliner, it seems fine (thank you for adding the tests) but I'm not clear if this will hold for other passes. I don't have a use case now, just making a note to verify this for coroutines.
The check that all accesses are still indirect on the return looks good.

aeubanks updated this revision to Diff 305871.Nov 17 2020, 12:09 PM

add debug logging
add comment to IndirectVHs

Could you clarify why a map and not a SmallVector?

I was expecting the indirect calls to be added by all CGSCC passes, e.g. lib/Transforms/IPO/Inliner.cpp:750. Adding the calls in updateCG... instead, introduces a single point of failure so less chances of having passes forget to add them; the one thing I'm not clear is if the updateCG... call may be added too late in a CGSCC pass; for the inliner, it seems fine (thank you for adding the tests) but I'm not clear if this will hold for other passes. I don't have a use case now, just making a note to verify this for coroutines.
The check that all accesses are still indirect on the return looks good.

It is definitely possible to have a call to updateCG... miss a devirtualization in a CGSCC pass if an indirect call is introduced and then promoted before the call to updateCG..., but the inliner looks ok. I think other CGSCC passes don't really devirtualize much. As long as we call updateCG... after passes that can devirtualize (inliner, function passes like instcombine, etc) we should be good. And we can always revisit if we find missed cases.

llvm/include/llvm/Analysis/CGSCCPassManager.h
321

If a CallInst is deallocated, its corresponding WeakTrackingVH will become null, and that should be ok. We just want to make sure that we don't insert duplicate entries into IndirectVHs. If we never end up with a new CallInst at the same address, then we'll ignore the nulled WeakTrackingVH when scanning for promoted calls. If there is a new CallInst at the same address, it will overwrite the old one due to the !Entry->second check in updateCGAndAnalysisManager.

We could remove all null WeakTrackingVHs at the beginning of updateCGAndAnalysisManager to alleviate any ABA and memory (we never remove entries) concerns, although I was trying to avoid looking through all entries in WeakTrackingVHs on every call to updateCGAndAnalysisManager.

I tried changing IndirectVHs into a SmallVector<WeakTrackingVH>, but then we end up adding extra code (and runtime) to create a map on every call to updateCGAndAnalysisManager, and we might as well just have that map as IndirectVHs itself.

Could you clarify why a map and not a SmallVector?

I was expecting the indirect calls to be added by all CGSCC passes, e.g. lib/Transforms/IPO/Inliner.cpp:750. Adding the calls in updateCG... instead, introduces a single point of failure so less chances of having passes forget to add them; the one thing I'm not clear is if the updateCG... call may be added too late in a CGSCC pass; for the inliner, it seems fine (thank you for adding the tests) but I'm not clear if this will hold for other passes. I don't have a use case now, just making a note to verify this for coroutines.
The check that all accesses are still indirect on the return looks good.

It is definitely possible to have a call to updateCG... miss a devirtualization in a CGSCC pass if an indirect call is introduced and then promoted before the call to updateCG..., but the inliner looks ok. I think other CGSCC passes don't really devirtualize much. As long as we call updateCG... after passes that can devirtualize (inliner, function passes like instcombine, etc) we should be good. And we can always revisit if we find missed cases.

SG. Can you make a note/comment on this in the update function where adding IndirectVH?

aeubanks updated this revision to Diff 305958.Nov 17 2020, 7:24 PM

add comment
Is this what you were looking for? Or more explanation?

aeubanks updated this revision to Diff 307127.Nov 23 2020, 11:16 AM

don't insert if found entry in map

asbirlea accepted this revision.Nov 23 2020, 11:17 AM
This revision was landed with ongoing or failed builds.Nov 23 2020, 11:56 AM
This revision was automatically updated to reflect the committed changes.