This is an archive of the discontinued LLVM Phabricator instance.

Include GUIDs from the same module when computing GUIDs that needs to be imported.
ClosedPublic

Authored by danielcdh on Oct 31 2017, 2:46 PM.

Details

Summary

In the compile phase of SamplePGO+ThinLTO, ICP is not invoked. Instead, indirect call targets will be included as function metadata for ThinIndex to buidl the call graph. This should not only include functions defined in other modules, but also functions defined in the same module, otherwise ThinIndex may find the callee dead and eliminate it, while ICP in backend will revive the symbol, which leads to undefined symbol.

Event Timeline

danielcdh created this revision.Oct 31 2017, 2:46 PM
tejohnson edited edge metadata.Oct 31 2017, 2:58 PM

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Because we need to handle the indirect call chain too. E.g. foo() indirectly calls bar(), which indirectly calls baz(). Assuming these 2 calls are ICPed and inlined in the profiling binary, when processing foo's indirect callsite, marking bar's GUID in the VP is not enough, but adding baz's GUID to foo's callsite does not make sense as baz is not its call target. That's why we need to keep the transitive closure as function level metadata.

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Because we need to handle the indirect call chain too. E.g. foo() indirectly calls bar(), which indirectly calls baz(). Assuming these 2 calls are ICPed and inlined in the profiling binary, when processing foo's indirect callsite, marking bar's GUID in the VP is not enough, but adding baz's GUID to foo's callsite does not make sense as baz is not its call target. That's why we need to keep the transitive closure as function level metadata.

But the case here is when they are in fact in the same module. I'm assuming the situation you describe here already works when bar and baz are in different modules than foo. In this case we have foo and bar in the same module, and were missing the VP metadata showing that foo calls bar indirectly. If baz is also in the same module then the same situation applies to the callsite in bar. If it is in a different module then wouldn't the existing logic handle it? I.e. it would be on bar's function entry metadata?

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Because we need to handle the indirect call chain too. E.g. foo() indirectly calls bar(), which indirectly calls baz(). Assuming these 2 calls are ICPed and inlined in the profiling binary, when processing foo's indirect callsite, marking bar's GUID in the VP is not enough, but adding baz's GUID to foo's callsite does not make sense as baz is not its call target. That's why we need to keep the transitive closure as function level metadata.

But the case here is when they are in fact in the same module. I'm assuming the situation you describe here already works when bar and baz are in different modules than foo. In this case we have foo and bar in the same module, and were missing the VP metadata showing that foo calls bar indirectly. If baz is also in the same module then the same situation applies to the callsite in bar. If it is in a different module then wouldn't the existing logic handle it? I.e. it would be on bar's function entry metadata?

If baz is in the same module as bar and foo, then we need to have baz in foo's entry metadata. For the callsite in bar (standalone copy), it is possible that bar becomes cold after inlined into foo thus there is no profile to indicate bar has an indirect call to baz.

If baz is in another module, then in the current implementation (without this patch), it will be added in foo's entry metadata.

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Because we need to handle the indirect call chain too. E.g. foo() indirectly calls bar(), which indirectly calls baz(). Assuming these 2 calls are ICPed and inlined in the profiling binary, when processing foo's indirect callsite, marking bar's GUID in the VP is not enough, but adding baz's GUID to foo's callsite does not make sense as baz is not its call target. That's why we need to keep the transitive closure as function level metadata.

But the case here is when they are in fact in the same module. I'm assuming the situation you describe here already works when bar and baz are in different modules than foo. In this case we have foo and bar in the same module, and were missing the VP metadata showing that foo calls bar indirectly. If baz is also in the same module then the same situation applies to the callsite in bar. If it is in a different module then wouldn't the existing logic handle it? I.e. it would be on bar's function entry metadata?

If baz is in the same module as bar and foo, then we need to have baz in foo's entry metadata. For the callsite in bar (standalone copy), it is possible that bar becomes cold after inlined into foo thus there is no profile to indicate bar has an indirect call to baz.

If baz is in another module, then in the current implementation (without this patch), it will be added in foo's entry metadata.

After discussing offline with Dehao, I understand the issue:

  • just an intra-module indirect call could be handled by doing the promotion/inlining during the compile step (or annotating it on the VP metadata attached to the indirect call)
  • the issue is when you have a call out of the module, that then does an indirect call back into the module, where the whole call chain was promoted/inlined in the profiled binary: A.c:foo() ->(direct or indirect)-> B.c:bar() ->(indirect)-> A.c:baz()

If the indirect calls along this chain were promoted and then inlined into foo in the profiled binary, we cannot do the promotion/inlining of the call back to A.c:baz() during the compile step since it is intra-module, but we need to be able to annotate the call somewhere. And apparently we can't do this on the VP metadata for the indirect call in B.c:bar() since it was inlined into foo() in the profiled binary (Dehao - is that correct?).

In that case, I have a couple suggestions for the comments and test below.

include/llvm/ProfileData/SampleProf.h
354–355

Function name and comment need updating, since we aren't necessarily importing, but need to register the promoted indirect calls for liveness analysis.

test/Transforms/SampleProfile/function_metadata.ll
8

Suggest making this a different test with a different name since we aren't testing importing, but rather liveness analysis. In that case it would also be good to add a check for the situation described earlier as well.
A.c:foo() ->(direct or indirect)-> B.c:bar() ->(indirect)-> A.c:baz()

danielcdh marked 2 inline comments as done.Nov 1 2017, 11:15 AM

If the indirect calls along this chain were promoted and then inlined into foo in the profiled binary, we cannot do the promotion/inlining of the call back to A.c:baz() during the compile step since it is intra-module, but we need to be able to annotate the call somewhere. And apparently we can't do this on the VP metadata for the indirect call in B.c:bar() since it was inlined into foo() in the profiled binary (Dehao - is that correct?).

That's correct

tejohnson added inline comments.Nov 1 2017, 11:31 AM
include/llvm/ProfileData/SampleProf.h
375

Needs update for name change?

lib/Transforms/IPO/SampleProfile.cpp
1468

I would leave the old reason too - it's both right? Making the IR match before annotation and getting the correct liveness analysis, for anything that couldn't be inlined during the compile step.

test/Transforms/SampleProfile/function_metadata.ll
36

I think you also need to add a second module to test for the A.c:foo() ->(direct or indirect)-> B.c:bar() ->(indirect)-> A.c:baz() case.

danielcdh marked an inline comment as done.Nov 1 2017, 11:40 AM
danielcdh added inline comments.
test/Transforms/SampleProfile/function_metadata.ll
36

foo is not defined in this module, so in this new test it is A.c:test_liveness() ->direct-> B.c:foo() -> (indirect or direct) -> A.c: foo_available().

This revision is now accepted and ready to land.Nov 1 2017, 11:47 AM
danielcdh closed this revision.Nov 1 2017, 1:27 PM