This is an archive of the discontinued LLVM Phabricator instance.

Virtualization, needs to add unit test
AbandonedPublicDraft

Authored by mingmingl on Dec 6 2021, 4:23 PM.

Details

Reviewers
None

Diff Detail

Event Timeline

mingmingl created this revision.Dec 6 2021, 4:23 PM
mingmingl updated this revision to Diff 392269.Dec 6 2021, 7:58 PM

fix tests by adding new fields and remove verbose debugging logs

mingmingl updated this revision to Diff 392282.Dec 6 2021, 9:53 PM

Remove verbose logging again.

mingmingl updated this revision to Diff 392291.Dec 6 2021, 11:26 PM

Apply an incremental change to avoid pull full repo (slow build!)

Will git rebase later.

mingmingl removed a subscriber: mgorny.Dec 6 2021, 11:27 PM
mingmingl retitled this revision from draft to save progress to Virtualization, needs to add unit test.Dec 6 2021, 11:36 PM
tejohnson added inline comments.
llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
961 ↗(On Diff #392291)

You want to look it up with the linkage as it is currently, since that was what would have been used to compute the guid when this symbol's summary was written into the index. So just ExportSummary->getValueInfo(TheFn->getGUID()) should suffice.

This will also allow you to remove the IsExported parameter here and it's computation in the caller.

973 ↗(On Diff #392291)

You'll also want to check Summary->isLive() and ignore any that aren't. If the prevailing symbol is not in the index, for example, the copies in the index should be marked dead and we should ignore the summaries and return false conservatively.

mingmingl updated this revision to Diff 392670.Dec 8 2021, 12:47 AM

save unit test to revision, yet no devirtualization was generated so far.

mingmingl updated this revision to Diff 392671.Dec 8 2021, 12:49 AM

merge two local commits

mingmingl added inline comments.Dec 8 2021, 2:14 PM
llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
961 ↗(On Diff #392291)

thanks for pointing this out!

I added some verbose logging in ModuleSummaryAnalysis.cpp and WholeProgramDevirt.cpp, to log functions GUID before being added around this line (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp#L503), like [1]

And also dump the ExportSummary (call dump method) as well as the function GUID in mustBeUnreachableFunction like [2].

Before combining the IsExported information, ExportSummary->getValueInfo(TheFn->getGUID()) would return null, so read https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp#L1158-L1182 and found the linkage is rewritten.

Having said that, I might well miss a context here, and using a pre-computed IsExported indeed looks a little weird to me.

[1] `
errs() << "Going to add F " << F.getName() << " with global identifier as "

       << F.getGlobalIdentifier() << " to global value summary\n";
Index.addGlobalValueSummary(F, std::move(FuncSummary));
[2]

// ExportSummary is absent. No sufficient information to determine

// if the function is reachable.
if (ExportSummary == nullptr) {
  return false;
}
assert((TheFn != nullptr) && "Caller guarantees that TheFn is not nullptr");

errs() << "\tGlobal identifier with current linkage is "
       << TheFn->getGlobalIdentifier() << "\n";

// If a function will be exported, use external linkage to get its
// global identifier.
const std::string rewrittenFuncGlobalIdentifier =
    GlobalValue::getGlobalIdentifier(TheFn->getName(),
                                     IsExported ? GlobalValue::ExternalLinkage
                                                : TheFn->getLinkage(),
                                     TheFn->getParent()->getSourceFileName());
errs() << "\tGlobal identifier with rewritten linkage is "
       << rewrittenFuncGlobalIdentifier << "\n";
[3]  Example output from ModuleSummaryAnalysis
       Going to add F _ZN7Derived1xEv with global identifier as _ZN7Derived1xEv to global value summary

[4]

Global identifier with current linkage is ld-temp.o:_ZN4BaseD0Ev
Global identifier with rewritten linkage is _ZN4BaseD0Ev
Global identifier with current linkage is ld-temp.o:_ZN7DerivedD0Ev
Global identifier with rewritten linkage is _ZN7DerivedD0Ev

973 ↗(On Diff #392291)

thanks for pointing this out!

I'll dig around a little bit about how isLive is used currently and follow this up soon.

mingmingl updated this revision to Diff 392936.Dec 8 2021, 2:23 PM

add verbose logs in cpp for linkage discussion

mingmingl updated this revision to Diff 392939.Dec 8 2021, 2:38 PM

just to resolve build errors

mingmingl added inline comments.Dec 8 2021, 3:52 PM
llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
961 ↗(On Diff #392291)

thanks for pointing this out!

I added some verbose logging in ModuleSummaryAnalysis.cpp and WholeProgramDevirt.cpp, to log functions GUID before being added around this line (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp#L503), like [1]

And also dump the ExportSummary (call dump method) as well as the function GUID in mustBeUnreachableFunction like [2].

Before combining the IsExported information, ExportSummary->getValueInfo(TheFn->getGUID()) would return null, so read https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp#L1158-L1182 and found the linkage is rewritten.

Having said that, I might well miss a context here, and using a pre-computed IsExported indeed looks a little weird to me.

[1] `
errs() << "Going to add F " << F.getName() << " with global identifier as "

       << F.getGlobalIdentifier() << " to global value summary\n";
Index.addGlobalValueSummary(F, std::move(FuncSummary));
[2]

// ExportSummary is absent. No sufficient information to determine

// if the function is reachable.
if (ExportSummary == nullptr) {
  return false;
}
assert((TheFn != nullptr) && "Caller guarantees that TheFn is not nullptr");

errs() << "\tGlobal identifier with current linkage is "
       << TheFn->getGlobalIdentifier() << "\n";

// If a function will be exported, use external linkage to get its
// global identifier.
const std::string rewrittenFuncGlobalIdentifier =
    GlobalValue::getGlobalIdentifier(TheFn->getName(),
                                     IsExported ? GlobalValue::ExternalLinkage
                                                : TheFn->getLinkage(),
                                     TheFn->getParent()->getSourceFileName());
errs() << "\tGlobal identifier with rewritten linkage is "
       << rewrittenFuncGlobalIdentifier << "\n";
[3]  Example output from ModuleSummaryAnalysis
       Going to add F _ZN7Derived1xEv with global identifier as _ZN7Derived1xEv to global value summary

[4]

Global identifier with current linkage is ld-temp.o:_ZN4BaseD0Ev
Global identifier with rewritten linkage is _ZN4BaseD0Ev
Global identifier with current linkage is ld-temp.o:_ZN7DerivedD0Ev
Global identifier with rewritten linkage is _ZN7DerivedD0Ev

It seems I cannot edit comment after a while, so just cleaned up the format (by adding a linebreak after[1]) of [1][2][3][4] and posted it .

The preview looks ok this time.
[1]

errs() << "Going to add F " << F.getName() << " with global identifier as "
         << F.getGlobalIdentifier() << " to global value summary\n";
  Index.addGlobalValueSummary(F, std::move(FuncSummary));

[2]

// ExportSummary is absent. No sufficient information to determine
  // if the function is reachable.
  if (ExportSummary == nullptr) {
    return false;
  }
  assert((TheFn != nullptr) && "Caller guarantees that TheFn is not nullptr");

  errs() << "\tGlobal identifier with current linkage is "
         << TheFn->getGlobalIdentifier() << "\n";

  // If a function will be exported, use external linkage to get its
  // global identifier.
  const std::string rewrittenFuncGlobalIdentifier =
      GlobalValue::getGlobalIdentifier(TheFn->getName(),
                                       IsExported ? GlobalValue::ExternalLinkage
                                                  : TheFn->getLinkage(),
                                       TheFn->getParent()->getSourceFileName());
  errs() << "\tGlobal identifier with rewritten linkage is "
         << rewrittenFuncGlobalIdentifier << "\n";

[3] Example output from ModuleSummaryAnalysis

Going to add F _ZN7Derived1xEv with global identifier as _ZN7Derived1xEv to global value summary

[4]

Global identifier with current linkage is ld-temp.o:_ZN4BaseD0Ev
Global identifier with rewritten linkage is _ZN4BaseD0Ev
Global identifier with current linkage is ld-temp.o:_ZN7DerivedD0Ev
Global identifier with rewritten linkage is _ZN7DerivedD0Ev
mingmingl updated this revision to Diff 393289.Dec 9 2021, 2:43 PM

update test IR

mingmingl updated this revision to Diff 393309.Dec 9 2021, 3:45 PM

Run "arc diff git merge-base HEAD origin --update D115203"

mingmingl updated this revision to Diff 393316.Dec 9 2021, 4:01 PM

thin impl with loggings, created as a back up

mingmingl updated this revision to Diff 393370.Dec 9 2021, 8:24 PM

clean up logs

mingmingl updated this revision to Diff 393376.Dec 9 2021, 10:11 PM
  • stash unit test that doesn't work in hybrid mode yet.
tejohnson added inline comments.Dec 10 2021, 11:10 AM
llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
961 ↗(On Diff #392291)

I suspect this is due to regular LTO internalization which has already updated the linkage types in the IR much earlier, and that we are sort of getting lucky that we are going to re-export due to WPD which undoes the effect of the earlier internalization. To confirm, you could try disabling LTO internalization and see if that causes the global identifier printed in the "current linkage" message above to not have the path in it. To do so, use -Wl,-mllvm,-enable-lto-internalization=false.

If it is internalization like I suspect, I think the easier and more robust thing (rather than relying on checking whether WPD will export, which may not be correlated to the earlier internalization decision), would be to simply lookup in the index with the exported name unconditionally if there is no entry found with the current linkage. I.e. something like:

ValueInfo TheFnVI = ExportSummary->getValueInfo(TheFn->getGUID());
if (!TheFnVI)
   // Assume TheFn was internalized, lookup index with original name.
   TheFnVI = ExportSummary->getValueInfo(GlobalValue::getGUID(TheFn->getName());
tejohnson added inline comments.Dec 10 2021, 11:15 AM
llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
961 ↗(On Diff #392291)

And we can guard the second index lookup on whether TheFn->getGUID() != GlobalValue::getGUID(TheFn->getName()). Probably save the GUIDs in locals so we don't have to repeat the hash computation.

And I think we should be able to assert that TheFnVI != null after this.

mingmingl marked 2 inline comments as done.Dec 13 2021, 9:21 AM
mingmingl added inline comments.
llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
961 ↗(On Diff #392291)

thanks for the suggestions!

It's indeed due to internalization. I made the changes in https://reviews.llvm.org/D115492.

961 ↗(On Diff #392291)

Thanks for the suggestion! Done in https://reviews.llvm.org/D115492

mingmingl abandoned this revision.Dec 16 2021, 2:29 PM
mingmingl marked an inline comment as done.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 16 2021, 2:29 PM