This is an archive of the discontinued LLVM Phabricator instance.

[extract_symbols.py] Better handling of templates
ClosedPublic

Authored by john.brawn on Jan 31 2023, 8:28 AM.

Details

Summary

Since commit 846b676 SmallVectorBase<uint32_t> has been explicitly instantiated, which means that clang.exe must export it for a plugin to be able to link against it, but the constructor is not exported as currently no template constructors or destructors are exported.

We can't just export all constructors and destructors, as that puts us over the symbol limit on Windows, so instead rewrite how we decide which templates need to be exported to be more precise. Currently we assume that templates instantiated many times have no explicit instantiations, but this isn't necessarily true and results also in exporting implicit template instantiations that we don't need to. Instead check for references to template members, as this indicates that the template must be explicitly instantiated (as if it weren't the template would just be implicitly instantiated on use).

Doing this reduces the number of symbols exported from clang from 66011 to 53993 (in the build configuration that I've been testing). It also lets us get rid of the special-case handling of Type::getAs, as its explicit instantiations are now being detected as such.

Diff Detail

Event Timeline

john.brawn created this revision.Jan 31 2023, 8:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2023, 8:28 AM
john.brawn requested review of this revision.Jan 31 2023, 8:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2023, 8:28 AM
glandium resigned from this revision.Jan 31 2023, 7:36 PM

Could you add before/after figures of the numbers of symbols in, say, clang.symbols in the summary?

Overall, this LGTM, but I'm not a reviewer.

llvm/utils/extract_symbols.py
183

would it be worth changing this to use parse_microsoft_mangling?

334

you probably want to pass the re.ASCII flag to these match invocations.

346

You can probably drop the \ before the first @.

john.brawn marked an inline comment as done.Feb 1 2023, 5:37 AM
john.brawn added a subscriber: glandium.

Could you add before/after figures of the numbers of symbols in, say, clang.symbols in the summary?

Will do.

llvm/utils/extract_symbols.py
183

The important thing here is detecting that we have a function type mangling, which parse_microsoft_mangling doesn't do.

334

Names in symbols aren't necessarily ASCII, e.g.

int あ() { return 0; }

results in the symbol ?あ@@YAHXZ. Though we probably don't use non-ASCII identifiers in LLVM, so it doesn't really matter.

john.brawn updated this revision to Diff 493925.Feb 1 2023, 6:02 AM
john.brawn edited the summary of this revision. (Show Details)
john.brawn removed a reviewer: glandium.

Adjusted summary, removed superfluous \.

simon_tatham accepted this revision.Feb 6 2023, 6:42 AM

LGTM, with a small commenting nit.

llvm/utils/extract_symbols.py
523–527

Grammar/clarity nit: this comment is now incoherent. Previously its overall structure was a single sentence saying "Print symbols which both [do this] and [are not that]", which made it clear how the two individual conditions combined into the overall decision. But now the "[are not that]" bullet point is replaced with a sentence rather than a predicate.

Suggest rewording to clarify how the revised two conditions are combined.

Reading the code, I _think_ it's "Print symbols which both: [appear in exactly one input] and [are not an unreferenced template]".

This revision is now accepted and ready to land.Feb 6 2023, 6:42 AM
This revision was automatically updated to reflect the committed changes.

Hi, this caused some failures on AIX, could you take a look?

Could not load library '/scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/unittests/Analysis/InlineAdvisorPlugin.so': rtld: 0712-001 Symbol _ZN4llvm15SmallVectorBaseIjE13mallocForGrowEPvmmRm was referenced
      from module /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/unittests/Analysis/InlineAdvisorPlugin.so(), but a runtime definition
	    of the symbol was not found.
rtld: 0712-001 Symbol _ZN4llvm13AllAnalysesOnINS_6ModuleEE6SetKeyE was referenced
      from module /scratch/powerllvm/powerllvm_env/aix-ppc64/clang-ppc64-aix/build/unittests/Analysis/InlineAdvisorPlugin.so(), but a runtime definition
	    of the symbol was not found.
LLVM-Unit :: Passes/./PluginsTests/1/2
LLVM-Unit :: Passes/./PluginsTests/0/2
LLVM-Unit :: Analysis/./AnalysisTests/42/81

https://lab.llvm.org/buildbot/#/builders/214/builds/5764

Hi, this caused some failures on AIX, could you take a look?

I'll try, though I don't know anything about AIX so I may not have much success.

Thank you - it seems only AIX uses -Wl,-brtl flags with these tests.

I could reproduce the failure in PluginInlineAdvisorTest.PluginLoad to find _ZN4llvm13AllAnalysesOnINS_6ModuleEE6SetKeyE on Linux as well, when using a suitable build configuration. Fixed in
https://reviews.llvm.org/rG78f13ea093afdebcaa3b5c5690530b9217bbdfac on Linux, and hopefully also on AIX. I don't expect it to fix the failure to find _ZN4llvm15SmallVectorBaseIjE13mallocForGrowEPvmmRm though, so I'll try to figure out what's going on there also.

Jake-Egan added a comment.EditedFeb 10 2023, 1:32 PM

Thank you for looking into this. Unfortunately I think 78f13ea093afdebcaa3b5c5690530b9217bbdfac removed another symbol _ZN4llvm4PassD2Ev and _ZN4llvm13AllAnalysesOnINS_6ModuleEE6SetKeyE is still missing as well. https://lab.llvm.org/buildbot/#/builders/214/builds/5781 (EDIT: Or maybe I spoke too soon and a clean build will resolve it: https://lab.llvm.org/buildbot/#/builders/214/builds/5782)

It's possible you don't see the _ZN4llvm15SmallVectorBaseIjE13mallocForGrowEPvmmRm issue on Linux because the symbol isn't called during the test. But on AIX, it is checked for at load time, regardless of a call or not.

I noticed that llvm-nm has slightly different output for undefined symbols to GNU nm: GNU nm doesn't list value and size, llvm-nm list value and size zero. I've pushed a fix for that in https://reviews.llvm.org/rGe5d914672233, which may possibly also fix the AIX failure if AIX nm behaves in a similar way to llvm-nm.

That did the trick! I'm going to go ahead and revert 78f13ea093afdebcaa3b5c5690530b9217bbdfac because it's causing some other failures and your most recent commit fixes things.