Page MenuHomePhabricator

[LTO] Prevent devirtualization for symbols dynamically exported
ClosedPublic

Authored by tejohnson on Nov 16 2020, 6:35 PM.

Details

Summary

Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
tejohnson edited the summary of this revision. (Show Details)Dec 30 2020, 4:03 PM

I implemented @MaskRay's suggestion and added a bit to convey whether a symbol is exported to the dynamic linker (via --export-dynamic[-symbol=] or --dynamic-list=), and use that to prevent the LTO visibility upgrade for WPD. I added support to both lld and gold plugin, and associated tests. Note that I couldn't use includeInDynsym in lld because that is not set for linkonce_odr symbols that were thus have canBeOmittedFromSymbolTable set (since any referencing module must have it's own copy) - we still want to block the LTO visibility upgrade for those symbols to avoid WPD. So I am using a slightly different interface that more directly checks whether export-dynamic is in effect.

ping - @MaskRay can you take a look?

Sorry for the delay. I did not review much over Christmas. I'll try to get to this sometime in the next couple of days.

Generally looks good.

lld/ELF/LTO.cpp
253

sym->exportDynamic || sym->inDynamicList

Then isExportDynamic does not need to be public.

lld/test/ELF/lto/devirt_vcall_vis_export_dynamic.ll
2

Nit: in LLD tests we use ;; to differentiate regular comments from CHECK RUN markers.

43

%s -> /dev/null

lld/test/ELF/lto/devirt_vcall_vis_public.ll
8

s/\t/ /

9

s/\t/ /

The two lines can be joined.

llvm/include/llvm/LTO/LTO.h
466

How about VisibleToOtherModules?

The name VisibleToDynamicLinker is too tied to the ELF binary format.

tejohnson added inline comments.Jan 12 2021, 3:05 PM
lld/ELF/LTO.cpp
253

sym->exportDynamic is false for linkonce_odr vtables, that was what I was referencing in this comment (otherwise I could use includeInDynsym which checks that):

Note that I couldn't use includeInDynsym in lld because that is not set for linkonce_odr symbols that were thus have canBeOmittedFromSymbolTable set (since any referencing module must have it's own copy) - we still want to block the LTO visibility upgrade for those symbols to avoid WPD. So I am using a slightly different interface that more directly checks whether export-dynamic is in effect.

lld/test/ELF/lto/devirt_vcall_vis_export_dynamic.ll
2

Can I do this in a follow on NFC commit? Otherwise it will make the diffs really noisy in this test.

llvm/include/llvm/LTO/LTO.h
466

VisibleToOtherModules sounds to me like it means LLVM Modules that are being linked together statically. I wanted to note that these are symbols that may have dynamic references not seen by the static link. Is there anything like --export-dynamic for other binary formats? If not, then it is ELF specific anyway.

tejohnson marked 3 inline comments as done.Jan 12 2021, 3:52 PM
tejohnson added inline comments.
lld/test/ELF/lto/devirt_vcall_vis_export_dynamic.ll
43

Fixed here and elsewhere - but not sure why it matters?

lld/test/ELF/lto/devirt_vcall_vis_public.ll
8

Done here and elsewhere

9

ditto

tejohnson updated this revision to Diff 316263.Jan 12 2021, 3:53 PM
tejohnson marked 3 inline comments as done.

Address comments

MaskRay added a subscriber: rnk.Jan 13 2021, 9:35 AM
MaskRay added inline comments.
llvm/include/llvm/LTO/LTO.h
466

@rnk for thoughts on COFF.

tejohnson updated this revision to Diff 316781.Jan 14 2021, 2:16 PM

Rebase and use ";;" instead of ";" for comments.

tejohnson added inline comments.Jan 14 2021, 2:18 PM
lld/test/ELF/lto/devirt_vcall_vis_export_dynamic.ll
2

I went ahead and fixed this in the existing tests in 5b42fd8dd4e7e29125a09a41a33af7c9cb57d144. I have updated this with a rebased version that fixes the comments in the new tests as well.

MaskRay added inline comments.Jan 14 2021, 2:24 PM
lld/ELF/LTO.cpp
253

Sorry, I don't understand the difference. If I replace this with sym->exportDynamic, I don't get a test failure...

llvm/include/llvm/LTO/LTO.h
466

Perhaps another name is Exported.

For ELF, the does not seem to be restricted to shared objects seen as in the input file.

tejohnson added inline comments.Jan 14 2021, 3:07 PM
lld/ELF/LTO.cpp
253

Ah, this is a test deficiency, looks like I need to make one or more of the vtables linkonce_odr to expose it. Will address that. The reason it is an issue for linkonce_odr can be seen in createBitcodeSymbol in lld/ELF/InputFiles.cpp, where it does:

if (canOmitFromDynSym)
  newSym.exportDynamic = false;

The canOmitFromDynSym gets propagated via the input file but is originally set in GlobalValue::canBeOmittedFromSymbolTable() for linkonce_odr with hasAtLeastLocalUnnamedAddr().

llvm/include/llvm/LTO/LTO.h
466

"Exported" is ambiguous - we use that throughout ThinLTO to mean exported from the current module (to other modules being LTO linked).

For ELF, the does not seem to be restricted to shared objects seen as in the input file.

Sorry I don't follow?

tejohnson updated this revision to Diff 316793.Jan 14 2021, 3:12 PM

Use linkonce_odr vtables to illustrate issue with Symbol exportDynamic

lld/ELF/LTO.cpp
253

I've improved the tests. Confirmed that the improved lld test fails if you make the change you proposed.

MaskRay added inline comments.Jan 23 2021, 11:19 AM
lld/ELF/LTO.cpp
253

I see that sym->isExportDynamic is used to prevent canOmitFromDynSym (unnamed_addr linkonce_odr or local_unnamed_addr linkonce_odr constant) logic.

There is one case where sym->isExportDynamic(sym->kind(), sym->visibility) may be false while sym->exportDynamic is true: a shared object with a STV_DEFAULT reference to the symbol can set exportDynamic (InputFiles.cpp:1557).

sym->isExportDynamic(...) || sym->exportDynamic should be safe.

llvm/include/llvm/LTO/LTO.h
466

Seems that the idea is just whether the symbol is exported and can be used by other linked images. A dynamic linker is the ELF concept but the idea can be used by other binary formats. In COFF it is called "export table". In Mach-O there are linker options -exported_*, and non-exported symbols are converted to private externs.

ThinLTO has already used exported to mean symbols exchanged among LLVM modules so export should not be used. Perhaps just exportDynamic? The name stills stems from ELF but users from other binary formats can still find similarity.

MaskRay added inline comments.Jan 23 2021, 11:47 AM
lld/ELF/LTO.cpp
253

The comprehensive rule for when exportDynamic is set:

* non-local `STV_DEFAULT/STV_PROTECTED` (this means it can be hid by `--exclude-libs`)
* logical OR of the following:
  + undefined
  + (`--export-dynamic` || `-shared`) && ! (unnamed_addr linkonce_odr GlobalVariable || local_unnamed_addr linkonce_odr constant GlobalVariable)
  + matched by `--dynamic-list/--export-dynamic-symbol-list/--export-dynamic-symbol`
  + defined or referenced by a shared object as `STV_DEFAULT`
  + `STV_PROTECTED` definition in a shared object preempted by copy relocation/canonical PLT when `--ignore-{data,function}-address-equality}` is specified
  + `-z ifunc-noplt` && has at least one relocation

The last two are edge cases (but works if you use sym->exportDynamic).

About the common case "defined or referenced by a shared object as STV_DEFAULT":
for ld.lld %t.o %t1.so -o %t, if %t1.so defines (linkonce_odr) the vtable, it can be preempted by the executable definition. %t thus needs to export the vtable.

This may deserve a test (sym->isExportDynamic(...) || sym->exportDynamic and `sym->isExportDynamic(...) have different behaviors)

tejohnson added inline comments.Jan 27 2021, 9:31 AM
lld/ELF/LTO.cpp
253

I see, so essentially sym->isExportDynamic(...) and sym->exportDynamic are non-overlapping and neither is a superset of the either. I will change the code to check them both. In terms of creating a test, can any of the cases where the latter is true but the former is not be triggered for a vtable? Since this is only being used for WPD right now, I'd need to be able to test this with a vtable def visible to a virtual call that would otherwise be devirtualized.

llvm/include/llvm/LTO/LTO.h
466

Sounds good, will change to simply ExportDynamic.

MaskRay added inline comments.Jan 27 2021, 9:36 AM
lld/ELF/LTO.cpp
253
llc -filetype=obj -relocation-model=pic %t.ll -o %t.o
lld -shared %t.o -o %t.so

The output has a dynamic symbol _ZTV1B.

Rename VisibleToDynamicLinker to ExportDynamic

lld/ELF/LTO.cpp
253

sym->isExportDynamic(...) already returns true if config->shared, so in the library link it should already be handled by the current patch, or am I misunderstanding?

MaskRay added inline comments.Jan 27 2021, 11:31 AM
lld/ELF/LTO.cpp
253
llc -filetype=obj -relocation-model=pic %t.ll -o %t.o
ld.lld -shared %t.o -o %t.so
ld.lld %t.o %t.so -o %t

The ELF semantic is that %t.o preempts every default visibility definition in %t.so. So, even in the absence of --export-dynamic/--dynamic-list/--export-dynamic-symbol, the definitions in %t need to be exported (to .dynsym) to allow preemption at runtime.

This is a case where sym->isExportDynamic(...) is false while sym->exportDynamic is true.

(sym->isExportDynamic(...) is true while sym->exportDynamic is false should be impossible, so no need for a test.)

tejohnson added inline comments.Jan 27 2021, 11:37 AM
lld/ELF/LTO.cpp
253
llc -filetype=obj -relocation-model=pic %t.ll -o %t.o
ld.lld -shared %t.o -o %t.so
ld.lld %t.o %t.so -o %t

The ELF semantic is that %t.o preempts every default visibility definition in %t.so. So, even in the absence of --export-dynamic/--dynamic-list/--export-dynamic-symbol, the definitions in %t need to be exported (to .dynsym) to allow preemption at runtime.

This is a case where sym->isExportDynamic(...) is false while sym->exportDynamic is true.

Ok let me create a test for this.

(sym->isExportDynamic(...) is true while sym->exportDynamic is false should be impossible, so no need for a test.)

It can happen which is why I added the check for sym->isExportDynamic(...) and not sym->exportDynamic in the first place. The current test case (with the linkonce_odr vtables) is exactly this case, because of canOmitFromDynSym (this was what I was describing upthread).

MaskRay added inline comments.Jan 27 2021, 11:42 AM
lld/ELF/LTO.cpp
253

Ah yes, I forgot again that sym->exportDynamic can be set to false due to canOmitFromDynSym (ThinLTO auto hiding) logic.

Also check sym->exportDynamic and add test

I think I've addressed all the comments now.

lld/ELF/LTO.cpp
253

Added check of sym->exportDynamic and added test case (confirmed it fails because we get devirtualizations without this change).

MaskRay accepted this revision.Jan 27 2021, 1:11 PM

LG. There are several minor required comment/description changes and test updates.

Under --export-dynamic[-symbol=] and --dynamic-list=, identify the exported symbols and prevent their LTO visibility from being upgraded.

The description needs a update now, something like: Identify the exported symbols (--export-dynamic[-symbol=], --dynamic-list=, or definitions needed to preempt shared objects)

This helps avoid use of whole program devirtualization when there may be overrides in dynamically loaded libraries.

Perhaps just dynamic libraries. "dynamically loaded libraries" gives me a feeling of dlopen, but the issue can arise with regular link-time shared objects as well: it is valid for a new version shared object to add more symbols (derived classes with vtables).

lld/test/ELF/lto/devirt_vcall_vis_export_dynamic.ll
3

This needs an update (due to a new case: shared object preemption): exported symbols prevent devirtualization.

102
108
This revision is now accepted and ready to land.Jan 27 2021, 1:11 PM

[LTO] Prevent devirtualization for symbols exported to dynamic linker

And the subject ("dynamic linker") needs a change

tejohnson updated this revision to Diff 319662.Jan 27 2021, 1:52 PM
tejohnson marked 3 inline comments as done.

Address comments

lld/test/ELF/lto/devirt_vcall_vis_export_dynamic.ll
102

Woops, good catch. I was testing the cases one at a time and missed fixing all the RUN lines again.

tejohnson retitled this revision from [LTO] Prevent devirtualization for symbols exported to dynamic linker to [LTO] Prevent devirtualization for symbols dynamically exported.Jan 27 2021, 1:53 PM
tejohnson edited the summary of this revision. (Show Details)
MaskRay accepted this revision.Jan 27 2021, 2:57 PM
This revision was landed with ongoing or failed builds.Jan 27 2021, 3:54 PM
This revision was automatically updated to reflect the committed changes.
lanza added a subscriber: lanza.EditedMar 7 2021, 10:17 PM

@tejohnson I'm not sure this change is working correctly -- either that or my builds are messed up.

BitcodeCompiler::add builds the resolutions for a bitcode module's symbols. For all symbols in the module r.ExportDynamic is set via isExportDynamic:

static bool isExportDynamic(Kind k, uint8_t visibility) {
  if (k == SharedKind)
    return visibility == llvm::ELF::STV_DEFAULT;
  return config->shared || config->exportDynamic;
}

The Kind for a BitcodeFile symbol is not SharedKind and thus config->shared causes this to always come back true. This is then used for GlobalResolutions.ExportDynamic which is used to build the DynamicExportSymbols list. Thus the DynamicExportSymbols list contains all the bitcode module's symbols and nothing gets vcall_visibility.

@tejohnson I'm not sure this change is working correctly -- either that or my builds are messed up.

BitcodeCompiler::add builds the resolutions for a bitcode module's symbols. For all symbols in the module r.ExportDynamic is set via isExportDynamic:

static bool isExportDynamic(Kind k, uint8_t visibility) {
  if (k == SharedKind)
    return visibility == llvm::ELF::STV_DEFAULT;
  return config->shared || config->exportDynamic;
}

The Kind for a BitcodeFile symbol is not SharedKind and thus config->shared causes this to always come back true. This is then used for GlobalResolutions.ExportDynamic which is used to build the DynamicExportSymbols list. Thus the DynamicExportSymbols list contains all the bitcode module's symbols and nothing gets vcall_visibility.

To me this is WAI. Why is "config->shared" true for your bitcode module? This should only affect when using the linker flags that assert you have whole program visibility during the link, which isn't true for a shared library and its symbols.

lanza added a comment.Mar 8 2021, 2:11 PM

To me this is WAI. Why is "config->shared" true for your bitcode module? This should only affect when using the linker flags that assert you have whole program visibility during the link, which isn't true for a shared library and its symbols.

Got ya. For our Android apps we compute the actual import and export lists exactly and thus can compute the symbol visibility during linking and use --lto-whole-program-visibility accordingly. (Though this is not yet used in production for build system reasons). This change makes the list of symbols equivalent to the list of DynamicExportSymbols, so even though we can tell lld that _ZTVN3xyz is internal-only it won't get vcall_visibility.

To me this is WAI. Why is "config->shared" true for your bitcode module? This should only affect when using the linker flags that assert you have whole program visibility during the link, which isn't true for a shared library and its symbols.

Got ya. For our Android apps we compute the actual import and export lists exactly and thus can compute the symbol visibility during linking and use --lto-whole-program-visibility accordingly. (Though this is not yet used in production for build system reasons). This change makes the list of symbols equivalent to the list of DynamicExportSymbols, so even though we can tell lld that _ZTVN3xyz is internal-only it won't get vcall_visibility.

The sym->isExportDynamic(sym->kind(), sym->visibility) || sym->exportDynamic || sym->inDynamicList condition is a bit conservative.
I sent D98220 to allow WPD with hidden/internal symbols.

To me this is WAI. Why is "config->shared" true for your bitcode module? This should only affect when using the linker flags that assert you have whole program visibility during the link, which isn't true for a shared library and its symbols.

Got ya. For our Android apps we compute the actual import and export lists exactly and thus can compute the symbol visibility during linking and use --lto-whole-program-visibility accordingly. (Though this is not yet used in production for build system reasons). This change makes the list of symbols equivalent to the list of DynamicExportSymbols, so even though we can tell lld that _ZTVN3xyz is internal-only it won't get vcall_visibility.

I see - so just to confirm, when compiling it isn't clear that these symbols are have internal or hidden visibility, but only during linking? Because otherwise clang should already have applied an appropriate vcall_visibility that allows WPD.

lanza added a comment.Mar 8 2021, 3:35 PM

I see - so just to confirm, when compiling it isn't clear that these symbols are have internal or hidden visibility, but only during linking? Because otherwise clang should already have applied an appropriate vcall_visibility that allows WPD.

Correct, when compiling we don't know what's exported and what's not, so nothing is applied. This is fundamentally similar to a problem you mentioned at a talk last year -- a source file shared between multiple apps might not be used the same everywhere, so you can only know after actually linking how it's used. So we apply stricter visibility via checking what was actually imported from the library. If _ZTV5Thing isn't used outside the current library we can mark it as such and then WPD operate with this extra info.

MaskRay added a comment.EditedMar 8 2021, 3:56 PM

I see - so just to confirm, when compiling it isn't clear that these symbols are have internal or hidden visibility, but only during linking? Because otherwise clang should already have applied an appropriate vcall_visibility that allows WPD.

Correct, when compiling we don't know what's exported and what's not, so nothing is applied. This is fundamentally similar to a problem you mentioned at a talk last year -- a source file shared between multiple apps might not be used the same everywhere, so you can only know after actually linking how it's used. So we apply stricter visibility via checking what was actually imported from the library. If _ZTV5Thing isn't used outside the current library we can mark it as such and then WPD operate with this extra info.

Do you use a local: version node in a version script to make vtable symbols local in a -shared link? LTO does not know the effective binding has become local in that case and can lose devirtualization opportunities.

The other possibility is -flto={full,thin} compile in one translation unit and -flto={full,thin} -fvisibility=hidden in another translation unit. Due to ELF visibility rule the most constraining one wins which may be more constraining than !vcall_visibility.

I see - so just to confirm, when compiling it isn't clear that these symbols are have internal or hidden visibility, but only during linking? Because otherwise clang should already have applied an appropriate vcall_visibility that allows WPD.

Correct, when compiling we don't know what's exported and what's not, so nothing is applied. This is fundamentally similar to a problem you mentioned at a talk last year -- a source file shared between multiple apps might not be used the same everywhere, so you can only know after actually linking how it's used. So we apply stricter visibility via checking what was actually imported from the library. If _ZTV5Thing isn't used outside the current library we can mark it as such and then WPD operate with this extra info.

Ok thanks. Dumb question - how do you know when you are linking the shared library what will be used outside of it (because presumably you don't know this until it is linked into the consuming binaries later on in the build process)? And do you communicate this info via --export-dynamic-symbol or the like?

lanza added a comment.Mar 8 2021, 5:23 PM

Do you use a local: version node in a version script to make vtable symbols local in a -shared link? LTO does not know the effective binding has become local in that case and can lose devirtualization opportunities.

Yup. This is exactly what I've started looking for over the past few days. LTO is clearly not taking full advantage of the fact that we can guarantee these symbols are hidden and local only. The ExportDynamic was the first obvious thing I ran into. I'm sure there's a good bit more.

Ok thanks. Dumb question - how do you know when you are linking the shared library what will be used outside of it (because presumably you don't know this until it is linked into the consuming binaries later on in the build process)? And do you communicate this info via --export-dynamic-symbol or the like?

Not a dumb question, it's a pretty reasonable one. We run the link twice. Once in the normal order -- leaf to root in the dependency graph and lld tells us the list of imported symbols. From those lists we generate the full list of symbols that need to be exported and pass a version script with local:\n*; included.