This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
ClosedPublic

Authored by tejohnson on Mar 22 2019, 11:32 AM.

Details

Summary

We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).

Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.

Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.

Diff Detail

Event Timeline

tejohnson created this revision.Mar 22 2019, 11:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2019, 11:32 AM

I am not quite sure what is the semantics for ELF linker. When thinLTO picks the linkonce_odr version and promote it to weak, it is not changing the visibility. It is a later linker optimization that auto hide the linkonce_odr symbol, correct?

If that is the case, would it be a problem for non-LTO link as well? ld64 always prefers the symbol that is not autohide so I don't think it will run into this situation.

I am not quite sure what is the semantics for ELF linker. When thinLTO picks the linkonce_odr version and promote it to weak, it is not changing the visibility. It is a later linker optimization that auto hide the linkonce_odr symbol, correct?

It is in fact ThinLTO that is changing the visibility (D43130). See the change I made to FunctionImport.cpp.

If that is the case, would it be a problem for non-LTO link as well? ld64 always prefers the symbol that is not autohide so I don't think it will run into this situation.

pcc@ looked at this and found that ELF semantics picks the most hidden visibility. However, without ThinLTO we aren't marking any of these as hidden, so it isn't an issue.

I am not quite sure what is the semantics for ELF linker. When thinLTO picks the linkonce_odr version and promote it to weak, it is not changing the visibility. It is a later linker optimization that auto hide the linkonce_odr symbol, correct?

It is in fact ThinLTO that is changing the visibility (D43130). See the change I made to FunctionImport.cpp.

Interesting, I made that change! But I somehow remembered I got pushed back because of ELF semantics and end up implementing that in ld64 instead. Let me dig up some context and get back to you.

If that is the case, would it be a problem for non-LTO link as well? ld64 always prefers the symbol that is not autohide so I don't think it will run into this situation.

pcc@ looked at this and found that ELF semantics picks the most hidden visibility. However, without ThinLTO we aren't marking any of these as hidden, so it isn't an issue.

I see, so if ELF linker doesn't have this autohide optimization, it should be fine in this specific case (but I thought there are some ELF linker actually does this).

I am not quite sure what is the semantics for ELF linker. When thinLTO picks the linkonce_odr version and promote it to weak, it is not changing the visibility. It is a later linker optimization that auto hide the linkonce_odr symbol, correct?

It is in fact ThinLTO that is changing the visibility (D43130). See the change I made to FunctionImport.cpp.

Interesting, I made that change! But I somehow remembered I got pushed back because of ELF semantics and end up implementing that in ld64 instead. Let me dig up some context and get back to you.

Are you thinking of https://reviews.llvm.org/D43361?

If that is the case, would it be a problem for non-LTO link as well? ld64 always prefers the symbol that is not autohide so I don't think it will run into this situation.

pcc@ looked at this and found that ELF semantics picks the most hidden visibility. However, without ThinLTO we aren't marking any of these as hidden, so it isn't an issue.

I see, so if ELF linker doesn't have this autohide optimization, it should be fine in this specific case (but I thought there are some ELF linker actually does this).

@pcc might have some info here. But I don't think so. We just can't mark hidden without the global visibility of all symbols, as we have here with ThinLTO (because of ELF picking the most hidden one).

I am not quite sure what is the semantics for ELF linker. When thinLTO picks the linkonce_odr version and promote it to weak, it is not changing the visibility. It is a later linker optimization that auto hide the linkonce_odr symbol, correct?

It is in fact ThinLTO that is changing the visibility (D43130). See the change I made to FunctionImport.cpp.

Interesting, I made that change! But I somehow remembered I got pushed back because of ELF semantics and end up implementing that in ld64 instead. Let me dig up some context and get back to you.

Are you thinking of https://reviews.llvm.org/D43361?

That is exactly what I was thinking.

If that is the case, would it be a problem for non-LTO link as well? ld64 always prefers the symbol that is not autohide so I don't think it will run into this situation.

pcc@ looked at this and found that ELF semantics picks the most hidden visibility. However, without ThinLTO we aren't marking any of these as hidden, so it isn't an issue.

I see, so if ELF linker doesn't have this autohide optimization, it should be fine in this specific case (but I thought there are some ELF linker actually does this).

@pcc might have some info here. But I don't think so. We just can't mark hidden without the global visibility of all symbols, as we have here with ThinLTO (because of ELF picking the most hidden one).

If this is really linker semantics difference, would it be easier if just make the visibility hidden change in D43130 be target specific?

pcc added a comment.Mar 22 2019, 3:49 PM

Does this has the correct behaviour in the case where the explicit instantiation is in a regular object file and the implicit instantiation is in bitcode?

@pcc might have some info here. But I don't think so. We just can't mark hidden without the global visibility of all symbols, as we have here with ThinLTO (because of ELF picking the most hidden one).

ELF linkers don't do auto-hide because it would require extending ELF to store the auto-hide bit (or information from which it can be recovered) somewhere. We have extended ELF with an address-significance table which tells us whether symbols are (local_)unnamed_addr, but we aren't keeping track of whether symbols are linkonce_odr.

If this is really linker semantics difference, would it be easier if just make the visibility hidden change in D43130 be target specific?

That would mean losing the auto hide optimization on ELF. I don't think we should do that.

include/llvm/IR/ModuleSummaryIndex.h
202

canAutoHide? (same elsewhere)

lib/Transforms/IPO/FunctionImport.cpp
984

Can't it also be local_unnamed_addr?

Thanks for explanation. Then the current approach in this patch LGTM. Thanks!

tejohnson marked 2 inline comments as done.Mar 22 2019, 9:10 PM
In D59709#1440303, @pcc wrote:

Does this has the correct behaviour in the case where the explicit instantiation is in a regular object file and the implicit instantiation is in bitcode?

Hmm, no. Nor will it work when the explicit instantiation is in a bitcode file without a summary. I can fix these cases by utilizing the VisibleOutsideSummary flag on the GlobalResolution to conservatively set CanAutoHide=false when that is set.

@pcc might have some info here. But I don't think so. We just can't mark hidden without the global visibility of all symbols, as we have here with ThinLTO (because of ELF picking the most hidden one).

ELF linkers don't do auto-hide because it would require extending ELF to store the auto-hide bit (or information from which it can be recovered) somewhere. We have extended ELF with an address-significance table which tells us whether symbols are (local_)unnamed_addr, but we aren't keeping track of whether symbols are linkonce_odr.

If this is really linker semantics difference, would it be easier if just make the visibility hidden change in D43130 be target specific?

That would mean losing the auto hide optimization on ELF. I don't think we should do that.

include/llvm/IR/ModuleSummaryIndex.h
202

Will do.

lib/Transforms/IPO/FunctionImport.cpp
984

I don't know but I wanted to maintain the status quo on that - we only did this in the global unnamed addr case before. If it can be changed to local_unnamed_addr that should probably be done as a follow up.

In D59709#1440303, @pcc wrote:

Does this has the correct behaviour in the case where the explicit instantiation is in a regular object file and the implicit instantiation is in bitcode?

Hmm, no. Nor will it work when the explicit instantiation is in a bitcode file without a summary. I can fix these cases by utilizing the VisibleOutsideSummary flag on the GlobalResolution to conservatively set CanAutoHide=false when that is set.

The GUIDPreservedSymbols set already has the set of GUIDs that are visible outside the summary, so I just made a change to pass that down and check it as well. Added a case to the test that checks for this situation (one module without summary).

tejohnson updated this revision to Diff 192560.Mar 27 2019, 9:19 PM

Address comments

tejohnson updated this revision to Diff 192561.Mar 27 2019, 9:39 PM

Added a gold test so that we can test the native object case too.

ping - @pcc any more comments or can this fix go in now?

Ping - ok if I submit this bugfix?

pcc added a comment.May 10 2019, 9:56 AM

I think there might still be a problem where we're dropping the distinction between visible to other translation units and visible outside the linkage unit. This is related to the distinction between LDPR_PREVAILING_DEF_IRONLY and LDPR_PREVAILING_DEF_IRONLY_EXP in the gold plugin interface. Let me look into it a bit more and get back to you.

pcc accepted this revision.May 10 2019, 10:33 AM

LGTM

Sorry for the long delay in reviewing this.

In D59709#1498204, @pcc wrote:

I think there might still be a problem where we're dropping the distinction between visible to other translation units and visible outside the linkage unit. This is related to the distinction between LDPR_PREVAILING_DEF_IRONLY and LDPR_PREVAILING_DEF_IRONLY_EXP in the gold plugin interface. Let me look into it a bit more and get back to you.

I don't think this is a problem. After having reminded myself about how things work, I realise that the linkers already keep track of this distinction and reflect it in the VisibleToRegularObj field.

lib/Transforms/IPO/FunctionImport.cpp
984

Okay, I missed that you were checking for global unnamed_addr when setting the flag to begin with.

This revision is now accepted and ready to land.May 10 2019, 10:33 AM
This revision was automatically updated to reflect the committed changes.

Hi Teresa, I am seeing some regressions because of this commit. ThinLTO is now producing more weak symbols than before and I am looking for a good fix. See inline comments for details.

llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

The regression I saw is exactly the situation mentioned in the comments but I am not so sure about the conclusion here. For ld64, it can see whether a symbol can be auto hide or not, and it prefers the one that is not auto hide. This means if there are other copies of the symbol that is not autohide outside the summary and not autohide, the one can be autohide should never be prevailing.

Is that the case for other linker?

tejohnson marked an inline comment as done.Sep 5 2019, 9:01 AM
tejohnson added inline comments.
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

The regression I saw is exactly the situation mentioned in the comments but I am not so sure about the conclusion here.

By the situation mentioned in the comments I assume you mean the part about symbols in the GUIDPreservedSymbols set because they are visible outside the summary? In your situation, why are there so many symbols in the preserved symbols set? Are there a lot of symbols in native code being linked in?

I guess it is different in lld since in the case I encountered (in an internal application), the linkonce_odr copy was prevailing, not the weak_odr copy. If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere, then we may need to figure out a way to communicate that.

tejohnson marked an inline comment as done.Sep 5 2019, 9:05 AM
tejohnson added inline comments.
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere

I meant if there is a *non*-autohide (weak_odr) copy available of course...

steven_wu added inline comments.Sep 5 2019, 9:51 AM
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

By the situation mentioned in the comments I assume you mean the part about symbols in the GUIDPreservedSymbols set because they are visible outside the summary? In your situation, why are there so many symbols in the preserved symbols set? Are there a lot of symbols in native code being linked in?

Yes. The situation is the code is linking a static c++ library (native code) with linkonce autohide libcxx symbols. Since ld64 decides to coalesce away the copy in the native code and use the version inside the LTO, it has to add it to mustPreserve symbols. Now even both copies can be autohide, it is not autohide because it is in the preserve list.

I guess it is different in lld since in the case I encountered (in an internal application), the linkonce_odr copy was prevailing, not the weak_odr copy. If ld64 guarantees that the autohide (linkonce_odr) copy will never be prevailing if there is an autohide (weak_odr) copy available somewhere, then we may need to figure out a way to communicate that.

This sounds like a linker bug but I am not familiar with lld LTO pipeline to be sure. Should lld mark weak_odr copy as prevailing? I also would like to understand how lld is using mustPreserve symbols because I thought lld doesn't need it.
For ld64, it uses mustPreserve symbol to communicate if the weak/linkonce symbols are prevailing in this case.

tejohnson marked an inline comment as done.Sep 6 2019, 11:35 AM
tejohnson added inline comments.
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

In the new LTO API, symbols are added to the GUIDPreservedSymbols set if they are visible outside the summary (either in a bitcode file without a summary or a native object).

We could do better if the linker indicated whether these symbols in objects without summaries were eligible for auto hide, which in the compiler is based both on the linkage type and on it having the global unnamed addr flag - does the linker know both of these for native objects?

I'm not an lld expert, but looking back at my original internal discussion with @pcc on this issue, it appears to be a difference in ld64 (Mach-O) semantics and ELF semantics. Here is an excerpt:

pcc wrote:

tejohnson wrote:

Is there some other criteria that should be used to prevent the symbol from being marked Hidden in this case?

It seems that the rule should be: if we see at least one weak_odr definition (i.e. explicit instantiation) that should prevent the relaxation from default to hidden. The reason is that the weak_odr makes it possible to have another translation unit that does not implicitly instantiate the function, including a translation unit in another DSO. Similar logic is implemented in ld64: if one symbol is weak_odr unnamed_addr (i.e. non-auto-hide) and the other is linkonce_odr unnamed_addr (i.e. auto-hide), the linker selects the non-auto-hide one.
https://github.com/apple-opensource/ld64/blob/master/src/ld/SymbolTable.cpp#L228

Unfortunately, this is at odds with ELF semantics where the "most hidden" visibility wins, so I think this means that we would not be able to mark linkonce_odr unnamed_addr symbols as hidden in the non-LTO case, as was suggested at one point on llvm-dev I believe.

So perhaps the best thing to do in this case is have an interface for the linker to indicate the correct semantics with regard to whether most or least hidden wins?

steven_wu added inline comments.Sep 6 2019, 1:11 PM
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

When you say visible outside the summary, what exactly is the definition of visible? I failed the find the relevant in lld that interacts with GUIDPreservedSymbols. For ld64, weak symbols are only in the GUIDPreservedSymbols when it is in the final image, so I am not sure if we share the same definition. If the weak symbols are visible outside the summary just to get coalesced away later, it doesn't have to be preserved.

I remember there were few other semantics difference in symbol resolution rules. IRLinker follows ELF visibility rules as well. It might be a good idea to clean that up sometimes.

If the ELF is choosing the most hidden version, why is the case described in the commit message a problem? Is it because weak_odr beats linkonce_odr before visibility comes into consideration?

tejohnson marked an inline comment as done.Sep 6 2019, 1:57 PM
tejohnson added inline comments.
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

When you say visible outside the summary, what exactly is the definition of visible? I failed the find the relevant in lld that interacts with GUIDPreservedSymbols. For ld64, weak symbols are only in the GUIDPreservedSymbols when it is in the final image, so I am not sure if we share the same definition. If the weak symbols are visible outside the summary just to get coalesced away later, it doesn't have to be preserved.

lld doesn't create this set directly, it is done in the new LTO API here:
http://llvm-cs.pcc.me.uk/lib/LTO/LTO.cpp#910

And VisibleOutsideSummary is set here:
http://llvm-cs.pcc.me.uk/lib/LTO/LTO.cpp#539

VisibleToRegularObj is set by lld if the symbol is used in a regular (native) object
InSummary is true of the bitcode object had a summary

If the ELF is choosing the most hidden version, why is the case described in the commit message a problem? Is it because weak_odr beats linkonce_odr before visibility comes into consideration?

ELF is picking the linkonce_odr symbol (most hidden) as prevailing. If you look at the change in FunctionImport.cpp, before this patch we were unconditionally marking the prevailing linkonce_odr copy as hidden when "promoting" to weak_odr.

In the case that was failing we were first linking into a (native) shared library. The translation units in the shared library link contained both some implicit instantiations (linkonce_odr) and an explicit instantiation (weak_odr). Without LTO, the weak_odr would have resulted in a non-hidden weak definition in the shared library. Then this shared library was linked with code that contained references to this symbol that were expecting to be resolved by the explicit instantiation. If we instead mark that symbol as hidden in the shared library, those references are no longer satisfied.

steven_wu added inline comments.Sep 6 2019, 4:01 PM
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

ELF is picking the linkonce_odr symbol (most hidden) as prevailing. If you look at the change in FunctionImport.cpp, before this patch we were unconditionally marking the prevailing linkonce_odr copy as hidden when "promoting" to weak_odr.

In the case that was failing we were first linking into a (native) shared library. The translation units in the shared library link contained both some implicit instantiations (linkonce_odr) and an explicit instantiation (weak_odr). Without LTO, the weak_odr would have resulted in a non-hidden weak definition in the shared library. Then this shared library was linked with code that contained references to this symbol that were expecting to be resolved by the explicit instantiation. If we instead mark that symbol as hidden in the shared library, those references are no longer satisfied.

I guess I was missing something here for how ELF handles weak links during static link time. It sounds it is correct the the linkonce_odr version is prevailing under ELF rule but I don't understand the reason why the weak_odr version is expected to be exported?

Here is my understanding for ELF linking and let me know if it is correct or not.

For linking without LTO:

  • There is no bits in ELF object file to indicate auto hide. linkonce_odr and weak_odr are both weak external symbols so you get a weak copy in the dylib.

For linking with fullLTO:

  • IRLinker will pick weak_odr linkage, remove unamed_addr, resulting the same output as native

For linking with thinLTO (no native or fullLTO):

  • autohide disabled because weak_odr is cannot autohide, weak external symbol produced.

Since every situation produces weak external symbol, would it be easier just make lld mark weak_odr as prevailing (can you tell linkonce_odr from weak_odr in ELF object)?

Because native object for ELF doesn't have autohide optimization, missing autohide during thinLTO is not a regression.

MaskRay added a subscriber: MaskRay.Sep 9 2019, 8:48 AM
tejohnson marked an inline comment as done.Sep 27 2019, 4:53 PM
tejohnson added inline comments.
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

Sorry I forgot I hadn't responded here. @pcc should be better able to answer some of your specific questions about ELF and lld.

For linking with thinLTO (no native or fullLTO):

  • autohide disabled because weak_odr is cannot autohide, weak external symbol produced.

This should also be the same as a non-LTO after the link completes - there is a non-hidden weak symbol in the library (the prevailing linkonce_odr that was "promoted" to weak_odr and now no longer marked hidden).

Since every situation produces weak external symbol, would it be easier just make lld mark weak_odr as prevailing (can you tell linkonce_odr from weak_odr in ELF object)?

I am not an expert on ELF semantics, deferring to @pcc here.

steven_wu added inline comments.Oct 7 2019, 8:55 AM
llvm/trunk/lib/LTO/LTO.cpp
334 ↗(On Diff #199067)

I also forgot to mention here. I figure out later even for ld64, it should behave correctly even when the libLTO are not sure if the symbol can be autohide or not. It should just treat them the same as local_unname_addr.

I don't think we need to touch the logic here.

int3 added a subscriber: int3.EditedFeb 14 2022, 1:40 PM

Sorry to necro an old diff, but I'm trying to implement this behavior in LLD's Mach-O backend right now, and I'm running into some issues with ThinLTO not hiding the symbols that I expect it to. D119767: [lld-macho] Improve hiding of unnamed_addr symbols and https://discourse.llvm.org/t/mach-o-lto-handling-of-linkonce-odr-unnamed-addr/60015 have a bit more context

llvm/trunk/lib/Analysis/ModuleSummaryAnalysis.cpp
423 ↗(On Diff #199067)

@tejohnson should we account for local_unnamed_addr constants here too? canBeOmittedFromSymbolTable() checks for both: https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/Globals.cpp#L354

tejohnson added inline comments.Feb 18 2022, 6:58 AM
llvm/trunk/lib/Analysis/ModuleSummaryAnalysis.cpp
423 ↗(On Diff #199067)

Yes it looks like it, at least for const variables (as well as for all functions since they are const, and aliases that meet the same criteria). In fact, reading through the old comments, I see pcc asked about this and I wanted to do that as a follow up (since the previous version already only handled that case - see the change in FunctionImport.cpp), which apparently never happened. Can you send a patch to change this? Note that you will need to adjust the assert in FunctionImport.cpp. I would just change all of these places that check for LinkOnceODR && GlobalUnnamedAddr to simply call canBeOmittedFromSymbolTable().