This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Fix import of multiply defined global variables
ClosedPublic

Authored by krisb on Feb 3 2021, 5:13 AM.

Details

Summary

Currently, if there is a module that contains a strong definition of
a global variable and a module that has both a weak definition for
the same global and a reference to it, it may result in an undefined symbol error
while linking with ThinLTO.

It happens because:

  • the strong definition got internalized because it is read-only and can be imported;
  • the weak definition got replaced by a declaration because it's non-prevailing;
  • the strong definition failed to be imported because the destination module already contains another definition of the global yet this def is non-prevailing.

The patch adds a check to computeImportForReferencedGlobals() that allows
considering a global variable for being imported even if the module contains
a definition of it in the case this def has an interposable linkage type.

Note that currently the check is based only on the linkage type
(and this seems to be enough), but it might be worth to account the information
whether the def is prevailing or not.

Diff Detail

Event Timeline

krisb created this revision.Feb 3 2021, 5:13 AM
krisb requested review of this revision.Feb 3 2021, 5:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 3 2021, 5:13 AM

Thanks. AFAICT this is a reasonable fix. Suggestion below for a better comment and FIXME to add.

llvm/lib/Transforms/IPO/FunctionImport.cpp
314

Can you add a comment here about how this is required for correctness?
Also, add a FIXME that we can refine this further if we pass down the prevailing symbol information, and only need to allow imports if this is not the prevailing copy, and further if the prevailing copy is marked read only?

ychen added a subscriber: ychen.Feb 3 2021, 3:59 PM
krisb updated this revision to Diff 321443.Feb 4 2021, 8:18 AM

Add the comment and FIXME.

tejohnson added inline comments.Feb 5 2021, 8:48 AM
llvm/lib/Transforms/IPO/FunctionImport.cpp
320

Can you add that this only affects interposable symbols since they are converted to declarations.

321

I think we would still want to check linkage type, since it is only interposable that are converted to declarations. The prevailing information would be useful to refine the extra importing further. Hmm, actually I think there might be a correctness issue with this fix as is. I notice in your test below that the imported symbol becomes internalized. I believe this will need to stay interposable. Normally interposable symbols are blocked from inlining and other IPO since they can be overridden at link time. By making the imported copy internalized, it can now be inlined and no longer overridden at link time. If we did ensure that we imported the prevailing copy this might be ok. However, there are going to be cases where that is not possible (e.g. doing ThinLTO on a shared library link, or if the prevailing copy is in an ELF object/shared library and therefore not available for importing). So I think the best thing here is to ensure that when it gets imported it retains the weak linkage type. I'm not sure if is being internalized in this particular test case because the imported copy is read only - can you confirm where/how that is happening?

tejohnson added inline comments.Feb 5 2021, 8:58 AM
llvm/lib/Transforms/IPO/FunctionImport.cpp
321

(ignore the comment about inlining, since we're dealing with variables, but other IPO applies)

krisb added a comment.Feb 8 2021, 5:01 AM

Thanks, @tejohnson, for reviewing this!
I'll make the comment more accurate.

I think we would still want to check linkage type, since it is only interposable that are converted to declarations. The prevailing information would be useful to refine the extra importing further. Hmm, actually I think there might be a correctness issue with this fix as is. I notice in your test below that the imported symbol becomes internalized. I believe this will need to stay interposable. Normally interposable symbols are blocked from inlining and other IPO since they can be overridden at link time. By making the imported copy internalized, it can now be inlined and no longer overridden at link time. If we did ensure that we imported the prevailing copy this might be ok. However, there are going to be cases where that is not possible (e.g. doing ThinLTO on a shared library link, or if the prevailing copy is in an ELF object/shared library and therefore not available for importing). So I think the best thing here is to ensure that when it gets imported it retains the weak linkage type. I'm not sure if is being internalized in this particular test case because the imported copy is read only - can you confirm where/how that is happening?

The imported copy becomes internal because the source def is read-only indeed (the imported copy inherits 'thinlto-internalize' attribute from the source one thus it got internalized in internalizeGVsAfterImport()). This seems okay since access from an ELF object/shared library prevents the source copy to be marked ReadOnly and at that time we already have attributes propagated.
In general, allow importing here we let it follow the normal importing rules as if the destination module doesn't contain a def at all. So that we will not import another copy with interposable linkage (it doesn't pass the check from canImportGlobalVar()). For other cases, the final linkage type will be whatever FunctionImportGlobalProcessing::getLinkage() returns for the source copy. Unless I'm missing something, this seems to be safe.

Thanks, @tejohnson, for reviewing this!
I'll make the comment more accurate.

I think we would still want to check linkage type, since it is only interposable that are converted to declarations. The prevailing information would be useful to refine the extra importing further. Hmm, actually I think there might be a correctness issue with this fix as is. I notice in your test below that the imported symbol becomes internalized. I believe this will need to stay interposable. Normally interposable symbols are blocked from inlining and other IPO since they can be overridden at link time. By making the imported copy internalized, it can now be inlined and no longer overridden at link time. If we did ensure that we imported the prevailing copy this might be ok. However, there are going to be cases where that is not possible (e.g. doing ThinLTO on a shared library link, or if the prevailing copy is in an ELF object/shared library and therefore not available for importing). So I think the best thing here is to ensure that when it gets imported it retains the weak linkage type. I'm not sure if is being internalized in this particular test case because the imported copy is read only - can you confirm where/how that is happening?

The imported copy becomes internal because the source def is read-only indeed (the imported copy inherits 'thinlto-internalize' attribute from the source one thus it got internalized in internalizeGVsAfterImport()). This seems okay since access from an ELF object/shared library prevents the source copy to be marked ReadOnly and at that time we already have attributes propagated.
In general, allow importing here we let it follow the normal importing rules as if the destination module doesn't contain a def at all. So that we will not import another copy with interposable linkage (it doesn't pass the check from canImportGlobalVar()). For other cases, the final linkage type will be whatever FunctionImportGlobalProcessing::getLinkage() returns for the source copy. Unless I'm missing something, this seems to be safe.

Ok I think I have convinced myself that this is correct:

  1. If the prevailing copy is not available for import (e.g. in an ELF object), these IR copies would have been marked dead and therefore we would not have marked them as read only.
  2. Any copy available for import, i.e. not interposable, would have to be a linkage type that cannot be overridden. So it would have to either be the prevailing copy, or one odr equivalent.

Hmm, it seems like we could do something similar for function importing, to allow importing and inlining of the prevailing copy when the copy in the module is interposable. Can you add a FIXME where we check "if (DefinedGVSummaries.count(VI.getGUID()))" in computeImportForFunction?

krisb updated this revision to Diff 323686.Feb 15 2021, 1:24 AM

Made comments more detailed and acccurate.

tejohnson accepted this revision.Feb 15 2021, 9:01 AM

lgtm with some comment tweaks suggested below for clarity. Thanks!

llvm/lib/Transforms/IPO/FunctionImport.cpp
318

Since the next two points (about the non-prevailing interposable becoming a decl, and then the read-only prevailing being internalized) are both about the correctness and need to go together to create the failure, can you reorder this to:
"This is required for the correctness purpose: First of all, if this copy is non-prevailing...."
Then reorder the second point slightly:
"On the other side, if the prevailing copy is read-only and suitable for import, it will be later internalized and thus not available for linking, and therefore must be imported to satisfy any references."

405

Nit: For clarity, add: "The prevailing copy can safely be imported."

This revision is now accepted and ready to land.Feb 15 2021, 9:01 AM
MaskRay added inline comments.
llvm/lib/Transforms/IPO/FunctionImport.cpp
314

Nit: flip the condition so that the "then" branch can be deleted, and the comment can be dedented by one level.

MaskRay added inline comments.Feb 15 2021, 10:35 AM
llvm/test/ThinLTO/X86/weak_globals_import.ll
2

Option: you may consider the split-file utility to keep the auxiliary file in this file. It may help readers in some cases as they don't need to open another file.

krisb updated this revision to Diff 323971.Feb 16 2021, 5:47 AM
krisb marked 6 inline comments as done.
krisb retitled this revision from [ThinLTO] Fix import of a global variable to modules contain its weak def to [ThinLTO] Fix import of multiply defined global variables.

Applied review comments, reduced the test.

llvm/lib/Transforms/IPO/FunctionImport.cpp
314

Extracted the check into a separate function. Hope it can be reused for computeImportForFunction() as well.

318

Thank you for the suggestion! I rephrased the comment and hope it's more clear now.

llvm/test/ThinLTO/X86/weak_globals_import.ll
2

Thank you! I haven't heard about split-file.

lgtm - thanks!

This revision was automatically updated to reflect the committed changes.