This is an archive of the discontinued LLVM Phabricator instance.

Support for ThinLTO function importing and symbol linking.
ClosedPublic

Authored by tejohnson on Oct 7 2015, 9:59 AM.

Details

Summary

Support for necessary linkage changes and symbol renaming during
ThinLTO function importing.

Also includes llvm-link support for manually importing functions
and associated llvm-link based tests.

Note that this does not include support for intelligently importing
metadata, which is currently imported duplicate times. That support will
be in the follow-on patch, and currently is ignored by the tests.

Diff Detail

Event Timeline

tejohnson updated this revision to Diff 36756.Oct 7 2015, 9:59 AM
tejohnson retitled this revision from to Support for ThinLTO function importing and symbol linking..
tejohnson updated this object.
tejohnson added a subscriber: llvm-commits.

Ping.
Thanks,
Teresa

davidxl added inline comments.Oct 14 2015, 10:38 AM
include/llvm/IR/FunctionInfo.h
235

comdat functions should have been merged at this point, right?

lib/Linker/LinkModules.cpp
441

Do we want to support importing a list of functions in a batch mode?

450–470

Is it better to add another flag to indicate whether it is for primary module or imported modules.

466

assert dstM == srcM?

573

How about just getName or getLinkName? The name may or may not be new.

676

Unless the address taken bit is set (to handle the case address is used in comparison).

688

For non promoted constant local globals, the name also needs to be uniqued -- otherwise there may be conflicts in the assembly file (some constants are materialized using memory read).

719

isPerformingImport() check is redundant here.

745

I think it should stay as WeakODRLinkage -- there is no guarantee that it will be emitted in other modules (if there is no reference).

771

The comment seems to mean the opposite.

Thanks for the comments. Responses below.

include/llvm/IR/FunctionInfo.h
235

Merged where? At the point we invoke this we are just getting a module ready to import from - we would not have merged comdats across modules. And the FunctionInfoList may contain multiple copies of a comdat, unless we decided in the stage-2 global indexing step to merge their entries in the combined index. Presumably due to the way this is being used to guard renaming and promotion of locals, we couldn't remove an entry for any comdat copy that accessed a local - but I doubt that could happen?

lib/Linker/LinkModules.cpp
441

Good question, and something I was pondering. For now I think we should support one at a time, and I can extend this to support importing a set in the future if it seems useful. Part of the issue is that until you import a function, you won't know what other function in the same module are called by it that you may now want to import. But presumably you may have multiple functions from the same source module called from the current primary module that you know you want to import, and batch mode could help there. But I would prefer to add that handling as a follow-on if it is necessary.

450–470

That is implied by whether FuncToImport is null or not. A flag would be redundant.

466

That should never be the case - the dstM is the "combined" module being created by the link. Even if we don't import, the module being linked is not the same as the srcM.

573

Ok, I can change to getName. I will accordingly change getNewLinkage below to getLinkage.

676

Good catch. And unfortunately for GVars there is no address taken bit or easy check for this. For functions, there is a hasAddressTaken function that walks the uses, but not for GVars. But that isn't useful here in the importing context in any case, since we haven't even parsed all of the functions where it may be address taken. It will have to be recorded in the function index in the front end. For now I will simply promote all locals (and leave a comment about what would need to be done to reduce this scope).

688

Good point. Since I will now promote all locals, this is somewhat moot, but will resurface if we do something more narrow to identify the address taken locals that need promotion. I will change this to rename all locals when we are importing or exporting.

719

Will remove.

745

WeakODR is not discardable, unlike LinkOnceODR. So it will be emitted in the original module.

771

You're right, will fix the comment.

davidxl added inline comments.Oct 16 2015, 1:39 PM
include/llvm/IR/FunctionInfo.h
235

yes, In theory, the combined index generator can do comdat de-duplication before writing out the combined summary -- but since that step is mandatory, handling duplication is still required here.

lib/Linker/LinkModules.cpp
441

ok.

450–470

ok.

466

sounds like this can be done in place (making dstM the same as srcM), but as it is existing code, ok.

676

When promoting Constant Global, make sure the initializer is made available so that const prop can still happen.

688

ok.

745

ok. I got it confused with LinkOnceODR.

tobiasvk added inline comments.
lib/Linker/LinkModules.cpp
676

Could you use the unnamed_addr bit?

tejohnson added inline comments.Oct 16 2015, 1:56 PM
lib/Linker/LinkModules.cpp
676

Thanks for the tip!

I am not familiar with this attribute, but see it documented at http://llvm.org/docs/LangRef.html#global-variables:

"Global variables can be marked with unnamed_addr which indicates that the address is not significant, only the content. Constants marked like this can be merged with other constants if they have the same initializer. Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant."

So it sounds like unnamed_addr variables are safe to be treated as non-address taken. I'm not sure under what context it is set though or if it covers all non-address taken const variables?

tejohnson added inline comments.Oct 16 2015, 2:01 PM
include/llvm/IR/FunctionInfo.h
235

Right, that is what this loop is handling.

lib/Linker/LinkModules.cpp
676

When promoting Constant Global, make sure the initializer is made available so that const prop can still happen.

Right, the initializers are imported along with the variables.

tobiasvk added inline comments.Oct 16 2015, 2:18 PM
lib/Linker/LinkModules.cpp
676

So it sounds like unnamed_addr variables are safe to be treated as non-address taken. I'm not sure under what context it is set though or if it covers all non-address taken const variables?

It's set by the GlobalOpt pass (mainly). GlobalOpt uses a helper, GlobalStatus, that does the analysis. I think you're safe to assume that it's non-address taken *if* the flag is set, but obviously GlobalOpt may not have been run.

tejohnson added inline comments.Oct 16 2015, 2:24 PM
lib/Linker/LinkModules.cpp
676

Great, I will guard the handling of constant variables by hasUnnamedAddr as well so we can catch those cases where GlobalOpt has been run and set it.
Thanks.

tejohnson updated this revision to Diff 37768.Oct 19 2015, 10:18 AM

Updated as per review comments from davidxl, dexonsmith and tobiasvk,
including updating test to use new llvm-lto functionality instead of gold.

tejohnson updated this revision to Diff 38532.Oct 27 2015, 6:07 AM
  • Handle aliases that were already imported from the same module, and add a test for that scenario.

Also, pinging the patch review. Duncan, did you have any more comments?

Thanks,
Teresa

slarin added a subscriber: slarin.Nov 2 2015, 8:43 AM
This revision was automatically updated to reflect the committed changes.