This is an archive of the discontinued LLVM Phabricator instance.

Move comdat setup after global value body linking in ModuleLinker
AbandonedPublic

Authored by tejohnson on Nov 5 2015, 10:03 AM.

Details

Summary

The verifier complains if a declaration is in a comdat. However,
the module linker speculatively adds linked global values that are
in comdats in the source module into comdats in the destination module,
which assumes their bodies will also be linked.

For ThinLTO we may not link the body if it isn't an imported function.
Also, with currently reverted patch r251926 (D14195: Move metadata
linking after lazy global materialization/linking), during an LTO build
a function referenced only from metadata will not have its definition
linked and will become a declaration. If it was in a comdat it will also
hit the verifier error.

To handle this, delay setting up comdats on the Dest module until after
all global value bodies have been linked, so that we know which are
definitions vs declarations in Dest.

This change is an enabler for resubmitting r251926, and also for some
follow-on fixes for ThinLTO comdat handling I discovered in the process
of working on these changes.

Diff Detail

Event Timeline

tejohnson updated this revision to Diff 39379.Nov 5 2015, 10:03 AM
tejohnson retitled this revision from to Move comdat setup after global value body linking in ModuleLinker.
tejohnson updated this object.
tejohnson added reviewers: dexonsmith, mehdi_amini.
tejohnson added a subscriber: llvm-commits.
dexonsmith edited edge metadata.Nov 5 2015, 11:22 AM
dexonsmith added a subscriber: dexonsmith.

This seems reasonable to me, but we don't have comdats in Mach-O (I know they're similar to linkonce/etc., but don't really understand the full semantics) so I'm not sure I'm the best person to review this.

Rafael, maybe you could have a look?

mehdi_amini added inline comments.Nov 5 2015, 11:29 AM
lib/Linker/LinkModules.cpp
2007

Is is possible to just iterate over ComdatsChosen?

tejohnson added inline comments.
lib/Linker/LinkModules.cpp
2007

I don't think that is straightforward since AFAIK there is no way to get directly from the comdat to all the global values that are in the comdat.

tobiasvk added inline comments.
lib/Linker/LinkModules.cpp
2007

Purely stylistic.. omit braces on the single-statement for loops. Otherwise, makes sense!

rafael edited edge metadata.Nov 5 2015, 12:19 PM

It has been awhile since I looked at lib/Linker, but correct if I am wrong, it has pretty much two modes

  • "real" linker semantics. This handles comdats, weaks, etc.
  • A mode where we only copy a user specified list of GVs and internal dependencies.

The second mode is used by thin lto. It could also be used with a few modifications by the gold plugin since gold did the "real linker" work already.

It looks like both modes are somewhat mixed. Could the linker be split in two? The low level one gets a list of globals. The high level one is not used by thinlto. What it does is decide based on comdats and linkages what a real linker would do and pass that list to the lower level class.

Following up after discussing with Rafael on IRC.

Splitting the linker will require quite a bit of work since LTO discovers what symbols it needs as it copies in function bodies.

As an alternative to this patch:

  1. For ThinLTO we can strip decls from comdats if we aren't importing that comdat group (note that the importer eventually needs to import the entire comdat group to get correct linker behavior).
  2. For LTO (and llvm-lto -only-needed) this situation only arises with my currently reverted patch D14195 to move metadata linking after all global value body linking when there is metadata that references a non-materialized global. At HEAD, a metadata reference causes the def to be linked in, which is unnecessary. With D14195 it was just leaving a decl. This caused LTO bootstrapping issues for global values in comdats which can't be decls (and aliases - which requires a separate fix to turn those into non-alias decls). The question is what to do about these metadata references - I assume they need to reference something in the dest module. Eventually with a follow on patch I have support for not linking in DISubprogram metadata that isn't needed. We would need to do the same for references to unmaterialized variables from DIVariable (any other metadata that can reference GVs?). These would have to be done along with the patch to move the metadata linking. For now, maybe just strip the decls from comdats in this case as well, but this assumes that the unreferenced global values and associated metadata will be removed by a later pass. Any other suggestions?
tejohnson abandoned this revision.Nov 6 2015, 8:25 AM

Based on discussions with Rafael and Duncan, I am going to abandon this change. See the new fix for the LTO bootstrapping issue suggested by Duncan included with D14447. I will fix the ThinLTO comdat issue in a separate patch that has some other fixes for comdats.