This is an archive of the discontinued LLVM Phabricator instance.

Drop comdats from the dst module if they are not selected
ClosedPublic

Authored by rafael on Mar 17 2016, 1:25 PM.

Details

Summary

A really unfortunate design of llvm-link and related libraries is that they operate one module at a time.

This means they can copy a GV to the destination module that should not be there in the final result because a later bitcode file takes precedence.

We already handled cases like a strong GV replacing a weak for example.

One case that is not currently handled is a comdat replacing another. This doesn't happen in ELF, but with COFF largest selection kind it is possible.

In "llvm-link a.ll b.ll" if the selected comdat was from a.ll, everything will work and we will not copy the comdat from b.ll.

But if we run "llvm-link b.ll a.ll", we fail to delete the already copied comdat from b.ll. This patch fixes that.

Diff Detail

Event Timeline

rafael updated this revision to Diff 50972.Mar 17 2016, 1:25 PM
rafael retitled this revision from to Drop comdats from the dst module if they are not selected.
rafael updated this object.
rafael added reviewers: majnemer, mehdi_amini.
rafael added a subscriber: llvm-commits.
mehdi_amini added inline comments.Mar 18 2016, 8:53 PM
llvm/lib/Linker/LinkModules.cpp
476 ↗(On Diff #50972)

doc

503 ↗(On Diff #50972)

What's happening here is not super nice in the alias case, since you're creating a new GlobalVariable to trash it right after.
Can't this be done before the if with an early return like below?

if (!ReplacedDstComdats.count(C))
    return;
if (Declaration->use_empty()) {
    Declaration->eraseFromParent();
    return;
}
  if (auto *F = dyn_cast<Function>(&GV))
    F->deleteBody();
 else if (auto *Var = dyn_cast<GlobalVariable>(&GV)) 
    Var->setInitializer(nullptr);
 else { // alias case
       auto &Alias = cast<GlobalAlias>(GV);
    Module &M = *Alias.getParent();
    PointerType &Ty = *cast<PointerType>(Alias.getType());
    GlobalValue::LinkageTypes L = Alias.getLinkage();
    auto Declaration =
        new GlobalVariable(M, Ty.getElementType(), /*isConstant*/ false, L,
                           /*Initializer*/ nullptr);
    Declaration->takeName(&Alias);
    Alias.replaceAllUsesWith(Declaration);
    Alias.eraseFromParent();
}
rafael updated this revision to Diff 51150.Mar 21 2016, 5:32 AM

Address review comments.

tejohnson added inline comments.
lib/Linker/LinkModules.cpp
105

Needs doxygen comment

479

nit: just invoke Alias.getLinkage() in below constructor call and remove local L.

Also is it even correct to use the original linkage as-is in all cases? Should it be ExternalLinkage?

Finally, what if this is ThinLTO and the alias was defined in the original module we are importing into? Won't we have an undef at link time? Ditto for full LTO - where would this new declaration be defined? I'm not even sure what the right behavior is in that case.

481

What if this was a function alias not variable alias?

541

Why not combine this into the above loop? I.e. when the ComdatsChosen array is initialized for C, you already have its LinkFromSrc. And that step is done exactly once for each comdat due to the earlier check that continues if C is already in ComdatsChosen.

546

Just invoke C->getName() and remove ComdatName local?

test/Linker/comdat-rm-dst.ll
13

s/ailas/alias/ to get desired check

tejohnson added inline comments.Mar 21 2016, 10:10 AM
lib/Linker/LinkModules.cpp
479

Finally, what if this is ThinLTO and the alias was defined in the original module we are importing into? Won't we have an undef at link time? Ditto for full LTO - where would this new declaration be defined? I'm not even sure what the right behavior is in that case.

Should the original dest aliasee simply be removed from the comdat, renamed if necessary, and the alias still use it?

Otherwise it isn't clear to me where these are being defined.

Also is it even correct to use the original linkage as-is in all cases? Should it be ExternalLinkage?

Good catch. I was expecting the verifier to catch out of comdat
references to locals, but it doesn't.

Finally, what if this is ThinLTO and the alias was defined in the original module we are importing into? Won't we have an undef at link time? Ditto for full LTO - where would this new declaration be defined? I'm not even sure what the right behavior is in that case.

With comdats the same symbols have to be defined in the comdat group
we selected. If that is not the case, an undefined error is the
correct result. The ELF wording is:


A symbol table entry with STB_GLOBAL or STB_WEAK binding that is
defined relative to one of a group's sections, and that is contained
in a symbol table section that is not part of the group, must be
converted to an undefined symbol (its section index must be changed to
SHN_UNDEF) if the group members are discarded. References to this

symbol table entry from outside the group are allowed.

Comment at: lib/Linker/LinkModules.cpp:479
@@ +478,3 @@
+ GlobalValue *Declaration =
+ new GlobalVariable(M, Ty.getElementType(), /*isConstant*/ false, L,

+ /*Initializer*/ nullptr);

What if this was a function alias not variable alias?

That is fine. It is a bit odd that we have declaration versions of 2
of the 3 GlobalObjects (there should be only 1 IMHO), but the linker
can handle a mismatch.

Comment at: lib/Linker/LinkModules.cpp:504
@@ +503,3 @@
+ bool LinkFromSource = P.second.second;
+ if (!LinkFromSource)

+ continue;

Why not combine this into the above loop? I.e. when the ComdatsChosen array is initialized for C, you already have its LinkFromSrc. And that step is done exactly once for each comdat due to the earlier check that continues if C is already in ComdatsChosen.

Excellent suggestion. Thanks!

I will upload a new patch in a sec.

rafael updated this revision to Diff 51282.Mar 22 2016, 7:17 AM
tejohnson accepted this revision.Mar 22 2016, 2:06 PM
tejohnson added a reviewer: tejohnson.

LGTM with a couple small nits.

lib/Linker/LinkModules.cpp
519

s/no able/not able/

test/Linker/comdat-rm-dst.ll
3

Why can't these two FileCheck invocations be combined? They are scanning the same output file.

This revision is now accepted and ready to land.Mar 22 2016, 2:06 PM
mehdi_amini requested changes to this revision.Mar 22 2016, 2:07 PM
mehdi_amini edited edge metadata.
mehdi_amini added inline comments.
llvm/lib/Linker/LinkModules.cpp
503 ↗(On Diff #50972)

ping?

This revision now requires changes to proceed.Mar 22 2016, 2:07 PM
mehdi_amini accepted this revision.Mar 22 2016, 2:21 PM
mehdi_amini edited edge metadata.

I have to learn how to use phabricator, wasn't looking at the right diff...

This revision is now accepted and ready to land.Mar 22 2016, 2:21 PM
rafael closed this revision.Mar 22 2016, 2:41 PM