This is an archive of the discontinued LLVM Phabricator instance.

[AsmPrinter] Emit .weak directive for weak linkage on COFF
ClosedPublic

Authored by mstorsjo on Mar 15 2018, 2:25 PM.

Details

Summary

MC already knows how to emulate the .weak directive (with its ELF
semantics; i.e., an undefined weak symbol resolves to 0, and a defined
weak symbol has lower link precedence than a strong symbol of the same
name) using COFF weak externals. Plumb this through the ASM printer too,
so that definitions marked with __attribute__((weak)) at the language
level (which gets translated to weak linkage at the IR level) have the
corresponding .weak directive emitted. Note that *declarations* marked
with __attribute__((weak)) at the language level (which translates to
extern_weak at the IR level) already have .weak directives emitted.

Weak*/linkonce* symbols without an associated comdat (in particular, ones generated with attribute((weak)) in C/C++) were earlier emitted as normal unique globals, as the comdat is required to provide the linkonce semantics. This change makes sure they are emitted as .weak instead, allowing other symbols to override them.

Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not
quite sure what that test covers, since the behavior being tested in it
(the emission of a one_only section) is just a result of passing
-function-sections to llc; the linkonce_odr makes no difference.

Add a new coff-weak.ll which tests the new directive emission.

Diff Detail

Event Timeline

smeenai created this revision.Mar 15 2018, 2:25 PM
rnk added a comment.Mar 15 2018, 3:17 PM

I don't think we can use LLVM linkage to mean ELF weak. According to langref, weak/weak_odr are 100% equivalent to linkonce/linkonce_odr with the exception that they are non-discardable. Based on that, we apply weak_odr linkage to wide variety of things that must be comdat deduplicated but must not be discarded. For example, if you dllexport a class with inline methods, it is weak_odr. Similarly, if you do explicit template instantiation, the result is weak_odr.

Are we sure we want MC to emulate ELF weak symbols for all these things? I'm not sure we do.

That suggests that __attribute__((weak)) should probably be encoded into IR in some other way.

Yeah, you're right. Introducing a new linkage type for __attribute__((weak)) definitions is probably the way to go then?

Yeah, you're right. Introducing a new linkage type for __attribute__((weak)) definitions is probably the way to go then?

I don't think so.

The cases Reid lists are not really weak when targeting COFF. If they were not in a comdat we would get a linker error.

The solution I think is to have weak in the IR really mean weak (I.E., this change is fine once the dependencies are done) but change the producers to not use weak/linkonce if they don't want weak semantics.

This would require first splitting linkonce out of linkage to be just a can_drop_if_unused bit.

rnk added a comment.Mar 15 2018, 5:10 PM

Yeah, you're right. Introducing a new linkage type for __attribute__((weak)) definitions is probably the way to go then?

I don't think so.

The cases Reid lists are not really weak when targeting COFF. If they were not in a comdat we would get a linker error.

The solution I think is to have weak in the IR really mean weak (I.E., this change is fine once the dependencies are done) but change the producers to not use weak/linkonce if they don't want weak semantics.

This would require first splitting linkonce out of linkage to be just a can_drop_if_unused bit.

I think my perspective is informed by the text in LangRef and whatever LLVM used to do for COFF when I started working on it. Meaning, weak and linkonce linkages just create a comdat, nothing else. The weak vs. linkonce bit indicates what's discardable. The ODR bit indicates whether you can optimize.

Being able to have undefined weak symbols was always just some weird ELF feature that wasn't spelled out in LangRef and didn't work on COFF so it must not be inside the model. It's entirely possible that I've given that old model too much weight and we should update the LangRef to talk about how weak declarations can be undefined at link time.

Now that we have comdats, weak_odr and linkonce_odr are pretty weird linkages. We could, for example, migrate weak_odr+comdat things to just external+comdat, and that would work pretty well. External GVs aren't discardable, but most of the optimizer still reasons from the linkage without looking at the comdat, so this would be a pretty big change.

I think my perspective is informed by the text in LangRef and whatever LLVM used to do for COFF when I started working on it. Meaning, weak and linkonce linkages just create a comdat, nothing else. The weak vs. linkonce bit indicates what's discardable. The ODR bit indicates whether you can optimize.

Being able to have undefined weak symbols was always just some weird ELF feature that wasn't spelled out in LangRef and didn't work on COFF so it must not be inside the model. It's entirely possible that I've given that old model too much weight and we should update the LangRef to talk about how weak declarations can be undefined at link time.

Now that we have comdats, weak_odr and linkonce_odr are pretty weird linkages. We could, for example, migrate weak_odr+comdat things to just external+comdat, and that would work pretty well. External GVs aren't discardable, but most of the optimizer still reasons from the linkage without looking at the comdat, so this would be a pretty big change.

We would still need a can_be_dropped bit. A comdat can be dropped if all of its members can be dropped.

For ELF and COFF removing weak_odr would work in practice. Replacing weak_odr with external would allow the same optimizations and in practice I don't think anyone uses the fact that the template instantiations are weak. That is, I don't expect anyone to have a C file with a _Z<something> function that is not weak and not in a comdat an expects to be able to use that to replace the result of a template instantiation during link.

For MachO dropping weak_odr is harder. We could use comdats and have codegen map "in a comdat" to weak, but that is a bit of a hack.

So what I would suggest is

  • Add a can_drop bit to GVs.
  • Remove linkonce and linkonce_odr. They become just weak+can_drop and weak_odr+can_drop.
  • Change clang to use external+can_drop instead of weak_odr+can_drop when targeting COFF.
  • Apply this patch.

@rnk, does that make sense to you?

I'm not very familiar with this part of LLVM; I ran into this issue because an internal user needed __attribute__((weak)) semantics on COFF, and having inline assembly with a manual .weak directive for a gigantic mangled C++ name isn't the greatest workaround (though it does work). I'm happy to follow through if there's agreement on how to proceed here though.

I do have a question though. linkonce doesn't appear to do anything by itself on COFF right now. E.g. if I have

define linkonce void @f() {
  ret void
}

and run bin/llc -mtriple=x86_64-windows-msvc on it, the object file produced is the same both with and without the linkonce. Is that expected? As in, linkonce by itself only has meaningful semantics as far as IR is concerned, not at the object file level?

(For inline functions and such, it looks like the linkonce is always paired with a comdat, which is what provides the actual COMDAT semantics at the object file level.)

FWIW, GCC does support __attribute__((weak)) on Windows, but with slightly different semantics than what I think it has on ELF.

With GCC, you can't have a symbol declared with __attribute__((weak)) not be present at link time, that fails. However, you can have a local fallback function definition within a TU defined weak, which only gets linked if no other TU defines the same symbol. (IIRC I did test that lld manages to link object files with this mechanism.)

@rnk, does that make sense to you?

I'm not very familiar with this part of LLVM; I ran into this issue because an internal user needed __attribute__((weak)) semantics on COFF, and having inline assembly with a manual .weak directive for a gigantic mangled C++ name isn't the greatest workaround (though it does work). I'm happy to follow through if there's agreement on how to proceed here though.

I do have a question though. linkonce doesn't appear to do anything by itself on COFF right now. E.g. if I have

define linkonce void @f() {
  ret void
}

and run bin/llc -mtriple=x86_64-windows-msvc on it, the object file produced is the same both with and without the linkonce.

This is almost certainly an oversight. I don't think I have ever seen a plain linkonce used in practice, so it was probably never implemented.

rnk added a comment.Mar 22 2018, 3:57 PM

One way that we could make forward progress without rewriting LLVM's linkage is to not apply .weak for weak_odr linkages. Then we'd say, weak_odr is exactly like linkonce_odr except it is non-discardable, pending a cleanup to make discardability a separate bit (probably along with a separate "odr" bit that informs the inliner and constant propagation). weak linkage would be reserved for things that truly use __attribute__((weak)).

I keep running into this issue in various places, that __attribute__((weak)) doesn't work on COFF in clang.

@rnk, does that make sense to you?

I'm not very familiar with this part of LLVM; I ran into this issue because an internal user needed __attribute__((weak)) semantics on COFF, and having inline assembly with a manual .weak directive for a gigantic mangled C++ name isn't the greatest workaround (though it does work). I'm happy to follow through if there's agreement on how to proceed here though.

I do have a question though. linkonce doesn't appear to do anything by itself on COFF right now. E.g. if I have

define linkonce void @f() {
  ret void
}

and run bin/llc -mtriple=x86_64-windows-msvc on it, the object file produced is the same both with and without the linkonce.

This is almost certainly an oversight. I don't think I have ever seen a plain linkonce used in practice, so it was probably never implemented.

If I've understood things correctly, this is the norm, see e.g. D71572 for needing to add a comdat to make linkonce behave as intended.

In D44543#1046298, @rnk wrote:

One way that we could make forward progress without rewriting LLVM's linkage is to not apply .weak for weak_odr linkages. Then we'd say, weak_odr is exactly like linkonce_odr except it is non-discardable, pending a cleanup to make discardability a separate bit (probably along with a separate "odr" bit that informs the inliner and constant propagation). weak linkage would be reserved for things that truly use __attribute__((weak)).

I'm also hesitant to adjusting the generic rules for how this behaves on a whim (even though the current hasLinkOnceDirective case essentially is COFF only), but some change like the one above (in one form or another) does seem to be needed. As the current code only checks for hasLinkOnceDirective, it does decide that it can use .globl as the platform in general should be able to describe linkonce semantics, while it only can describe linkonce semantics if there's a comdat.

One way of handling it, which seems to run fine without regression in a larger set of code, is this:

     if (MAI->hasWeakDefDirective()) {
       // .globl _foo
       OutStreamer->emitSymbolAttribute(GVSym, MCSA_Global);
 
       if (!canBeHidden(GV, *MAI))
         // .weak_definition _foo
         OutStreamer->emitSymbolAttribute(GVSym, MCSA_WeakDefinition); 
       else
         OutStreamer->emitSymbolAttribute(GVSym, MCSA_WeakDefAutoPrivate);
+    } else if (TT.isOSWindows() && TT.isOSBinFormatCOFF() &&
+               GlobalValue::isWeakLinkage(Linkage) && !GV->hasComdat()) {
+      // .weak _foo
+      OutStreamer->emitSymbolAttribute(GVSym, MCSA_Weak);
     } else if (MAI->hasLinkOnceDirective()) {
       // .globl _foo
       OutStreamer->emitSymbolAttribute(GVSym, MCSA_Global); 
       //NOTE: linkonce is handled by the section the symbol was assigned to.
     } else {
       // .weak _foo
       OutStreamer->emitSymbolAttribute(GVSym, MCSA_Weak);
     }

This retains the distinction between linkonce/linkonce_odr and weak/weak_odr that @smeenai was making so far.

Another way of doing it, with a much smaller diff, would be this:

     if (MAI->hasWeakDefDirective()) {
       // .globl _foo
       OutStreamer->emitSymbolAttribute(GVSym, MCSA_Global);
    
       if (!canBeHidden(GV, *MAI))
         // .weak_definition _foo
         OutStreamer->emitSymbolAttribute(GVSym, MCSA_WeakDefinition);
       else
         OutStreamer->emitSymbolAttribute(GVSym, MCSA_WeakDefAutoPrivate);
-    } else if (MAI->hasLinkOnceDirective()) {
+    } else if (MAI->hasLinkOnceDirective() && GV->hasComdat()) {
+      // hasLinkOnceDirective() is essentially COFF-only, and COFF needs a comdat for expressing the linkonce semantics
       // .globl _foo
       OutStreamer->emitSymbolAttribute(GVSym, MCSA_Global);
       //NOTE: linkonce is handled by the section the symbol was assigned to.
     } else {
       // .weak _foo
       OutStreamer->emitSymbolAttribute(GVSym, MCSA_Weak);
     }

(This one I haven't testrun in the same way yet.)

If @smeenai doesn't have the same need for this one any longer, I could comandeer this revision, to start pestering people to get this finished.

Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2020, 7:17 AM

@rnk - what do you think about the two alternative solutions I commented above?

mstorsjo commandeered this revision.Mar 16 2020, 4:38 AM
mstorsjo edited reviewers, added: smeenai; removed: mstorsjo.
mstorsjo updated this revision to Diff 250521.Mar 16 2020, 4:44 AM
mstorsjo edited the summary of this revision. (Show Details)
mstorsjo changed the repository for this revision from rL LLVM to rG LLVM Github Monorepo.

Rebased the patch, adjusted the condition to another way of expressing roughly the same, added a comment to the common code.

The current version does emit .weak for both weak_odr, linkonce and linkonce_odr unless they have a comdat associated. Currently, weak/weak_odr/linkonce/linkonce_odr have no effect at all for COFF targets, unless a comdat has been set, so in practice, those cases should be rare (and in most cases not working as intended at the moment anyway).

Or would it be best to extend the generic MCAsmInfo with another function, and change the existing condition into (MAI->hasLinkOnceDirective() && (GV->hasComdat() || !MAI->linkOnceDirectiveRequiresComdat()))? That makes the condition both generic and clearer.

rnk added a comment.Mar 25 2020, 4:46 PM

@rnk - what do you think about the two alternative solutions I commented above?

The second change seems simpler, but maybe we should adjust the linkonce predicate name. As I understand it, both solutions will preserve the behavior of existing linkonce* and weak* linkages if a comdat is present, which it pretty much always is unless the user is using the __attribute__((weak)) extension.

It seems to me like the hasLinkOnceDirective predicate isn't really helping us here. I dug into the history, and I see that back around 2010, it controlled the emission of .linkonce discard. It got moved to section emission in rG76a07580a. Then, later, when we made all the object file sections use the standard, non-suffixed names, we folded that into the .section directive and stopped using the .linkonce directive entirely. So, it seems like maybe the MCAsmInfo feature that we really want here should read something like "avoidWeakIfComdat" or "shouldWeakComdatSymbolsBeExternal" or "isWeakComdatExternal" or something like that. shouldUpgradeWeakComdatToExternal...

In D44543#1942495, @rnk wrote:

@rnk - what do you think about the two alternative solutions I commented above?

The second change seems simpler, but maybe we should adjust the linkonce predicate name. As I understand it, both solutions will preserve the behavior of existing linkonce* and weak* linkages if a comdat is present, which it pretty much always is unless the user is using the __attribute__((weak)) extension.

It seems to me like the hasLinkOnceDirective predicate isn't really helping us here. I dug into the history, and I see that back around 2010, it controlled the emission of .linkonce discard. It got moved to section emission in rG76a07580a. Then, later, when we made all the object file sections use the standard, non-suffixed names, we folded that into the .section directive and stopped using the .linkonce directive entirely. So, it seems like maybe the MCAsmInfo feature that we really want here should read something like "avoidWeakIfComdat" or "shouldWeakComdatSymbolsBeExternal" or "isWeakComdatExternal" or something like that. shouldUpgradeWeakComdatToExternal...

Thanks for looking into this! That explains the current situation well, and I agree that sounds like a good way of moving forward, not holding on to old mismatched abstractions needlessly.

mstorsjo updated this revision to Diff 252803.Mar 26 2020, 4:59 AM
mstorsjo edited the summary of this revision. (Show Details)

Changed hasLinkOnceDirective into avoidWeakIfComdat and adjusted the condition.

@rnk Does this form of the change look decent to you?

rnk accepted this revision.Mar 28 2020, 7:21 AM

Yep, lgtm, thanks!

This revision is now accepted and ready to land.Mar 28 2020, 7:21 AM
This revision was automatically updated to reflect the committed changes.
MaskRay added a subscriber: MaskRay.EditedAug 19 2021, 4:30 PM

} else if (MAI->avoidWeakIfComdat() && GV->hasComdat()) { in AsmPrinter.cpp makes the comdat leader symbol unconditionally .globl.
Will it be possible to allow .weak leader?

If allowed, clang -fprofile-instr-generate a.c can place the weak __profc_weak in an associative comdat (comdat nodeduplicate) for the following code:

__attribute__((weak)) void foo() {}

and not getting "duplicate definition error" with lld-link -lldmingw.

} else if (MAI->avoidWeakIfComdat() && GV->hasComdat()) { in AsmPrinter.cpp makes the comdat leader symbol unconditionally .globl.
Will it be possible to allow .weak leader?

I'm not sure...

If allowed, clang -fprofile-instr-generate a.c can place the weak __profc_weak in an associative comdat (comdat nodeduplicate) for the following code:

__attribute__((weak)) void foo() {}

and not getting "duplicate definition error" with lld-link -lldmingw.

Can you elaborate on the issue? Right now I can successfully build and link code that uses __attribute__((weak)) with -fprofile-instr-generate in mingw mode with clang+lld. Or is the situation relating to the upcoming state after some other patches? (I'm entirely unfamiliar with the internals of the PGO implementation.)

MaskRay added a comment.EditedAug 19 2021, 11:45 PM

} else if (MAI->avoidWeakIfComdat() && GV->hasComdat()) { in AsmPrinter.cpp makes the comdat leader symbol unconditionally .globl.
Will it be possible to allow .weak leader?

I'm not sure...

If allowed, clang -fprofile-instr-generate a.c can place the weak __profc_weak in an associative comdat (comdat nodeduplicate) for the following code:

__attribute__((weak)) void foo() {}

and not getting "duplicate definition error" with lld-link -lldmingw.

Can you elaborate on the issue? Right now I can successfully build and link code that uses __attribute__((weak)) with -fprofile-instr-generate in mingw mode with clang+lld. Or is the situation relating to the upcoming state after some other patches? (I'm entirely unfamiliar with the internals of the PGO implementation.)

For __attribute__((weak)) void foo() {}, currently the profd and profc variables don't use comdat.
compiler-rt/test/profile/Windows/coverage-weak-lld.cpp works because in lld-link -lldmingw mode, weak definitions are allowed to be duplicated.

If I use comdat nodeduplicate (rGfbb8e772ec501a1b71643db90e9c6445e17d7cac sorry, I accidentally pushed it, but reverted) for profd and profc, the comdat leader symbol (profc) will become .globl.
In the ninja check-profile result, compiler-rt/test/profile/Windows/coverage-weak-lld.cpp has a "duplicate definition" error.