Comdat groups override weak symbol behavior, allowing the linker to keep
the comdats for weak symbols in favor of comdats for strong symbols.
Fixes the issue described in:
https://bugs.chromium.org/p/chromium/issues/detail?id=918662
|  Differential  D56516  
[SanitizerCoverage] Don't create comdat for interposable functions. Authored by morehouse on Jan 9 2019, 3:08 PM. 
Details Comdat groups override weak symbol behavior, allowing the linker to keep Fixes the issue described in: 
Diff Detail 
 Event TimelineComment Actions Can you create a minimal reproducer? If objcopy is complaining about sh_link = 0 I suspect that there's some issue where a global referenced via !associated is being dropped. Comment Actions Minimal repro: $ cat weak.c #include <stdio.h> __attribute__((weak)) void f() {fprintf(stderr, "Weak!\n");} $ cat strong.c #include <stdio.h> void f() {fprintf(stderr, "Strong!\n");} $ cat fuzz.c #include <stdint.h> #include <stddef.h> void f(); int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { f(); return 1; } $ cat main.c void f(); int main() { f(); return 0; } $ clang main.c weak.c strong.c && ./a.out Strong! $ clang -fsanitize=fuzzer fuzz.c weak.c strong.c && ./a.out ... Weak! Comment Actions I wonder if it is the same problem as here: https://reviews.llvm.org/D53234 which is a case of the global referenced via !associated is being dropped. Having said that there is definitely a bug in compiler here at least for ELF. In ELF given the duplication behavior specified for GRP_COMDAT, each set of duplicate COMDATs has an "api" made up of the globals that are part of the COMDATs in that set and are referenced from anywhere in the program outside of that COMDAT set. Each instance of a COMDAT group must have this "api" in common, otherwise the program might fail to link, or have arbitrarily different run-time behavior (depending on which COMDAT is selected). Comment Actions That looks like it's addressing a different problem. In the case I'm concerned about, globals are never dropped from the COMDAT. Rather, a COMDAT containing a weak symbol is chosen over a COMDAT with a strong symbol. 
 Does weak-vs-strong count towards that "api"? Because that's the only difference in the case I'm seeing... And does the COMDAT need to have a SelectionKind of ExactMatch to enforce the API, or does Any suffice? Comment Actions Weak-vs-strong count *if* program behaviour is affected by weak vs strong symbol resolution semantics (and I guess that it is in your case otherwise COMDAT selection wouldn't be causing you problems). This is because the ELF standard provides no guarantees as to which COMDAT instance from each duplicate group will be selected (i.e in ELF the selection kind behaviour is always "Any"). 
 In ELF (and WebAssembly, apparently, according to the LLVM language reference at least) the only valid selection kind is "Any". My understanding is that this is not the case for COFF where COMDAT processing is more involved. I am not an expert on COFF; but, I don't think that you would need to take weak symbols out of COMDATs for the COFF case to work (In fact I was told that COFF doesn't have weak symbols in the ELF sense). Comment Actions @bd1976llvm: So do you still think this is a compiler bug? It sounds like ELF provides no guarantees about which COMDAT is kept. Which would mean we need to keep the weak symbols from going in a COMDAT. Comment Actions What I mean't by a compiler bug is that, at least on ELF targets, you can't have weak and strong versions of a symbol in the same duplicate COMDAT set (the set of COMDATs with the same name) and expect the linker to always pick a strong version of the symbol, as we have been discussing; therefore, the current output looks incorrect on ELF. Is not putting weak symbols into COMDATs the way to fix this? 
 I'm afraid I don't have the time to look into the details of your bug ATM. Some other solutions might be: 
 Hopefully, someone who is an expert on the other file formats might comment (Reid?). Comment Actions Yes, but this is definitely preferred over breaking weak/strong semantics for our users. 
 @rnk: How do weak/strong symbols in COMDATs interact on COFF? Comment Actions Yes, I think I agree with your conclusions for ELF. With weak vs. strong linkage, the strong symbol can appear later, and it will win. With comdats, the first symbol will win, regardless of strength. Putting such symbols into a comdat isn't going to work. However, it looks like this code still creates sancov metadata for such a weak function, even though it might not be linked in. Is that a problem? Will it create relocations to discarded data? Is the !associated metadata enough to make it get discarded when the strong user symbol overrides the weak one? 
 Well, GCC has some extensions for implementing weak symbols in COFF. I forget if we implement them. Maybe @smeenai remembers, I think Facebook wanted weak symbols for mingw. The Visual C++ linker doesn't implement weak symbols. I think this change will be fine with my suggestion to keep applying comdats when the function is weak_odr. We should also test all the relevant linkages, like plain linkonce and weak_odr. 
 Comment Actions In my local tests, the weak function and its sancov metadata both get discarded with bfd, gold, and lld. Comment Actions Great! This looks good to me, but maybe ask @pcc his opinion before landing. 
 
 Comment Actions COFF has weak externals, and I think gcc emulates weak symbols on top of weak external semantics. LLVM does too, if you use the weak + alias or weakref attributes. I'm not sure how weak externals interact with COMDATs. Note that there are multiple semantics for weak externals themselves, and LLVM/LLD might not be implementing all of them correctly ... https://developercommunity.visualstudio.com/content/problem/384278/link-not-handling-coff-weak-externals.html is a behavior difference between link and LLD, for example. | ||||||||||||||||||||||
hasWeakLinkage will include weak_odr linkages, which are often used for explicit template instantiations and dllexport functions. In those cases, the user has promised that there is only "one definition", so there cannot be some overriding strong definition. I think you really want to say hasWeakAnyLinkage() or isInterposable() here instead. That will affect these linkages:
external_weak can only appear on declarations, common can only be on globals, and a linkonce function would be a discardable-but-weak function. There just aren't many use cases for it. I'm not sure when we'd generate it, but I think you might want to avoid putting such things in comdats. I guess the use case would be to emit an overridable aligned operator new, which could be discarded if unused after optimization. You probably don't want to put such things in a comdat either.
So, in conclusion, isInterposable seems like the right thing to use here.