Comdat groups override weak symbol behavior, allowing the linker to keep
the comdats for weak symbols in favor of comdats for strong symbols.
Fixes the issue described in:
https://bugs.chromium.org/p/chromium/issues/detail?id=918662
Differential D56516
[SanitizerCoverage] Don't create comdat for interposable functions. morehouse on Jan 9 2019, 3:08 PM. Authored by
Details Comdat groups override weak symbol behavior, allowing the linker to keep Fixes the issue described in:
Diff Detail
Event TimelineComment Actions Can you create a minimal reproducer? If objcopy is complaining about sh_link = 0 I suspect that there's some issue where a global referenced via !associated is being dropped. Comment Actions Minimal repro: $ cat weak.c #include <stdio.h> __attribute__((weak)) void f() {fprintf(stderr, "Weak!\n");} $ cat strong.c #include <stdio.h> void f() {fprintf(stderr, "Strong!\n");} $ cat fuzz.c #include <stdint.h> #include <stddef.h> void f(); int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { f(); return 1; } $ cat main.c void f(); int main() { f(); return 0; } $ clang main.c weak.c strong.c && ./a.out Strong! $ clang -fsanitize=fuzzer fuzz.c weak.c strong.c && ./a.out ... Weak! Comment Actions I wonder if it is the same problem as here: https://reviews.llvm.org/D53234 which is a case of the global referenced via !associated is being dropped. Having said that there is definitely a bug in compiler here at least for ELF. In ELF given the duplication behavior specified for GRP_COMDAT, each set of duplicate COMDATs has an "api" made up of the globals that are part of the COMDATs in that set and are referenced from anywhere in the program outside of that COMDAT set. Each instance of a COMDAT group must have this "api" in common, otherwise the program might fail to link, or have arbitrarily different run-time behavior (depending on which COMDAT is selected). Comment Actions That looks like it's addressing a different problem. In the case I'm concerned about, globals are never dropped from the COMDAT. Rather, a COMDAT containing a weak symbol is chosen over a COMDAT with a strong symbol.
Does weak-vs-strong count towards that "api"? Because that's the only difference in the case I'm seeing... And does the COMDAT need to have a SelectionKind of ExactMatch to enforce the API, or does Any suffice? Comment Actions Weak-vs-strong count *if* program behaviour is affected by weak vs strong symbol resolution semantics (and I guess that it is in your case otherwise COMDAT selection wouldn't be causing you problems). This is because the ELF standard provides no guarantees as to which COMDAT instance from each duplicate group will be selected (i.e in ELF the selection kind behaviour is always "Any").
In ELF (and WebAssembly, apparently, according to the LLVM language reference at least) the only valid selection kind is "Any". My understanding is that this is not the case for COFF where COMDAT processing is more involved. I am not an expert on COFF; but, I don't think that you would need to take weak symbols out of COMDATs for the COFF case to work (In fact I was told that COFF doesn't have weak symbols in the ELF sense). Comment Actions @bd1976llvm: So do you still think this is a compiler bug? It sounds like ELF provides no guarantees about which COMDAT is kept. Which would mean we need to keep the weak symbols from going in a COMDAT. Comment Actions What I mean't by a compiler bug is that, at least on ELF targets, you can't have weak and strong versions of a symbol in the same duplicate COMDAT set (the set of COMDATs with the same name) and expect the linker to always pick a strong version of the symbol, as we have been discussing; therefore, the current output looks incorrect on ELF. Is not putting weak symbols into COMDATs the way to fix this?
I'm afraid I don't have the time to look into the details of your bug ATM. Some other solutions might be:
Hopefully, someone who is an expert on the other file formats might comment (Reid?). Comment Actions Yes, but this is definitely preferred over breaking weak/strong semantics for our users.
@rnk: How do weak/strong symbols in COMDATs interact on COFF? Comment Actions Yes, I think I agree with your conclusions for ELF. With weak vs. strong linkage, the strong symbol can appear later, and it will win. With comdats, the first symbol will win, regardless of strength. Putting such symbols into a comdat isn't going to work. However, it looks like this code still creates sancov metadata for such a weak function, even though it might not be linked in. Is that a problem? Will it create relocations to discarded data? Is the !associated metadata enough to make it get discarded when the strong user symbol overrides the weak one?
Well, GCC has some extensions for implementing weak symbols in COFF. I forget if we implement them. Maybe @smeenai remembers, I think Facebook wanted weak symbols for mingw. The Visual C++ linker doesn't implement weak symbols. I think this change will be fine with my suggestion to keep applying comdats when the function is weak_odr. We should also test all the relevant linkages, like plain linkonce and weak_odr.
Comment Actions In my local tests, the weak function and its sancov metadata both get discarded with bfd, gold, and lld. Comment Actions Great! This looks good to me, but maybe ask @pcc his opinion before landing.
Comment Actions COFF has weak externals, and I think gcc emulates weak symbols on top of weak external semantics. LLVM does too, if you use the weak + alias or weakref attributes. I'm not sure how weak externals interact with COMDATs. Note that there are multiple semantics for weak externals themselves, and LLVM/LLD might not be implementing all of them correctly ... https://developercommunity.visualstudio.com/content/problem/384278/link-not-handling-coff-weak-externals.html is a behavior difference between link and LLD, for example. |