Depends on D73228
This change passed in our internal huge code base.
Differential D73230
[X86][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local MaskRay on Jan 22 2020, 2:14 PM. Authored by
Details
Diff Detail
Event TimelineComment Actions @craig.topper @rnk This is even safer than I thought. // clang/lib/CodeGen/CodeGenModule:shouldAssumeDSOLocal const auto &CGOpts = CGM.getCodeGenOpts(); llvm::Reloc::Model RM = CGOpts.RelocationModel; const auto &LOpts = CGM.getLangOpts(); if (RM != llvm::Reloc::Static && !LOpts.PIE) return false; -fpic/-fPIC does not set dso_local so this change does not affect that compile model. Comment Actions Thanks! When -fsemantic-interposition is ready, I want to check if we can do some aggressive thing: infer dso_local for -fPIC, i.e. require -fPIC users to specify -fsemantic-interposition to get the interposition behavior. As the summary of D73228 says, the existing -fno-semantic-interposition behaviors in various IPO optimization and the previous assembly behavior may have given us enough license to do this. Comment Actions Unit tests: fail. 62352 tests passed, 1 failed and 839 were skipped. failed: libc++.std/containers/sequences/array/array_creation/to_array.fail.cpp clang-tidy: pass. clang-format: pass. Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project. Comment Actions @MaskRay - this change causes a behaviour difference for --wrap. Here is the --wrap behaviour before this change: ben@ben-VirtualBox:~/tests/wrap$ more main.c void __wrap_foo () { puts ("__wrap_foo"); __real_foo(); } void foo () { puts("foo()"); } int main() { __real_foo(); puts("---"); __wrap_foo(); puts("---"); foo(); return 0; } ben@ben-VirtualBox:~/tests/wrap$ gcc main.c -Wl,--wrap=foo -ffunction-sections -fuse-ld=lld -o lld.elf -Wno-implicit-function-declaration ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf foo() --- __wrap_foo foo() --- __wrap_foo foo() … and here is the behaviour after this change: ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf foo() --- __wrap_foo foo() --- foo() There is no behaviour change for -flto builds so the behaviour for --wrap is now effectively different for LTO vs normal builds. Comment Actions Passing-by remark: This change passed in our internal huge code base. isn't a great expanded description for the change.. Comment Actions I think you missed a point in the description of --wrap: You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed. If you do this, you should not put the definition of "__real_malloc" in the same file as "__wrap_malloc"; if you do, the assembler may resolve the call before the linker has a chance to wrap it to "malloc". Providing foo definition in the translation unit where they are referenced is not reliable when you are using --wrap. If you want to get guaranteed semantics, don't define foo when it is referenced. You may also try gcc and gcc -fPIC -fno-semantic-interposition, the behavior is similar to latest clang. Comment Actions Thanks for the summary. I am not particularly concerned about which behaviour we have w.r.t. wrapping intra-translation-unit references (although I have seen some evidence that lld's behaviour is useful e.g. https://stackoverflow.com/questions/13961774/gnu-gcc-ld-wrapping-a-call-to-symbol-with-caller-and-callee-defined-in-the-sam). However, you stated in https://sourceware.org/bugzilla/show_bug.cgi?id=26358 that for lld -r, lto, and normal links have the same behaviour - that is not true after this change. Furthermore, with the current clang it is not possible to go back to the old behaviour using -fsemantic-interposition for hidden symbols. IIUC I think that hidden symbols are probably the majority of opensource symbols now as the GNU toolchain encourages the use of -fvisiblity=hidden. Comment Actions Let me summarize the cases:
I think your make an integration between this commit and 872c5fb1432493c0a09b6f210765c0d94ce9b5d0, so for -fno-PIC or -fPIE, you observe the -fPIC -fno-semantic-interposition behavior as well. If you cherry pick 872c5fb1432493c0a09b6f210765c0d94ce9b5d0 and don't use -fno-semantic-interposition, and use LLD, you should get a wrapping behavior. (Clang traditionally has some -fno-semantic-interposition behaviors, so in the future we might be able to make -fno-semantic-interposition default for -fPIC.) I indeed prefer the LLD behavior, so I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26358 yesterday, but I cannot say the wrapping behavior is promised. If you want better portability, make foo weak (and be aware of side effects with the change). Comment Actions @MaskRay - Thanks for taking the time to look into this :) Here are my results from an e912fffd3a8c6c9f6e09d2eac4c1ee3a32800a22 clang using my previous example. I have included a bit more context in the first relocation dump so you can see which relocation.. in the other dumps I strip out more. With -fpic: $ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... Relocation section '.rela.text.main' at offset 0x330 contains 7 entries: Offset Info Type Symbol's Value Symbol's Name + Addend ... 0000000000000041 0000000600000004 R_X86_64_PLT32 0000000000000000 .text.foo - 4 ... Relocation section '.rela.eh_frame' at offset 0x3d8 contains 3 entries: ... $ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... 0000000000000041 0000000a00000004 R_X86_64_PLT32 0000000000000000 foo - 4 ... $ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... 0000000000000041 0000000600000004 R_X86_64_PLT32 0000000000000000 .text.foo - 4 ... $ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... 0000000000000041 0000000600000004 R_X86_64_PLT32 0000000000000000 .text.foo - 4 … With -fpie: $ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... 0000000000000041 0000000a00000004 R_X86_64_PLT32 0000000000000000 foo - 4 ... $ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... 0000000000000041 0000000a00000004 R_X86_64_PLT32 0000000000000000 foo - 4 ... $ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... 0000000000000041 0000000a00000004 R_X86_64_PLT32 0000000000000000 foo - 4 ... $ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o ... 0000000000000041 0000000a00000004 R_X86_64_PLT32 0000000000000000 foo - 4 ... So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior. Comment Actions Sorry for replying to my own post - I have realised that I have made a mistake -fno-semantic-interposition is not the default for -fpic so there has only been a change in behaviour for hidden symbols: $ /c/u/br2/bin/clang main.c -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration bdunbobbin@BENDB-W10-3 MINGW64 /c/temp/interpos $ /c/u/br2/bin/llvm-readelf -r test.o … 0000000000000041 0000000600000004 R_X86_64_PLT32 0000000000000000 .text.foo - 4 … $ /c/u/br2/bin/clang main.c -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration $ /c/u/br2/bin/llvm-readelf -r test.o … 0000000000000041 0000000a00000004 R_X86_64_PLT32 0000000000000000 foo - 4 ... Comment Actions I think the summary is correct. -fvisibility=hidden nullifies -fsemantic-interposition when the definition is available in the same translation unit. Given how GCC and GNU ld handle/document it, I'd say the previous hidden clang behavior working with -Wl,--wrap=foo is accidental rather than intentional. If you don't pass explicit -fsemantic-interposition, in -fPIC mode clang can freely inline foo into call sites, which will also defeat the intended -Wl,--wrap=foo behavior.
I think the reasonably portable approach making the wrapping scheme work is __attribute__((weak)). An alternative is to move the definitions to a separate translation unit (it does not work with -r or GCC LTO, though). Comment Actions .. but now we have this difference in behaviour for -normal vs flto links: ben@ben-VirtualBox:~/tests/wrap$ cat smaller.c void __wrap_foo () { puts ("__wrap_foo"); __real_foo(); } void foo () { puts("foo()"); } int main() { foo(); } ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old.elf -fvisibility=hidden -fuse-ld=lld ben@ben-VirtualBox:~/tests/wrap$ ./old.elf __wrap_foo foo() ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old_lto.elf -fvisibility=hidden -fuse-ld=lld -flto ben@ben-VirtualBox:~/tests/wrap$ ./old_lto.elf __wrap_foo foo() ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new.elf -fvisibility=hidden -fuse-ld=lld ben@ben-VirtualBox:~/tests/wrap$ ./new.elf foo() ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new_lto.elf -fvisibility=hidden -fuse-ld=lld -flto ben@ben-VirtualBox:~/tests/wrap$ ./new_lto.elf __wrap_foo foo() ... and the same will be true for default symbols + -fpic when we enable -fno-semantic-interposition by default for -fpic :( Comment Actions I think the non-LTO vs Full LTO difference is due to the way we implement --wrap for LTO: D33621. clang -fuse-ld=lld smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -fvisibility=hidden -flto -Wl,--save-temps % llvm-dis < a.out.0.0.preopt.bc ... define weak hidden void @foo() #0 { # Note the weakany linkage ... I think the only way making --wrap sufficiently reliable is to communicate --wrap to the LTO code generation. It was considered unworthy at the implementation time https://bugs.llvm.org/show_bug.cgi?id=33145#c7 To make the non-LTO case behave like the LTO case, add __attribute__((weak)) Comment Actions I had come to the same conclusion and I agree with your assessment of what is needed to improve the LTO behavior. Another way of obtaining consistent behavior would be to adjust the current canBenefitFromLocalAlias(): bool GlobalValue::canBenefitFromLocalAlias() const { // See AsmPrinter::getSymbolPreferLocal(). - return GlobalObject::isExternalLinkage(getLinkage()) && !isDeclaration() && + return hasDefaultVisibility() && GlobalObject::isExternalLinkage(getLinkage()) && !isDeclaration() && !isa<GlobalIFunc>(this) && !hasComdat(); } Doing this would:
Comment Actions The proposed canBenefitFromLocalAlias change looks good to me. Mind sending a patch? I think 4 is the important one. I agree that it happens to provide 2 and 3 here but the --wrap behavior still looks unreliable to me. Glad that you find a solution addressing your problems:) |