Page MenuHomePhabricator

[X86][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local
ClosedPublic

Authored by MaskRay on Jan 22 2020, 2:14 PM.

Details

Diff Detail

Event Timeline

MaskRay created this revision.Jan 22 2020, 2:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 22 2020, 2:14 PM

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

@craig.topper @rnk This is even safer than I thought.

// clang/lib/CodeGen/CodeGenModule:shouldAssumeDSOLocal
  const auto &CGOpts = CGM.getCodeGenOpts();
  llvm::Reloc::Model RM = CGOpts.RelocationModel;
  const auto &LOpts = CGM.getLangOpts();
  if (RM != llvm::Reloc::Static && !LOpts.PIE)
    return false;

-fpic/-fPIC does not set dso_local so this change does not affect that compile model.

rnk accepted this revision.Jan 30 2020, 5:24 PM

-fpic/-fPIC does not set dso_local so this change does not affect that compile model.

Yep, that's how it's supposed to work. :)

lgtm

This revision is now accepted and ready to land.Jan 30 2020, 5:24 PM
In D73230#1850895, @rnk wrote:

-fpic/-fPIC does not set dso_local so this change does not affect that compile model.

Yep, that's how it's supposed to work. :)

lgtm

Thanks!

When -fsemantic-interposition is ready, I want to check if we can do some aggressive thing: infer dso_local for -fPIC, i.e. require -fPIC users to specify -fsemantic-interposition to get the interposition behavior.

As the summary of D73228 says, the existing -fno-semantic-interposition behaviors in various IPO optimization and the previous assembly behavior may have given us enough license to do this.

MaskRay updated this revision to Diff 241617.Jan 30 2020, 5:42 PM

Fix fold-add-pcrel.ll

This revision was automatically updated to reflect the committed changes.

Unit tests: fail. 62352 tests passed, 1 failed and 839 were skipped.

failed: libc++.std/containers/sequences/array/array_creation/to_array.fail.cpp

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

@MaskRay - this change causes a behaviour difference for --wrap.

Here is the --wrap behaviour before this change:

ben@ben-VirtualBox:~/tests/wrap$ more main.c
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() {
	__real_foo();
	puts("---");
	__wrap_foo();
	puts("---");
	foo();
	return 0;
}
ben@ben-VirtualBox:~/tests/wrap$ gcc main.c -Wl,--wrap=foo -ffunction-sections -fuse-ld=lld -o lld.elf -Wno-implicit-function-declaration
ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
__wrap_foo
foo()

… and here is the behaviour after this change:

ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
foo()

There is no behaviour change for -flto builds so the behaviour for --wrap is now effectively different for LTO vs normal builds.

Passing-by remark: This change passed in our internal huge code base. isn't a great expanded description for the change..

@MaskRay - this change causes a behaviour difference for --wrap.

Here is the --wrap behaviour before this change:

ben@ben-VirtualBox:~/tests/wrap$ more main.c
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() {
	__real_foo();
	puts("---");
	__wrap_foo();
	puts("---");
	foo();
	return 0;
}
ben@ben-VirtualBox:~/tests/wrap$ gcc main.c -Wl,--wrap=foo -ffunction-sections -fuse-ld=lld -o lld.elf -Wno-implicit-function-declaration
ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
__wrap_foo
foo()

… and here is the behaviour after this change:

ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
foo()

There is no behaviour change for -flto builds so the behaviour for --wrap is now effectively different for LTO vs normal builds.

I think you missed a point in the description of --wrap:

You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed.  If you do this, you
should not put the definition of "__real_malloc" in the same file as "__wrap_malloc"; if you do, the assembler may resolve the call
before the linker has a chance to wrap it to "malloc".

Providing foo definition in the translation unit where they are referenced is not reliable when you are using --wrap.
Actually, this is where GNU ld and LLD differ. See https://sourceware.org/bugzilla/show_bug.cgi?id=26358 and the history of lld/test/ELF/wrap-shlib-undefined.s

If you want to get guaranteed semantics, don't define foo when it is referenced. You may also try gcc and gcc -fPIC -fno-semantic-interposition, the behavior is similar to latest clang.

@MaskRay - this change causes a behaviour difference for --wrap.

Here is the --wrap behaviour before this change:

ben@ben-VirtualBox:~/tests/wrap$ more main.c
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() {
	__real_foo();
	puts("---");
	__wrap_foo();
	puts("---");
	foo();
	return 0;
}
ben@ben-VirtualBox:~/tests/wrap$ gcc main.c -Wl,--wrap=foo -ffunction-sections -fuse-ld=lld -o lld.elf -Wno-implicit-function-declaration
ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
__wrap_foo
foo()

… and here is the behaviour after this change:

ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
foo()

There is no behaviour change for -flto builds so the behaviour for --wrap is now effectively different for LTO vs normal builds.

I think you missed a point in the description of --wrap:

You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed.  If you do this, you
should not put the definition of "__real_malloc" in the same file as "__wrap_malloc"; if you do, the assembler may resolve the call
before the linker has a chance to wrap it to "malloc".

Providing foo definition in the translation unit where they are referenced is not reliable when you are using --wrap.
Actually, this is where GNU ld and LLD differ. See https://sourceware.org/bugzilla/show_bug.cgi?id=26358 and the history of lld/test/ELF/wrap-shlib-undefined.s

If you want to get guaranteed semantics, don't define foo when it is referenced. You may also try gcc and gcc -fPIC -fno-semantic-interposition, the behavior is similar to latest clang.

Thanks for the summary. I am not particularly concerned about which behaviour we have w.r.t. wrapping intra-translation-unit references (although I have seen some evidence that lld's behaviour is useful e.g. https://stackoverflow.com/questions/13961774/gnu-gcc-ld-wrapping-a-call-to-symbol-with-caller-and-callee-defined-in-the-sam). However, you stated in https://sourceware.org/bugzilla/show_bug.cgi?id=26358 that for lld -r, lto, and normal links have the same behaviour - that is not true after this change. Furthermore, with the current clang it is not possible to go back to the old behaviour using -fsemantic-interposition for hidden symbols. IIUC I think that hidden symbols are probably the majority of opensource symbols now as the GNU toolchain encourages the use of -fvisiblity=hidden.

MaskRay added a comment.EditedAug 10 2020, 9:52 AM

@MaskRay - this change causes a behaviour difference for --wrap.

Here is the --wrap behaviour before this change:

ben@ben-VirtualBox:~/tests/wrap$ more main.c
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() {
	__real_foo();
	puts("---");
	__wrap_foo();
	puts("---");
	foo();
	return 0;
}
ben@ben-VirtualBox:~/tests/wrap$ gcc main.c -Wl,--wrap=foo -ffunction-sections -fuse-ld=lld -o lld.elf -Wno-implicit-function-declaration
ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
__wrap_foo
foo()

… and here is the behaviour after this change:

ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
foo()

There is no behaviour change for -flto builds so the behaviour for --wrap is now effectively different for LTO vs normal builds.

I think you missed a point in the description of --wrap:

You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed.  If you do this, you
should not put the definition of "__real_malloc" in the same file as "__wrap_malloc"; if you do, the assembler may resolve the call
before the linker has a chance to wrap it to "malloc".

Providing foo definition in the translation unit where they are referenced is not reliable when you are using --wrap.
Actually, this is where GNU ld and LLD differ. See https://sourceware.org/bugzilla/show_bug.cgi?id=26358 and the history of lld/test/ELF/wrap-shlib-undefined.s

If you want to get guaranteed semantics, don't define foo when it is referenced. You may also try gcc and gcc -fPIC -fno-semantic-interposition, the behavior is similar to latest clang.

Thanks for the summary. I am not particularly concerned about which behaviour we have w.r.t. wrapping intra-translation-unit references (although I have seen some evidence that lld's behaviour is useful e.g. https://stackoverflow.com/questions/13961774/gnu-gcc-ld-wrapping-a-call-to-symbol-with-caller-and-callee-defined-in-the-sam). However, you stated in https://sourceware.org/bugzilla/show_bug.cgi?id=26358 that for lld -r, lto, and normal links have the same behaviour - that is not true after this change. Furthermore, with the current clang it is not possible to go back to the old behaviour using -fsemantic-interposition for hidden symbols. IIUC I think that hidden symbols are probably the majority of opensource symbols now as the GNU toolchain encourages the use of -fvisiblity=hidden.

Let me summarize the cases:

  • {clang,gcc} -fuse-ld=lld main.c -Wl,--wrap=foo => wrapped (in LLD, --wrap is done after (global) symbol resolution. Definitions are wrapped as well)
  • {clang,gcc} -fuse-ld=bfd main.c -Wl,--wrap=foo => not wrapped (in GNU ld, --wrap is per object file. --wrap is not effective when the symbol is defined)
  • {clang,gcc} -fuse-ld=lld main.c -fPIC -fno-semantic-interposition -Wl,--wrap=foo => not wrapped (references go through .Lfoo$local which cannot be wrapped)

I think your make an integration between this commit and 872c5fb1432493c0a09b6f210765c0d94ce9b5d0, so for -fno-PIC or -fPIE, you observe the -fPIC -fno-semantic-interposition behavior as well. If you cherry pick 872c5fb1432493c0a09b6f210765c0d94ce9b5d0 and don't use -fno-semantic-interposition, and use LLD, you should get a wrapping behavior. (Clang traditionally has some -fno-semantic-interposition behaviors, so in the future we might be able to make -fno-semantic-interposition default for -fPIC.)

I indeed prefer the LLD behavior, so I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26358 yesterday, but I cannot say the wrapping behavior is promised. If you want better portability, make foo weak (and be aware of side effects with the change).

bd1976llvm added a comment.EditedAug 10 2020, 10:44 AM

@MaskRay - this change causes a behaviour difference for --wrap.

Here is the --wrap behaviour before this change:

ben@ben-VirtualBox:~/tests/wrap$ more main.c
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() {
	__real_foo();
	puts("---");
	__wrap_foo();
	puts("---");
	foo();
	return 0;
}
ben@ben-VirtualBox:~/tests/wrap$ gcc main.c -Wl,--wrap=foo -ffunction-sections -fuse-ld=lld -o lld.elf -Wno-implicit-function-declaration
ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
__wrap_foo
foo()

… and here is the behaviour after this change:

ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
foo()

There is no behaviour change for -flto builds so the behaviour for --wrap is now effectively different for LTO vs normal builds.

I think you missed a point in the description of --wrap:

You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed.  If you do this, you
should not put the definition of "__real_malloc" in the same file as "__wrap_malloc"; if you do, the assembler may resolve the call
before the linker has a chance to wrap it to "malloc".

Providing foo definition in the translation unit where they are referenced is not reliable when you are using --wrap.
Actually, this is where GNU ld and LLD differ. See https://sourceware.org/bugzilla/show_bug.cgi?id=26358 and the history of lld/test/ELF/wrap-shlib-undefined.s

If you want to get guaranteed semantics, don't define foo when it is referenced. You may also try gcc and gcc -fPIC -fno-semantic-interposition, the behavior is similar to latest clang.

Thanks for the summary. I am not particularly concerned about which behaviour we have w.r.t. wrapping intra-translation-unit references (although I have seen some evidence that lld's behaviour is useful e.g. https://stackoverflow.com/questions/13961774/gnu-gcc-ld-wrapping-a-call-to-symbol-with-caller-and-callee-defined-in-the-sam). However, you stated in https://sourceware.org/bugzilla/show_bug.cgi?id=26358 that for lld -r, lto, and normal links have the same behaviour - that is not true after this change. Furthermore, with the current clang it is not possible to go back to the old behaviour using -fsemantic-interposition for hidden symbols. IIUC I think that hidden symbols are probably the majority of opensource symbols now as the GNU toolchain encourages the use of -fvisiblity=hidden.

Let me summarize the cases:

  • {clang,gcc} -fuse-ld=lld main.c -Wl,--wrap=foo => wrapped (in LLD, --wrap is done after (global) symbol resolution. Definitions are wrapped as well)
  • {clang,gcc} -fuse-ld=bfd main.c -Wl,--wrap=foo => not wrapped (in GNU ld, --wrap is per object file. --wrap is not effective when the symbol is defined)
  • {clang,gcc} -fuse-ld=lld main.c -fPIC -fno-semantic-interposition -Wl,--wrap=foo => not wrapped (references go through .Lfoo$local which cannot be wrapped)

I think your make an integration between this commit and 872c5fb1432493c0a09b6f210765c0d94ce9b5d0, so for -fno-PIC or -fPIE, you observe the -fPIC -fno-semantic-interposition behavior as well. If you cherry pick 872c5fb1432493c0a09b6f210765c0d94ce9b5d0 and don't use -fno-semantic-interposition, and use LLD, you should get a wrapping behavior. (Clang traditionally has some -fno-semantic-interposition behaviors, so in the future we might be able to make -fno-semantic-interposition default for -fPIC.)

I indeed prefer the LLD behavior, so I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26358 yesterday, but I cannot say the wrapping behavior is promised. If you want better portability, make foo weak (and be aware of side effects with the change).

@MaskRay - Thanks for taking the time to look into this :)

Here are my results from an e912fffd3a8c6c9f6e09d2eac4c1ee3a32800a22 clang using my previous example. I have included a bit more context in the first relocation dump so you can see which relocation.. in the other dumps I strip out more.

With -fpic:

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
Relocation section '.rela.text.main' at offset 0x330 contains 7 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
...
0000000000000041  0000000600000004 R_X86_64_PLT32         0000000000000000 .text.foo - 4
...
Relocation section '.rela.eh_frame' at offset 0x3d8 contains 3 entries:
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000600000004 R_X86_64_PLT32         0000000000000000 .text.foo - 4
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000600000004 R_X86_64_PLT32         0000000000000000 .text.foo - 4

With -fpie:

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior.

@MaskRay - this change causes a behaviour difference for --wrap.

Here is the --wrap behaviour before this change:

ben@ben-VirtualBox:~/tests/wrap$ more main.c
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() {
	__real_foo();
	puts("---");
	__wrap_foo();
	puts("---");
	foo();
	return 0;
}
ben@ben-VirtualBox:~/tests/wrap$ gcc main.c -Wl,--wrap=foo -ffunction-sections -fuse-ld=lld -o lld.elf -Wno-implicit-function-declaration
ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
__wrap_foo
foo()

… and here is the behaviour after this change:

ben@ben-VirtualBox:~/tests/wrap$ ./lld.elf 
foo()
---
__wrap_foo
foo()
---
foo()

There is no behaviour change for -flto builds so the behaviour for --wrap is now effectively different for LTO vs normal builds.

I think you missed a point in the description of --wrap:

You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed.  If you do this, you
should not put the definition of "__real_malloc" in the same file as "__wrap_malloc"; if you do, the assembler may resolve the call
before the linker has a chance to wrap it to "malloc".

Providing foo definition in the translation unit where they are referenced is not reliable when you are using --wrap.
Actually, this is where GNU ld and LLD differ. See https://sourceware.org/bugzilla/show_bug.cgi?id=26358 and the history of lld/test/ELF/wrap-shlib-undefined.s

If you want to get guaranteed semantics, don't define foo when it is referenced. You may also try gcc and gcc -fPIC -fno-semantic-interposition, the behavior is similar to latest clang.

Thanks for the summary. I am not particularly concerned about which behaviour we have w.r.t. wrapping intra-translation-unit references (although I have seen some evidence that lld's behaviour is useful e.g. https://stackoverflow.com/questions/13961774/gnu-gcc-ld-wrapping-a-call-to-symbol-with-caller-and-callee-defined-in-the-sam). However, you stated in https://sourceware.org/bugzilla/show_bug.cgi?id=26358 that for lld -r, lto, and normal links have the same behaviour - that is not true after this change. Furthermore, with the current clang it is not possible to go back to the old behaviour using -fsemantic-interposition for hidden symbols. IIUC I think that hidden symbols are probably the majority of opensource symbols now as the GNU toolchain encourages the use of -fvisiblity=hidden.

Let me summarize the cases:

  • {clang,gcc} -fuse-ld=lld main.c -Wl,--wrap=foo => wrapped (in LLD, --wrap is done after (global) symbol resolution. Definitions are wrapped as well)
  • {clang,gcc} -fuse-ld=bfd main.c -Wl,--wrap=foo => not wrapped (in GNU ld, --wrap is per object file. --wrap is not effective when the symbol is defined)
  • {clang,gcc} -fuse-ld=lld main.c -fPIC -fno-semantic-interposition -Wl,--wrap=foo => not wrapped (references go through .Lfoo$local which cannot be wrapped)

I think your make an integration between this commit and 872c5fb1432493c0a09b6f210765c0d94ce9b5d0, so for -fno-PIC or -fPIE, you observe the -fPIC -fno-semantic-interposition behavior as well. If you cherry pick 872c5fb1432493c0a09b6f210765c0d94ce9b5d0 and don't use -fno-semantic-interposition, and use LLD, you should get a wrapping behavior. (Clang traditionally has some -fno-semantic-interposition behaviors, so in the future we might be able to make -fno-semantic-interposition default for -fPIC.)

I indeed prefer the LLD behavior, so I filed https://sourceware.org/bugzilla/show_bug.cgi?id=26358 yesterday, but I cannot say the wrapping behavior is promised. If you want better portability, make foo weak (and be aware of side effects with the change).

@MaskRay - Thanks for taking the time to look into this :)

Here are my results from an e912fffd3a8c6c9f6e09d2eac4c1ee3a32800a22 clang using my previous example. I have included a bit more context in the first relocation dump so you can see which relocation.. in the other dumps I strip out more.

With -fpic:

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
Relocation section '.rela.text.main' at offset 0x330 contains 7 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
...
0000000000000041  0000000600000004 R_X86_64_PLT32         0000000000000000 .text.foo - 4
...
Relocation section '.rela.eh_frame' at offset 0x3d8 contains 3 entries:
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000600000004 R_X86_64_PLT32         0000000000000000 .text.foo - 4
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000600000004 R_X86_64_PLT32         0000000000000000 .text.foo - 4

With -fpie:

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fno-semantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

$ /c/u/br2/bin/clang main.c -fsemantic-interposition -c -fpie -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o                                           
...
0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior.

Sorry for replying to my own post - I have realised that I have made a mistake -fno-semantic-interposition is not the default for -fpic so there has only been a change in behaviour for hidden symbols:

$ /c/u/br2/bin/clang main.c -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=hidden -Wno-implicit-function-declaration
bdunbobbin@BENDB-W10-3 MINGW64 /c/temp/interpos
$ /c/u/br2/bin/llvm-readelf -r test.o

0000000000000041  0000000600000004 R_X86_64_PLT32         0000000000000000 .text.foo - 4


$ /c/u/br2/bin/clang main.c -c -fpic -target x86_64-linux-gnu -o test.o -ffunction-sections -fdata-sections -fvisibility=default -Wno-implicit-function-declaration
$ /c/u/br2/bin/llvm-readelf -r test.o

0000000000000041  0000000a00000004 R_X86_64_PLT32         0000000000000000 foo - 4
...

So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior.

I think the summary is correct. -fvisibility=hidden nullifies -fsemantic-interposition when the definition is available in the same translation unit. Given how GCC and GNU ld handle/document it, I'd say the previous hidden clang behavior working with -Wl,--wrap=foo is accidental rather than intentional. If you don't pass explicit -fsemantic-interposition, in -fPIC mode clang can freely inline foo into call sites, which will also defeat the intended -Wl,--wrap=foo behavior.

Any undefined reference to symbol will be resolved to "__wrap_symbol".

I think the reasonably portable approach making the wrapping scheme work is __attribute__((weak)). An alternative is to move the definitions to a separate translation unit (it does not work with -r or GCC LTO, though).

So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior.

I think the summary is correct. -fvisibility=hidden nullifies -fsemantic-interposition when the definition is available in the same translation unit. Given how GCC and GNU ld handle/document it, I'd say the previous hidden clang behavior working with -Wl,--wrap=foo is accidental rather than intentional. If you don't pass explicit -fsemantic-interposition, in -fPIC mode clang can freely inline foo into call sites, which will also defeat the intended -Wl,--wrap=foo behavior.

Any undefined reference to symbol will be resolved to "__wrap_symbol".

I think the reasonably portable approach making the wrapping scheme work is __attribute__((weak)). An alternative is to move the definitions to a separate translation unit (it does not work with -r or GCC LTO, though).

.. but now we have this difference in behaviour for -normal vs flto links:

ben@ben-VirtualBox:~/tests/wrap$ cat smaller.c 
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() { foo(); }
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./old.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./old_lto.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./new.elf
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./new_lto.elf
__wrap_foo
foo()

... and the same will be true for default symbols + -fpic when we enable -fno-semantic-interposition by default for -fpic :(

MaskRay added a comment.EditedAug 10 2020, 12:23 PM

So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior.

I think the summary is correct. -fvisibility=hidden nullifies -fsemantic-interposition when the definition is available in the same translation unit. Given how GCC and GNU ld handle/document it, I'd say the previous hidden clang behavior working with -Wl,--wrap=foo is accidental rather than intentional. If you don't pass explicit -fsemantic-interposition, in -fPIC mode clang can freely inline foo into call sites, which will also defeat the intended -Wl,--wrap=foo behavior.

Any undefined reference to symbol will be resolved to "__wrap_symbol".

I think the reasonably portable approach making the wrapping scheme work is __attribute__((weak)). An alternative is to move the definitions to a separate translation unit (it does not work with -r or GCC LTO, though).

.. but now we have this difference in behaviour for -normal vs flto links:

ben@ben-VirtualBox:~/tests/wrap$ cat smaller.c 
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() { foo(); }
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./old.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./old_lto.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./new.elf
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./new_lto.elf
__wrap_foo
foo()

... and the same will be true for default symbols + -fpic when we enable -fno-semantic-interposition by default for -fpic :(

I think the non-LTO vs Full LTO difference is due to the way we implement --wrap for LTO: D33621.
It adds WeakAny linkage which obtains wrapping behavior.

clang -fuse-ld=lld smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -fvisibility=hidden -flto -Wl,--save-temps

% llvm-dis < a.out.0.0.preopt.bc
...
define weak hidden void @foo() #0 {  # Note the weakany linkage
...

I think the only way making --wrap sufficiently reliable is to communicate --wrap to the LTO code generation. It was considered unworthy at the implementation time https://bugs.llvm.org/show_bug.cgi?id=33145#c7

To make the non-LTO case behave like the LTO case, add __attribute__((weak))

So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior.

I think the summary is correct. -fvisibility=hidden nullifies -fsemantic-interposition when the definition is available in the same translation unit. Given how GCC and GNU ld handle/document it, I'd say the previous hidden clang behavior working with -Wl,--wrap=foo is accidental rather than intentional. If you don't pass explicit -fsemantic-interposition, in -fPIC mode clang can freely inline foo into call sites, which will also defeat the intended -Wl,--wrap=foo behavior.

Any undefined reference to symbol will be resolved to "__wrap_symbol".

I think the reasonably portable approach making the wrapping scheme work is __attribute__((weak)). An alternative is to move the definitions to a separate translation unit (it does not work with -r or GCC LTO, though).

.. but now we have this difference in behaviour for -normal vs flto links:

ben@ben-VirtualBox:~/tests/wrap$ cat smaller.c 
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() { foo(); }
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./old.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./old_lto.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./new.elf
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./new_lto.elf
__wrap_foo
foo()

... and the same will be true for default symbols + -fpic when we enable -fno-semantic-interposition by default for -fpic :(

I think the non-LTO vs Full LTO difference is due to the way we implement --wrap for LTO: D33621.
It adds WeakAny linkage which obtains wrapping behavior.

clang -fuse-ld=lld smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -fvisibility=hidden -flto -Wl,--save-temps

% llvm-dis < a.out.0.0.preopt.bc
...
define weak hidden void @foo() #0 {  # Note the weakany linkage
...

I think the only way making --wrap sufficiently reliable is to communicate --wrap to the LTO code generation. It was considered unworthy at the implementation time https://bugs.llvm.org/show_bug.cgi?id=33145#c7

To make the non-LTO case behave like the LTO case, add __attribute__((weak))

I had come to the same conclusion and I agree with your assessment of what is needed to improve the LTO behavior.

Another way of obtaining consistent behavior would be to adjust the current canBenefitFromLocalAlias():

 bool GlobalValue::canBenefitFromLocalAlias() const {
   // See AsmPrinter::getSymbolPreferLocal().
-  return GlobalObject::isExternalLinkage(getLinkage()) && !isDeclaration() &&
+  return hasDefaultVisibility() && GlobalObject::isExternalLinkage(getLinkage()) && !isDeclaration() &&
          !isa<GlobalIFunc>(this) && !hasComdat();
 }

Doing this would:

  1. Make the code read a bit better (it was not clear to me without going back to the code reviews why this function doesn't consider visibility).
  2. Remove the introduced difference in behavior for --wrap and hidden definitions.
  3. Remove the difference in --wrap behavior between LTO and normal builds.
  4. Improve the size of object files with hidden symbols (and most symbols are hidden now IIUC) by preventing the need for superfluous STT_SECTION symbols.
MaskRay added a comment.EditedAug 10 2020, 5:07 PM

So, clang's behavior has changed so that --wrap no longer wraps symbol definitions for default symbols + -fpic + -fno-semantic-interposition (-fno-semantic-interposition is the default); however, you can restore the old behavior via -fsemantic-interposition. For hidden symbols + -fpic --wrap no longer wraps symbol definitions and there is no way to restore the old definition wrapping behavior.

I think the summary is correct. -fvisibility=hidden nullifies -fsemantic-interposition when the definition is available in the same translation unit. Given how GCC and GNU ld handle/document it, I'd say the previous hidden clang behavior working with -Wl,--wrap=foo is accidental rather than intentional. If you don't pass explicit -fsemantic-interposition, in -fPIC mode clang can freely inline foo into call sites, which will also defeat the intended -Wl,--wrap=foo behavior.

Any undefined reference to symbol will be resolved to "__wrap_symbol".

I think the reasonably portable approach making the wrapping scheme work is __attribute__((weak)). An alternative is to move the definitions to a separate translation unit (it does not work with -r or GCC LTO, though).

.. but now we have this difference in behaviour for -normal vs flto links:

ben@ben-VirtualBox:~/tests/wrap$ cat smaller.c 
void __wrap_foo () {
	puts ("__wrap_foo");
	__real_foo();
}

void foo () { puts("foo()"); }

int main() { foo(); }
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./old.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o old_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./old_lto.elf
__wrap_foo
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new.elf -fvisibility=hidden -fuse-ld=lld
ben@ben-VirtualBox:~/tests/wrap$ ./new.elf
foo()
ben@ben-VirtualBox:~/tests/wrap$ ~/u/build/bin/clang smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -o new_lto.elf -fvisibility=hidden -fuse-ld=lld -flto
ben@ben-VirtualBox:~/tests/wrap$ ./new_lto.elf
__wrap_foo
foo()

... and the same will be true for default symbols + -fpic when we enable -fno-semantic-interposition by default for -fpic :(

I think the non-LTO vs Full LTO difference is due to the way we implement --wrap for LTO: D33621.
It adds WeakAny linkage which obtains wrapping behavior.

clang -fuse-ld=lld smaller.c -Wno-implicit-function-declaration -fpic -ffunction-sections -Wl,--wrap=foo -fvisibility=hidden -flto -Wl,--save-temps

% llvm-dis < a.out.0.0.preopt.bc
...
define weak hidden void @foo() #0 {  # Note the weakany linkage
...

I think the only way making --wrap sufficiently reliable is to communicate --wrap to the LTO code generation. It was considered unworthy at the implementation time https://bugs.llvm.org/show_bug.cgi?id=33145#c7

To make the non-LTO case behave like the LTO case, add __attribute__((weak))

I had come to the same conclusion and I agree with your assessment of what is needed to improve the LTO behavior.

Another way of obtaining consistent behavior would be to adjust the current canBenefitFromLocalAlias():

 bool GlobalValue::canBenefitFromLocalAlias() const {
   // See AsmPrinter::getSymbolPreferLocal().
-  return GlobalObject::isExternalLinkage(getLinkage()) && !isDeclaration() &&
+  return hasDefaultVisibility() && GlobalObject::isExternalLinkage(getLinkage()) && !isDeclaration() &&
          !isa<GlobalIFunc>(this) && !hasComdat();
 }

Doing this would:

  1. Make the code read a bit better (it was not clear to me without going back to the code reviews why this function doesn't consider visibility).
  2. Remove the introduced difference in behavior for --wrap and hidden definitions.
  3. Remove the difference in --wrap behavior between LTO and normal builds.
  4. Improve the size of object files with hidden symbols (and most symbols are hidden now IIUC) by preventing the need for superfluous STT_SECTION symbols.

The proposed canBenefitFromLocalAlias change looks good to me. Mind sending a patch? I think 4 is the important one. I agree that it happens to provide 2 and 3 here but the --wrap behavior still looks unreliable to me. Glad that you find a solution addressing your problems:)

bd1976llvm added a comment.EditedAug 11 2020, 12:29 PM

The proposed canBenefitFromLocalAlias change looks good to me. Mind sending a patch?

Incoming just struggling with the testing a bit.

The proposed canBenefitFromLocalAlias change looks good to me. Mind sending a patch?

I have put up https://reviews.llvm.org/D85782.