Page MenuHomePhabricator

[MC][ELF] Emit a relocation if target is defined in the same section and is non-local
ClosedPublic

Authored by MaskRay on Jan 3 2020, 11:23 PM.

Details

Summary

For a target symbol defined in the same section, currently we don't emit a
relocation (with few exceptions like RISC-V relaxation), while GNU as emits
one. This can break symbol interposition if the assembly is used by a shared
object.

.globl foo
foo:
  call foo   # on various targets, may be b foo, etc

ARM/thumb2-beq-fixup.s: we now emit a relocation to global_thumb_fn as GNU as does.
X86/Inputs/align-branch-64-2.s: we now emit R_X86_64_PLT32 to foo as GNU does.

ELF/relax.s: rewrite the test as target-in-same-section.s .
We omitted relocations to global and now emit R_X86_64_PLT32.
Note, GNU as does not emit a relocation for jmp global (maybe its own
bug). Our new behavior is compatible except jmp global.

Diff Detail

Event Timeline

MaskRay created this revision.Jan 3 2020, 11:23 PM

Unit tests: fail. 61237 tests passed, 1 failed and 729 were skipped.

failed: lld.ELF/global-offset-table-position-aarch64.s

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

MaskRay updated this revision to Diff 236169.Jan 4 2020, 12:08 AM

Fix lld/test/ELF/global-offset-table-position-aarch64.s

Unit tests: pass. 61238 tests passed, 0 failed and 729 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

Also, this code is missing a case for STB_GLOBAL and visibility != STV_DEFAULT. Actually, I have noticed before that this case is missing in various places in MC. As it stands this patch may cause a performance regression on platforms that use -fvisibility=hidden.

I'm not sure of the history of LLVM policy on this. I note that as of today it is possible to get different program behaviour (either performance regressions due to extra PLT entries, or interposing) with -ffunction-sections and without as there will always be a relocation for a branch to another global STT_FUNC symbol with -ffunction-sections as it will be in a different section.

As well as thumb2-beq-fixup.s the Thumb-1 branch and conditional branch relocations will be affected.

        .syntax unified
        .text
        .thumb
        .global foo
foo:
        b.n foo
        beq.n foo

These will now generate, like GNU as:

00000000 <foo>:
   0:   e7fe            b.n     0 <foo>
                        0: R_ARM_THM_JUMP11     foo
   2:   d0fe            beq.n   0 <foo>
                        2: R_ARM_THM_JUMP8      foo

There is an existing test thumb1-branch-reloc.s for these but it only tests external symbols, may be worth adding a global defined in the section.

I don't see the behavior you say gnu as has.

On x86_64-linux-gnu, running as -o /tmp/test.o llvm/test/MC/ELF/target-in-same-section.s && objdump -r /tmp/test.o gives only one relocation record:

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE 
0000000000000009 R_X86_64_PLT32    weak-0x0000000000000004

I have: GNU assembler (GNU Binutils for Debian) 2.31.1

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

I had a talk about this once I believe with @hfinkel (or maybe @dblaikie), and this behavior was intentional. Essentially LLVM may perform IPO between the 2 functions, in which case its no longer appropriate to allow the callee to be preempted at runtime. The back-ends aren't consistent on this behaviour either. For example:

__attribute__((noinline))
int bar(int i) {
    return i;
}


int caller(int i) {
    if (bar(i) == i)
      return 0;

    return 1;
}

Compiling this for PowerPC and AArch64 Linux and -fPIC I get no relocation for the call. Compiling for X86_64 Linux and -fPIC I do get a relocation for the call. So it seems to be pick your poison: Emit a relocation and expose a potential runtime problem, or don't emit a relocation and break ELF interposition semantics. I'm also not sure if the optimizer actually skips all IPO when the functions are in different sections, so we might end up with both broken shared object interposition and potentially unsafe codegen 😦

With 'dso_local' and 'dso_premptable' generated by clang we should have enough information in the IR now to be able to properly implement '-fsemantic-interposition`, if someone was so inclined.

I don't see the behavior you say gnu as has.

Oh..the behavior of gnu as is apparently different for jmp (and conditional variants thereof), versus call. It behaves as you say for call. I wonder if it's a bug that it doesn't do so for jmp.

It does cause some odd inconsistencies:

#include <stddef.h>
#include <string.h>

__attribute__((noinline)) int memcmp2(const void *s1, const void *s2, size_t n) {
  return memcmp(s1, s2, n);
}

int memcmp3(const void *s1, const void *s2, size_t n) {
  return memcmp2(s1, s2, n);
}

int memcmp3_plus1(const void *s1, const void *s2, size_t n) {
  return memcmp2(s1, s2, n) + 1;
}

Running gcc -c -o /tmp/test.o /tmp/test.c -O2 && objdump -dr /tmp/test.o, you can see that a relocation was emitted for the tail-call JMP from memcmp2 to memcmp, and for the CALL from memcmp3_plus1 to memcmp2, but NOT for the tail-call JMP from memcmp3 to memcmp2.

Disassembly of section .text:

0000000000000000 <memcmp2>:
   0: e9 00 00 00 00        jmpq   5 <memcmp2+0x5>
   1: R_X86_64_PLT32 memcmp-0x4
   5: 66 66 2e 0f 1f 84 00  data16 nopw %cs:0x0(%rax,%rax,1)
   c: 00 00 00 00 

0000000000000010 <memcmp3>:
  10: eb ee                 jmp    0 <memcmp2>
  12: 66 66 2e 0f 1f 84 00  data16 nopw %cs:0x0(%rax,%rax,1)
  19: 00 00 00 00 
  1d: 0f 1f 00              nopl   (%rax)

0000000000000020 <memcmp3_plus1>:
  20: 48 83 ec 08           sub    $0x8,%rsp
  24: e8 00 00 00 00        callq  29 <memcmp3_plus1+0x9>
   25: R_X86_64_PLT32 memcmp2-0x4
  29: 48 83 c4 08           add    $0x8,%rsp
  2d: 83 c0 01              add    $0x1,%eax
  30: c3                    retq
MaskRay added a comment.EditedJan 6 2020, 9:52 AM

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

Also, this code is missing a case for STB_GLOBAL and visibility != STV_DEFAULT. Actually, I have noticed before that this case is missing in various places in MC. As it stands this patch may cause a performance regression on platforms that use -fvisibility=hidden.

Visibility is irrelevant here. For a symbol defined in the same section as the instruction, the relocation record will reference a non-SECTION symbol instead of a STT_SECTION after this change. If the symbol is STV_HIDDEN, the linker will think the symbol is non-preemptible. So, no performance regression.

There are many ways to make a symbol non-preemptible: binding (STB_LOCAL), visibility (STV_INTERNAL, STV_PROTECTED or STV_HIDDEN), --dynamic-list, VER_NDX_LOCAL, -Bsymbolic, -Bsymbolic-functions, etc. I have made enough refactorings to lld Symbol::isPreemptible and I am confident with these things :)

GCC r212049 (first included in GCC 5) introduced -fsemantic-interposition. Yes, current Clang/LLVM behavior is like -fno-semantic-interposition. Our behavior is not ideal and @hfinkel had an RFC https://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html . This change should help -fsemantic-interposition if we decide to move towards that goal.

MaskRay added a comment.EditedJan 6 2020, 10:49 AM

I don't see the behavior you say gnu as has.

Oh..the behavior of gnu as is apparently different for jmp (and conditional variants thereof), versus call. It behaves as you say for call. I wonder if it's a bug that it doesn't do so for jmp.

It does cause some odd inconsistencies:

#include <stddef.h>
#include <string.h>

__attribute__((noinline)) int memcmp2(const void *s1, const void *s2, size_t n) {
  return memcmp(s1, s2, n);
}

int memcmp3(const void *s1, const void *s2, size_t n) {
  return memcmp2(s1, s2, n);
}

int memcmp3_plus1(const void *s1, const void *s2, size_t n) {
  return memcmp2(s1, s2, n) + 1;
}

Running gcc -c -o /tmp/test.o /tmp/test.c -O2 && objdump -dr /tmp/test.o, you can see that a relocation was emitted for the tail-call JMP from memcmp2 to memcmp, and for the CALL from memcmp3_plus1 to memcmp2, but NOT for the tail-call JMP from memcmp3 to memcmp2.

Disassembly of section .text:

0000000000000000 <memcmp2>:
   0: e9 00 00 00 00        jmpq   5 <memcmp2+0x5>
   1: R_X86_64_PLT32 memcmp-0x4
   5: 66 66 2e 0f 1f 84 00  data16 nopw %cs:0x0(%rax,%rax,1)
   c: 00 00 00 00 

0000000000000010 <memcmp3>:
  10: eb ee                 jmp    0 <memcmp2>
  12: 66 66 2e 0f 1f 84 00  data16 nopw %cs:0x0(%rax,%rax,1)
  19: 00 00 00 00 
  1d: 0f 1f 00              nopl   (%rax)

0000000000000020 <memcmp3_plus1>:
  20: 48 83 ec 08           sub    $0x8,%rsp
  24: e8 00 00 00 00        callq  29 <memcmp3_plus1+0x9>
   25: R_X86_64_PLT32 memcmp2-0x4
  29: 48 83 c4 08           add    $0x8,%rsp
  2d: 83 c0 01              add    $0x1,%eax
  30: c3                    retq

Yes, GNU as's behavior can be different for STB_GLOBAL call/jmp. I'll add the test case.

I wanted to mention this but got the description "ELF/relax.s: we emitted 3 relocations while GNU as emits 0." wrong. I'll add call test cases and fix the description. (Our handling of call var@plt and call local@plt may also be weird. But since var and local are STB_LOCAL symbols, I'll assume that the compiler should not emit @plt in the first place. Though, ideally, I hope assemblers don't make no-suffix and @plt different.)

MaskRay updated this revision to Diff 236422.Jan 6 2020, 10:54 AM

Add in-section global/local to ARM/thumb1-branch-reloc.s as peter.smith suggested.
Improve ELF/target-in-same-section.s

MaskRay edited the summary of this revision. (Show Details)Jan 6 2020, 10:55 AM
MaskRay added a comment.EditedJan 6 2020, 11:08 AM

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

I had a talk about this once I believe with @hfinkel (or maybe @dblaikie), and this behavior was intentional. Essentially LLVM may perform IPO between the 2 functions, in which case its no longer appropriate to allow the callee to be preempted at runtime. The back-ends aren't consistent on this behaviour either. For example:

__attribute__((noinline))
int bar(int i) {
    return i;
}


int caller(int i) {
    if (bar(i) == i)
      return 0;

    return 1;
}

Compiling this for PowerPC and AArch64 Linux and -fPIC I get no relocation for the call. Compiling for X86_64 Linux and -fPIC I do get a relocation for the call. So it seems to be pick your poison: Emit a relocation and expose a potential runtime problem, or don't emit a relocation and break ELF interposition semantics. I'm also not sure if the optimizer actually skips all IPO when the functions are in different sections, so we might end up with both broken shared object interposition and potentially unsafe codegen 😦

Behavior without this patch.

  • clang -target {aarch64,ppc64le} -fPIC -c => no relocation referencing bar
  • clang -target {aarch64,ppc64le} -fPIC -c -ffunction-sections => a branch relocation referencing bar

Look, we are already inconsistent. With this patch, we will consistently emit a relocation. I thinks this patch makes the situation better.

x86_64 emits R_X86_64_PLT32 before this patch. This is because the MCSymbolRefExpr uses VK_PLT, instead of VK_None as used by aarch64/ppc64 (bl bar)

///////// 
 if (A->getKind() != MCSymbolRefExpr::VK_None || SA.isUndefined()) {
   IsResolved = false;
 } else if (auto *Writer = getWriterPtr()) {
   IsResolved = Writer->isSymbolRefDifferenceFullyResolvedImpl(
       *this, SA, *DF, false, true);
 }

With 'dso_local' and 'dso_premptable' generated by clang we should have enough information in the IR now to be able to properly implement '-fsemantic-interposition`, if someone was so inclined.

(You added dso_local/dso_preemptable in D20217.) I suspect we can simplify some linkage kinds with dso/local/dso_preemptable. -fsemantic-interposition will be nice to have and Clang should still default to -fno-semantic-interposition.

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

I had a talk about this once I believe with @hfinkel (or maybe @dblaikie), and this behavior was intentional. Essentially LLVM may perform IPO between the 2 functions, in which case its no longer appropriate to allow the callee to be preempted at runtime. The back-ends aren't consistent on this behaviour either. For example:

__attribute__((noinline))
int bar(int i) {
    return i;
}


int caller(int i) {
    if (bar(i) == i)
      return 0;

    return 1;
}

Compiling this for PowerPC and AArch64 Linux and -fPIC I get no relocation for the call. Compiling for X86_64 Linux and -fPIC I do get a relocation for the call. So it seems to be pick your poison: Emit a relocation and expose a potential runtime problem, or don't emit a relocation and break ELF interposition semantics. I'm also not sure if the optimizer actually skips all IPO when the functions are in different sections, so we might end up with both broken shared object interposition and potentially unsafe codegen 😦

Behavior without this patch.

  • clang -target {aarch64,ppc64le} -fPIC -c => no relocation referencing bar
  • clang -target {aarch64,ppc64le} -fPIC -c -ffunction-sections => a branch relocation referencing bar

    Look, we are already inconsistent. With this patch, we will consistently emit a relocation. I thinks this patch makes the situation better.

I wouldn't call that inconsistent - it's necessary if the two functions are in separate sections, and isn't (if you aren't supporting symbol interposition) if they aren't, right?

I think the question I have: What is the value (I don't mean to assume there is no value, but I want to understand the specific use case) of supporting this if the full generality of symbol interposition is not supported? (is there a well defined use case that is usable with this patch and doesn't need the full generality? Or is the use case going to be subtly broken if it changes even slighlty and happens to trip over another part of LLVM that doesn't support symbol interposition? (in which case this is a great patch, but only as part of an effort to fully support the general case))

Unit tests: pass. 61256 tests passed, 0 failed and 736 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

MaskRay added a comment.EditedJan 6 2020, 11:30 AM

I think the question I have: What is the value (I don't mean to assume there is no value, but I want to understand the specific use case) of supporting this if the full generality of symbol interposition is not supported? (is there a well defined use case that is usable with this patch and doesn't need the full generality? Or is the use case going to be subtly broken if it changes even slighlty and happens to trip over another part of LLVM that doesn't support symbol interposition? (in which case this is a great patch, but only as part of an effort to fully support the general case))

@dblaikie I came from a ppc64 use case https://reviews.llvm.org/D71639#1803561 where I could not construct a good test due to the MC issue this patch intends to fix. The advantages include at least the following:

  • Improve compatibility with GNU as. This improves portability of hand-written assembly.
  • Resolve a program behavior difference (-ffunction-sections vs w/o). See @peter.smith's comment above. The new behavior is consistent with the -ffunction-sections case before.
  • Make MC simpler (it deleted isWeak)

LLVM's CMake build system uses -ffunction-sections on non-Windows-non-Darwin platforms (e.g. Linux) if CMAKE_BUILD_TYPE!=Debug. Such configurations have been thoroughly tested by various build bots and downstream users.

# cmake/modules/HandleLLVMOptions.cmake
if(NOT CYGWIN AND NOT WIN32)
  if(NOT ${CMAKE_SYSTEM_NAME} MATCHES "Darwin" AND
     NOT uppercase_CMAKE_BUILD_TYPE STREQUAL "DEBUG")
    check_c_compiler_flag("-Werror -fno-function-sections" C_SUPPORTS_FNO_FUNCTION_SECTIONS)
    if (C_SUPPORTS_FNO_FUNCTION_SECTIONS)
      # Don't add -ffunction-section if it can be disabled with -fno-function-sections.
      # Doing so will break sanitizers.
      add_flag_if_supported("-ffunction-sections" FFUNCTION_SECTIONS)
    endif()
    add_flag_if_supported("-fdata-sections" FDATA_SECTIONS)
  endif()
endif()

Lack of -fsemantic-interposition is a larger issue that we only have a partial solution currently. Many aspects of that issue is irrelevant to this patch but I believe this patch fixes the assembler side issue any efforts on that feature will face.

I think the question I have: What is the value (I don't mean to assume there is no value, but I want to understand the specific use case) of supporting this if the full generality of symbol interposition is not supported? (is there a well defined use case that is usable with this patch and doesn't need the full generality? Or is the use case going to be subtly broken if it changes even slighlty and happens to trip over another part of LLVM that doesn't support symbol interposition? (in which case this is a great patch, but only as part of an effort to fully support the general case))

IMO it's useful to differentiate the high-level desires, vs the low-level implementation details. Whatever clang/LLVM-IR wants its policy to be for semantic interposition, that choice needs to be represented somehow in assembly.

And it seems a reasonable choice (especially given gnu as compatibility) that on ELF platforms, referencing a global symbol in assembly should always use a relocation, as this patch implements. If we want to have the other behavior in some case, we can emit a local symbol alias for a function, and call via that, instead.

I hope the jmp difference doesn't cause an issue -- I think it makes more sense as you have, but that kind of thing always worries me. :)

MaskRay added a comment.EditedJan 6 2020, 12:22 PM

I hope the jmp difference doesn't cause an issue -- I think it makes more sense as you have, but that kind of thing always worries me. :)

I agree that GNU as should treat jmp global like jmp weak. Some projects noticed problems w/o -ffunction-sections in gcc/binutils and enforced -ffunction-sections to (hopefully) eliminate the issues https://git.musl-libc.org/cgit/musl/commit/?id=27c1eccf33ce5cb7508ef5e541daa9b6441b4a51

I think the question I have: What is the value (I don't mean to assume there is no value, but I want to understand the specific use case) of supporting this if the full generality of symbol interposition is not supported? (is there a well defined use case that is usable with this patch and doesn't need the full generality? Or is the use case going to be subtly broken if it changes even slighlty and happens to trip over another part of LLVM that doesn't support symbol interposition? (in which case this is a great patch, but only as part of an effort to fully support the general case))

@dblaikie I came from a ppc64 use case https://reviews.llvm.org/D71639#1803561 where I could not construct a good test due to the MC issue this patch intends to fix. The advantages include at least the following:

  • Improve compatibility with GNU as. This improves portability of hand-written assembly.

That's fair - though I'd still wonder if this is a consistently applied idea, that llvm's assembler does support symbol interposition, or is this one of many ways in which it assumes that is not going to happen? (in which case fixing one bug without a plan to test & fix the full surface area seems like not a good idea to me)

  • Resolve a program behavior difference (-ffunction-sections vs w/o). See @peter.smith's comment above. The new behavior is consistent with the -ffunction-sections case before.

This seems like a red herring/not relevant & more reflective of LLVM's more general lack of support for function interposition - if functions can't be interposed then it makes sense not to use a relocation between two functions in the same section, I think?

  • Make MC simpler (it deleted isWeak)

I'm not too concerned about the code complexity (I haven't looked at the implementation of the patch at all) - only wondering about the semantic choices.

Lack of -fsemantic-interposition is a larger issue that we only have a partial solution currently. Many aspects of that issue is irrelevant to this patch but I believe this patch fixes the assembler side issue any efforts on that feature will face.

I'm concerned about partial solutions that might make someone start depending on this in ways that aren't actually supported.

But if this is a full solution to the issue of hand-written assembly code, that seems like a coherent specific supported/supportable feature without needing to address the broader issue.

I think the question I have: What is the value (I don't mean to assume there is no value, but I want to understand the specific use case) of supporting this if the full generality of symbol interposition is not supported? (is there a well defined use case that is usable with this patch and doesn't need the full generality? Or is the use case going to be subtly broken if it changes even slighlty and happens to trip over another part of LLVM that doesn't support symbol interposition? (in which case this is a great patch, but only as part of an effort to fully support the general case))

IMO it's useful to differentiate the high-level desires, vs the low-level implementation details. Whatever clang/LLVM-IR wants its policy to be for semantic interposition, that choice needs to be represented somehow in assembly.

And it seems a reasonable choice (especially given gnu as compatibility) that on ELF platforms, referencing a global symbol in assembly should always use a relocation, as this patch implements. If we want to have the other behavior in some case, we can emit a local symbol alias for a function, and call via that, instead.

I hope the jmp difference doesn't cause an issue -- I think it makes more sense as you have, but that kind of thing always worries me. :)

Yeah, fair enough - if no one knows of other major outstanding bugs/unimplemented functionality needed to support symbol interposition at the assembly level, I'm good with this patch/direction. [if there are other major outstanding things - then I'd worry about this bug being fixed in isolation & letting some project limp along expecting full support where it doesn't actually exist/where there are traps]

I'm bowing out for now then - thanks for explaining & sorry for the noise.

MaskRay added a comment.EditedJan 6 2020, 4:16 PM

I think the question I have: What is the value (I don't mean to assume there is no value, but I want to understand the specific use case) of supporting this if the full generality of symbol interposition is not supported? (is there a well defined use case that is usable with this patch and doesn't need the full generality? Or is the use case going to be subtly broken if it changes even slighlty and happens to trip over another part of LLVM that doesn't support symbol interposition? (in which case this is a great patch, but only as part of an effort to fully support the general case))

@dblaikie I came from a ppc64 use case https://reviews.llvm.org/D71639#1803561 where I could not construct a good test due to the MC issue this patch intends to fix. The advantages include at least the following:

  • Improve compatibility with GNU as. This improves portability of hand-written assembly.

That's fair - though I'd still wonder if this is a consistently applied idea, that llvm's assembler does support symbol interposition, or is this one of many ways in which it assumes that is not going to happen? (in which case fixing one bug without a plan to test & fix the full surface area seems like not a good idea to me)

MC supports symbol interposition, but the status is not ideal. It requires a non-VK_None MCSymbolRefExpr::VariantKind, or the target symbol is not defined in the same section. Both codegen and MC need to be fixed, and the efforts are independent. This patch fixes one of the MC side issues and is safe on its own.

  • Resolve a program behavior difference (-ffunction-sections vs w/o). See @peter.smith's comment above. The new behavior is consistent with the -ffunction-sections case before.

This seems like a red herring/not relevant & more reflective of LLVM's more general lack of support for function interposition - if functions can't be interposed then it makes sense not to use a relocation between two functions in the same section, I think?

As @jyknight suggested above, LLVM should create a STB_LOCAL symbol alias and call via that symbol. Otherwise you will see program behavior differences between clang -no-integrated-as -fno-function-sections and clang -integrated-as -fno-function-sections.

  • Make MC simpler (it deleted isWeak)

I'm not too concerned about the code complexity (I haven't looked at the implementation of the patch at all) - only wondering about the semantic choices.

I believe I have picked the ideal semantic choice. I'll also report a GNU as bug.

Lack of -fsemantic-interposition is a larger issue that we only have a partial solution currently. Many aspects of that issue is irrelevant to this patch but I believe this patch fixes the assembler side issue any efforts on that feature will face.

I'm concerned about partial solutions that might make someone start depending on this in ways that aren't actually supported.

But if this is a full solution to the issue of hand-written assembly code, that seems like a coherent specific supported/supportable feature without needing to address the broader issue.

There are several issues. For example, clang does not support -fsemantic-interposition. The various global variable linkage types and dso_local/dso_preemptable overlap in functionality. LinkOnceAny/LinkOnceODR should be folded. WeakAny/WeakODR should be folded. I think the discussion has diverged too much from the main topic.

I think the question I have: What is the value (I don't mean to assume there is no value, but I want to understand the specific use case) of supporting this if the full generality of symbol interposition is not supported? (is there a well defined use case that is usable with this patch and doesn't need the full generality? Or is the use case going to be subtly broken if it changes even slighlty and happens to trip over another part of LLVM that doesn't support symbol interposition? (in which case this is a great patch, but only as part of an effort to fully support the general case))

IMO it's useful to differentiate the high-level desires, vs the low-level implementation details. Whatever clang/LLVM-IR wants its policy to be for semantic interposition, that choice needs to be represented somehow in assembly.

And it seems a reasonable choice (especially given gnu as compatibility) that on ELF platforms, referencing a global symbol in assembly should always use a relocation, as this patch implements. If we want to have the other behavior in some case, we can emit a local symbol alias for a function, and call via that, instead.

I hope the jmp difference doesn't cause an issue -- I think it makes more sense as you have, but that kind of thing always worries me. :)

Yeah, fair enough - if no one knows of other major outstanding bugs/unimplemented functionality needed to support symbol interposition at the assembly level, I'm good with this patch/direction. [if there are other major outstanding things - then I'd worry about this bug being fixed in isolation & letting some project limp along expecting full support where it doesn't actually exist/where there are traps]

Sounds like you worried that this is a "partial" solution. OK, it is not. There is no prerequisite or follow-up functionality patch that needs to be done.

If you place it into the larger picture of semantic interposition fixes/linkage type cleanups in LLVM, in some sense you may argue that it is partial:) It does interfere with other fixes or gets in the away. (I'd like to investigate how far away we are from making -fsemantic-interposition work.)

I don't know of other major outstanding MC level bugs blocking symbol interposition. Maybe others know. (I am still learning MC but have made some refactoring/improvement through the process.)

I'm bowing out for now then - thanks for explaining & sorry for the noise.

Still thanks to you for asking these questions:) Hope my explanation clarifies things. These questions can be check lists when I want to do some larger refactoring of something.

Ping :)

The discussions are long, but most are not directly related.

If I could summarise the comments so far:

  • There is some concern that this is not the full solution and there maybe other things that can break interposing.
  • The response is that fixing the assembler will be a necessary part of that work.
  • There maybe some extra via-PLT calls in shared libraries if -ffunction-sections is not used, but these can be mitigated by giving the symbols non-default visibility, and in any case these would exist with -ffunction-sections.

I'm personally in favour of the change, I'd like to see if there any firm objections from others?

Thanks for the summary:)

Any firm objections?

I think the change is correct, and desirable for the reasons you mentioned previously: we get consistent semantics whether function sections are used or not, and consistent behavior whether we use the integrated assembler or not. There is a chance this patch affects some users negatively despite being correct. Ideally we would commit to at least supporting part of the semantic-interposition option (so the user can opt in to disabling IPO when the callee is runtime replaceable) in the same release we land this fix. If I had the bandwidth to do this I would offer but I don't :(. Because of that user impact I don't feel confident enough to approve the patch myself, but I have no strong objection against it landing.

MaskRay added a comment.EditedJan 9 2020, 2:50 PM

I think I've got weak approvals from @jyknight, @peter.smith and @sfertile .

I built clang+lld at f937b43fdb30b67facf616ad394976b08001ee89 and this patch (b352577b4967e2cb5bffce5aec55e5fbdc291bb1). Use that to build clang at f937b43fdb30b67facf616ad394976b08001ee89.

I then compare the executables and shared objects

for i in Play/bin/clang-10 Play/lib/lib*.so.10git; do cmp -l $i /tmp/$i | wc -l; done. Everything except .comment is byte identical. .comment are different because the two clangs have different VCS information.

% readelf -p.comment Play/bin/clang-10 /tmp/Play/bin/clang-10

File: Play/bin/clang-10

String dump of section '.comment':
  [     0]  GCC: (Debian 8.3.0-6) 8.3.0
  [    1c]  clang version 10.0.0 (git@github.com:MaskRay/llvm-project.git f937b43fdb30b67facf616ad394976b08001ee89)
  [    84]  Linker: LLD 10.0.0 (git@github.com:MaskRay/llvm-project.git f937b43fdb30b67facf616ad394976b08001ee89)


File: /tmp/Play/bin/clang-10

String dump of section '.comment':
  [     0]  GCC: (Debian 8.3.0-6) 8.3.0
  [    1c]  clang version 10.0.0 (git@github.com:MaskRay/llvm-project.git b352577b4967e2cb5bffce5aec55e5fbdc291bb1)
  [    84]  Linker: LLD 10.0.0 (git@github.com:MaskRay/llvm-project.git b352577b4967e2cb5bffce5aec55e5fbdc291bb1)

Does the result look good enough?

There is a chance this patch affects some users negatively despite being correct.

I don't think this is more dangerous than some other MC or linker changes.

peter.smith accepted this revision.Jan 10 2020, 2:06 AM

I'm personally strongly enough in favour to approve given no strong objections so far. Will be worth holding off to next week to see if the approval provokes any.

This revision is now accepted and ready to land.Jan 10 2020, 2:06 AM
This revision was automatically updated to reflect the committed changes.

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

Also, this code is missing a case for STB_GLOBAL and visibility != STV_DEFAULT. Actually, I have noticed before that this case is missing in various places in MC. As it stands this patch may cause a performance regression on platforms that use -fvisibility=hidden.

Visibility is irrelevant here. For a symbol defined in the same section as the instruction, the relocation record will reference a non-SECTION symbol instead of a STT_SECTION after this change. If the symbol is STV_HIDDEN, the linker will think the symbol is non-preemptible. So, no performance regression.

For STB_GLOBAL and visibility != STV_DEFAULT symbols I don't think we need to emit a relocation record for pc-relative references within a section, no? That's what I thought the MC code did for STB_GLOBAL symbols. Unless I am mistaken, it seems that for projects that are not using -ffunction/data-sections we may be emitting many unnecessary relocation records with this patch? That is what I meant by a performance regression. However, I think that you trying to explain that this optimization isn't done and, rather than no relocation, we currently emit a relocation to the STT_SECTION symbol in the STB_GLOBAL case? If so, LGTM as long as there is a code comment added explaining why STB_LOCAL is the only case in which we consider such relocations resolvable.

There are many ways to make a symbol non-preemptible: binding (STB_LOCAL), visibility (STV_INTERNAL, STV_PROTECTED or STV_HIDDEN), --dynamic-list, VER_NDX_LOCAL, -Bsymbolic, -Bsymbolic-functions, etc. I have made enough refactorings to lld Symbol::isPreemptible and I am confident with these things :)

GCC r212049 (first included in GCC 5) introduced -fsemantic-interposition. Yes, current Clang/LLVM behavior is like -fno-semantic-interposition. Our behavior is not ideal and @hfinkel had an RFC https://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html . This change should help -fsemantic-interposition if we decide to move towards that goal.

Thanks for clarifying the situation. I would love to see your suggested refactorings for dso_local/dso_preemptable, LinkOnceAny/LinkOnceODR, and WeakAny/WeakODR implemented. I strongly agree with these even if we are not going to support -fsemantic-interposition.

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

Also, this code is missing a case for STB_GLOBAL and visibility != STV_DEFAULT. Actually, I have noticed before that this case is missing in various places in MC. As it stands this patch may cause a performance regression on platforms that use -fvisibility=hidden.

Visibility is irrelevant here. For a symbol defined in the same section as the instruction, the relocation record will reference a non-SECTION symbol instead of a STT_SECTION after this change. If the symbol is STV_HIDDEN, the linker will think the symbol is non-preemptible. So, no performance regression.

For STB_GLOBAL and visibility != STV_DEFAULT symbols I don't think we need to emit a relocation record for pc-relative references within a section, no?

If this symbols is not STT_GNU_IFUNC, MC can choose whether a relocation referencing a STB_GLOBAL and visibility != STV_DEFAULT symbol should be resolved at assembly time. The symbol will be considered non-preemptible and the linked output will be the same. For non--fvisibility=hidden-non--fvisibility-inlines-hidden users, there aren't going to have many non-STV_DEFAULT symbols. I think this optimization will not be valuable.

Unless I am mistaken, it seems that for projects that are not using -ffunction/data-sections we may be emitting many unnecessary relocation records with this patch? That is what I meant by a performance regression. However, I think that you trying to explain that this optimization isn't done and, rather than no relocation, we currently emit a relocation to the STT_SECTION symbol in the STB_GLOBAL case? If so, LGTM as long as there is a code comment added explaining why STB_LOCAL is the only case in which we consider such relocations resolvable.

The old code did not emit relocations referencing STT_SECTION symbols.
Those projects not using -ffunction-sections will see an increase of relocations referencing in-section non-STB_LOCAL functions.

We should make -fno-semantic-interposition behave more like -fno-semantic-interposition and do it at the correct layer (clang codegen). I'll investigate how to make clang emit STB_LOCAL .L (PrivateGlobalPrefix) aliases and jump through the aliases for the non-STB_LOCAL case. Relocations referencing in-section STT_SECTION symbols will be resolved at assembly time.

pcc added a subscriber: pcc.Jan 16 2020, 1:23 PM

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.
In D72197#1825012, @pcc wrote:

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.
% arm-linux-gnueabi-as /tmp/c/a.s -o /tmp/c/a.o
/tmp/c/a.s: Assembler messages:
/tmp/c/a.s:6: Error: invalid Hi register with immediate
/tmp/c/a.s:6: Error: invalid immediate for address calculation (value = 0xFFFFFFFFFFFFFFFE)

How can I make GNU as recognize it?

pcc added a comment.Jan 16 2020, 2:05 PM
In D72197#1825012, @pcc wrote:

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.
% arm-linux-gnueabi-as /tmp/c/a.s -o /tmp/c/a.o
/tmp/c/a.s: Assembler messages:
/tmp/c/a.s:6: Error: invalid Hi register with immediate
/tmp/c/a.s:6: Error: invalid immediate for address calculation (value = 0xFFFFFFFFFFFFFFFE)

How can I make GNU as recognize it?

It looks like you need to add .syntax unified to the top.

MaskRay added a comment.EditedJan 16 2020, 2:49 PM
In D72197#1825012, @pcc wrote:

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.

foo is a hidden definition in the current object file. Can the ART code be changed to use a local symbol (adr lr, .Lfoo) instead?

GNU as resolves R_ARM_THM_ALU_PREL_11_0 at assembly time, regardless of the visibility. It also rejects an undefined symbol. This looks likes a relocation-specific behavior.
I think a local symbol can make the intention more explicit, just don't know whether this is a broader issue.

(ELFObjectWriter::isSymbolRefDifferenceFullyResolvedImpl is in an anonymous namespace, so is not convenient to be overridden in an ARM specific file.)

pcc added a comment.Jan 16 2020, 3:28 PM
In D72197#1825012, @pcc wrote:

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.

foo is a hidden definition in the current object file. Can the ART code be changed to use a local symbol (adr lr, .Lfoo) instead?

It could, but I don't really see a good argument for local symbols having different behaviour to strong hidden symbols.

GNU as resolves R_ARM_THM_ALU_PREL_11_0 at assembly time, regardless of the visibility. It also rejects an undefined symbol. This looks likes a relocation-specific behavior.

I think it's more like it doesn't know about the relocation, so it tries to resolve it locally.

I wouldn't put too much stock into GNU as's behaviour here, there seem to be bugs. For example, at least the version I'm using will accept this:

.thumb
.thumb_func
.syntax unified
.globl foo
.hidden foo
foo:
adr lr, bar

.data
bar:

and won't emit a relocation for the adr instruction at all.

I think a local symbol can make the intention more explicit, just don't know whether this is a broader issue.

I'm not sure about that, to me it would seem more like a workaround for an assembler bug.

In D72197#1825390, @pcc wrote:
In D72197#1825012, @pcc wrote:

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.

foo is a hidden definition in the current object file. Can the ART code be changed to use a local symbol (adr lr, .Lfoo) instead?

It could, but I don't really see a good argument for local symbols having different behaviour to strong hidden symbols.

Not special casing hidden in the general case has the nice property that: the assembler behavior is not affected by the visibility. Only the binding can.

We can thus postpone decisions to the linker, and all the logic can be implemented on the linker side. Both a hidden defined symbol and a local symbol is considered non-preemptive, thus they have the same behavior at the linker stage.

The downside is that we increase object file sizes. Hidden symbols are not that common so the downside may be acceptable.

GNU as resolves R_ARM_THM_ALU_PREL_11_0 at assembly time, regardless of the visibility. It also rejects an undefined symbol. This looks likes a relocation-specific behavior.

I think it's more like it doesn't know about the relocation, so it tries to resolve it locally.

I wouldn't put too much stock into GNU as's behaviour here, there seem to be bugs. For example, at least the version I'm using will accept this:

.thumb
.thumb_func
.syntax unified
.globl foo
.hidden foo
foo:
adr lr, bar

.data
bar:

and won't emit a relocation for the adr instruction at all.

I think a local symbol can make the intention more explicit, just don't know whether this is a broader issue.

I'm not sure about that, to me it would seem more like a workaround for an assembler bug.

I'll need to make more analysis.

In D72197#1825390, @pcc wrote:
In D72197#1825012, @pcc wrote:

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.

foo is a hidden definition in the current object file. Can the ART code be changed to use a local symbol (adr lr, .Lfoo) instead?

It could, but I don't really see a good argument for local symbols having different behaviour to strong hidden symbols.

Not special casing hidden in the general case has the nice property that: the assembler behavior is not affected by the visibility. Only the binding can.

We can thus postpone decisions to the linker, and all the logic can be implemented on the linker side. Both a hidden defined symbol and a local symbol is considered non-preemptive, thus they have the same behavior at the linker stage.

The downside is that we increase object file sizes. Hidden symbols are not that common so the downside may be acceptable.

Please add a code comment with this rationale.

GNU as resolves R_ARM_THM_ALU_PREL_11_0 at assembly time, regardless of the visibility. It also rejects an undefined symbol. This looks likes a relocation-specific behavior.

I think it's more like it doesn't know about the relocation, so it tries to resolve it locally.

I wouldn't put too much stock into GNU as's behaviour here, there seem to be bugs. For example, at least the version I'm using will accept this:

.thumb
.thumb_func
.syntax unified
.globl foo
.hidden foo
foo:
adr lr, bar

.data
bar:

and won't emit a relocation for the adr instruction at all.

I think a local symbol can make the intention more explicit, just don't know whether this is a broader issue.

I'm not sure about that, to me it would seem more like a workaround for an assembler bug.

I was feeling ok about the design decision to always emit relocations for references to STB_GLOBAL symbols. However, an alternative approach might be to base the behaviour on a switch? I might suggest something like --emit-relocations-for-references=<always,always-for-globals,only-when-required>? We used to have a downstream patch-set to force the emission of relocations for references to enable linker optimizations so I think that this is useful behaviour.

I'll need to make more analysis.

It looks to me like LLD is simply missing support for that relocation?

In D72197#1825390, @pcc wrote:
In D72197#1825012, @pcc wrote:

It looks like this change caused us to start rejecting:

.thumb
.thumb_func
.globl foo
.hidden foo
foo:
adr lr, foo + 1

with:

test.s:6:1: error: unsupported relocation on symbol
adr lr, foo + 1
^

This is exposed by this Android ART code: https://cs.android.com/android/platform/superproject/+/master:art/runtime/arch/arm/quick_entrypoints_arm.S;l=1826

I see a couple of possible fixes for this:

  1. We could go back to the previous behaviour for global hidden symbols.
  2. We could teach MC and LLD about the R_ARM_THM_ALU_PREL_11_0 relocation required to relocate this instruction. Arguably the ART code shouldn't be adding the 1 for Thumb here (because R_ARM_THM_ALU_PREL_11_0 adds the 1 itself), so ART would then need to be fixed.

foo is a hidden definition in the current object file. Can the ART code be changed to use a local symbol (adr lr, .Lfoo) instead?

It could, but I don't really see a good argument for local symbols having different behaviour to strong hidden symbols.

Not special casing hidden in the general case has the nice property that: the assembler behavior is not affected by the visibility. Only the binding can.

We can thus postpone decisions to the linker, and all the logic can be implemented on the linker side. Both a hidden defined symbol and a local symbol is considered non-preemptive, thus they have the same behavior at the linker stage.

The downside is that we increase object file sizes. Hidden symbols are not that common so the downside may be acceptable.

Please add a code comment with this rationale.

We seem to be consistent with GNU as on most targets. ARM/Thumb2 ADR seems a bit special. I can send a separate patch to you.

GNU as resolves R_ARM_THM_ALU_PREL_11_0 at assembly time, regardless of the visibility. It also rejects an undefined symbol. This looks likes a relocation-specific behavior.

I think it's more like it doesn't know about the relocation, so it tries to resolve it locally.

I wouldn't put too much stock into GNU as's behaviour here, there seem to be bugs. For example, at least the version I'm using will accept this:

.thumb
.thumb_func
.syntax unified
.globl foo
.hidden foo
foo:
adr lr, bar

.data
bar:

and won't emit a relocation for the adr instruction at all.

I think a local symbol can make the intention more explicit, just don't know whether this is a broader issue.

I'm not sure about that, to me it would seem more like a workaround for an assembler bug.

I was feeling ok about the design decision to always emit relocations for references to STB_GLOBAL symbols. However, an alternative approach might be to base the behaviour on a switch? I might suggest something like --emit-relocations-for-references=<always,always-for-globals,only-when-required>? We used to have a downstream patch-set to force the emission of relocations for references to enable linker optimizations so I think that this is useful behaviour.

Do you mind sharing a bit more information how enabling relocations everywhere can help linker optimizations?

I'll need to make more analysis.

It looks to me like LLD is simply missing support for that relocation?

The assembler does not produce R_ARM_THM_ALU_PREL_11_0. I created D72892 for @pcc's Thumb issue.

xiangzhangllvm added a subscriber: xiangzhangllvm.EditedSat, Feb 8, 11:15 PM

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

Also, this code is missing a case for STB_GLOBAL and visibility != STV_DEFAULT. Actually, I have noticed before that this case is missing in various places in MC. As it stands this patch may cause a performance regression on platforms that use -fvisibility=hidden.

Visibility is irrelevant here. For a symbol defined in the same section as the instruction, the relocation record will reference a non-SECTION symbol instead of a STT_SECTION after this change. If the symbol is STV_HIDDEN, the linker will think the symbol is non-preemptible. So, no performance regression.

There are many ways to make a symbol non-preemptible: binding (STB_LOCAL), visibility (STV_INTERNAL, STV_PROTECTED or STV_HIDDEN), --dynamic-list, VER_NDX_LOCAL, -Bsymbolic, -Bsymbolic-functions, etc. I have made enough refactorings to lld Symbol::isPreemptible and I am confident with these things :)

GCC r212049 (first included in GCC 5) introduced -fsemantic-interposition. Yes, current Clang/LLVM behavior is like -fno-semantic-interposition. Our behavior is not ideal and @hfinkel had an RFC https://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html . This change should help -fsemantic-interposition if we decide to move towards that goal.

Hi Fangrui, for "protected" visibility, that references within the defining module (mostly same section) will bind to the local symbol. So I think here should except "protected":

-   if (SymA.getBinding() != ELF::STB_LOCAL
+   if (SymA.getBinding() != ELF::STB_LOCAL  && (SymA.getBinding() != ELF::STB_GLOBAL && SymA.getVisibility() != ELF::STV_PROTECTED)

Now Clang also get a bug about the following test :

$cat t.c
__attribute__((visibility("protected"))) void * foo (void) { return (void *)foo; }
$clang -c  -O0 -g  -fpic t.c
$clang -o t.so  -O0 -g  -fpic t.o  -shared
/usr/bin/ld: t.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object
/usr/bin/ld: final link failed: Bad value
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
$/usr/bin/ld -v
GNU ld version 2.30-49.el8
MaskRay added a comment.EditedSun, Feb 9, 8:59 PM

I wonder if is this really a bug or intentional. Clang and LLVM seem to ignore the possibility of symbol interposition (see: -fno-semantic-interposition). What is the LLVM policy for this?

Also, this code is missing a case for STB_GLOBAL and visibility != STV_DEFAULT. Actually, I have noticed before that this case is missing in various places in MC. As it stands this patch may cause a performance regression on platforms that use -fvisibility=hidden.

Visibility is irrelevant here. For a symbol defined in the same section as the instruction, the relocation record will reference a non-SECTION symbol instead of a STT_SECTION after this change. If the symbol is STV_HIDDEN, the linker will think the symbol is non-preemptible. So, no performance regression.

There are many ways to make a symbol non-preemptible: binding (STB_LOCAL), visibility (STV_INTERNAL, STV_PROTECTED or STV_HIDDEN), --dynamic-list, VER_NDX_LOCAL, -Bsymbolic, -Bsymbolic-functions, etc. I have made enough refactorings to lld Symbol::isPreemptible and I am confident with these things :)

GCC r212049 (first included in GCC 5) introduced -fsemantic-interposition. Yes, current Clang/LLVM behavior is like -fno-semantic-interposition. Our behavior is not ideal and @hfinkel had an RFC https://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html . This change should help -fsemantic-interposition if we decide to move towards that goal.

Hi Fangrui, for "protected" visibility, that references within the defining module (mostly same section) will bind to the local symbol. So I think here should except "protected":

-   if (SymA.getBinding() != ELF::STB_LOCAL
+   if (SymA.getBinding() != ELF::STB_LOCAL  && (SymA.getBinding() != ELF::STB_GLOBAL && SymA.getVisibility() != ELF::STV_PROTECTED)

Now Clang also get a bug about the following test :

$cat t.c
__attribute__((visibility("protected"))) void * foo (void) { return (void *)foo; }
$clang -c  -O0 -g  -fpic t.c
$clang -o t.so  -O0 -g  -fpic t.o  -shared
/usr/bin/ld: t.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object
/usr/bin/ld: final link failed: Bad value
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
$/usr/bin/ld -v
GNU ld version 2.30-49.el8

@xiangzhangllvm This is not related to this patch, though I'd like to make some explanations here.

gcc -fpic a.c -shared -fuse-ld=bfd # relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object

binutils 2.26 introduced a regression: R_X86_64_PC32 can no longer be used against a protected symbol https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=ca3fe95e469b9daec153caa2c90665f5daaec2b5

The original issue is "Copy relocation against protected symbol doesn't work".
I agree with Rich Felker (https://gcc.gnu.org/ml/gcc/2016-04/msg00168.html) and
Cary Coutant (https://sourceware.org/ml/binutils/2016-03/msg00407.html https://gcc.gnu.org/ml/gcc/2016-04/msg00158.html https://gcc.gnu.org/ml/gcc/2016-04/msg00169.html) that we should
keep using direct access against protected symbols and disallow copy relocations against protected symbols.

I appreciate that Cary Coutant and Rafael Ávila de Espíndola added diagnostics to gold and lld, respectively:

@xiangzhangllvm Perhaps you are in a good position to change the resolutions to the following issues:)

GCC 5 x86-64 introduced a regression (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65248)
i386 was flagged as a reproduce (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55012)

__attribute__((visibility("protected"))) int a;
int foo() { return a; } // GCC>=5 uses R_X86_64_GOTPCREL/R_X86_64_REX_GOTPCRELX instead of R_X86_64_PC32

binutils 2.26 introduced a regression R_X86_64_PC32 can no longer be used against a protected symbol https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=ca3fe95e469b9daec153caa2c90665f5daaec2b5

Thanks for so detailed information, I am happy to push this thing go ahead.
But except the failed case, I feel let "protected symbol refer in same section" to directly calculate the symbol Vaule is better.
Does it will affected "symbol interposition"? I just know IFUN will use "symbol interposition".

MaskRay added a comment.EditedMon, Feb 10, 11:29 PM

Thanks for so detailed information, I am happy to push this thing go ahead.
But except the failed case, I feel let "protected symbol refer in same section" to directly calculate the symbol Vaule is better.
Does it will affected "symbol interposition"? I just know IFUN will use "symbol interposition".

It is just not very necessary. Protected symbols are rare. Such optimizations are not needed on the assembler side because the linker will optimize anyway (except GNU ld x86, but I don't think we need to work around such kind of regressions.)
These are serious GCC+BFD regressions that cannot be easily worked around in llvm-mc and I prefer not to work around them.

// On clang's side, this triggers GNU ld>=2.26 error after this patch
__attribute__((visibility("protected"))) int a;
int foo(void) { return a; }

// On GCC's side, GNU ld>=2.26 will error on x86-64.
__attribute__((visibility("protected"))) void *bar(void) { return bar; }

We probably don't need to work around GNU ld>=2.26 regression in MC. We can move forward with D73865

Protected symbols are rare.

I'm not taking a stance on the general topic under discussion here, but this statement is not necessarily true for all systems. For example, a system might use protected symbols for exports in general, and not use default visibility, to avoid symbol interposition issues.

Just noticed that PR44929 https://bugs.llvm.org/show_bug.cgi?id=44929 had been traced to this change. I've put a suggestion on how we might be able to fix it in the PR, if you have any thoughts please comment there.