Page MenuHomePhabricator

[libunwind] LIBUNWIND_HERMETIC_STATIC_LIBRARY fixes
Needs RevisionPublic

Authored by rprichard on Dec 9 2020, 11:14 PM.

Details

Reviewers
phosek
mstorsjo
ldionne
compnerd
Group Reviewers
Restricted Project
Summary

ELF: The unw_getcontext symbol is an alias for __unw_getcontext, a
function written in assembly. Ensure that it is hidden when
LIBUNWIND_HERMETIC_STATIC_LIBRARY is enabled, like the unw_* aliases
for functions written in C++.

Mach-O: For aliases to C++ functions, the weak_import declaration
doesn't seem to have an effect if the unw_* function isn't used in the
TU. If it is used, a .weak_reference is generated, but the assembler
seems to ignore that. assembly.h also outputs .weak_reference, but
frequently with the wrong symbol name (the __USER_LABEL_PREFIX__ is
missing). Make the aliases non-weak instead, like on Windows.

Rename the _LIBUNWIND_DISABLE_VISIBILITY_ANNOTATIONS C define to
_LIBUNWIND_HERMETIC_STATIC_LIBRARY, because a hermetic library needs to
add .hidden/.private_extern for the alias symbols, not just disable the
visibility annotations.

This change also adds a few SYMBOL_NAME calls that were missing from
the ELF code path in assembly.h.

Fixes PR46709.

Diff Detail

Event Timeline

rprichard created this revision.Dec 9 2020, 11:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2020, 11:14 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
rprichard requested review of this revision.Dec 9 2020, 11:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2020, 11:14 PM

Fix LIBUNWIND_HERMETIC_STATIC_LIBRARY on macOS.

rprichard retitled this revision from [libunwind] Fix ELF/Mach-O visibility for unw_getcontext to [libunwind] LIBUNWIND_HERMETIC_STATIC_LIBRARY fixes.Dec 10 2020, 12:38 AM
rprichard edited the summary of this revision. (Show Details)

I assume the LIBUNWIND_HERMETIC_STATIC_LIBRARY setting isn't used on Darwin -- it leaves unw_* symbols as Extern and not PrivateExtern. Maybe the CMake file could reject that flag on Darwin instead of trying to make it work.

AFAICT, the existing unw_* aliases aren't weak on macOS. For aliases to C++ functions, the __attribute__((weak_import)) didn't seem to do anything when I tested it. I think the declared unw_* functions aren't referenced in the TU that declares them, so the compiler doesn't do anything with the declaration. When I experimented, if I *did* reference the function, then the compiler outputted a .weak_reference directive. However, AFAIK, the assembler ignores the .weak_reference directive and uses the weakdef/weakref flags from the target of the alias instead (i.e. the __unw_* function).

For unw_getcontext specifically, assembly.h's WEAK_SYMBOL(aliasname) SEPARATOR omitted the SYMBOL_NAME macro on Darwin, so I think the .weak_reference would have referred to the wrong symbol. For arm64 iphoneos, I saw this preprocessed output:

.globl _unw_getcontext %% .weak_reference unw_getcontext %% _unw_getcontext = ___unw_getcontext

Adding SYMBOL_NAME doesn't make unw_getcontext weak.

I'm not sure if the aliases can actually be marked weak on Darwin, so instead, I removed the code making them weak. Windows already leaves the symbols non-weak. I'm not very familiar with Mach-O, though, so maybe I'm wrong here. In particular, maybe there's some configuration where it works (e.g. where the assembler behaves differently).

libunwind/CMakeLists.txt
333

This code block effectively duplicates the LIBUNWIND_HERMETIC_STATIC_LIBRARY block in libunwind/src/CMakeLists.txt, assuming that the -fvisibility flags are ignored for Windows. It's on by default, though, whereas the LIBUNWIND_HERMETIC_STATIC_LIBRARY option is off by default. Maybe this code block should be removed, but I suppose that would break current users, quietly.

libunwind/src/CMakeLists.txt
163

e.g. Maybe this could be: if(LIBUNWIND_HERMETIC_STATIC_LIBRARY OR WIN32)

Or maybe we can default LIBUNWIND_HERMETIC_STATIC_LIBRARY to ON for WIN32? I'm not sure what's possible with CMake.

165

I think this -fvisibility-global-new-delete-hidden flag has no effect, but if it did have an effect, it could break the NDK's libc++_shared.so when it's linked with libunwind.a.

i.e. AFAIK, libunwind doesn't define or use the global new/delete operators. (D57251 switched to placement new.) If libunwind merely referenced new/delete, then this flag would make the undefined symbol reference hidden, and then linking libunwind.a into the NDK's libc++_shared.so could make the new/delete definitions hidden (because undef-hidden + def-default ==> def-hidden).

It was added with D57107.

compnerd requested changes to this revision.Wed, Jan 13, 8:44 AM
compnerd added a subscriber: compnerd.
compnerd added inline comments.
libunwind/CMakeLists.txt
332

Similar to the argument for the naming for the compiler-rt builtins, would you mind renaming this to _LIBUNWIND_HIDE_SYMBOLS please?

333

IIRC, -fvisibility= is not ignored and is an error. Either way, -fvisibility= is meaningless on Windows - you *must* control the behaviour via macros and __declspec(dllexport) for the DLL build, and nothing for the static build (which makes it internal).

libunwind/src/CMakeLists.txt
163

-fvisibility= should not be set for Win32. This is correct as is.

libunwind/src/assembly.h
82

Does this not change the behaviour on MachO? This symbol is now private_extern rather than a weak_reference. A weak reference will be set to 0 by the loader if it is not found, and a private_extern is a strong internal reference.

104

I don't understand how this works. This is making the symbol a forward declaration, which is not a weak symbol definition.

This revision now requires changes to proceed.Wed, Jan 13, 8:44 AM
rprichard added inline comments.Sat, Jan 16, 12:38 AM
libunwind/CMakeLists.txt
332

Should I also rename the CMake variable? And if so, do I need to continue recognizing the old name for backwards compatibility?

If we change the CMake variable name, at least ./clang/cmake/caches/Fuchsia-stage2.cmake will need to be updated.

libunwind/src/assembly.h
82

Is .weak_reference the right directive to use here, instead of .weak_definition? We're defining a symbol (aliasname) and setting its value to that of another symbol (name).

I think marking unw_* weak is intended to let some other strong definition override it. Its value won't ever be set to 0.

Currently on Mach-O, the hide-symbols flag hides almost everything (including _Unwind_*) but leaves all of the unw_* alias symbols as extern (and not private-extern) and not weak. With my change, they're still not weak, but they're private-extern.

libunwind's current assembly.h behavior for a weak alias:

.globl aliasname
.weak_reference aliasname
aliasname = name

The LLVM Mach-O assembler ignores the .weak_reference directive. If I change it to .weak_definition, it is still ignored. AFAICT, the LLVM assembler uses the WeakDef/WeakRef attributes from the name symbol.

e.g.

$ cat test.S    
    .text
    .space 0x42

    // Define foo.
    .globl foo
foo:
    ret

    // Define a weak alias, bar.
    .globl bar
    .weak_reference bar
    bar = foo

$ ~/clang11/bin/clang test.S -c && ~/clang11/bin/llvm-readobj --syms test.o

File: test.o
Format: Mach-O 64-bit x86-64
Arch: x86_64
AddressSize: 64bit
Symbols [
  Symbol {
    Name: bar (1)
    Extern
    Type: Section (0xE)
    Section: __text (0x1)
    RefType: UndefinedNonLazy (0x0)
    Flags [ (0x0)
    ]
    Value: 0x42
  }
  Symbol {
    Name: foo (5)
    Extern
    Type: Section (0xE)
    Section: __text (0x1)
    RefType: UndefinedNonLazy (0x0)
    Flags [ (0x0)
    ]
    Value: 0x42
  }
]

The Flags are empty.

OTOH, if I flip things around:

    .text
    .space 0x42

    // Define a weak function, foo.
    .globl foo
    .weak_reference foo
foo:
    ret

    // Define an alias, bar.
    .globl bar
    bar = foo

Now both symbols are WeakRef:

File: test.o
Format: Mach-O 64-bit x86-64
Arch: x86_64
AddressSize: 64bit
Symbols [
  Symbol {
    Name: bar (1)
    Extern
    Type: Section (0xE)
    Section: __text (0x1)
    RefType: UndefinedNonLazy (0x0)
    Flags [ (0x40)
      WeakRef (0x40)
    ]
    Value: 0x42
  }
  Symbol {
    Name: foo (5)
    Extern
    Type: Section (0xE)
    Section: __text (0x1)
    RefType: UndefinedNonLazy (0x0)
    Flags [ (0x40)
      WeakRef (0x40)
    ]
    Value: 0x42
  }
]

I'm wondering if this LLVM behavior is actually correct, but I'm also unfamiliar with Mach-O. (i.e. We want to copy the symbol's address, but should we be copying the WeakRef/WeakDef flags?) It looks like the behavior is controlled in MachObjectWriter::writeNlist. writeNlist finds the aliased symbol and uses its flags:

// The Mach-O streamer uses the lowest 16-bits of the flags for the 'desc'
// value.
bool EncodeAsAltEntry =
  IsAlias && cast<MCSymbolMachO>(OrigSymbol).isAltEntry();
W.write<uint16_t>(cast<MCSymbolMachO>(Symbol)->getEncodedFlags(EncodeAsAltEntry));

The PrivateExtern attribute, on the other hand, isn't part of these encoded flags:

if (Data.isPrivateExtern())
  Type |= MachO::N_PEXT;

Data continues to refer to the alias symbol rather than the aliased symbol. However, apparently the author isn't completely sure about things?

// FIXME: Should this update Data as well?

In libunwind, there is one usage of assembly.h's WEAK_ALIAS. UnwindRegistersSave.S defines a hidden non-weak __unw_getcontext function, and also a weak alias unw_getcontext. My patch's goal is to make unw_getcontext hidden or not-hidden depending on a CMake config variable.

Currently, on Mach-O and on Windows, WEAK_ALIAS doesn't actually make the alias weak. I'm just making it a bit more explicit on Mach-O.

Alternatively:

  • Instead of a weak alias, we could output a weak thunk -- a weak function with a branch to the internal hidden symbol. That's more of a functional change, though.
  • Or, on Mach-O, both the unw_* and __unw_* functions could be WeakDef.
  • Maybe the hide-symbols flag should only affect ELF?
104

Currently, there's only one use of WEAK_ALIAS, at the end of UnwindRegistersSave.S, which defines unw_getcontext to have the same value as __unw_getcontext, which was defined earlier in the file. It's generating one of:

.hidden unw_getcontext
.weak unw_getcontext
unw_getcontext = __unw_getcontext

or:

.weak unw_getcontext
unw_getcontext = __unw_getcontext