Page MenuHomePhabricator

[libunwind] Export the unw_* symbols as weak symbols

Authored by phosek on Mar 28 2019, 12:45 AM.



libunwind defines the _Unwind_* ABI used by libc++abi. This ABI is a
stable quasi-standard common between multiple implementations such as
LLVM and GNU. The _U* symbol name space is also safely within the symbol
name space that standard C & C++ reserve for the implementation.

Furthermore, libunwind also defines several unw_* symbols, and references
these from the _Unwind_* entry points so the standard/reserved part of
the ABI is dependent on the unw_* part of the ABI. This is not OK for a
C or C++ implementation. The unw_* symbols are reserved for C and extern
"C" used by application code.

This change renames each unw_* function to _unw* and adds a weak alias
unw_* to keep the public <libunwind.h> ABI unchanged for backwards
compatibility. Every reference to unw_* in the implementation has been
changed to use _unw* so that if other unw_* definitions are in force
because nothing uses <libunwind.h> in a particular program, no _Unwind*
code path depends on any unw_* symbol. Furthemore, _unw_* symbols are
hidden, which saves PLT overhead in the shared library case.

In the future, we should consider untangling the unw_* API/ABI from the
_Unwind_* API/ABI. The internal API backing the _Unwind_* ABI
implementation should not rely on any nonstandard symbols not in the
implementation-reserved name space. This would then allow separating the
_Unwind_* API/ABI from unw_* entirely, but that's a more substantial
change that's going to require more significant refactoring.

Diff Detail


Event Timeline

phosek created this revision.Mar 28 2019, 12:45 AM
phosek updated this revision to Diff 192573.Mar 28 2019, 12:45 AM
phosek added a reviewer: mcgrathr.
mstorsjo added a subscriber: mstorsjo.
mstorsjo added inline comments.
95 ↗(On Diff #192573)

What assembler syntax is this alias<somesymbol> = <othersymbol>? The LLVM assembler certainly doesn't support it at least. Is it some thing for a microsoft assembler? The rest of this code doesn't assemble with any such tool anyway. Or is it a leftover marker indicating that you meant to do something?

If we go with /alternatename via a pragma on the C level, the corresponding thing here would be an alternatename directive in the exact same way as EXPORT_SYMBOL2 above. If we go with __attribute__((weak,alias())), or __attribute__((alias())) the corresponding thing is the same as for ELF above, with or without the .weak directive.

77 ↗(On Diff #192573)

This kind of pragma requires building with -fms-extensions for mingw mode (which is the only windows configuration where there's any need to use libunwind at all, as far as I know), which breaks building with gcc (and the directive is only supported by lld, not by ld.bfd). This requires adding unwind_append_if(LIBUNWIND_COMPILE_FLAGS MINGW -fms-extensions) to the main CMakeLists.txt.

That's not a blocker for this change by any means, but I'll just let it be noted that it's a change from how things were before.

The __attribute__((weak, alias())) form does work in both gcc and clang actually though, so that might almost also be an option, but it's a bit messy (both clang and ld.bfd seem to have a bit of gotchas relating to it), so perhaps we shouldn't.

Alternatively, we could also just postpone making them properly weak on mingw (as it's not strictly necessary contrary to e.g. sanitizers), and just do this with a plain __attribute__((alias(#name))) as above, without the weak.

Also, I'm pretty sure that /alternatename requires using __USER_LABEL_PREFIX__ here (judging from how it's used in sanitizers).

ldionne requested changes to this revision.Mar 28 2019, 7:31 AM
ldionne added inline comments.
443 ↗(On Diff #192573)

I don't think a single leading underscore followed by a lowercase letter is a reserved name (or maybe it is, but only in the global namespace?). I suggest we use two leading underscores for consistency.

This revision now requires changes to proceed.Mar 28 2019, 7:31 AM
mcgrathr added inline comments.Mar 28 2019, 10:25 AM
443 ↗(On Diff #192573)

Single leading _ is reserved for external linkage. _[A-Z_] is reserved for all identifiers in C/C++. So using a single _ is actually fine but since the rules are somewhat arcane and using double __ is safe for all purposes, __ is the most common choice for this purpose just to avoid such confusion.

rnk added a subscriber: compnerd.Mar 28 2019, 1:56 PM
rnk added inline comments.
77 ↗(On Diff #192573)

Re: __USER_LABEL_PREFIX__, I think that's true. It's a pretty big drawback to /alternatename.

I guess @phosek wants to make llvm libunwind consistently not provide strong definitions of unw_* across all platforms, otherwise I would say just make this whole thing ELF-only, and provide strong aliases of unw_* = _unw_* for non-ELF platforms.

Does libunwind build with MSVC on Windows? I have no idea, personally. I feel like it's only ever brought in if Itanium-style EH is in use, which usually means mingw, or @compnerd's emergent *-windows-itanium ABI, so theoretically, it might build with MSVC.

mstorsjo added inline comments.Mar 28 2019, 2:21 PM
77 ↗(On Diff #192573)

I don't think there's anything particularly tricky at all on the C level so far, so I would expect it to build with MSVC just fine. But the assembly files require a gas style assembler.

phosek marked 2 inline comments as done.Mar 28 2019, 11:15 PM
phosek added inline comments.
95 ↗(On Diff #192573)

It's the masm syntax, but I'll replace it with the alternative as you suggested.

77 ↗(On Diff #192573)

We're only concerned with ELF for our case, but I'd prefer to keep all platforms as similar as possible unless there's technical reason why it cannot be done. It seems like in this case it's still feasible to use the same approach for ELF, COFF and Mach-O.

mstorsjo added inline comments.Mar 29 2019, 12:04 AM
77 ↗(On Diff #192573)

Ok. It's probably simplest to proceed with this using alternatename then, and if someone has a concrete need of libunwind built with gcc/ld.bfd, they can propose patches to make it into non-weak aliases for that target, or see if the gcc/mingw/ld.bfd weak aliases really work for their case.

phosek updated this revision to Diff 192840.Mar 29 2019, 8:56 AM
phosek marked 2 inline comments as done.

I plan to do some more testing on Windows to ensure that the updated version works properly.

mstorsjo added inline comments.Mar 29 2019, 12:46 PM
97 ↗(On Diff #192840)

Thanks, this looks great overall now! Just one minor mistake.

The mingw version of the WEAK_ALIAS macro lacks a definition of WEAK_SYMBOL() - to match the C version we probably should just leave it out? With that changed, this compiles and links just fine.

Didn't test the non-mingw codepaths though.

phosek updated this revision to Diff 192952.Mar 29 2019, 6:49 PM
phosek marked an inline comment as done.Mar 29 2019, 10:45 PM
mstorsjo added inline comments.Mar 30 2019, 1:52 AM
97 ↗(On Diff #192840)

Sorry, I keep coming up with more things...

The alias needs to be exported to be visible outside of the dll. (In that case there's no issue with conflicting non-reserved symbols at all.) And it needs a .globl.

The non-mingw version is missing some # to make string literals. And that requires a few layers of macro indirection.

diff --git a/src/assembly.h b/src/assembly.h
index 3ff959d..1d47e4d 100644
--- a/src/assembly.h
+++ b/src/assembly.h
@@ -93,13 +93,21 @@
 #if defined(__MINGW32__)
 #define WEAK_ALIAS(name, aliasname)                                            \
+  .globl SYMBOL_NAME(aliasname) SEPARATOR                                      \
+  EXPORT_SYMBOL(aliasname) SEPARATOR                                           \
   SYMBOL_NAME(aliasname) = SYMBOL_NAME(name)
-#define WEAK_ALIAS(name, aliasname)                                            \
+#define WEAK_ALIAS3(name, aliasname)                                           \
   .section .drectve,"yn" SEPARATOR                                             \
-  .ascii "-alternatename:", SYMBOL_NAME(aliasname), "=",                       \
-                            SYMBOL_NAME(name), "\0" SEPARATOR                  \
+  .ascii "-alternatename:", #aliasname, "=", #name, "\0" SEPARATOR             \
+#define WEAK_ALIAS2(name, aliasname)                                           \
+  WEAK_ALIAS3(name, aliasname)
+#define WEAK_ALIAS(name, aliasname)                                            \
+  EXPORT_SYMBOL(aliasname) SEPARATOR                                           \
mstorsjo added inline comments.Mar 30 2019, 2:01 AM
97 ↗(On Diff #192840)

I also forgot to mention, the EXPORT_SYMBOL macro needs to add the symbol prefix when building for msvc (or any non-mingw I guess), but not for mingw. But that's a preexisting issue, unrelated to this patch.

phosek updated this revision to Diff 193028.Mar 31 2019, 4:09 PM
phosek marked 2 inline comments as done.
mstorsjo added inline comments.Apr 1 2019, 1:35 AM
107 ↗(On Diff #193028)

Ok, folding the msvc specific SYMBOL_NAME() here seems to work fine. If there were other places where this macro is used, it might have been good to factorize it in there, but this seems to work.

This version of the patch seems to be ok wrt all windows configurations, as far as I can see.

(I don't have a windows-itanium setup available for testing easily, and making clang-cl build this requires a few other hacks, but I managed to get these aspects tested.)

ldionne accepted this revision.Apr 1 2019, 9:07 AM

My only objection was with the single underscore naming, which has been addressed. Removing my "requires changes" and deferring to other reviewers.

This revision is now accepted and ready to land.Apr 1 2019, 9:07 AM

My only objection was with the single underscore naming, which has been addressed. Removing my "requires changes" and deferring to other reviewers.

@mstorsjo is this OK with you as well? If yes then I'm going to go ahead and land it. Thanks a lot for all the testing and help on the Windows side!

mstorsjo accepted this revision.Apr 2 2019, 11:37 AM

Yes, this is fine with me. Thanks for your patience, and sorry that the windows stuff turned out to require so many roundtrips.

Btw, correcting an earlier statement of mine:

I don't think there's anything particularly tricky at all on the C level so far, so I would expect it to build with MSVC just fine. But the assembly files require a gas style assembler.

When I tried, I did run into one issue. UnwindRegistersRestore.S implements functions with itanium C++ mangled signatures, so if built with MSVC, they don't match, and need to be passed via a function with C name mangling. But being able to build it with MSVC is only a theoretical thing anyway (while the windows-itanium case probably is more plausible).

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2019, 2:51 PM