This is an archive of the discontinued LLVM Phabricator instance.

[llvm-libgcc] initial commit
ClosedPublic

Authored by cjdb on Aug 19 2021, 3:22 PM.

Details

Summary

Note: the term "libgcc" refers to the all of libgcc.a, libgcc_eh.a,
and libgcc_s.so.

Enabling libunwind as a replacement for libgcc on Linux has proven to be
challenging since libgcc_s.so is a required dependency in the [Linux
standard base][5]. Some software is transitively dependent on libgcc
because glibc makes hardcoded calls to functions in libgcc_s. For example,
the function __GI___backtrace eventually makes its way to a [hardcoded
dlopen to libgcc_s' _Unwind_Backtrace][1]. Since libgcc_{eh.a,s.so} and
libunwind have the same ABI, but different implementations, the two
libraries end up [cross-talking, which ultimately results in a
segfault][2].

To solve this problem, libunwind needs to build a “libgcc”. That is, link
the necessary functions from compiler-rt and libunwind into an archive
and shared object that advertise themselves as libgcc.a, libgcc_eh.a,
and libgcc_s.so, so that glibc’s baked calls are diverted to the
correct objects in memory. Fortunately for us, compiler-rt and libunwind
use the same ABI as the libgcc family, so the problem is solvable at the
llvm-project configuration level: no program source needs to be edited.
Thus, the end result is for a user to configure their LLVM build with a
flag that indicates they want to archive compiler-rt/unwind as libgcc.
We achieve this by compiling libunwind with all the symbols necessary
for compiler-rt to emulate the libgcc family, and then generate symlinks
named for our "libgcc" that point to their corresponding libunwind
counterparts.

We alternatively considered patching glibc so that the source doesn't
directly refer to libgcc, but rather _defaults_ to libgcc, so that a
system preferring compiler-rt/libunwind can point to these libraries
at the config stage instead. Even if we modified the Linux standard
base, this alternative won't work because binaries that are built using
libgcc will still end up having crosstalk between the differing
implementations.

This problem has been solved in this manner for [FreeBSD][3], and this
CL has been tested against [Chrome OS][4].

[1]: https://github.com/bminor/glibc/blob/master/sysdeps/arm/backtrace.c#L68
[2]: https://bugs.chromium.org/p/chromium/issues/detail?id=1162190#c16
[3]: https://github.com/freebsd/freebsd-src/tree/main/lib/libgcc_s
[4]: https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/2945947
[5]: https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/libgcc-s.html

Diff Detail

Event Timeline

cjdb created this revision.Aug 19 2021, 3:22 PM
cjdb requested review of this revision.Aug 19 2021, 3:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2021, 3:22 PM
cjdb added a comment.Aug 19 2021, 3:24 PM

The core missing functionality for this revision is that I'm struggling to work out how to get this to build in-tree.

My suggestion for symbol versioning from D106703 still applies.

Also [5]: https://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/libgcc-s.html should be replaced with a link to 5.0.0

MaskRay added inline comments.Aug 20 2021, 11:58 AM
llvm/libgcc/docs/LLVMLibgcc.rst
68 ↗(On Diff #367632)

Does LLVM_DEFAULT_TARGET_TRIPLE help?

183 ↗(On Diff #367632)

readelf -W --dyn-syms

186 ↗(On Diff #367632)

There are many symbols of other versions.

llvm/libgcc/lib/CMakeLists.txt
3 ↗(On Diff #367632)

The name "unwind" may be confused with llvm-project/libunwind

Is this a new top-level project or something hidden in the llvm directory? In the later case, why isn't in llvm/tools or llvm/utils?

gcc_s-armhf.ver version script (copy/symlink of armv7) is needed too.

llvm/libgcc/FindCompilerRT.cmake
1 ↗(On Diff #367632)

CMake complained about the missing reference:
+include(CheckLibraryExists)

48–49 ↗(On Diff #367632)

maybe just say:
check_library_exists("libclang_rt.builtins-${LLVM_LIBGCC_TARGET}.a" emutls_key_destructor "${LLVM_LIBGCC_COMPILER_RT_DIR}")

llvm/libgcc/docs/LLVMLibgcc.rst
168 ↗(On Diff #367632)

typo

llvm/libgcc/lib/CMakeLists.txt
16–17 ↗(On Diff #367632)

This worked for me, other wise the path was wrong:

-  -Wl,${LLVM_LIBGCC_SYSROOT}/lib${LLVMLIB_DIR_SUFFIX}/${LLVM_LIBGCC_COMPILER_RT}
-  -Wl,${LLVM_LIBGCC_SYSROOT}/lib${LLVMLIB_DIR_SUFFIX}/${LLVM_LIBGCC_UNWIND_STATIC}
+  -Wl,${LLVM_LIBGCC_COMPILER_RT}
+  -Wl,${LLVM_LIBGCC_UNWIND_STATIC}
cjdb updated this revision to Diff 399780.Jan 13 2022, 1:50 PM
cjdb marked 5 inline comments as done.
cjdb retitled this revision from [WORK IN PROGRESS][llvm-libgcc] initial commit to [llvm-libgcc] initial commit.
cjdb edited the summary of this revision. (Show Details)
  • applies most of the feedback in the CL
  • adds a Python script that generates new linker version scripts instead of requiring developers to do it by hand
llvm/libgcc/docs/LLVMLibgcc.rst
68 ↗(On Diff #367632)

Possibly, but there can be ambiguity between x86 and x86_64 since x86 chips aren't really a thing any more and the two terms are often interchangeable.

186 ↗(On Diff #367632)

Yes. It's not really clear to me why you're pointing this out though.

tschuett added a comment.EditedJan 13 2022, 1:59 PM

I prefer: #!/usr/bin/env python3

and

def main():
  return 0

if __name__ == "__main__":
    sys.exit(main())

https://docs.python.org/3/library/__main__.html#idiomatic-usage

MaskRay added inline comments.Jan 18 2022, 7:40 PM
llvm/tools/llvm-libgcc/generate_version_script.py
10

docstring formatting is not aligned

68

llvm-readelf and drop --wide

127

Single quotes

llvm/tools/llvm-libgcc/version-scripts/gcc_s-i386.ver
124 ↗(On Diff #399780)

delete trailing blank lines

cjdb updated this revision to Diff 403111.Jan 25 2022, 7:22 PM
cjdb edited the summary of this revision. (Show Details)

updating to C symver file

cjdb updated this revision to Diff 403428.Jan 26 2022, 3:54 PM

restores the version scripts, but instead makes them generated when invoking the build

cjdb updated this revision to Diff 403429.Jan 26 2022, 3:58 PM

updates documentation to reflect status quo

cjdb updated this revision to Diff 403735.Jan 27 2022, 11:32 AM
  • collapses the version script preprocessing into a single version script that's now run through the C preprocessor
  • adds pseudo-detection for arm-*-*-gnueabihf
cjdb added inline comments.Jan 27 2022, 11:57 AM
llvm/tools/llvm-libgcc/FindCompilerRT.cmake
48

We might need to do some magic here to cover libclang_rt.builtins-x86_64.a, etc.

Thanks for the update.

${CMAKE_C_COMPILER}" "-E" "-xc" "${CMAKE_CURRENT_SOURCE_DIR}/gcc_s.ver" "-o" "${CMAKE_CURRENT_BINARY_DIR}/gcc_s.ver"

Yeah. This CC -E preprocessed version script is what I like to happen in a comment last year (https://reviews.llvm.org/D106703#2940320) and the style favored by jyknight.

You may use llvm-nm -jUD /lib/x86_64-linux-gnu/libgcc_s.so.1 to get versioned defined symbols. Could you please comparing the symbol set with GCC libgcc_s.so.1 for these architectures you mostly care about?
Knowing which symbols are missing and have mismatched versions will be useful. I spot checked a small set of symbols and found some minor issues.
If 32-bit arm is so different from other architectures, perhaps use a global ifdef like

#ifdef __arm__
....
#else
...
#endif

instead of special casing #ifdef __arm__ (or the like) in every version node.

ld.lld has diagnostics if

  • an exact pattern occurs in two version nodes
  • or an exact pattern does not match any symbol

You may want to check there is no such warning.

llvm/tools/llvm-libgcc/lib/gcc_s.ver
50

Looks like __addtf3 for x86-64 is in GCC_4.3.0. You may need a comment documenting which architecture is the outlier here.

65

__register_frame_info and its friends are generic. See llvm-nm -jDU /usr/aarch64-linux-gnu/lib/libgcc_s.so.1 if you have installed aarch64 GCC cross compiler on a Debian/Ubuntu like distro.

180

__addtf3 and its friends are missing here.

% llvm-nm -jDU /lib/x86_64-linux-gnu/libgcc_s.so.1 | grep addtf3     
__addtf3@@GCC_4.3.0
198

This symbol is also in /usr/powerpc64le-linux-gnu/lib/libgcc_s.so.1.

If just arm is weird, exclude it.

Thanks for update.

llvm/tools/llvm-libgcc/lib/CMakeLists.txt
31–32

These two libs could be added as dependency too as gcc_s_ver.

llvm/tools/llvm-libgcc/lib/gcc_s.ver
5

that should work as
__ARM_FP means HW floating point is available as
• Bit 1 - half precision (16-bit).
• Bit 2 - single precision (32-bit).
• Bit 3 - double precision (64-bit).
or
• 0x04 for single-support.
• 0x0C for single- and double-support.
• 0x0E for half-, single-, and double-support.

cjdb updated this revision to Diff 404082.Jan 28 2022, 10:25 AM
cjdb marked 4 inline comments as done.
  • re-adds generate_version_script.py
  • responds to @MaskRay's feedback
cjdb added a comment.Jan 28 2022, 10:28 AM

Could you please comparing the symbol set with GCC libgcc_s.so.1 for these architectures you mostly care about?

Done for x86_64 just now, will move on to the others shortly.
I'd actually like to get some sort of pre-commit CI on this subproject if at all possible, but I'm not entirely sure how to do that.

Knowing which symbols are missing and have mismatched versions will be useful. I spot checked a small set of symbols and found some minor issues.
If 32-bit arm is so different from other architectures, perhaps use a global ifdef like

#ifdef __arm__
....
#else
...
#endif

instead of special casing #ifdef __arm__ (or the like) in every version node.

Is it really that unwieldy?

llvm/tools/llvm-libgcc/generate_version_script.py
68

Why drop --wide?

llvm/tools/llvm-libgcc/lib/gcc_s.ver
65

Yep, but they're apparently versioned under GLIBC_2.0 for everything else :-(

180

__addtf3 is in neither libclang_rt.builtins-x86_64.a nor libunwind.a for x86-64, which is why it's omitted here. I think we'd need to add it to libclang_rt.builtins-x86_64 in order for it to be usable?

198

Looks like this is a 64-bit global symbol. I've updated the documentation to indicate which platforms are supported at the moment, and how to expand support for platforms like PowerPC.

cjdb updated this revision to Diff 404118.Jan 28 2022, 11:52 AM
cjdb marked an inline comment as done.

applies @danielkiss' feedback

llvm/tools/llvm-libgcc/lib/gcc_s.ver
5

Should it be __ARM_FP == 0xC or __ARM_FP >= 0x04?

cjdb updated this revision to Diff 404168.Jan 28 2022, 2:33 PM

changes how the version script is laid out so that global symbols are all listed first, and exclusive symbols are listed afterward

MaskRay added inline comments.Jan 28 2022, 7:34 PM
llvm/tools/llvm-libgcc/lib/gcc_s.ver
17

You may use __LP64__

21

__LP64__

Unsure anyone will use this for ILP64. Sticking with LP64 and !__LP64__ should be fine.

cjdb updated this revision to Diff 404607.Jan 31 2022, 10:39 AM
cjdb marked 2 inline comments as done.

changes how GLOBAL_32BIT and GLOBAL_64BIT are enabled

llvm/tools/llvm-libgcc/lib/gcc_s.ver
21

Eh, it's not that much extra work to support it, especially if we use #else on the tail of 64-bit detection.

arichardson added inline comments.Jan 31 2022, 3:43 PM
llvm/tools/llvm-libgcc/lib/gcc_s.ver
21

If you use __SIZEOF_POINTER__ >= 8 it should also do the right thing when compiling for Arm Morello (which is L64P128)

cjdb updated this revision to Diff 404773.Jan 31 2022, 5:37 PM

changes to SIZEOF_POINTER >= 8

cjdb marked an inline comment as done.Jan 31 2022, 5:37 PM
danielkiss added inline comments.Feb 1 2022, 3:02 AM
llvm/tools/llvm-libgcc/lib/gcc_s.ver
5

__ARM_FP >= 0x04 is better because on some targets the 64bit operations are available.

cjdb updated this revision to Diff 405035.Feb 1 2022, 12:06 PM

broadens armhf detection

cjdb marked an inline comment as done.Feb 1 2022, 12:38 PM
MaskRay added a comment.EditedFeb 1 2022, 3:30 PM

Thanks for the update. From a high level,

  • llvm-libgcc looks useful to me
  • the refactored gcc_s.ver looks good to me.
  • I'd wish that some CMake experts (e.g. @phosek @compnerd) can review the CMake logic, which is a significant portion of this patch. I am hesitant to LGTM this part.
  • the motivation in the summary and the documentation focus on glibc loading libgcc_s.so.1, and people may wonder whether other systems (e.g. Linux musl) need this? If glibc is patched to not dlopen libgcc_so.so.1, is it still a problem? I think the symbol version will remain as a problem for GCC prebuilt objects. The documentation just doesn't make this as a clear motivation.
  • apparently many folks are interested (therefore the comments). Ideally hope someone can stand up to test that the symbol versioning result is indeed closer to libgcc_s.so.1 ... (Sorry, it may not be me.)
tschuett added inline comments.Feb 1 2022, 5:22 PM
llvm/tools/llvm-libgcc/generate_version_script.py
2

#!/usr/bin/env python3 looks safer.

compnerd requested changes to this revision.Feb 7 2022, 11:07 AM
compnerd added inline comments.
llvm/tools/llvm-libgcc/CMakeLists.txt
8

Perhaps we can default the path since the layout is now fairly uniform due to the merged repository layout.

37

I think that bin rather than lib is appropriate. The CMAKE_RUNTIME_OUTPUT_DIRECTORY is used for the executables and dlls on Windows.

llvm/tools/llvm-libgcc/FindCompilerRT.cmake
2

I think that this is not really the way to do this. This should follow the standard CMake expectations around the behaviour of Find*.cmake. Alternatively, you could name this something else and completely elide the find_* usage.

92

This should be hoisted into the top level CMake. The FindCompilerRT.cmake should only search for the prebuilt version.

llvm/tools/llvm-libgcc/FindLLVMLibunwind.cmake
2

Similar

llvm/tools/llvm-libgcc/lib/CMakeLists.txt
18

add_custom_command by itself is difficult to use, please create a target to track the output so that dependencies can be wired properly.

31

Is there a reason to not use -nostdlib as a driver level option?

34

Is there a guarantee that we are not building these libraries? If we are building these libraries, I think that it would be better to use the target objects (i.e. $<TARGET_OBJECTS:unwind> rather than the --whole-archive, --no-whole-archive path.

35

I think that this should be outside of the -(, -).

39

This seems like an odd check. I think that we should drop it, libc is required and unless you are suggesting that it is possible to have a freestanding build of this library, we should assume that it is available. (I think that it is still problematic for non-Unix platforms due to spelling but we can deal with that if/when necessary as libgcc is not something that really applies there).

56

I would rather use ${CMAKE_SHARED_LIBRARY_PREFIX} and ${CMAKE_SHARED_LIBRARY_SUFFIX}. rather than lib and .so.

60

I think that the versioned symlinks should point to the same names not to the peers.

This revision now requires changes to proceed.Feb 7 2022, 11:07 AM
cjdb updated this revision to Diff 408689.Feb 14 2022, 6:58 PM
cjdb marked 4 inline comments as done.

gets CMake scripts into a decent state

Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added subscribers: libcxx-commits, Restricted Project, sstefan1. · View Herald Transcript
cjdb updated this revision to Diff 408690.Feb 14 2022, 7:16 PM

quick CMake fix

compnerd added inline comments.Feb 15 2022, 8:10 AM
llvm-libgcc/CMakeLists.txt
10 ↗(On Diff #408690)

Any reason for insert rather that append?

36 ↗(On Diff #408690)

I tend to prefer all caps for the On and Off, but its equivalent really.

llvm-libgcc/lib/CMakeLists.txt
3 ↗(On Diff #408690)

newline before this line would be nice.

8 ↗(On Diff #408690)

I tend to prefer the custom command before the custom target so you can see what the target is generating.

17 ↗(On Diff #408690)

What differentiates this from the gcc_s_ver target? I think that you have two targets for the same output.

19 ↗(On Diff #408690)

I think that breaking after the ( is a bit unnecessary:

set_target_properties(gcc_s_version_script PROPERTIES
  OUTPUT_PATH "${CMAKE_CURRENT_BINARY_DIR}/gcc_s.ver")

Similar throughout.

54 ↗(On Diff #408690)

Dead code?

56 ↗(On Diff #408690)

Why is this cached?

compnerd accepted this revision.Feb 15 2022, 8:12 AM

I'd like to see at least https://reviews.llvm.org/D108416#inline-1146215, https://reviews.llvm.org/D108416#inline-1146214, and https://reviews.llvm.org/D108416#inline-1146209 addressed pre-commit. Would be nice to have the other ones resolved as well, but they are more stylistic.

zero9178 added inline comments.
llvm-libgcc/docs/LLVMLibgcc.rst
96 ↗(On Diff #408690)

neither compiler-rt, nor llvm-libgcc, are listed in LLVM_ENABLE_RUNTIMES

Is this a typo and is meant to be libunwind instead of llvm-libgcc?

EricWF accepted this revision.Feb 15 2022, 10:35 AM
EricWF added a subscriber: EricWF.

LGTM on behalf of libc++. But please address the comments left by the other reviewers.

This revision is now accepted and ready to land.Feb 15 2022, 10:35 AM
MaskRay added inline comments.Feb 15 2022, 10:35 AM
llvm-libgcc/docs/LLVMLibgcc.rst
97 ↗(On Diff #408690)

I think some folks prefer that compiler-rt is added to LLVM_ENABLE_RUNTIMES. How difficult is it to make this work?

Note that current plan is that libunwind in DLLVM_ENABLE_PROJECTS will be unsupported.

llvm-libgcc/generate_version_script.py
1 ↗(On Diff #408690)

/usr/bin/env python3

Add a file-level comment what this script does.

MaskRay accepted this revision.Feb 15 2022, 10:36 AM
ldionne added inline comments.
runtimes/CMakeLists.txt
181–200 ↗(On Diff #408690)

I think this can be simplified like this:

if("llvm-libgcc" IN_LIST runtimes)
  if("compiler-rt" IN_LIST runtimes OR "compiler-rt" IN_LIST LLVM_ENABLE_PROJECTS)
    message(FATAL_ERROR
      "Attempting to build both compiler-rt and llvm-libgcc will cause irreconcilable "
      "target clashes. Please choose one or the other, but not both.")
  endif()

  if("libunwind" IN_LIST runtimes)
    message(
      FATAL_ERROR
      "Attempting to build both libunwind and llvm-libgcc will cause irreconcilable "
      "target clashes. Please choose one or the other, but not both.")
  endif()
endif()
cjdb updated this revision to Diff 409017.Feb 15 2022, 12:43 PM
cjdb marked 8 inline comments as done.

responds to feedback

llvm-libgcc/CMakeLists.txt
10 ↗(On Diff #408690)

I copied this directly from either libunwind/CMakeLists.txt or runtimes/CMakeLists.txt. Looks as if it's being prepended though?

36 ↗(On Diff #408690)

Acknowledged.

llvm-libgcc/docs/LLVMLibgcc.rst
96 ↗(On Diff #408690)

Thanks!

97 ↗(On Diff #408690)

compiler-rt must be listed in neither LLVM_ENABLE_RUNTIMES, nor LLVM_ENABLE_PROJECTS when llvm-libgcc is present. llvm-libgcc will take care of that for you. You can still have LLVM_ENABLE_RUNTIMES=compiler-rt, but not when llvm-libgcc is also in the list.

llvm-libgcc/generate_version_script.py
1 ↗(On Diff #408690)

Done, but what's the benefit of this?

llvm-libgcc/lib/CMakeLists.txt
19 ↗(On Diff #408690)

I find the current way to improve readability a lot.

56 ↗(On Diff #408690)

I think this was user-settable at one point, but per yesterday, this entire variable is redundant.

runtimes/CMakeLists.txt
181–200 ↗(On Diff #408690)

Thanks, this is way simpler!

compnerd accepted this revision.Feb 15 2022, 12:49 PM

I believe that the break after ( in CMake is counter the style of the rest of the project. I'd recommend against it.

cjdb updated this revision to Diff 409026.Feb 15 2022, 1:01 PM

cleans up a few things

cjdb updated this revision to Diff 409066.Feb 15 2022, 2:51 PM
  • makes it harder to accidentally build llvm-libgcc by removing it from LLVM_ENABLE_RUNTIMES=all
  • installs shared objects properly
cjdb updated this revision to Diff 409158.Feb 15 2022, 11:04 PM

rebases to activate CI

phosek accepted this revision.Feb 15 2022, 11:59 PM

LGTM

This revision was landed with ongoing or failed builds.Feb 16 2022, 9:07 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2023, 10:26 AM
Herald added a subscriber: Enna1. · View Herald Transcript