Page MenuHomePhabricator

[Driver] Support compiler-rt crtbegin.o/crtend.o for Linux
ClosedPublic

Authored by phosek on Mar 12 2019, 9:36 AM.

Details

Summary

When compiler-rt is selected as the runtime library for Linux
use its crtbegin.o/crtend.o implementation rather than platform one.

Diff Detail

Repository
rL LLVM

Event Timeline

phosek created this revision.Mar 12 2019, 9:36 AM
E5ten added a subscriber: E5ten.Mar 12 2019, 2:56 PM
MaskRay added inline comments.
clang/lib/Driver/ToolChains/Gnu.cpp
563 ↗(On Diff #190281)

I believe crtbegin.o crtend.o should just work. It is not necessary to use crtbegin_shared.o crtend_shared.o.

phosek marked an inline comment as done.Mar 12 2019, 10:33 PM
phosek added inline comments.
clang/lib/Driver/ToolChains/Gnu.cpp
563 ↗(On Diff #190281)

This is related to your comments on D28791, specifically that we should be using crtbegin_shared.o for -shared or -pie and crtbegin.o otherwise, is that not the case?

MaskRay added inline comments.Mar 12 2019, 11:55 PM
clang/lib/Driver/ToolChains/Gnu.cpp
563 ↗(On Diff #190281)

Yes. I think we can rename crtbegin_shared.o to crtbegin.o and use it for every configuration: -no-pie -pie -shared -static -static -pie.

phosek marked an inline comment as done.Mar 13 2019, 12:27 AM
phosek added inline comments.
clang/lib/Driver/ToolChains/Gnu.cpp
563 ↗(On Diff #190281)

We've checked the glibc implementation of __cxa_finalize. A nonzero __dso_handle has to match the value passed to __cxa_atexit but a zero __dso_handle matches every function registered. So it matters that DSO fini calls use &__dso_handle to match their registrations for the dlclose case, but it also matters that the main executable fini call use zero to run all the dtors at exit time. It's not clear it really needs to be that way, but it would affect how the dtors get run which might affect some use cases. Hence, I don't think we can combine crtbegin.o and crtbegin_shared.o.

E5ten added inline comments.Mar 13 2019, 5:32 AM
clang/lib/Driver/ToolChains/Gnu.cpp
563 ↗(On Diff #190281)

I may be wrong but from what I can see crtend and crtend_shared are identical, so while the crtbegins must stay separate can't the 2 crtends be merged into one that gets used in all cases instead of having a duplicate object under a different name?

manojgupta added inline comments.
clang/lib/Driver/ToolChains/Gnu.cpp
447 ↗(On Diff #190281)

This is currently unconditional on using compiler-rt. Given that compiler-rt provides the crt*.o files only under the CMake flag COMPILER_RT_BUILD_CRT, shouldn't there be an equivalent clang CMake flag to conditionally enable this.

phosek marked an inline comment as done.Mar 13 2019, 2:26 PM
phosek added inline comments.
clang/lib/Driver/ToolChains/Gnu.cpp
447 ↗(On Diff #190281)

COMPILER_RT_BUILD_CRT is ON by default now in D28791, but even if we change that, I think this logic should be independent since you should be able to build Clang separately from compiler-rt.

manojgupta added inline comments.Mar 13 2019, 4:14 PM
clang/lib/Driver/ToolChains/Gnu.cpp
447 ↗(On Diff #190281)

Since it is possible to build compiler-rt without CRT files, I believe that clang should also be able to use compiler-rt without compiler-rt provided CRT files.
Using them unconditionally will also break the case of using newer clang with older compiler-rt.

srhines added a subscriber: enh.Mar 13 2019, 4:59 PM
MaskRay added a comment.EditedMar 14 2019, 2:15 AM

A nonzero dso_handle has to match the value passed to cxa_atexit but a zero dso_handle matches every function registered. So it matters that DSO fini calls use &dso_handle to match their registrations for the dlclose case

Yes, but this won't matter if we change void *__dso_handle = 0; to void *__dso_handle = &__dso_handle;.

static void __do_global_dtors_aux() { // static void __do_fini() in D28791
  if (__cxa_finalize)
    __cxa_finalize(__dso_handle); // what happens if we change `void *__dso_handle = 0;` to `void *__dso_handle = &__dso_handle;`.
  ...
}

exit calls __run_exit_handlers, which goes through __exit_funcs and calls the hooks one by one. _dl_fini is the last in the list because __cxa_atexit(_dl_fini,0,__dso_handle) runs before other atexit registered functions.[1]

__do_global_dtors_aux is called by _dl_fini. When it gets called and it calls __cxa_finalize(0), other atexit registered functions have run, thus __cxa_finalize(0) will do nothing.

The differentiation of crtbegin.o crtbeginS.o is unnecessary. It adds complexity for little size benefit (crtbegin.o is a bit smaller than crtbeginS.o).
While we are adding support for crtbegin.o, it'd be really good to get this fixed.

[1]: If a shared object has a constructor that calls __cxa_atexit, __cxa_atexit(_dl_fini,...) will not be the first. [2]

[2]: If you try really hard you can break that in glibc (but not in FreeBSD libc), but I'll call that an unsupported land as functions in the main executable (register_in_exe) shouldn't be called before it is initialized: __attribute__((constructor)) void init_in_dso() { register_in_exe(); atexit(fini_in_dso); }
Moreover, if you have several DSOs, the global reverse order of atexit-registered functions won't be guaranteed. __cxa_finalize(__dso_handle in exe) __cxa_finalize(__dso_handle in b.so) __cxa_finalize(__dso_handle c.so)

phosek added a comment.Apr 1 2019, 1:13 PM

A nonzero dso_handle has to match the value passed to cxa_atexit but a zero dso_handle matches every function registered. So it matters that DSO fini calls use &dso_handle to match their registrations for the dlclose case

Yes, but this won't matter if we change void *__dso_handle = 0; to void *__dso_handle = &__dso_handle;.

static void __do_global_dtors_aux() { // static void __do_fini() in D28791
  if (__cxa_finalize)
    __cxa_finalize(__dso_handle); // what happens if we change `void *__dso_handle = 0;` to `void *__dso_handle = &__dso_handle;`.
  ...
}

exit calls __run_exit_handlers, which goes through __exit_funcs and calls the hooks one by one. _dl_fini is the last in the list because __cxa_atexit(_dl_fini,0,__dso_handle) runs before other atexit registered functions.[1]

__do_global_dtors_aux is called by _dl_fini. When it gets called and it calls __cxa_finalize(0), other atexit registered functions have run, thus __cxa_finalize(0) will do nothing.

The differentiation of crtbegin.o crtbeginS.o is unnecessary. It adds complexity for little size benefit (crtbegin.o is a bit smaller than crtbeginS.o).
While we are adding support for crtbegin.o, it'd be really good to get this fixed.

[1]: If a shared object has a constructor that calls __cxa_atexit, __cxa_atexit(_dl_fini,...) will not be the first. [2]

[2]: If you try really hard you can break that in glibc (but not in FreeBSD libc), but I'll call that an unsupported land as functions in the main executable (register_in_exe) shouldn't be called before it is initialized: __attribute__((constructor)) void init_in_dso() { register_in_exe(); atexit(fini_in_dso); }
Moreover, if you have several DSOs, the global reverse order of atexit-registered functions won't be guaranteed. __cxa_finalize(__dso_handle in exe) __cxa_finalize(__dso_handle in b.so) __cxa_finalize(__dso_handle c.so)

I sent an email to glibc maintainers to clarify their position on this: https://sourceware.org/ml/libc-alpha/2019-04/msg00028.html. If they're fine with this change, I'll update the implementation as you suggested.

MaskRay removed a subscriber: MaskRay.
phosek updated this revision to Diff 193733.Apr 4 2019, 8:45 AM
phosek marked 5 inline comments as done.

@MaskRay can you review this again now that there's just a single crtbegin.o and crtend.o version?

MaskRay accepted this revision.Apr 15 2019, 11:48 PM

Thank you for unifying crtbegin{,S,T}.o 😊 I hope Android people can make their crtbegin*.o files simpler, one day..

This revision is now accepted and ready to land.Apr 15 2019, 11:48 PM
niosHD added a subscriber: niosHD.Apr 18 2019, 8:30 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2019, 12:33 PM