This is an archive of the discontinued LLVM Phabricator instance.

ELF: Create unique SHF_GNU_RETAIN sections for llvm.used global objects
ClosedPublic

Authored by MaskRay on Feb 24 2021, 11:41 PM.

Details

Summary

If a global object is listed in @llvm.used, place it in a unique section with
the SHF_GNU_RETAIN flag. The section is a GC root under ld --gc-sections
with LLD>=13 or GNU ld>=2.36.

For front ends which do not expect to see multiple sections of the same name,
consider emitting @llvm.compiler.used instead of @llvm.used.

SHF_GNU_RETAIN is restricted to ELFOSABI_GNU and ELFOSABI_FREEBSD in
binutils. We don't do the restriction - see the rationale in D95749.

The integrated assembler has supported SHF_GNU_RETAIN since D95730.
GNU as>=2.36 supports section flag 'R'.
We don't need to worry about GNU ld support because older GNU ld just ignores
the unknown SHF_GNU_RETAIN.

With this change, __attribute__((retain)) functions/variables emitted
by clang will get the SHF_GNU_RETAIN flag.

Diff Detail

Event Timeline

MaskRay created this revision.Feb 24 2021, 11:41 PM
MaskRay requested review of this revision.Feb 24 2021, 11:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2021, 11:41 PM
rnk accepted this revision.Feb 26 2021, 1:02 PM

This will mostly only affect users that are using the integrated assembler, or who are using clang -fno-integrated-as -fbinutils-version=2.36+. I think that means all the compatibility concerns about the new R section flag are addressed. Nice.

Looks good to me.

This revision is now accepted and ready to land.Feb 26 2021, 1:02 PM
MaskRay edited the summary of this revision. (Show Details)Feb 26 2021, 1:36 PM
MaskRay edited the summary of this revision. (Show Details)Feb 26 2021, 4:25 PM
This revision was landed with ongoing or failed builds.Feb 26 2021, 4:38 PM
This revision was automatically updated to reflect the committed changes.
nikic added a subscriber: nikic.Aug 8 2021, 2:21 PM

Just FYI this patch exposes a bug in at least GNU gold (GNU Binutils for Ubuntu 2.34) 1.16 when used in conjunction with section(".init_array").

; RUN: clang -fuse-ld=gold %s
#include <stdio.h>
static void do_init(void) { printf("Foobar\n"); }

__attribute__((used, retain, section(".init_array")))
static void (*init)(void) = &do_init;

int main() { return 0; }

The symbol is not placed between __init_array_start/__init_array_end and thus the init function is not run.

Following your recommendation, we switched the #[used] attribute in rust to use llvm.compiler.used instead of llvm.used. Hopefully that won't have other fallout.

Just FYI this patch exposes a bug in at least GNU gold (GNU Binutils for Ubuntu 2.34) 1.16 when used in conjunction with section(".init_array").

; RUN: clang -fuse-ld=gold %s
#include <stdio.h>
static void do_init(void) { printf("Foobar\n"); }

__attribute__((used, retain, section(".init_array")))
static void (*init)(void) = &do_init;

int main() { return 0; }

The symbol is not placed between __init_array_start/__init_array_end and thus the init function is not run.

The gold bug is fixed in 2.36 (https://sourceware.org/bugzilla/show_bug.cgi?id=27039)

Following your recommendation, we switched the #[used] attribute in rust to use llvm.compiler.used instead of llvm.used. Hopefully that won't have other fallout.

I informed Rust folks in March. https://github.com/rust-lang/rust/issues/40289#issuecomment-788567604
Switching to llvm.compiler.used improves compatibility with GCC/Clang.

nikic added a comment.Aug 9 2021, 12:46 AM

Thanks for the reference! The part that wasn't obvious to me (both from the patch summary here and the gold bug) is that using SHF_GNU_RETAIN with an older binutils version will not only ignore the retain semantics (which is entirely expected), but will lead to silent miscompiles unrelated to retain semantics. From the description here and rnk's comment, it sounded like there aren't any toolchain compatibility concerns.

It would be good to update LangRef in this regard. Currently the llvm.used documentation says:

This is commonly used to represent references from inline asms and other things the compiler cannot “see”, and corresponds to “attribute((used))” in GNU C.

And the llvm.compiler.used documentation says:

This is a rare construct that should only be used in rare circumstances, and should not be exposed to source languages.

While now the recommendation seems to be the other way around: Frontends should map "used"-like attributes to llvm.compiler.used rather than llvm.used, and using the latter may be dangerous if the frontend doesn't control the used toolchain.

Thanks for the reference! The part that wasn't obvious to me (both from the patch summary here and the gold bug) is that using SHF_GNU_RETAIN with an older binutils version will not only ignore the retain semantics (which is entirely expected), but will lead to silent miscompiles unrelated to retain semantics. From the description here and rnk's comment, it sounded like there aren't any toolchain compatibility concerns.

It would be good to update LangRef in this regard. Currently the llvm.used documentation says:

This is commonly used to represent references from inline asms and other things the compiler cannot “see”, and corresponds to “attribute((used))” in GNU C.

LangRef has mentioned the linker behavior since 2009 (rGae76db5edd9251d76af488d601be01d903805f8e), despite the fact that ELF did not really distinguish llvm.used and llvm.compiler.used before this patch.

GCC never adds linker semantics to __attribute__((used)).
The armcc documentation mentions the linker behavior "Data marked with attribute((used)) is tagged in the object file to avoid removal by linker unused section removal."
I don't have access to armcc to check whether/how this is actually satisfied.
Clang Mach-O may use the semantics for a long time.

And the llvm.compiler.used documentation says:

This is a rare construct that should only be used in rare circumstances, and should not be exposed to source languages.

While now the recommendation seems to be the other way around: Frontends should map "used"-like attributes to llvm.compiler.used rather than llvm.used, and using the latter may be dangerous if the frontend doesn't control the used toolchain.

Using is the latter is not more dangerous:/ The compiler just did not do what LangRef promised on ELF for a long time.
Rust likely misused llvm.used...

That said, sounds good to make llvm.compiler.used more supported, since the GCC/Clang/(Rust?) behavior has become more popular.