This is an archive of the discontinued LLVM Phabricator instance.

Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it free of global initializer
ClosedPublic

Authored by mehdi_amini on Jul 13 2021, 10:07 PM.

Details

Summary

We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.

Diff Detail

Event Timeline

mehdi_amini created this revision.Jul 13 2021, 10:07 PM
mehdi_amini requested review of this revision.Jul 13 2021, 10:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2021, 10:07 PM

I've briefly skimmed this, and the general approach looks good. Unfortunately, I don't have the time to review this further at this point.

lattner accepted this revision.Jul 15 2021, 9:51 AM
lattner added a subscriber: lattner.

This is really great Mehdi, thank you for making libsupport be static ctor clean!

This revision is now accepted and ready to land.Jul 15 2021, 9:51 AM
jpienaar accepted this revision.Jul 15 2021, 11:05 AM

Nice!

llvm/lib/Support/Debug.cpp
83–84

Would it make sense to have these above the flags instead of above the structs?

94–109

Why not in anonymous namespace too? (consistenly everywhere)

llvm/lib/Support/DebugOptions.h
2

Missing c++?

31

rm?

If you have installed valgrind:

/home/maskray/llvm/llvm/lib/Support/Valgrind.cpp:30:19: error: declaration requires a global constructor [-Werror,-Wglobal-constructors]
static const bool NotUnderValgrind = InitNotUnderValgrind();
llvm/lib/Support/ARMBuildAttrs.cpp
13

This change is unrelated to removing dynamic initializations.

const objects at namespace level has internal linkage so you can omit static.

llvm/lib/Support/CMakeLists.txt
7 ↗(On Diff #358513)

Perhaps move this to a separate change.

llvm/lib/Support/CommandLine.cpp
2439–2532

unnamed namespace

mehdi_amini marked 7 inline comments as done.

Address comment and fix build errors

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 15 2021, 6:54 PM
mehdi_amini added inline comments.Jul 15 2021, 6:54 PM
llvm/lib/Support/CMakeLists.txt
7 ↗(On Diff #358513)

Sure, I can commit this separately.

llvm/lib/Support/Debug.cpp
94–109

I think it is already?

This is a really welcome change! Multiple registration issues were really an inconvenience - I had no clue this was the pattern to use to fix it. Thanks!

Fix windows build

This is a really welcome change! Multiple registration issues were really an inconvenience - I had no clue this was the pattern to use to fix it. Thanks!

To be fair: we can do it transparently only for libSupport, as it contains the entry point for command line parsing it can explicitly trigger the registration of the options for itself. Other libraries in LLVM don't have this luxury...

This revision was landed with ongoing or failed builds.Jul 15 2021, 8:34 PM
This revision was automatically updated to reflect the committed changes.

This is a really welcome change! Multiple registration issues were really an inconvenience - I had no clue this was the pattern to use to fix it. Thanks!

To be fair: we can do it transparently only for libSupport, as it contains the entry point for command line parsing it can explicitly trigger the registration of the options for itself. Other libraries in LLVM don't have this luxury...

We could probably combine this with the technique used to avoid static constructors in compiler-rt/lib/profile to pick up other libraries in the same final linked image. The idea would be to ask the linker to build an array of initializer functions that LLVMSupport could iterate through to initialize options it doesn't know about.

For example, on Darwin, this code adds an initializer to a global array in section __DATA,__llvmopts:

using OptionConstructorT = void (*)(); // Declared in LLVMSupport headers to get safety.

void initOptions();

// Add an options constructor the array of cl::opt initializers.
__attribute__((section("__DATA,__llvmopts"),visibility("hidden")))
OptionConstructorT OptionsConstructor = initOptions;

Also on Darwin, this could go in LLVMSupport to access it:

using OptionConstructorT = void (*)(); // Declared in LLVMSupport headers to get safety.

// Declare symbols to access the global array.
extern OptionConstructorT *ExternalOptionsInitBegin
  __asm("section$start$__DATA$__llvmopts");
extern OptionConstructorT *ExternalOptionsInitEnd
  __asm("section$end$__DATA$__llvmopts");

void initCommonOptions();
void initOptions() {
  initCommonOptions();
  // Iterate through external initializers for options.
  for (auto I = ExternalOptionsInitBegin, E = ExternalOptionsInitEnd; I != E; ++I)
    (*I)();
}

The relevant code for other platforms is in compiler-rt/lib/profile/InstrProfilingPlatform*.c.

If we did this, we'd maybe want to control it with a CMake configuration? When off (always for platforms where we don't have/know the magic incantations), the various initOptions() would be added to static initializers as before. When on, we could add -Werror=global-constructors to other libraries (once they're clean).

That's interesting!

I'm not sure how to get there though: a complete solution should be able to "degrade" to global constructor when the platform-specific logic isn't available (unless we're confident that no such environment can exist?).

Something else I'm not sure about is how it works across DSOs: when each LLVM library is linked into its own shared library, is the dynamic linker creating this array when loading the libraries?
(I'm starting to doubt about how this would even work on windows)

It's just a standard linker set; the static linker combines all the input sections containing the array elements into a single output section. It's similar to init_array etc, just without the special meaning (i.e. you have to write the code to iterate over the array and do what you want).

That's interesting!

I'm not sure how to get there though: a complete solution should be able to "degrade" to global constructor when the platform-specific logic isn't available (unless we're confident that no such environment can exist?).

Right, it needs to downgrade (and I think there should be a CMake flag to turn it off in case there are reasons some LLVM user doesn't want it).

I think this would work:

#ifdef USE_DARWIN_CLOPT
// Add F to the list of initializers for LLVMSupport to iterate through.
#define LLVM_ADD_CLOPT_CONSTRUCTOR(F, V) \
  __attribute__((section("__DATA,__llvmopts"),visibility("hidden"))) \
  void (*V)() = F;
#elif defined(...) // Handle magic for other platforms.
#else
// Create a local symbol with a static initializer that calls F.
#define LLVM_ADD_CLOPT_CONSTRUCTOR(F, V) \
  namespace { struct V##Type { V##Type() { F(); } } V; }
#endif

void initOptions();
// Add an options constructor the array of cl::opt initializers.
LLVM_ADD_CLOPT_CONSTRUCTOR(initOptions, OptionsConstructorVariable);

but maybe there's a cleaner downgrade that doesn't add an extra type.

Something else I'm not sure about is how it works across DSOs: when each LLVM library is linked into its own shared library, is the dynamic linker creating this array when loading the libraries? (I'm starting to doubt about how this would even work on windows)

Yeah, this technique only works within a single final linked image. It'd need to be off when the library isn't statically linked to LLVMSupport.

I imagine we'd want CMake to just "get this right" for us, but we can also enforce this at link time to catch configuration errors -- add a (dead-strippable) symbol to LLVMSupport that's visibility("hidden"), and manufacture a (dead-strippable) reference to it in any TU that uses the macro. If the TU ends up in a different linked image than LLVMSupport you'll get a link error.