We can build it with -Werror=global-constructors now. This helps
in situation where libSupport is embedded as a shared library,
potential with dlopen/dlclose scenario, and when command-line
parsing or other facilities may not be involved. Avoiding the
implicit construction of these cl::opt can avoid double-registration
issues and other kind of behavior.
Details
- Reviewers
jhenderson lattner jpienaar - Commits
- rG76374573ce82: Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it…
rGaf9321739b20: Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it…
rG42f588f39c5c: Use ManagedStatic and lazy initialization of cl::opt in libSupport to make it…
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I've briefly skimmed this, and the general approach looks good. Unfortunately, I don't have the time to review this further at this point.
If you have installed valgrind:
/home/maskray/llvm/llvm/lib/Support/Valgrind.cpp:30:19: error: declaration requires a global constructor [-Werror,-Wglobal-constructors] static const bool NotUnderValgrind = InitNotUnderValgrind();
llvm/lib/Support/ARMBuildAttrs.cpp | ||
---|---|---|
13 | This change is unrelated to removing dynamic initializations. const objects at namespace level has internal linkage so you can omit static. | |
llvm/lib/Support/CMakeLists.txt | ||
7 | Perhaps move this to a separate change. | |
llvm/lib/Support/CommandLine.cpp | ||
2440 | unnamed namespace |
This is a really welcome change! Multiple registration issues were really an inconvenience - I had no clue this was the pattern to use to fix it. Thanks!
To be fair: we can do it transparently only for libSupport, as it contains the entry point for command line parsing it can explicitly trigger the registration of the options for itself. Other libraries in LLVM don't have this luxury...
We could probably combine this with the technique used to avoid static constructors in compiler-rt/lib/profile to pick up other libraries in the same final linked image. The idea would be to ask the linker to build an array of initializer functions that LLVMSupport could iterate through to initialize options it doesn't know about.
For example, on Darwin, this code adds an initializer to a global array in section __DATA,__llvmopts:
using OptionConstructorT = void (*)(); // Declared in LLVMSupport headers to get safety. void initOptions(); // Add an options constructor the array of cl::opt initializers. __attribute__((section("__DATA,__llvmopts"),visibility("hidden"))) OptionConstructorT OptionsConstructor = initOptions;
Also on Darwin, this could go in LLVMSupport to access it:
using OptionConstructorT = void (*)(); // Declared in LLVMSupport headers to get safety. // Declare symbols to access the global array. extern OptionConstructorT *ExternalOptionsInitBegin __asm("section$start$__DATA$__llvmopts"); extern OptionConstructorT *ExternalOptionsInitEnd __asm("section$end$__DATA$__llvmopts"); void initCommonOptions(); void initOptions() { initCommonOptions(); // Iterate through external initializers for options. for (auto I = ExternalOptionsInitBegin, E = ExternalOptionsInitEnd; I != E; ++I) (*I)(); }
The relevant code for other platforms is in compiler-rt/lib/profile/InstrProfilingPlatform*.c.
If we did this, we'd maybe want to control it with a CMake configuration? When off (always for platforms where we don't have/know the magic incantations), the various initOptions() would be added to static initializers as before. When on, we could add -Werror=global-constructors to other libraries (once they're clean).
That's interesting!
I'm not sure how to get there though: a complete solution should be able to "degrade" to global constructor when the platform-specific logic isn't available (unless we're confident that no such environment can exist?).
Something else I'm not sure about is how it works across DSOs: when each LLVM library is linked into its own shared library, is the dynamic linker creating this array when loading the libraries?
(I'm starting to doubt about how this would even work on windows)
It's just a standard linker set; the static linker combines all the input sections containing the array elements into a single output section. It's similar to init_array etc, just without the special meaning (i.e. you have to write the code to iterate over the array and do what you want).
Right, it needs to downgrade (and I think there should be a CMake flag to turn it off in case there are reasons some LLVM user doesn't want it).
I think this would work:
#ifdef USE_DARWIN_CLOPT // Add F to the list of initializers for LLVMSupport to iterate through. #define LLVM_ADD_CLOPT_CONSTRUCTOR(F, V) \ __attribute__((section("__DATA,__llvmopts"),visibility("hidden"))) \ void (*V)() = F; #elif defined(...) // Handle magic for other platforms. #else // Create a local symbol with a static initializer that calls F. #define LLVM_ADD_CLOPT_CONSTRUCTOR(F, V) \ namespace { struct V##Type { V##Type() { F(); } } V; } #endif void initOptions(); // Add an options constructor the array of cl::opt initializers. LLVM_ADD_CLOPT_CONSTRUCTOR(initOptions, OptionsConstructorVariable);
but maybe there's a cleaner downgrade that doesn't add an extra type.
Yeah, this technique only works within a single final linked image. It'd need to be off when the library isn't statically linked to LLVMSupport.
I imagine we'd want CMake to just "get this right" for us, but we can also enforce this at link time to catch configuration errors -- add a (dead-strippable) symbol to LLVMSupport that's visibility("hidden"), and manufacture a (dead-strippable) reference to it in any TU that uses the macro. If the TU ends up in a different linked image than LLVMSupport you'll get a link error.
clang-tidy: warning: invalid case style for variable 'tagData' [readability-identifier-naming]
not useful