This patch adds more fine-grained support over which information is output from the libomptarget runtime when run with the environment variable LIBOMPTARGET_INFO set. An extensible set of flags can be used to pick and choose which information the user is interested in.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I like the direction. Looks easily applicable to amdgpu and granular debug printing is a good feature.
As commented above, not convinced the -1 sentinel is worth the complexity. If the goal is to only call getenv once (which seems sensible), that's probably worth guarding with something threadsafe instead.
openmp/libomptarget/include/Debug.h | ||
---|---|---|
40 | Why not uint32, with zero for all flags clear as the default? |
openmp/libomptarget/include/Debug.h | ||
---|---|---|
40 | I wanted to have a method for getting the info / debug levels that didn't expose them as global variables, so I put them in a function as static variables. But this required having a special value for uninitialized so the value is only read from getenv the first time it's called. I could change it to use a boolean or something and avoid the complexity of reserving the highest order bit. |
openmp/libomptarget/include/Debug.h | ||
---|---|---|
40 | How about code that runs on library load? We have a few getenv calls and it's probably good to do all of them once. struct env { env() {all the getenv calls} uint32_t debug_level() {return dl;} uint32_t info_level() {} private: uint32_t dl; // etc } environment; Makes the variables read-only after construction. Or use const member variables. That will be thread safe, removes some branches, no sentinel. No teardown problems as destructor is a no-op. |
Changing the info and debug functions to use booleans to check if they've been initialized already. Another benefit is that allows every flag to be enabled by setting LIBOMPTARGET_INFO=1.
openmp/libomptarget/include/Debug.h | ||
---|---|---|
40 | This is basically how it was done previously, I didn't like needing to initialize it in every plugin separately when they're all sharing the same function. I could change it to work this way but I feel like the new approach using an extra boolean works alright. |
openmp/libomptarget/include/Debug.h | ||
---|---|---|
40 | That is worth avoiding. The variable in the header doesn't achieve that though - each plugin will have it's own copy of the static variable + wrapper function. If we move the init code, however it is done, into libomptarget then each plugin will need to make a cross-dl call to get the variable, but getenv is only called once. I think that's a win. A comprehensively overengineered solution is to make the getenv calls once, from libomptarget, and also have the plugin read&cache the result of said call. |
@JonChesterfield Let's do this firt and we can revisit a single global in libomptarget later. WDYT?
Oh - didn't mention it's still signed. Would prefer move to uint32 as the -1 is no longer in use, but that's not blocking.
openmp/libomptarget/include/Debug.h | ||
---|---|---|
49 | Is 0x0008 reserved for something else? | |
54 | I would suggest you declare initialized as std::once_flag and call std::call_once() to initialize InfoLevel. This is guaranteed to be thread-safe (unlike the implementation posted here) while also making sure that InfoLevel is initialized exactly once. We initialize RTLs in RTLsTy::RegisterLib() in the very same way. | |
openmp/libomptarget/src/interface.cpp | ||
237 | Wouldn't we want to have if (getInfoLevel() & OMP_INFOTYPE_KERNEL_ARGS) printKernelArguments(loc, device_id, arg_num, arg_sizes, arg_types, arg_names); here as well? It is useful to see what arguments we have if something goes wrong at a target exit data directive or upon exiting a target data scope. |
openmp/libomptarget/include/Debug.h | ||
---|---|---|
49 | No real reason, I just felt like keeping a space open, so if we add another libomptarget flag the first nibble just applies to libomptarget, since the fourth one is for the CUDA plugin right now. | |
54 | Seems like a good solution. Should I push this first and address that in a future patch? Or just do it here. | |
openmp/libomptarget/src/interface.cpp | ||
237 | This method just prints the arguments to __tgt_target_data_begin_mapper and __tgt_target_data_end_mapper, which should have the same arguments right? I had it in previously but it didn't add any useful information when I tested it. I could add a separate message that just says something like "Exiting data region started at (loc)". Would that be good? |
Changing environment variables to initialize using std::call_once. Added a message when exiting an OpenMP data kernel.
openmp/libomptarget/src/interface.cpp | ||
---|---|---|
237 | For scoped constructs (e.g. omp target data) this is true, whatever arguments we have in begin_mapper the very same arguments we have in end_mapper. But what if we have a standalone directive (e.g. omp target exit data)? If I recall correctly (currently on vacation with no access to a working llvm environment), omp target exit data is compiled into a __tgt_target_data_end_mapper call. In this case there may be no matching enter data directive so if we don't call printKernelArguments() here the user will not be able to see what arguments end_mapper was called with. |
openmp/libomptarget/src/interface.cpp | ||
---|---|---|
237 | That's true, I should probably just make it print whether its a data region entry, device kernel, or data region exit. That way it's clear what the flow is. Otherwise it would be confusing because it would still say that a variable is mapped "to" at the end of a data region. |
Fixing a bug with passing a nullptr and making the kernel arguments print a different message for which type of region was executed.
openmp/libomptarget/include/Debug.h | ||
---|---|---|
73–82 | Hello. Am I the only one who is getting libomptarget.rtl.x86_64.so build fail with -DCMAKE_BUILD_TYPE=Debug?
Here is a small reproducer: #include <mutex> static inline uint32_t getInfoLevel() { static uint32_t InfoLevel = 0; static std::once_flag Flag{}; std::call_once(Flag, []() { InfoLevel = 1; }); return InfoLevel; } I cannot build a shared library from it using gcc: g++ -fPIC -std=c++14 test.cpp -shared -o test.so
The issue is reproducible with GCC 5.5.0 and 6.5.0, and does not reproduce with 7.3.0. I believe LLVM's minimal requirement is 5.1, so it is a problem. The issue seems to be caused by the fact that getInfoLevel() is not referenced in openmp/libomptarget/plugins/generic-elf-64bit/src/rtl.cpp |
openmp/libomptarget/include/Debug.h | ||
---|---|---|
73–82 | I have no issues compiling the reproducer you provided. It compiles without errors locally and here https://godbolt.org/z/oEKq3E for the compilers you listed. There might be a problem when the compiler doesn't create a static variable when the function in unused. Does the problem go away if you create a dummy usage of getInfoLevel()? |
openmp/libomptarget/include/Debug.h | ||
---|---|---|
73–82 | Actually you're right, it doesn't work on Godbolt, the error went off screen and didn't list as an error in the compilation. It definitely works locally on the machine I tested it on so who knows what's up with that. The error goes away if I just add a dummy usage to force the compiler to actually generate a symbol for InfoLevel. It's a solution but probably not the best. |
openmp/libomptarget/include/Debug.h | ||
---|---|---|
73–82 | Of course, it does compile with a dummy reference. I guess you have more recent GCC on your machine. The issue seems to be fixed somewhere between 6.5.0 and 7.1.0. |
I tried inserting the following forward-declaration right before the function definition and the problem seems to go away:
static inline uint32_t getInfoLevel() __attribute__((used));
I guess it's better to have the above rather than a dummy call?
Can mark the function itself as ((used)), for slightly less typing. Seems a reasonable workaround to me, with a comment like 'workaround bug in gcc 5.ish'
Yeah, that's an option. Adding a dummy variable would defeat the original purpose of initializing the variables this way, rather than having each plugin initialize it individually and globally. This problem only seems to occur on GCC x86 from the compilers I've checked. The reason It wasn't failing on my machine was because it was GCC 6.3 on PPC. I'm assuming any other compilers will just ignore the attribute if they don't understand it. This should be sufficient.
static inline uint32_t __attribute__((used)) getInfoLevel () { ... } static inline uint32_t __attribute__((used)) getDebugLevel () { ... }
offloading/dynamic_module_load.c fails due to this commit in my environment with x86_64 offload. It fails because std::call_once cannot be used without libpthread, and libomptarget.rtl.x86_64.so is not linked with libpthread. Basically, gdb shows that include/c++/7.3.0/x86_64-linux-gnu/bits/gthr-default.h::__gthread_active_p() returns false, when std::call_once is called from libomptarget.rtl.x86_64.so. It does return true, though, when std::call_once is called from libomptarget.so, because libomptarget.so is linked with libpthread.
It looks like GCC11 will have a fix: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55394
For the time being, all plugins have to be linked with libpthread, since Debug.h is considered to be common code for all plugins.
Any disagreement?
Why not uint32, with zero for all flags clear as the default?