This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add using bit flags to select Libomptarget Information
ClosedPublic

Authored by jhuber6 on Dec 22 2020, 12:44 PM.

Details

Summary

This patch adds more fine-grained support over which information is output from the libomptarget runtime when run with the environment variable LIBOMPTARGET_INFO set. An extensible set of flags can be used to pick and choose which information the user is interested in.

Diff Detail

Event Timeline

jhuber6 created this revision.Dec 22 2020, 12:44 PM
jhuber6 requested review of this revision.Dec 22 2020, 12:44 PM

I like the direction. Looks easily applicable to amdgpu and granular debug printing is a good feature.

As commented above, not convinced the -1 sentinel is worth the complexity. If the goal is to only call getenv once (which seems sensible), that's probably worth guarding with something threadsafe instead.

openmp/libomptarget/include/Debug.h
40

Why not uint32, with zero for all flags clear as the default?

jhuber6 added inline comments.Dec 22 2020, 1:05 PM
openmp/libomptarget/include/Debug.h
40

I wanted to have a method for getting the info / debug levels that didn't expose them as global variables, so I put them in a function as static variables. But this required having a special value for uninitialized so the value is only read from getenv the first time it's called. I could change it to use a boolean or something and avoid the complexity of reserving the highest order bit.

openmp/libomptarget/include/Debug.h
40

How about code that runs on library load? We have a few getenv calls and it's probably good to do all of them once.

struct env {
env() {all the getenv calls}

uint32_t debug_level() {return dl;}
uint32_t info_level() {}

private:
  uint32_t dl; // etc
} environment;

Makes the variables read-only after construction. Or use const member variables.

That will be thread safe, removes some branches, no sentinel. No teardown problems as destructor is a no-op.

jhuber6 updated this revision to Diff 313431.Dec 22 2020, 2:15 PM

Changing the info and debug functions to use booleans to check if they've been initialized already. Another benefit is that allows every flag to be enabled by setting LIBOMPTARGET_INFO=1.

jhuber6 added inline comments.Dec 22 2020, 2:18 PM
openmp/libomptarget/include/Debug.h
40

This is basically how it was done previously, I didn't like needing to initialize it in every plugin separately when they're all sharing the same function. I could change it to work this way but I feel like the new approach using an extra boolean works alright.

openmp/libomptarget/include/Debug.h
40

That is worth avoiding. The variable in the header doesn't achieve that though - each plugin will have it's own copy of the static variable + wrapper function.

If we move the init code, however it is done, into libomptarget then each plugin will need to make a cross-dl call to get the variable, but getenv is only called once. I think that's a win.

A comprehensively overengineered solution is to make the getenv calls once, from libomptarget, and also have the plugin read&cache the result of said call.

@JonChesterfield Let's do this firt and we can revisit a single global in libomptarget later. WDYT?

JonChesterfield accepted this revision.Dec 22 2020, 4:46 PM

Sure, may as well. If there's a race here it's preexisting.

This revision is now accepted and ready to land.Dec 22 2020, 4:46 PM

Oh - didn't mention it's still signed. Would prefer move to uint32 as the -1 is no longer in use, but that's not blocking.

jhuber6 updated this revision to Diff 313460.Dec 22 2020, 5:52 PM

Changing info and debug values to uint32_t.

JonChesterfield accepted this revision.Dec 23 2020, 2:09 AM

Nice, thanks!

grokos added a subscriber: grokos.Dec 23 2020, 2:27 PM
grokos added inline comments.
openmp/libomptarget/include/Debug.h
48

Is 0x0008 reserved for something else?

53

I would suggest you declare initialized as std::once_flag and call std::call_once() to initialize InfoLevel. This is guaranteed to be thread-safe (unlike the implementation posted here) while also making sure that InfoLevel is initialized exactly once. We initialize RTLs in RTLsTy::RegisterLib() in the very same way.

openmp/libomptarget/src/interface.cpp
237–240

Wouldn't we want to have

if (getInfoLevel() & OMP_INFOTYPE_KERNEL_ARGS)
  printKernelArguments(loc, device_id, arg_num, arg_sizes, arg_types,
                       arg_names);

here as well? It is useful to see what arguments we have if something goes wrong at a target exit data directive or upon exiting a target data scope.

jhuber6 added inline comments.Dec 24 2020, 11:46 AM
openmp/libomptarget/include/Debug.h
48

No real reason, I just felt like keeping a space open, so if we add another libomptarget flag the first nibble just applies to libomptarget, since the fourth one is for the CUDA plugin right now.

53

Seems like a good solution. Should I push this first and address that in a future patch? Or just do it here.

openmp/libomptarget/src/interface.cpp
237–240

This method just prints the arguments to __tgt_target_data_begin_mapper and __tgt_target_data_end_mapper, which should have the same arguments right? I had it in previously but it didn't add any useful information when I tested it. I could add a separate message that just says something like "Exiting data region started at (loc)". Would that be good?

jhuber6 updated this revision to Diff 313864.Dec 28 2020, 9:14 AM

Changing environment variables to initialize using std::call_once. Added a message when exiting an OpenMP data kernel.

grokos added inline comments.Dec 28 2020, 10:07 AM
openmp/libomptarget/src/interface.cpp
237–240

For scoped constructs (e.g. omp target data) this is true, whatever arguments we have in begin_mapper the very same arguments we have in end_mapper. But what if we have a standalone directive (e.g. omp target exit data)? If I recall correctly (currently on vacation with no access to a working llvm environment), omp target exit data is compiled into a __tgt_target_data_end_mapper call. In this case there may be no matching enter data directive so if we don't call printKernelArguments() here the user will not be able to see what arguments end_mapper was called with.

jhuber6 added inline comments.Dec 28 2020, 11:51 AM
openmp/libomptarget/src/interface.cpp
237–240

That's true, I should probably just make it print whether its a data region entry, device kernel, or data region exit. That way it's clear what the flow is. Otherwise it would be confusing because it would still say that a variable is mapped "to" at the end of a data region.

jhuber6 updated this revision to Diff 313885.Dec 28 2020, 12:34 PM

Fixing a bug with passing a nullptr and making the kernel arguments print a different message for which type of region was executed.

grokos accepted this revision.Jan 4 2021, 6:46 AM

Thanks for the changes! LGTM.

This revision was landed with ongoing or failed builds.Jan 4 2021, 9:05 AM
This revision was automatically updated to reflect the committed changes.
vzakhari added inline comments.
openmp/libomptarget/include/Debug.h
54–63

Hello. Am I the only one who is getting libomptarget.rtl.x86_64.so build fail with -DCMAKE_BUILD_TYPE=Debug?

ld: CMakeFiles/omptarget.rtl.x86_64.dir/__/generic-elf-64bit/src/rtl.cpp.o: in function `getInfoLevel()::{lambda()#1}::operator()() const':
.../openmp/libomptarget/include/Debug.h:61: undefined reference to `getInfoLevel()::InfoLevel'
ld: CMakeFiles/omptarget.rtl.x86_64.dir/__/generic-elf-64bit/src/rtl.cpp.o: relocation R_X86_64_PC32 against undefined symbol `_ZZL12getInfoLevelvE9InfoLevel' can not be used when making a shared object; recompile with -fPIC

Here is a small reproducer:

#include <mutex>

static inline uint32_t getInfoLevel() {
  static uint32_t InfoLevel = 0;
  static std::once_flag Flag{};
  std::call_once(Flag, []() {
      InfoLevel = 1;
  });

  return InfoLevel;
}

I cannot build a shared library from it using gcc: g++ -fPIC -std=c++14 test.cpp -shared -o test.so

ld: /tmp/ccBpTXXs.o: relocation R_X86_64_PC32 against undefined symbol `_ZZL12getInfoLevelvE9InfoLevel' can not be used when making a shared object; recompile with -fPIC

The issue is reproducible with GCC 5.5.0 and 6.5.0, and does not reproduce with 7.3.0. I believe LLVM's minimal requirement is 5.1, so it is a problem.

The issue seems to be caused by the fact that getInfoLevel() is not referenced in openmp/libomptarget/plugins/generic-elf-64bit/src/rtl.cpp

jhuber6 added inline comments.Jan 26 2021, 12:26 PM
openmp/libomptarget/include/Debug.h
54–63

I have no issues compiling the reproducer you provided. It compiles without errors locally and here https://godbolt.org/z/oEKq3E for the compilers you listed. There might be a problem when the compiler doesn't create a static variable when the function in unused. Does the problem go away if you create a dummy usage of getInfoLevel()?

jhuber6 added inline comments.Jan 26 2021, 12:34 PM
openmp/libomptarget/include/Debug.h
54–63

Actually you're right, it doesn't work on Godbolt, the error went off screen and didn't list as an error in the compilation. It definitely works locally on the machine I tested it on so who knows what's up with that. The error goes away if I just add a dummy usage to force the compiler to actually generate a symbol for InfoLevel. It's a solution but probably not the best.

vzakhari added inline comments.Jan 26 2021, 12:49 PM
openmp/libomptarget/include/Debug.h
54–63

Of course, it does compile with a dummy reference. I guess you have more recent GCC on your machine. The issue seems to be fixed somewhere between 6.5.0 and 7.1.0.

I tried inserting the following forward-declaration right before the function definition and the problem seems to go away:

static inline uint32_t getInfoLevel() __attribute__((used));

I guess it's better to have the above rather than a dummy call?

Can mark the function itself as ((used)), for slightly less typing. Seems a reasonable workaround to me, with a comment like 'workaround bug in gcc 5.ish'

Can mark the function itself as ((used)), for slightly less typing. Seems a reasonable workaround to me, with a comment like 'workaround bug in gcc 5.ish'

Yeah, that's an option. Adding a dummy variable would defeat the original purpose of initializing the variables this way, rather than having each plugin initialize it individually and globally. This problem only seems to occur on GCC x86 from the compilers I've checked. The reason It wasn't failing on my machine was because it was GCC 6.3 on PPC. I'm assuming any other compilers will just ignore the attribute if they don't understand it. This should be sufficient.

static inline uint32_t __attribute__((used)) getInfoLevel  () { ... }
static inline uint32_t __attribute__((used)) getDebugLevel  () { ... }

Can mark the function itself as ((used)), for slightly less typing. Seems a reasonable workaround to me, with a comment like 'workaround bug in gcc 5.ish'

Yeah, that's an option. Adding a dummy variable would defeat the original purpose of initializing the variables this way, rather than having each plugin initialize it individually and globally. This problem only seems to occur on GCC x86 from the compilers I've checked. The reason It wasn't failing on my machine was because it was GCC 6.3 on PPC. I'm assuming any other compilers will just ignore the attribute if they don't understand it. This should be sufficient.

static inline uint32_t __attribute__((used)) getInfoLevel  () { ... }
static inline uint32_t __attribute__((used)) getDebugLevel  () { ... }

Thank you all for the investigation. Fixed in https://reviews.llvm.org/D95486

offloading/dynamic_module_load.c fails due to this commit in my environment with x86_64 offload. It fails because std::call_once cannot be used without libpthread, and libomptarget.rtl.x86_64.so is not linked with libpthread. Basically, gdb shows that include/c++/7.3.0/x86_64-linux-gnu/bits/gthr-default.h::__gthread_active_p() returns false, when std::call_once is called from libomptarget.rtl.x86_64.so. It does return true, though, when std::call_once is called from libomptarget.so, because libomptarget.so is linked with libpthread.

It looks like GCC11 will have a fix: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55394
For the time being, all plugins have to be linked with libpthread, since Debug.h is considered to be common code for all plugins.

Any disagreement?

offloading/dynamic_module_load.c fails due to this commit in my environment with x86_64 offload. It fails because std::call_once cannot be used without libpthread, and libomptarget.rtl.x86_64.so is not linked with libpthread. Basically, gdb shows that include/c++/7.3.0/x86_64-linux-gnu/bits/gthr-default.h::__gthread_active_p() returns false, when std::call_once is called from libomptarget.rtl.x86_64.so. It does return true, though, when std::call_once is called from libomptarget.so, because libomptarget.so is linked with libpthread.

It looks like GCC11 will have a fix: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55394
For the time being, all plugins have to be linked with libpthread, since Debug.h is considered to be common code for all plugins.

Any disagreement?

Linking libpthread also into the plugins seems plausible. Can you write the patch?