Page MenuHomePhabricator

[OpenMP] Improve ref count debug messages

Authored by jdenny on Jun 18 2021, 11:10 AM.



For example, without this patch:

$ cat test.c
int main() {
  int x;
  #pragma omp target enter data map(alloc: x)
  #pragma omp target exit data map(release: x)
  return 0;
$ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda test.c
$ LIBOMPTARGET_DEBUG=1 ./a.out |& grep 'Creating\|Mapping exists'
Libomptarget --> Creating new map entry with HstPtrBegin=0x00007ffcace8e448, TgtPtrBegin=0x00007f12ef600000, Size=4, Name=unknown
Libomptarget --> Mapping exists with HstPtrBegin=0x00007ffcace8e448, TgtPtrBegin=0x00007f12ef600000, Size=4, updated RefCount=1

There are two problems in this example:

  • RefCount is not reported when a mapping is created, but it might be 1 or infinite. In this case, because it's created by omp target enter data, it's 1. Seeing that would make later RefCount messages easier to understand.
  • RefCount is still 1 at the omp target exit data, but it's reported as updated. The reason it's still 1 is that, upon deletions, the reference count is generally not updated in DeviceTy::getTgtPtrBegin, where the report is produced. Instead, it's zeroed later in DeviceTy::deallocTgtPtr, where it's actually removed from the mapping table.

This patch makes the following changes:

  • Report the reference count when creating a mapping.
  • Where an existing mapping is reported, always report a reference count action:
    • update suppressed when UpdateRefCount=false
    • incremented
    • decremented
    • deferred final decrement, which replaces the misleading updated in the above example
  • Add comments to DeviceTy::getTgtPtrBegin to explain why it does not zero the reference count. (Please advise if these comments miss the point.)
  • For unified shared memory, don't report confusing messages like RefCount= or RefCount= updated given that reference counts are irrelevant in this case. Instead, just report for unified shared memory.
  • Use INFO not DP consistently for Mapping exists messages.
  • Fix device table dumps to print INF instead of -1 for an infinite reference count.

Diff Detail

Event Timeline

jdenny created this revision.Jun 18 2021, 11:10 AM
jdenny requested review of this revision.Jun 18 2021, 11:10 AM
jdenny set the repository for this revision to rG LLVM Github Monorepo.Jun 18 2021, 11:11 AM
jdenny added a subscriber: openmp-commits.
Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2021, 11:11 AM

Thanks for addressing this. Please update the documentation as well ./openmp/docs/design/Runtimes.rst.


Could probably do the same here, having nested ternary might look a little silly however.


We'll also want to do this for dumping the device table, since anything with an infinite ref count will be printed at the end of the program.

Thanks for addressing this. Please update the documentation as well ./openmp/docs/design/Runtimes.rst.

Thanks, I'll work on that.


I would prefer to determine the action and the record of the action at the same time instead of duplicating the logic. I think that's easier to maintain. However, this case is simple, so I can change it if you feel strongly.


In this more complicated case (and D104560 makes it even more complicated), I feel strongly that we should determine the action and the record of the action at the same time instead of duplicating the logic. Is that ok with you?


Thanks, I'll change that. Is it just dumpTargetPointerMappings?

jhuber6 added inline comments.Jun 18 2021, 12:14 PM

It's fine as is if you're going to be using it for something more complicated. If it was just printing it this would be the best option.


Yes, where it calls getRefCount.

jdenny updated this revision to Diff 353377.Jun 21 2021, 8:07 AM
jdenny marked 5 inline comments as done.
jdenny edited the summary of this revision. (Show Details)

Applied jhuber6's suggestions:

  • Fixed INF case in dumpTargetPointerMappings too. Extended existing test to cover it.
  • For the simple RefCountAction case, used the conditional operator.
  • Updated openmp/docs/design/Runtimes.rst.
jhuber6 accepted this revision.Jun 21 2021, 12:59 PM

LGTM if you fix the documentation size.



This needs to fit in the page. I omitted the libomptarget device 0 stuff to help that. Also I had to manually dro psome lined down.

This revision is now accepted and ready to land.Jun 21 2021, 12:59 PM
jdenny updated this revision to Diff 353493.Jun 21 2021, 2:30 PM

Reformatted documentation change to fit better on the page.

jdenny marked an inline comment as done.Jun 21 2021, 2:31 PM
jdenny added inline comments.

I'm not sure what page width you're going for, but I think it's similar to the original now.

grokos added inline comments.Jun 21 2021, 2:56 PM

Just to restore the faster emplace method, we can have something like this:

auto newEntry = HostDataToTargetMap.emplace(HostDataToTargetTy(base, begin,...));
INFO(..., newEntry.refCountToStr().c_str(), ...);
grokos added inline comments.Jun 22 2021, 1:22 AM

Actually, you don't even need to create a HostDataToTargetTy object inside emplace, just call the function with the arguments directly (like in D104580):

auto newEntry = HostDataToTargetMap.emplace(base, begin,...);
INFO(..., newEntry.refCountToStr().c_str(), ...);
jdenny updated this revision to Diff 353638.EditedJun 22 2021, 7:09 AM
jdenny marked 3 inline comments as done.

Applied grokos's suggestions: used emplace instead of insert in DeviceTy::getOrAllocTgtPtr. Did the same for the existing case in DeviceTy::associatePtr.

Thanks for the review.

grokos accepted this revision.Jun 22 2021, 1:47 PM
This revision was landed with ongoing or failed builds.Jun 23 2021, 6:59 AM
This revision was automatically updated to reflect the committed changes.
jdenny added inline comments.Jun 23 2021, 7:08 AM

When I rebased before pushing, there were conflicts with D104580, which makes similar adjustments here for emplace. I resolved to the version in this patch because it uses LLVM's typical function argument comment style. I'm happy to push another patch to make any desired adjustments.