For example, without this patch:
$ cat test.c int main() { int x; #pragma omp target enter data map(alloc: x) #pragma omp target exit data map(release: x) ; return 0; } $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda test.c $ LIBOMPTARGET_DEBUG=1 ./a.out |& grep 'Creating\|Mapping exists' Libomptarget --> Creating new map entry with HstPtrBegin=0x00007ffcace8e448, TgtPtrBegin=0x00007f12ef600000, Size=4, Name=unknown Libomptarget --> Mapping exists with HstPtrBegin=0x00007ffcace8e448, TgtPtrBegin=0x00007f12ef600000, Size=4, updated RefCount=1
There are two problems in this example:
- RefCount is not reported when a mapping is created, but it might be 1 or infinite. In this case, because it's created by omp target enter data, it's 1. Seeing that would make later RefCount messages easier to understand.
- RefCount is still 1 at the omp target exit data, but it's reported as updated. The reason it's still 1 is that, upon deletions, the reference count is generally not updated in DeviceTy::getTgtPtrBegin, where the report is produced. Instead, it's zeroed later in DeviceTy::deallocTgtPtr, where it's actually removed from the mapping table.
This patch makes the following changes:
- Report the reference count when creating a mapping.
- Where an existing mapping is reported, always report a reference count action:
- update suppressed when UpdateRefCount=false
- incremented
- decremented
- deferred final decrement, which replaces the misleading updated in the above example
- Add comments to DeviceTy::getTgtPtrBegin to explain why it does not zero the reference count. (Please advise if these comments miss the point.)
- For unified shared memory, don't report confusing messages like RefCount= or RefCount= updated given that reference counts are irrelevant in this case. Instead, just report for unified shared memory.
- Use INFO not DP consistently for Mapping exists messages.
- Fix device table dumps to print INF instead of -1 for an infinite reference count.
We'll also want to do this for dumping the device table, since anything with an infinite ref count will be printed at the end of the program.