This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)
ClosedPublic

Authored by jdenny on Jul 2 2020, 10:32 AM.

Details

Summary

This implements OpenMP runtime support for the OpenMP TR8 present
map type modifier. The previous patch in this series implements Clang
front end support. See that patch summary for behaviors that are not
yet supported.

Diff Detail

Event Timeline

jdenny created this revision.Jul 2 2020, 10:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2020, 10:32 AM
grokos added inline comments.Jul 2 2020, 1:01 PM
openmp/libomptarget/src/device.cpp
209–212

I think this else-if should be moved right after the else-if that checks for explicit extension of mapping (i.e. after line 194) and outside the else if (Size) branch. If we have the present modifier then the data must be mapped already no matter whether a size is specified.

openmp/libomptarget/src/omptarget.cpp
321–338

Correct. The only case we can have a PTR_AND_OBJ entry is if we have a pointer which is member of a struct or element of an array. This means that whenever we have something like:

#pragma omp target map(s.p[0:N])

clang will first map the struct/array and then we'll visit the PTR_AND_OBJ entry, so the space for the pointer itself has been allocated already. I think it's safe to skip this check here (or you can leave it as a form of defensive programming).

jdoerfert added inline comments.Jul 2 2020, 1:29 PM
openmp/libomptarget/src/omptarget.cpp
494

Can we print the error in non debug mode as well? Is there a way we can provide more information on the location and variable?

jdenny marked 3 inline comments as done.Jul 7 2020, 10:46 AM
jdenny added inline comments.
openmp/libomptarget/src/device.cpp
209–212

In the case of Size == 0, as far as I can tell, nothing is ever placed in HostDataToTargetMap, with or without the patch under review. Instead, getOrAllocTgtPtr always handles this case as already mapped and returns NULL for it. I believe your suggested change would not affect this behavior. However, when HasPresentModifier, your suggested change would cause the Mapping required but does not exist debug message to be printed for this case, incorrectly implying that this case is never already mapped and that a runtime error should follow. A runtime error won't follow because Size == 0.

In the case of Size != 0, your suggested change would affect unified shared memory. As far as I can tell, unified shared memory is handled similarly to the case of Size == 0: it's never placed in HostDataToTargetMap, and getOrAllocTgtPtr always handles it as already mapped. However, when HasPresentModifier, your suggested change would cause getOrAllocTgtPtr to always handle unified shared memory as unmapped, and a runtime error will follow, so unified shared memory is then unusable with the present modifier.

In both cases, your suggested change seems to produce the wrong behavior. Do you agree?

I need to add tests to cover these cases.

openmp/libomptarget/src/omptarget.cpp
321–338

Thanks for confirming. I'll leave it but adjust the comment not to be a todo.

485–498

This should be TgtPtrBegin || !data_size, mirroring similar code in target_data_begin.

494

I would prefer that too. When I investigated this originally, I didn't find a good precedent in this runtime implementation, so I decided to be conservative. Is there an existing diagnostic I should model this after, or is this a unique case?

jdenny updated this revision to Diff 276199.Jul 7 2020, 1:44 PM
jdenny marked an inline comment as done.
jdenny edited the summary of this revision. (Show Details)

Added tests for cases of size-zero mappings and unified shared memory.

Fixed handling of size-zero mapping in target_data_end.

Converted TODO into a normal comment.

jdenny marked 3 inline comments as done.Jul 7 2020, 1:46 PM
jdenny added inline comments.
openmp/libomptarget/src/device.h
183

I see no remaining getOrAllocTgtPtr calls that make use of these default arguments. Any objection to removing them in this patch?

grokos added inline comments.Jul 7 2020, 1:51 PM
openmp/libomptarget/src/device.cpp
209–212

In the case of Size == 0, as far as I can tell, nothing is ever placed in HostDataToTargetMap, with or without the patch under review. Instead, getOrAllocTgtPtr always handles this case as already mapped and returns NULL for it.

This is not always true. We can have zero size and still get a valid device address. Consider the example below:

int *p = malloc(N);
#pragma omp target enter data map(to: p[0:N])
...
#pragma omp target
{
  p[0] = 1;
}

Here p[0:N] is mapped via enter data and address &p[0] is inserted into HostDataToTargetMap. When we encounter the target region, the compiler captures pointer p implicitly as it appears inside the target region, however the compiler doesn't know its size (it's a pointer, not a named array), therefore it will be mapped as p[0:0], i.e. as a zero-size pointer. getOrAllocTgtPtr will find address &p[0] in HostDataToTargetMap (lr.isContained) and return the corresponding device address. If we have demanded that p[0:0] is present on the device, then libomptarget must check for its presence even if size is 0. So if you invoke the target region like this:

#pragma omp target map(present, alloc : p[0:0])

then libomptarget must either:

  1. Return the corresponding device address of &p[0] if p[0:N] has been mapped before or
  2. Error out if &p[0] cannot be found in HostDataToTargetMap (e.g. if we had omitted the enter data directive in the example above) but we requested it to be present

The latter case is not covered by this version of the patch.

Regarding unified shared memory you're right. So my proposed change should be:

if (contained) {
...
} else if (explicit extension) {
...
} else if (HasPresentModifier && (!USM || (USM && Close) ) {
  error out
} else if (Size) {
...
}
openmp/libomptarget/src/omptarget.cpp
494

Unfortunately, libomptarget does not have access to source code location. __tgt_ API functions do not have a related argument, like the ident_t *loc argument of __kmpc_ API functions in libomp. Of course, if there is demand for such a feature we can implement it, although we would need to change *EVERY* API function in libomptarget.

jdenny marked an inline comment as done.Jul 7 2020, 2:34 PM
jdenny added inline comments.
openmp/libomptarget/src/device.cpp
209–212

Thanks for explaining. What is the correct behavior in the following case?

#pragma omp target data map(alloc : p[0:0])
#pragma omp target data map(present, alloc : p[0:0])
jdenny updated this revision to Diff 276254.Jul 7 2020, 4:24 PM

Changed present modifier to fail for zero-length array section if the base address is not already mapped, and updated tests.

jdenny marked 2 inline comments as done.Jul 7 2020, 4:30 PM
jdenny added inline comments.
openmp/libomptarget/src/device.cpp
209–212

Thanks for explaining. What is the correct behavior in the following case?

#pragma omp target data map(alloc : p[0:0])
#pragma omp target data map(present, alloc : p[0:0])

My latest update assumes the second directive should produce a runtime error because the first directive doesn't actually map anything. Please let me know if that's not right.

So my proposed change should be:

if (contained) {
...
} else if (explicit extension) {
...
} else if (HasPresentModifier && (!USM || (USM && Close) ) {

error out

} else if (Size) {
...
}

I organized this a little differently to avoid repeating the USM conditions, but I intend it to be equivalent.

grokos added inline comments.Jul 7 2020, 4:52 PM
openmp/libomptarget/src/device.cpp
209–212

Thanks for explaining. What is the correct behavior in the following case?

#pragma omp target data map(alloc : p[0:0])
#pragma omp target data map(present, alloc : p[0:0])

For the first one, if the object p is pointing to has been mapped before then there is no noticeable effect, the only thing that happens is that getOrAllocTgtPtr increments the RefCount of the mapping and returns the corresponding device address. If the object hasn't been mapped before and because size is zero, getOrAllocTgtPtr will return NULL (there is nothing to be allocated).

With the present modifier, if the object is already mapped, getOrAllocTgtPtr increments the RefCount and returns the corresponding device address (just like in the former case). If the object is not mapped, the present modifier will trigger an error - in other words NULL is not an option if we have requested something to be present.

openmp/libomptarget/src/device.h
183

Sure, they can go.

jdenny marked an inline comment as done.Jul 7 2020, 4:58 PM
jdenny added inline comments.
openmp/libomptarget/src/device.cpp
209–212

Sorry, I meant that these directives are nested and the array was not previously mapped. It sounds like the outer directive does not map it in that case, and so the inner directive produces a runtime error.

grokos added inline comments.Jul 7 2020, 5:00 PM
openmp/libomptarget/src/device.cpp
209–212

Correct.

jdenny updated this revision to Diff 276268.Jul 7 2020, 5:02 PM
jdenny marked an inline comment as done.

Removed unused default arguments from getOrAllocTgtPtr.

jdenny marked 2 inline comments as done.Jul 7 2020, 5:03 PM
jdenny marked 2 inline comments as done.Jul 7 2020, 5:07 PM
jdenny added inline comments.
openmp/libomptarget/src/device.cpp
209–212

Thanks for all the careful explanations! I believe the logic is now correct for zero-length array sections.

jdenny marked 3 inline comments as done.Jul 7 2020, 5:20 PM
jdenny added inline comments.
openmp/libomptarget/src/omptarget.cpp
494

Unfortunately, libomptarget does not have access to source code location. __tgt_ API functions do not have a related argument, like the ident_t *loc argument of __kmpc_ API functions in libomp. Of course, if there is demand for such a feature we can implement it, although we would need to change *EVERY* API function in libomptarget.

That would be nice, but in a later patch.

For this patch, perhaps just a simple message clarifying that the error is due to the present modifier would be sufficient. Should we define another macro next to FATAL_MESSAGE to print messages to stderr without exiting? The exit happens later when FATAL_MESSAGE is actually called.

grokos added inline comments.Jul 8 2020, 11:45 AM
openmp/libomptarget/src/omptarget.cpp
494

That would be good, such a message is missing from other places in libomptarget as well (e.g. when we explicitly try to extend an existing mapping). Currently such messages are only printed in debug output. It's important for the user to know why offload failed even if libomptarget is running in release mode.

jdenny updated this revision to Diff 276581.Jul 8 2020, 3:29 PM
jdenny marked an inline comment as done.

I added stderr messages for present violations, as discussed. I've named the new macro MESSAGE to contrast with FATAL_MESSAGE, but I'm open to suggestions.

I adjusted the new tests to check for these new messages instead of the LIBOMPTARGET_DEBUG=1 output.

jdenny marked 2 inline comments as done.Jul 8 2020, 3:30 PM
grokos accepted this revision.Jul 9 2020, 12:40 PM

The patch looks good now, thanks for all the work!

This revision is now accepted and ready to land.Jul 9 2020, 12:40 PM
jdenny added a comment.Jul 9 2020, 1:13 PM

Thanks for the review!

The Clang part of this (D83061) must land first, so I'll wait for that review.

jdenny added inline comments.Jul 16 2020, 8:54 AM
openmp/libomptarget/src/omptarget.cpp
492

This is right for omp target exit data, as in openmp/libomptarget/test/mapping/present/target_exit_data.c.

I think it's wrong for omp target data. My read of TR8 is that the present assertion does not happen on exit from a region. For example:

#pragma omp target data map(alloc:x)
{ 
  #pragma omp target data map(present,alloc:x)
  {
    #pragma omp target exit data map(delete:x)
  } // fails here but shouldn't
}
jdenny added inline comments.Jul 16 2020, 3:10 PM
openmp/libomptarget/src/omptarget.cpp
323

This comment is incorrect. We can also have PTR_AND_OBJ in the case of omp declare target link as discussed at http://lists.llvm.org/pipermail/openmp-dev/2020-July/003586.html.

grokos added inline comments.Jul 16 2020, 3:17 PM
openmp/libomptarget/src/omptarget.cpp
323

Right, the comment is not relevant anymore. But I think the conveyed message is still correct. Apart from structs/arrays, if we have a PTR_AND_OBJ for a global pointer then by definition the pointer itself has been mapped (since it's a global). Maybe update the comment? The logic is still correct I think.

jdenny added inline comments.Jul 16 2020, 3:34 PM
openmp/libomptarget/src/device.cpp
215

The \n in each of my MESSAGE calls should be dropped. MESSAGE already adds that.

jdenny added inline comments.Jul 16 2020, 4:07 PM
openmp/libomptarget/src/omptarget.cpp
323

Yes, I believe the code is right. I'll try to generalize the comment.

grokos added inline comments.Jul 16 2020, 4:54 PM
openmp/libomptarget/src/omptarget.cpp
323

I was just reminded that globals marked with declare target link are also PTR_AND_OBJ entries. declare target link already works in the exact same way as the proposed change we discussed at the July 15 telecon.

jdenny updated this revision to Diff 278782.Jul 17 2020, 8:29 AM

Rebased onto latest D83061. Updated tests to use -fopenmp-version=51.

Removed extra newlines from MESSAGE calls.

Tried to fix PTR_AND_OBJ comment.

jdenny marked 5 inline comments as done.Jul 17 2020, 8:43 AM
jdenny added inline comments.
openmp/libomptarget/src/omptarget.cpp
323

If we're so sure it's impossible to have !Pointer_TgtPtrBegin here, should we have an assert?

That way, if the situation evolves, testing will hopefully reveal when our assumptions here are wrong. Based on my understanding now, it would probably indicate a bug somewhere. But I guess it might just mean the assert then needs to be removed and the comment needs to be updated. Either way, it will be good to know about the change.

My understanding is that assert would only be active for debug builds. For release builds, we could still have the if and return OFFLOAD_FAIL for the sake of defensive programming.

492

In order to fix this, it looks like I could modify Clang to pass a new argument to __tgt_target_data_end to indicate whether it's just the end of a region (as in omp target exit data) or both the start and end of a region (as in omp target data).

Before I work on that, can somewhere confirm my understanding of TR8 behavior, described in the previous comment?

jdenny updated this revision to Diff 278878.Jul 17 2020, 12:33 PM
jdenny marked an inline comment as done.

Updated OMP_TGT_MAPTYPE_PRESENT to match new value of OMP_MAP_PRESENT in D83061.

jdenny updated this revision to Diff 278935.Jul 17 2020, 4:56 PM

Rebased.

grokos added inline comments.Jul 17 2020, 6:58 PM
openmp/libomptarget/src/omptarget.cpp
323

Yes, we can include an assertion here.

330

It's declare target link, not plain declare target.

jdenny added inline comments.Jul 20 2020, 2:10 PM
openmp/libomptarget/src/omptarget.cpp
323

The assert would currently fail for the following:

int *x;
#pragma omp declare target link(x)

int main() {
  #pragma omp target map(present, tofrom: x[0:3])
  ;
  return 0;
}
jdenny updated this revision to Diff 279636.Jul 21 2020, 2:56 PM
jdenny marked 6 inline comments as done.

Tried to fix comments on PTR_AND_OBJ.

Added fixme for case of present at exit from omp target data.

openmp/libomptarget/src/omptarget.cpp
323

I didn't add the assert. I updated the comments to describe the situation as I understand it. Please advise if it's not right.

492

In order to fix this, it looks like I could modify Clang to pass a new argument to __tgt_target_data_end to indicate whether it's just the end of a region (as in omp target exit data) or both the start and end of a region (as in omp target data).

I reversed "omp target exit data" and "omp target data" there. Otherwise, I believe that comment is correct.

For now, I've decided to add a fixme and handle this in a later patch after pushing this series.

grokos accepted this revision.Jul 21 2020, 3:08 PM

OK, let's proceed with this version now and later on we can fine-tune the behavior.

Thanks again for the review.

This revision was automatically updated to reflect the committed changes.
jdenny added inline comments.Jul 22 2020, 8:15 AM
openmp/libomptarget/src/omptarget.cpp
493

It looks like DPxMOD and DPxPTR are not defined for release builds, and so bots are failing. I had only tested debug builds.

Here's an excerpt of the relevant code from omptarget.h:

#ifdef OMPTARGET_DEBUG
#include <stdio.h>
#define DEBUGP(prefix, ...)                                                    \
  {                                                                            \
    fprintf(stderr, "%s --> ", prefix);                                        \
    fprintf(stderr, __VA_ARGS__);                                              \
  }

#ifndef __STDC_FORMAT_MACROS
#define __STDC_FORMAT_MACROS
#endif

#include <inttypes.h>
#define DPxMOD "0x%0*" PRIxPTR
#define DPxPTR(ptr) ((int)(2*sizeof(uintptr_t))), ((uintptr_t) (ptr))

Any objection to moving relevant parts before the #ifdef OMPTARGET_DEBUG? Or is there a better solution?

Sure, go ahead!