This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] libomptarget: Don't map alignment padding to host
ClosedPublic

Authored by jdenny on May 2 2023, 1:26 PM.

Details

Summary

In the case of partially mapped structs, libomptarget sometimes adds
padding to device allocations to ensure they are aligned properly.
However, without this patch, it considers that padding to be mapped to
the host, which can cause presence checks (e.g.,
omp_target_is_present or a present modifier) to misbehave for
unmapped parts of the struct. This patch keeps the padding but treats
it as unmapped. See the new test case for examples.

Diff Detail

Event Timeline

jdenny created this revision.May 2 2023, 1:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2023, 1:26 PM
jdenny requested review of this revision.May 2 2023, 1:26 PM
jdenny edited the summary of this revision. (Show Details)May 9 2023, 6:40 AM

Ping.

I understand what you are trying to achieve here and indeed this patch does exactly what you want to. But my question is, do we really want to allow this behavior? The test case which was added to demonstrate what problem this patch fixes looks illegal from an OpenMP perspective. It is a case of "explicitly mapping more" of the same object, which is forbidden in OpenMP. I wasn't sure whether that restriction has been relaxed and it slipped under my radar, but the latest TR11 spec still says about map clauses (https://www.openmp.org/wp-content/uploads/openmp-TR11.pdf, pg. 159, lines 29-32):

If a list item is an element of a structure, and a different element of the structure has a
corresponding list item in the device data environment prior to a task encountering the
construct associated with the map clause, then the list item must also have a corresponding
list item in the device data environment prior to the task encountering the construct.

In other words, if you want to map a struct member via a later map clause and other members of the same struct have been mapped via an earlier map clause, then the former member should also have been mapped via the earlier map clause.

If there is a valid scenario that doesn't work without this patch then:

  1. The valid scenario should be added as a new test case to highlight what this patch fixes
  2. Modify this patch so that the illegal scenario in padding_not_mapped.c still fails, as libomptarget correctly detects the attempt to "explicitly map more of the same struct"
jdenny abandoned this revision.May 10 2023, 1:56 PM

It is a case of "explicitly mapping more" of the same object, which is forbidden in OpenMP.

Indeed, I misunderstood. Sorry for the noise, and thanks for explaining. I'll abandon the patch.

If there is a valid scenario that doesn't work without this patch

I'm not sure there is.

Modify this patch so that the illegal scenario in padding_not_mapped.c still fails, as libomptarget correctly detects the attempt to "explicitly map more of the same struct"

For the record, while I designed that test to trigger detection, libomptarget does not always detect such cases. For example, the second pair of directives in the existing test openmp/libomptarget/test/mapping/low_alignment.c seem to be designed to avoid detection. In other cases, detection depends on the amount of padding on the device, which depends on how the host memory happens to be aligned, which can vary from run to run of the same program. That is, detection is non-deterministic.

Now that I have a better handle on what's happening, it seems to me that the second pair of directives in openmp/libomptarget/test/mapping/low_alignment.c should be removed. I think that test only passes when all of the following conditions are true, but OpenMP doesn't guarantee them to be true:

  1. The runtime doesn't diagnose the restriction violation because map(from : t.i, t.j) has too little device padding for map(from : t.a) to collide with it.
  1. t.i = 21; and t.j = 31; do not produce memory access violations even though they write to the offsets of t.i and t.j computed from the base device address of t from map(from : t.a) and even though those offsets are not guaranteed to be allocated on the device.
  1. The desired values for t.i and t.j are copied back to the host even though t.i = 21; and t.j = 31; do *not* write to the offset of t.i and t.j computed from the base device address of t from map(from : t.i, t.j). The reason the correct values are copied back is that the previous kernel wrote the same values to s.i and s.j, which happened to have the same device addresses.

Agreed?

jdenny updated this revision to Diff 529421.Jun 7 2023, 3:35 PM
jdenny edited the summary of this revision. (Show Details)

If there is a valid scenario that doesn't work without this patch then:

After further thought, I believe there are several such scenarios. For a struct member that only maps to device padding and thus shouldn't be considered present, omp target update should have no effect, omp_target_is_present should return false, and present modifiers should produce runtime errors. Did I miss anything in the spec that makes these scenarios invalid?

I updated the test and patch summary for these scenarios, and I rebased.

jdoerfert accepted this revision.Jun 30 2023, 9:48 AM

I think the motivation is valid. The code looks good, though it's early for me. It seems to pass all our alignment tests, which is a good start. LG

This revision is now accepted and ready to land.Jun 30 2023, 9:48 AM
This revision was automatically updated to reflect the committed changes.
jdenny added a comment.Jul 3 2023, 7:25 AM

Thanks for the review.

cseo added a subscriber: cseo.Jul 4 2023, 7:40 AM

I think there's a corner case you may have missed. I'm still seeing this on aarch64:

/llvm-project/openmp/libomptarget/test/mapping/target_derefence_array_pointrs.cpp:28:12: error: CHECK: expected string not found in input
 // CHECK: 3
           ^
<stdin>:1:103: note: scanning from here
Libomptarget message: explicit extension not allowed: host address specified is 0x0000ffffd26ff7c0 (12 bytes), but device allocation maps to host at 0x0000ffffd26ff7c0 (8 bytes)
                                                                                                      ^
<stdin>:1:104: note: possible intended match here
Libomptarget message: explicit extension not allowed: host address specified is 0x0000ffffd26ff7c0 (12 bytes), but device allocation maps to host at 0x0000ffffd26ff7c0 (8 bytes)

Your patch fixes most failures, except this one. Fails with and without LTO enabled. Could you please check?

I think there's a corner case you may have missed. I'm still seeing this on aarch64:

/llvm-project/openmp/libomptarget/test/mapping/target_derefence_array_pointrs.cpp:28:12: error: CHECK: expected string not found in input
 // CHECK: 3
           ^
<stdin>:1:103: note: scanning from here
Libomptarget message: explicit extension not allowed: host address specified is 0x0000ffffd26ff7c0 (12 bytes), but device allocation maps to host at 0x0000ffffd26ff7c0 (8 bytes)
                                                                                                      ^
<stdin>:1:104: note: possible intended match here
Libomptarget message: explicit extension not allowed: host address specified is 0x0000ffffd26ff7c0 (12 bytes), but device allocation maps to host at 0x0000ffffd26ff7c0 (8 bytes)

Your patch fixes most failures, except this one. Fails with and without LTO enabled. Could you please check?

I'm not testing on aarch64, but that test fails for me both before and after this patch. This patch is about padding generated for a partial struct mapping, which is irrelevant to that test, as far as I can tell.

cseo added a comment.Jul 6 2023, 4:53 PM

I think there's a corner case you may have missed. I'm still seeing this on aarch64:

/llvm-project/openmp/libomptarget/test/mapping/target_derefence_array_pointrs.cpp:28:12: error: CHECK: expected string not found in input
 // CHECK: 3
           ^
<stdin>:1:103: note: scanning from here
Libomptarget message: explicit extension not allowed: host address specified is 0x0000ffffd26ff7c0 (12 bytes), but device allocation maps to host at 0x0000ffffd26ff7c0 (8 bytes)
                                                                                                      ^
<stdin>:1:104: note: possible intended match here
Libomptarget message: explicit extension not allowed: host address specified is 0x0000ffffd26ff7c0 (12 bytes), but device allocation maps to host at 0x0000ffffd26ff7c0 (8 bytes)

Your patch fixes most failures, except this one. Fails with and without LTO enabled. Could you please check?

I'm not testing on aarch64, but that test fails for me both before and after this patch. This patch is about padding generated for a partial struct mapping, which is irrelevant to that test, as far as I can tell.

OK, thanks! I'll take a look then.

jdenny added a comment.Jul 7 2023, 8:04 AM

OK, thanks! I'll take a look then.

If you want to fix target_derefence_array_pointrs.cpp, I suggest looking at its function zoo. It explicitly maps the pointer sa when mapping *sa is needed. Its mapping has memory aliasing between sa and sa->sa. I'm not sure what the test intended, and I'm not confident how the implementation is meant to handle those cases, so you might want to contact the test's author. (It would also be nice to fix the typos in the test name while you're there.)