This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][NFC] Simplify targetDataEnd conditions for CopyMember
ClosedPublic

Authored by jdenny on Jul 14 2021, 8:30 AM.

Details

Summary

targetDataEnd and targetDataBegin compute CopyMember/copy differently,
and I don't see why they should. This patch eliminates one of those
differences by making a simplifying NFC change to targetDataEnd.

The change is NFC as follows. The change only affects the case when
!UNIFIED_SHARED_MEMORY || HasCloseModifier. In that case, the
following points are always true:

  • The value of CopyMember is relevant later only if DelEntry = false.
  • DelEntry = false only if one of the following is true:
    • IsLast = false. In this case, it's always true that CopyMember = false = IsLast.
    • MEMBER_OF && !PTR_AND_OBJ is true. In this case, CopyMember = IsLast.
  • Thus, if CopyMember is relevant, CopyMember = IsLast.

Diff Detail

Event Timeline

jdenny created this revision.Jul 14 2021, 8:30 AM
jdenny requested review of this revision.Jul 14 2021, 8:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2021, 8:30 AM
Herald added a subscriber: sstefan1. · View Herald Transcript

I think we can simplify things even further. I have inlined my reasoning. Please double check in case I missed something.

openmp/libomptarget/src/omptarget.cpp
739–741

cond1: This condition is useless because... see below.

746–747

cond2: This is equivalent to cond1:

!(PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin) is equivalent to
!(PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) || TgtPtrBegin != HstPtrBegin.

TgtPtrBegin != HstPtrBegin if USM is not enabled OR if we have used the close modifier to allocate data on the device instead of host RAM, so:
TgtPtrBegin != HstPtrBegin is equivalent to
!(PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) || HasCloseModifier==true.

It follows that cond2 is equivalent to:
!(PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) || HasCloseModifier
which is cond1.

DelEntry, Always and CopyMember are relevant only if cond2 is true. So we can delete the if(cond1) above and leave only this bit:

if (IsLast)
  CopyMember = true;

But then, this would be equal to setting CopyMember = IsLast:

  • If IsLast is false, then CopyMember will be false (that's how it was declare a few lines above`.
  • If IsLast is true, CopyMember is set to true, which is the value of IsLast.

In all cases, CopyMember = IsLast.
So we can get rid of CopyMember entirely and leave the code like this:

if (ArgTypes[I] & OMP_TGT_MAPTYPE_FROM) {
  bool Always = ArgTypes[I] & OMP_TGT_MAPTYPE_ALWAYS;

  if ((DelEntry || Always || IsLast) &&
      !(PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY &&
        TgtPtrBegin == HstPtrBegin)) {

Thanks for the review. I'd feel more comfortable with one small simplification at a time. That is, a subsequent patch (which I'm happy to write if you'd like) could make the further simplification you're proposing and adjust targetDataBegin to match. Does that seem reasonable?

openmp/libomptarget/src/omptarget.cpp
746–747

Looks right to me, but it's possible I don't know of some additional condition in which TgtPtrBegin == HstPtrBegin.

Assuming that, as you say, all of the following are equivalent, which is best to use in the code?

  • !(PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin)
  • TgtPtrBegin != HstPtrBegin
  • !(PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) || HasCloseModifier

The second is most succinct, but the last is clearest in my opinion because it spells out the cases. Maybe we should use the second but assert that it's equivalent to the last?

jdenny updated this revision to Diff 364886.Aug 6 2021, 2:46 PM

Rebased.

I'm working on the suggested changes for a separate patch. However, seeing D104555's changes to related code in targetDataBegin, I wonder if there's another patch out there that's about to rewrite this code in targetDataEnd too.

grokos accepted this revision.Aug 6 2021, 3:15 PM

Rebased.

I'm working on the suggested changes for a separate patch. However, seeing D104555's changes to related code in targetDataBegin, I wonder if there's another patch out there that's about to rewrite this code in targetDataEnd too.

I asked the same question in D104555 and @tianshilei1992 said that there are no races to be fixed at the end of a target region, so I assume targetDataEnd won't get touched by the async patches.

This revision is now accepted and ready to land.Aug 6 2021, 3:15 PM
grokos added a comment.Aug 6 2021, 3:16 PM

Thanks for the review. I'd feel more comfortable with one small simplification at a time. That is, a subsequent patch (which I'm happy to write if you'd like) could make the further simplification you're proposing and adjust targetDataBegin to match. Does that seem reasonable?

Sure, no problem!

grokos added inline comments.Aug 6 2021, 3:21 PM
openmp/libomptarget/src/omptarget.cpp
746–747

The second is *almost* equivalent to the last with one exception: in the most unlikely coincidence ever, a device address may happen to be the same number as a host address (i.e. although we have no USM support, a call to e.g. cudaMalloc may return a device pointer which happens to have the same value as a host pointer returned by malloc). Super unlikely, but let's stick with the last condition.

jdenny added a comment.Aug 6 2021, 4:46 PM
openmp/libomptarget/src/omptarget.cpp
746–747

Ah, of course. Thanks.

What should happen if close is specified on an omp target enter data but not on the corresponding omp target exit data? (See test close_enter_exit.c.) To handle that case, don't we have to use the first version of the above condition in targetDataEnd?

I wonder how close can ever be meaningful when specified on an omp target exit data.

jdenny added inline comments.Aug 6 2021, 5:18 PM
openmp/libomptarget/src/omptarget.cpp
746–747

Another case where the first condition appears to be required is when omp_target_associate_ptr circumvents unified shared memory (as in the test unified_shared_memory/api.c). I think both targetDataEnd and targetDataBegin are wrong for this case.

jdenny added inline comments.Aug 6 2021, 5:29 PM
openmp/libomptarget/src/omptarget.cpp
746–747

How quickly I forgot your point. The first condition cannot be used to handle either of these situations because the addresses might happen to be the same even with discrete memory. We'll need a new condition.

jdenny added inline comments.Aug 11 2021, 1:08 PM
openmp/libomptarget/src/omptarget.cpp
746–747

These remaining issues are addressed by D107925, D107926, D107927, and D107928.