Page MenuHomePhabricator

[OpenMP][FIX] Remove shadow pointer map and introduce consistent locking
AcceptedPublic

Authored by jdoerfert on Apr 8 2022, 10:16 PM.

Details

Summary

The shadow pointer map was problematic as we scanned an entire list if
an entry had shadow pointers. The new scheme stores the shadow pointers
inside the entries. This allows easy access without any search. It also
helps us, but also makes it necessary, to define a consistent locking
scheme. The implicit locking of entries via TargetPointerResultTy makes
this pretty effortless, however one has to:

  1. Lock HDTTMap before locking an entry.
  2. Do not lock HDTTMap while holding an entry lock.
  3. Hold the entry lock to read or modify an entry.

The changes to submitData and retrieveData have been made to ensure 2
when the LIBOMPTARGET_INFO flag is used. Most everything else happens by
itself as TargetPointerResultTy acts as a lock_guard for the entry. It
is a little complicated when we deal with multiple entries, especially
as they can be equal. However, one can still follow the rules with
reasonable effort.

Diff Detail

Event Timeline

jdoerfert created this revision.Apr 8 2022, 10:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2022, 10:16 PM
jdoerfert requested review of this revision.Apr 8 2022, 10:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2022, 10:16 PM
Herald added a subscriber: sstefan1. · View Herald Transcript
jdoerfert updated this revision to Diff 421690.Apr 8 2022, 10:19 PM

Improve comment

tianshilei1992 added inline comments.Apr 13 2022, 9:40 AM
openmp/libomptarget/include/device.h
96

We don't use the order here, right? If so, unordered_map can usually get better performance.

jdoerfert added inline comments.Apr 13 2022, 11:03 AM
openmp/libomptarget/include/device.h
96

We do not use the order, correct. I can change it though I doubt it makes much of a difference.

jdoerfert added inline comments.Apr 13 2022, 12:53 PM
openmp/libomptarget/include/device.h
96

That comparison is not helpful. Our expected size is *probably* < 10.

tianshilei1992 added inline comments.Apr 13 2022, 1:04 PM
openmp/libomptarget/include/device.h
96

It also says:

TL;DR

Results: the unordered_map shows its superiority as soon as there is data in the map. The only time it exhibits worse performance than the ordered map is when the maps are empty.

The answer also shows the search speed is different, not just insertion speed.

Actually, based on all the uses of ShadowPtrInfos, it doesn't have to be a map at all because we never need the key-value relation. It can be simply an unordered_set<ShadowPtrInfoTy> hashed by ShadowPtrInfoTy::HstPtrAddr. The only place we need the "hash" feature is when adding an element.

The new scheme stores the shadow pointers inside the entries.

Makes a lot more sense now.

openmp/libomptarget/include/device.h
352

Please delete copy constructor and copy assign operator.

379

I don't get the intention of this function. Need documentation. It also seems buggy. Don't you need to unlock Entry before assign?

415

Do you know what this is? Add some documentation?

I'm afraid you are abusing existing mutex. Please be specific about what part of the Entry you are trying to protect with TPR lock/unlock.

openmp/libomptarget/include/device.h
376
void lock() const { States->UpdateMtx.lock(); }
void unlock() const { States->UpdateMtx.unlock(); }

the mutex is only intended to protect data transfer submission.

jdoerfert added inline comments.Apr 14 2022, 1:14 PM
openmp/libomptarget/include/device.h
352

Aren't they deleted anyway? I can make it explicit, sure.

376

The mutex is now used to protect the entire entry. I will update the comment.
While the data transfer is running the entry should not be looked at, similarly, while anything might modify the entry, the entry should not be looked at. Hence, one can only get the entry locked.

379

It's not buggy, IMHO. Assignment will swap the entry, which may or may not be locked. The new object goes out of scope immediately which will trigger the destructor. If there is an entry it will be unlocked, if there was none, all is already good.

This function is needed to give up ownership of an entry early, that is while the TPR is still live but effectively not needed anymore.

415

We do know what this is. And we should add some documentation. We should do so in a separate patch though.

ye-luo added inline comments.Apr 14 2022, 1:37 PM
openmp/libomptarget/include/device.h
352

I'm not sure if they are deleted by default. So helpful to make them deleted.

379

You are right. The new object will be responsible to call the unlock via the destructor. So this is correct.

415

Then do it separately. Probably also for PendingCtorsDtorsPerLibrary PendingCtorsDtors;

jdoerfert updated this revision to Diff 423309.Apr 17 2022, 1:06 PM
jdoerfert marked 2 inline comments as done.

Fix rebase issues, address comments

jdoerfert updated this revision to Diff 423314.Apr 17 2022, 2:38 PM

Use set instead of map

tianshilei1992 accepted this revision.Apr 18 2022, 1:43 PM

LGTM. Like @ye-luo suggested, we need to update the documents of lock and unlock as they are being used in a different way now.

This revision is now accepted and ready to land.Apr 18 2022, 1:43 PM