This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][FIX] Avoid races in the handling of to be deleted mapping entries
ClosedPublic

Authored by jdoerfert on Mar 5 2022, 1:51 PM.

Details

Summary

If we decided to delete a mapping entry we did not act on it right away
but first issued and waited for memory copies. In the meantime some
other thread might reuse the entry. While there was some logic to avoid
colliding on the actual "deletion" part, there were two races happening:

  1. The data transfer back of the thread deleting the entry and the data transfer back of the thread taking over the entry raced.
  2. The update to the shadow map happened regardless if the entry was actually reused by another thread which left the shadow map in a inconsistent state.

To fix both issues we will now update the shadow map and delete the
entry only if we are sure the thread is responsible for deletion, hence
no other thread took over the entry and reused it. We also wait for a
potential former data transfer from the device to finish before we issue
another one that would race with it.

Diff Detail

Event Timeline

jdoerfert created this revision.Mar 5 2022, 1:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2022, 1:51 PM
jdoerfert requested review of this revision.Mar 5 2022, 1:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2022, 1:51 PM
Herald added a subscriber: sstefan1. · View Herald Transcript
tianshilei1992 added inline comments.Mar 5 2022, 2:23 PM
openmp/libomptarget/src/device.cpp
370

Unfortunately, __kmpc_global_thread_num cannot give you a unique id. I did a simple test:

#include <cstdint>
#include <cstdio>
#include <thread>
#include <vector>

extern "C" int32_t __kmpc_global_thread_num(void *);

void foo() {
  std::chrono::seconds timespan(1);
  std::this_thread::sleep_for(timespan);
  int gtid = __kmpc_global_thread_num(nullptr);
  printf("gtid = %d.\n", gtid);
}

int main(int argc, char *argv[]) {
  std::vector<std::thread> pool(8);
  for (int i = 0; i < 8; ++i) {
    pool[i] = std::thread(foo);
  }

  for (std::thread &t : pool) {
    if (t.joinable()) {
      t.join();
    }
  }

  return 0;
}

It prints:

gtid = 0.
gtid = 9.
gtid = 10.
gtid = 9.
gtid = 10.
gtid = 9.
gtid = 10.
gtid = 9.

Since libomptarget is not forced to use OpenMP thread, we cannot use functions in libomp to tell thread id. std::thread::id should be used here I think.

jdoerfert added inline comments.Mar 5 2022, 2:26 PM
openmp/libomptarget/src/device.cpp
370

Interesting. I was expecting only openmp threads but you are right. Let's do std::trhead::id then.

jdoerfert updated this revision to Diff 413255.Mar 5 2022, 2:33 PM

Use std::thread::id

jdoerfert marked an inline comment as done.Mar 5 2022, 2:33 PM
tianshilei1992 accepted this revision.Mar 11 2022, 8:27 AM

LG. That also means we have to hold the lock longer than before, but for the correctness, we don't have choice.

This revision is now accepted and ready to land.Mar 11 2022, 8:27 AM
This revision was landed with ongoing or failed builds.Mar 28 2022, 8:34 PM
This revision was automatically updated to reflect the committed changes.
ronlieb added inline comments.
openmp/libomptarget/test/mapping/map_back_race.cpp
32

seeing sporadic failures on this test, in the amdgpu openmp buildbot. return code -11
i think we need a return 0; here

jdoerfert added inline comments.Mar 29 2022, 6:37 AM
openmp/libomptarget/test/mapping/map_back_race.cpp
32

main should return 0 by default, IIRC. return code -11 sounds more like we still have a bug -.-

jdoerfert added inline comments.Mar 29 2022, 6:47 AM
openmp/libomptarget/test/mapping/map_back_race.cpp
32

@ronlieb It seems we only fail for host (x86) offloading. One needs to check but I'd expect the error code -11 to be 139 (128 + 11 = 139), which would be a common segfault. given it's on the host we can probably use gdb to pinpoint it. Feel free to disable it for the host for now, that should make the bot green at least.