The async data movement can cause data race if the target supports it.
Details can be found in [1]. This patch tries to fix this problem by attaching
an event to the entry of data mapping table. Here are the details.
For each issued data movement, a new event is generated and returned to libomptarget
by calling createEvent. The event will be attached to the corresponding mapping table
entry.
For each data mapping lookup, if there is no need for a data movement, the
attached event has to be inserted into the queue to gaurantee that all following
operations in the queue can only be executed if the event is fulfilled.
This design is to avoid synchronization on host side.
Note that we are using CUDA terminolofy here. Similar mechanism is assumped to
be supported by another targets. Even if the target doesn't support it, it can
be easily implemented in the following fall back way:
- Event can be any kind of flag that has at least two status, 0 and 1.
- waitEvent can directly busy loop if Event is still 0.
My local test shows that bug49334.cpp can pass.
Reference:
[1] https://bugs.llvm.org/show_bug.cgi?id=49940
'barrier'? Not sure what dependency means here