As discussed in D105990, without this patch, targetDataEnd
determines whether to transfer data or delete a device mapping (as
opposed to assuming it's in shared memory) using two different
conditions, each of which is broken for some cases:
- !(UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin): The broken case is rare: the device and host might happen to use the same address for their mapped allocations. I don't know how to write a test that's likely to reveal this case, but this patch does fix it, as discussed below.
- !UNIFIED_SHARED_MEMORY || HasCloseModifier: There are at least two broken cases:
- The close modifier might have been specified on an omp target enter data but not the corresponding omp target exit data, which thus might falsely assume a mapping is in shared memory. The test unified_shared_memory/close_enter_exit.c already has a missing deletion as a result, and this patch adds a check for that. This patch also adds the new test close_member.c to reveal a missing transfer and deletion.
- Use of discrete memory might have been forced by omp_target_associate_ptr, as in the test unified_shared_memory/api.c. In the current targetDataEnd implementation, this condition turns out not be used for this case: because the reference count is infinite, a transfer is possible only with an always modifier, and this condition is never used in that case. To ensure it's never used for that case in the future, this patch adds the test unified_shared_memory/associate_ptr.c.
Fortunately, DeviceTy::getTgtPtrBegin already has a solution: it
reports whether the allocation was found in shared memory via the
variable IsHostPtr.
After this patch, HasCloseModifier is no longer used in
targetDataEnd, and I wonder if the close modifier is ever useful
on an omp target data end.
Just to be on the safe side for future patches, can you add a comment saying that DeviceTy::deallocTgtPtr no longer checks whether we use USM and it assumes that the caller has checked beforehand that HstPtrBegin has (or should have) corresponding storage on the device instead of shared memory?