- User Since
- Feb 16 2016, 1:22 PM (396 w, 6 d)
May 10 2023
I understand what you are trying to achieve here and indeed this patch does exactly what you want to. But my question is, do we really want to allow this behavior? The test case which was added to demonstrate what problem this patch fixes looks illegal from an OpenMP perspective. It is a case of "explicitly mapping more" of the same object, which is forbidden in OpenMP. I wasn't sure whether that restriction has been relaxed and it slipped under my radar, but the latest TR11 spec still says about map clauses (https://www.openmp.org/wp-content/uploads/openmp-TR11.pdf, pg. 159, lines 29-32):
Jan 25 2023
Jan 24 2023
Padding was added to libomptarget in order to ensure 8-byte struct members remain properly aligned, e.g. pointers, doubles etc. If you remove padding, then such a member may find itself in a non 8-aligned address, making its access result in a segfault.
Jan 27 2022
Added a check to catch the error when the user tries to map distinct sections of the same object using different pointers.
Jan 26 2022
Jan 24 2022
Changed the title because it was totally misleading.
Added test for use_device_ptr
Jan 23 2022
Dec 6 2021
Ping. I'm preparing some more ShadowPtrMap optimizations which I'd like to base on top of this patch. Is there any planned change or can we land the patch?
Nov 20 2021
Nov 10 2021
LGTM with two minor nits.
Nov 3 2021
Diff shows rtl.h and device.h have been deleted?
Oct 25 2021
We need to keep bases and offsets separate. Sometimes (e.g. in OpenCL) we need to manifest base pointers prior to launching a kernel. Even if you have mapped an object only partially, e.g. A[N:M], although the kernel is expected to access elements starting at address &A[N] and beyond, we still need to manifest the base of the array &A. In other cases, e.g. the COI API, need the begin address itself, i.e. &A[N] as the API operates on begin addresses, not bases. That's why we pass args and offsets as two separate entities so that each plugin can do what it needs. What you are trying to do with this patch is to revert https://reviews.llvm.org/D33028.
Sep 20 2021
LG. Such fixes can go in without a review :)
Sep 15 2021
Good catch, LGTM!
Sep 10 2021
I had another look at this. It seems the assertion is just fine, we would return SUCCESS anyway right after assert(), so we are now double-checking that SUCCESS is indeed the correct value to return.
LG, quite descriptive and covers everything related to OpenMP libraries. Waiting for others to comment as well.
Sep 7 2021
LGTM as well.
LG. This can be considered NFC, can you put the [NFC] tag in the title?
Sep 5 2021
Sep 1 2021
LG. One possible suggestion is that you leave the double dash (--) variant in some tests so that we can make sure both variants (e.g. both openmp-amdgcn-amd-amdhsa--gfx906 and openmp-amdgcn-amd-amdhsa-gfx906) are correctly parsed.
This solution is quite neat and much simpler than D102043, in line with what we want. LGTM, waiting for other people's opinions.
I think it's better to move on with these patches and then revisit D108569, which needs extensive changes anyway. It will be easier to work on that other patch if the fixes from this series have landed instead of trying to think about all corner cases all over again.
Aug 31 2021
Aug 28 2021
Aug 27 2021
I don't think it is at all possible to write a test emulating this rare case of TgtPtr in GPU memory having the same value as HstPtr in host memory, so this is good to go even without a test. LGTM.
LG, just update the documentation in device.h:lookupMapping().
Changing plugin interfaces will be very painful. All changes since 2019 were all to add new interfaces just to make sure not to break existing applications.
DelEntry is true only if IsLast is true.
Aug 26 2021
Aug 23 2021
Typos. LGTM, waiting for @ye-luo to comment on the patch.
Aug 20 2021
A few minor nits. Generally the patch is now in good shape, its logic is much simpler than the first iteration and it's easy to see that we don't risk running into deadlocks or further races. We haven't been very enthusiastic about alternative approaches; meanwhile there's a race waiting to be fixed, so I think we should move on with providing a solution. LGTM on my side, but since a lot of other people have reviewed the patch, let's wait until we hear them, too.
Aug 19 2021
Aug 13 2021
Aug 10 2021
Aug 6 2021
Jul 30 2021
Now that target nowait operations are handled via hidden helper tasks which make sure that dependencies are satisfied, these calls to __kmpc_omp_taskwait are not only redundant but also hurt performance as they serialize execution unconditionally. LGTM. Can you edit the patch description and explain exactly what you mean and why this patch is needed? Thanks!
Jul 23 2021
This patch looks good now! Let's commit it and then move on with D104418.
Jul 19 2021
Jul 15 2021
I think we can simplify things even further. I have inlined my reasoning. Please double check in case I missed something.
Jul 14 2021
Made the comment more detailed to explain why this patch is needed.
No, only when the PTR is deallocated is the entry removed from the map. But in env/base_ptr_ref_count.c the PTR is a global declare target pointer, so it stays in the map for the lifetime of the application.
Jul 13 2021
No. If the old entry exists in ShadowPtrMap then UpdateDevPtr will remain false, so no update will take place. This patch changes that condition: if either the entry is not in the map or it exists but contains stale data, an update should be performed.
What about other instances where we have access to the HostDataToTargetMap followed be data transfers, e.g. when removing a mapping and copy data back to the host? Shouldn't we apply the same logic there?
Closed by commit rGbb0166dc7279: [libomptarget] Update device pointer only if needed (authored by grokos).
Jul 12 2021
Jul 9 2021
Right, only the pointer member is updated, no other fields are modified.
So the expected behavior is what libomptarget currently does. These 2 patches do not change it, so they are good to go.
Thinking about it a bit more, no matter what the answer to my last question is, even the existing version of the library (without D105121 and D104924) will copy the attached object to the device. So these patches result in logically equivalent behavior. @jdenny I think you can proceed with committing them as patches that simplify the code. If we need to take extra care of the corner case in the above example, then we can prepare a new patch that fixes the bug.
Just like in D104924, the code after this patch is equivalent to the original one with one exception. The question was raised at our telecon a couple of weeks ago but no one was sure what the right answer is:
Jun 30 2021
I think this is good to go now. Thanks! Maybe you can wait for some of the initial reviewers to accept the patch as well or just go ahead and commit.
Jun 29 2021
The new code is equivalent to the old one but clearer and with one fewer lock. Checking the parent struct is a leftover from back when we had a different map interface.
Jun 23 2021
LGTM, thanks for the patch!
Jun 22 2021
Jun 21 2021
Jun 17 2021
Jun 8 2021
May 21 2021
May 20 2021
May 6 2021
Looks good (with one naming nit)
Apr 16 2021
Apr 15 2021
Apr 13 2021
Apr 6 2021
Change looks good, so it's accepted on my end. I'll let the other reviewers have a look and post their comments. Please do not commit until we have reached an agreement for all 4 patches together (D99551, D99552, D99553, D99612).