User Details
- User Since
- May 29 2015, 9:24 AM (408 w, 2 d)
Tue, Mar 21
Mon, Mar 6
Wed, Mar 1
Jan 25 2023
@kevinsala if the issue is with returning the host thread a pointer that cannot be used on the host - except in a map clause and a target region - then perhaps we should not return the locked pointer at all. The lock function could operate in a "side effect" mode, where the user asks the runtime to lock memory and the ROCr keeps track of it. Does this make sense?
Jan 24 2023
This test works correctly with the old plugin. A locked pointer is not a device pointer. It is a host pointer that has been pinned in memory and that has been made accessible by ROCr to the device agent that will be assigned the data transfers implementing the map clause. In abstract programming terms, the second feature is only an AMD one and it does not need to have a corresponding functionality on other targets.
Jan 17 2023
Jan 13 2023
[AMD Official Use Only - General]
Pushed a fix to the non-amdgpu build:
7928d9e12d47fcc226d0c6984e11f5f463670f4a
Tested on machine without ROCm installation or AMDGPU.
Jan 11 2023
[OpenMP][libomptarget][AMDGPU] Address comments, including fix in error handling in is_locked function.
Jan 10 2023
[OpenMP][libomptarget][AMDGPU] Addressed comments.
Jan 9 2023
I changed the old plugin interface for tgt_rtl_data_lock to return an error code. It now returns the lockedptr as function argument. Let me know if this is not what was called for.
Thanks for this extension!
Thanks for being patience with this patch update. I hit a problem with the is_locked function, when the prelocked pointer is passed to the map clause: the address offset calculation was based on the system-allocator (e.g., malloc) pointer but it would not work if the agentBasePointer (locked) was passed in. Fixed now.
[OpenMP][libomptarget][AMDGPU] Address comments and use prelocked pointers in mep clause implementation.
Jan 4 2023
[OpenMP][libomptarget][AMDGPU] Add lock/unlock to prevent races on Devices vector in plugin manager.
Jan 3 2023
[OpenMP][libomptarget][AMDGPU] Apply requested changes and merge against trunk.
Dec 15 2022
Dec 14 2022
Updated test with checks of a and b pointers.
Dec 13 2022
[OpenMP] Only rerun lit test for pinned memory, no need to rebuild.
[OpenMP] Add test for default allocator and move unlock call to appropriate site.
Dec 12 2022
ping
ping
Dec 5 2022
Apply comments
Dec 2 2022
Nov 30 2022
Removed llvm_omp_target_host_mem_alloc as a case for pinning.
Nov 29 2022
Pinning does not apply to device memory on AMDGPU target, only to host memory, as explained here:
https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/master/src/inc/hsa_ext_amd.h#L1526
LGTM but @ABataev should review/approve this
Aug 22 2022
Thanks for explaining. LGTM.
This looks good, but what happens when the user accidentally adds a nested parallel when this option is turned on? Do we get serial (correct) execution?
Apr 12 2022
LGTM, thanks for this.
Feb 18 2022
Feb 16 2022
Replace list of shadow pointer to be restored with simple synchronization.
Dec 17 2021
It looks like https://reviews.llvm.org/D115823 will take a bit longer to be merged. I will merge this in to fix the API error, as it can go in first.
Dec 16 2021
Add context.
LGTM
Dec 15 2021
Dec 14 2021
Dec 10 2021
Trying rebase again.
rebase onto main
Dec 9 2021
[OpenMP][NFC] Apply requested change: remove "nowait" from function name.
Add context (sorry about that!)
Dec 8 2021
[OpenMP] Add missing hsa declarations/definitions when building runtime without rocr (or hsa library) installed on the system
Dec 7 2021
@JonChesterfield thanks for the quick review. Would you mind merging this for me?
format
@JonChesterfield this looks more readable as a first step.
Abandoning this one for readability issues and moving to two separate patches.
First one:
https://reviews.llvm.org/D115267
Nov 17 2021
This is already a lot of code with parse+sema. I wonder if we should split the patch into two, i.e. 1. parse+sema; 2. code gen? @ABataev ?
It should simplify maintenance of the patch and allow time to extend the OpenMP IR builder.
Nov 3 2021
LGTM
Oct 25 2021
Oct 15 2021
Thanks for fixing this. It would be nice if you could add a comment where it is used at the beginning of parallel_51, stating what are the initial value of parallelLevel (kmpc_parallel_level) and those of kmpc_is_spmd_exec_mode in the two cases of generic and spmd modes (assuming generic spmd mode is handled as a spmd). Something like:
Oct 7 2021
Thanks for this. Do you think this could use a test?
I assume a test would be easier to write once this information is used somewhere, but it's hard to tell without context.
Sep 30 2021
LGTM
Sep 20 2021
Updated patch based on comments: add new map table entry field to track USM maps; use it to determine behavior in getTargetPointer and deallocTgtPtr; update var name and add explaining comments.
Sep 17 2021
Aug 26 2021
Thanks for uploading this.
Aug 23 2021
Update environment variable name to reflect comments and intended name
Jul 6 2021
I have very small comments. LGTM.
Jun 14 2021
@ABataev can you please merge this for me? I still have to ask for commit privileges.