Need to check if there are map types for the components before trying to
access them when trying to modify type mappings for combined partial
mappings.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This stops it from crashing Clang, but I'm not sure if it's fixing the underlying problem. When I compile and run this program it crashes in libomptarget. If you get rid of the float f[50] in the struct it works as expected.
$ clang++ -fopenmp -fopenmp-targets=nvptx64 test.cpp && ./a.out
#include <stdio.h> struct S2 { float f[50]; double *p; }; int main() { S2 s; #pragma omp target map(s.p) { s.p = nullptr; } }
Output from running with debugging
Libomptarget --> Device 0 is ready to use. Target CUDA RTL --> Load data from image 0x0000000000400a30 Target CUDA RTL --> CUDA module successfully loaded! Target CUDA RTL --> Entry point 0x0000000000000000 maps to __omp_offloading_fd02_267be70_main_l10 (0x000000000174b170) Target CUDA RTL --> Sending global device environment data 4 bytes Libomptarget --> Entry 0: Base=0x00007fffcca22ad8, Begin=0x00007fffcca22ba0, Size=8, Type=0x20 Libomptarget --> Entry 1: Base=0x00007fffcca22ad8, Begin=0x00007fffcca22ad8, Size=208, Type=0x1000000000203 Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffcca22ba0, Size=8)... Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007fffcca22ba0. Libomptarget --> findBucket: Size 8 is floored to 8. Libomptarget --> Cannot find a node in the FreeLists. Allocate on device. Libomptarget --> Node address 0x0000000001732d88, target pointer 0x00007f0506400000, size 8 Libomptarget --> Creating new map entry: HstBase=0x00007fffcca22ad8, HstBegin=0x00007fffcca22ba0, HstEnd=0x00007fffcca22ba8, TgtBegin=0x00007f0506400000 Libomptarget --> There are 8 bytes allocated at target address 0x00007f0506400000 - is new Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffcca22ad8, Size=208)... Libomptarget --> WARNING: Pointer is not mapped but section extends into already mapped data Libomptarget --> Mapping exists (implicit) with HstPtrBegin=0x00007fffcca22ad8, TgtPtrBegin=0x00007f05063fff38, Size=208, RefCount=1 Libomptarget --> There are 208 bytes allocated at target address 0x00007f05063fff38 - is not new Libomptarget --> DeviceTy::getMapEntry: requested entry found Libomptarget --> Moving 208 bytes (hst:0x00007fffcca22ad8) -> (tgt:0x00007f05063fff38) Target CUDA RTL --> Error when copying data from host to device. Pointers: host = 0x00007fffcca22ad8, device = 0x00007f05063fff38, size = 208 Target CUDA RTL --> CUDA error is: invalid argument Libomptarget --> Copying data to device failed. Libomptarget --> Call to targetDataBegin failed, abort target. Libomptarget --> Failed to process data before launching the kernel.
Yeah, I think there's some extra information looking at it again.
Target CUDA RTL --> Start initializing CUDA Libomptarget --> Successfully loaded library 'libomptarget.rtl.cuda.so'! Libomptarget --> Registering RTL libomptarget.rtl.cuda.so supporting 1 devices! Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'... Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': libomptarget.rtl.aarch64.so: cannot open shared object file: No such file or directory! Libomptarget --> Loading library 'libomptarget.rtl.ve.so'... Libomptarget --> Unable to load library 'libomptarget.rtl.ve.so': libomptarget.rtl.ve.so: cannot open shared object file: No such file or directory! Libomptarget --> Loading library 'libomptarget.rtl.amdgpu.so'... Libomptarget --> Unable to load library 'libomptarget.rtl.amdgpu.so': libomptarget.rtl.amdgpu.so: cannot open shared object file: No such file or directory! Libomptarget --> RTLs loaded! Libomptarget --> Image 0x0000000000400a30 is NOT compatible with RTL libomptarget.rtl.x86_64.so! Libomptarget --> Image 0x0000000000400a30 is compatible with RTL libomptarget.rtl.cuda.so! Libomptarget --> RTL 0x00000000008334c0 has index 0! Libomptarget --> Registering image 0x0000000000400a30 with RTL libomptarget.rtl.cuda.so! Libomptarget --> Done registering entries! Libomptarget --> Call to omp_get_num_devices returning 1 Libomptarget --> Default TARGET OFFLOAD policy is now mandatory (devices were found) Libomptarget --> Entering target region with entry point 0x00000000004009e0 and device Id -1 Libomptarget --> Checking whether device 0 is ready. Libomptarget --> Is the device 0 (local ID 0) initialized? 0 Target CUDA RTL --> Init requires flags to 1 Target CUDA RTL --> Getting device 0 Target CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC Target CUDA RTL --> Max CUDA blocks per grid 2147483647 exceeds the hard team limit 65536, capping at the hard limit Target CUDA RTL --> Using 1024 CUDA threads per block Target CUDA RTL --> Using warp size 32 Target CUDA RTL --> Device supports up to 65536 CUDA blocks and 1024 threads with a warp size of 32 Target CUDA RTL --> Default number of teams set according to library's default 128 Target CUDA RTL --> Default number of threads set according to library's default 128 Libomptarget --> Device 0 is ready to use. Target CUDA RTL --> Load data from image 0x0000000000400a30 Target CUDA RTL --> CUDA module successfully loaded! Target CUDA RTL --> Entry point 0x0000000000000000 maps to __omp_offloading_fd02_267be70_main_l10 (0x00000000011d0ef0) Target CUDA RTL --> Sending global device environment data 4 bytes Libomptarget --> Entry 0: Base=0x00007ffc7e77c2a8, Begin=0x00007ffc7e77c370, Size=8, Type=0x20 Libomptarget --> Entry 1: Base=0x00007ffc7e77c2a8, Begin=0x00007ffc7e77c2a8, Size=208, Type=0x1000000000203 Libomptarget --> Looking up mapping(HstPtrBegin=0x00007ffc7e77c370, Size=8)... Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007ffc7e77c370. Libomptarget --> findBucket: Size 8 is floored to 8. Libomptarget --> Cannot find a node in the FreeLists. Allocate on device. Libomptarget --> Node address 0x00000000011b8ba8, target pointer 0x00007fa792400000, size 8 Libomptarget --> Creating new map entry: HstBase=0x00007ffc7e77c2a8, HstBegin=0x00007ffc7e77c370, HstEnd=0x00007ffc7e77c378, TgtBegin=0x00007fa792400000 Libomptarget --> There are 8 bytes allocated at target address 0x00007fa792400000 - is new Libomptarget --> Looking up mapping(HstPtrBegin=0x00007ffc7e77c2a8, Size=208)... Libomptarget --> WARNING: Pointer is not mapped but section extends into already mapped data Libomptarget --> Mapping exists (implicit) with HstPtrBegin=0x00007ffc7e77c2a8, TgtPtrBegin=0x00007fa7923fff38, Size=208, RefCount=1 Libomptarget --> There are 208 bytes allocated at target address 0x00007fa7923fff38 - is not new Libomptarget --> DeviceTy::getMapEntry: requested entry found Libomptarget --> Moving 208 bytes (hst:0x00007ffc7e77c2a8) -> (tgt:0x00007fa7923fff38) Target CUDA RTL --> Error when copying data from host to device. Pointers: host = 0x00007ffc7e77c2a8, device = 0x00007fa7923fff38, size = 208 Target CUDA RTL --> CUDA error is: invalid argument Libomptarget --> Copying data to device failed. Libomptarget --> Call to targetDataBegin failed, abort target. Libomptarget --> Failed to process data before launching the kernel. Libomptarget error: run with env LIBOMPTARGET_INFO>1 to dump host-targetpointer maps Libomptarget fatal error 1: failure of target construct while offloading is mandatory Aborted (core dumped)
Could you try this patch:
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index ce8846140d4..854b7f3e830 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -9938,7 +9938,7 @@ void CGOpenMPRuntime::emitTargetCall( MappedVarSet.insert(CI->getCapturedVar()); else MappedVarSet.insert(nullptr); - if (CurInfo.BasePointers.empty()) + if (CurInfo.BasePointers.empty() && !PartialStruct.Base.isValid()) MEHandler.generateDefaultMapInfo(*CI, **RI, *CV, CurInfo); // Generate correct mapping for variables captured by reference in // lambdas. @@ -9947,7 +9947,7 @@ void CGOpenMPRuntime::emitTargetCall( CurInfo, LambdaPointers); } // We expect to have at least an element of information for this capture. - assert(!CurInfo.BasePointers.empty() && + assert((!CurInfo.BasePointers.empty() || PartialStruct.Base.isValid()) && "Non-existing map pointer for capture!"); assert(CurInfo.BasePointers.size() == CurInfo.Pointers.size() && CurInfo.BasePointers.size() == CurInfo.Sizes.size() &&
It stopped the crashing but I'm not sure it's working as intended. The following program doesn't crash, but it doesn't change the value of s.p. Looking at the debug output it doesn't seem to be mapping the whole struct.
struct S { float f[50]; double *p; }; int main() { S s; printf(%p\n", s.p); #pragma omp target map(tofrom:s.p) { s.p = nullptr; } printf(%p\n", s.p); }
0x7ffea8eec8f0 0x7ffea8eec8f0
Libomptarget --> Entry 0: Base=0x00007fffa3265500, Begin=0x00007fffa3265690, Size=8, Type=0x20 Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffa3265690, Size=8)... Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007fffa3265690. Libomptarget --> findBucket: Size 8 is floored to 8. Libomptarget --> Cannot find a node in the FreeLists. Allocate on device. Libomptarget --> Node address 0x00000000010511e8, target pointer 0x00007fc846400000, size 8 Libomptarget --> Creating new map entry: HstBase=0x00007fffa3265500, HstBegin=0x00007fffa3265690, HstEnd=0x00007fffa3265698, TgtBegin=0x00007fc846400000 Libomptarget --> There are 8 bytes allocated at target address 0x00007fc846400000 - is new Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffa3265690, Size=8)... Libomptarget --> Mapping exists with HstPtrBegin=0x00007fffa3265690, TgtPtrBegin=0x00007fc846400000, Size=8, RefCount=1 Libomptarget --> Obtained target argument 0x00007fc8463ffe70 from host pointer 0x00007fffa3265690
However if it's just a capture with only #pragma omp target then it seems to work.
0x7fff2b234ed0 (nil)
Also I'm assuming you meant for this to be applied to the master breanch, and not in addition to the previous one.
Yes, it was against master. Will check it tomorrow, looks like need to fix mapping flags. Also, the patch itself needs to be fixed.