This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP]Fix PR48076: Check map types array before accessing its front.
ClosedPublic

Authored by ABataev on Nov 12 2020, 10:59 AM.

Details

Summary

Need to check if there are map types for the components before trying to
access them when trying to modify type mappings for combined partial
mappings.

Diff Detail

Event Timeline

ABataev created this revision.Nov 12 2020, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2020, 10:59 AM
ABataev requested review of this revision.Nov 12 2020, 10:59 AM

This stops it from crashing Clang, but I'm not sure if it's fixing the underlying problem. When I compile and run this program it crashes in libomptarget. If you get rid of the float f[50] in the struct it works as expected.

$ clang++ -fopenmp -fopenmp-targets=nvptx64 test.cpp && ./a.out

#include <stdio.h>

struct S2 {
    float f[50];
    double *p; 
};

int main() {
  S2 s;
#pragma omp target map(s.p)
  { s.p = nullptr; }
}

Output from running with debugging

Libomptarget --> Device 0 is ready to use.
Target CUDA RTL --> Load data from image 0x0000000000400a30
Target CUDA RTL --> CUDA module successfully loaded!
Target CUDA RTL --> Entry point 0x0000000000000000 maps to __omp_offloading_fd02_267be70_main_l10 (0x000000000174b170)
Target CUDA RTL --> Sending global device environment data 4 bytes
Libomptarget --> Entry  0: Base=0x00007fffcca22ad8, Begin=0x00007fffcca22ba0, Size=8, Type=0x20
Libomptarget --> Entry  1: Base=0x00007fffcca22ad8, Begin=0x00007fffcca22ad8, Size=208, Type=0x1000000000203
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffcca22ba0, Size=8)...
Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007fffcca22ba0.
Libomptarget --> findBucket: Size 8 is floored to 8.
Libomptarget --> Cannot find a node in the FreeLists. Allocate on device.
Libomptarget --> Node address 0x0000000001732d88, target pointer 0x00007f0506400000, size 8
Libomptarget --> Creating new map entry: HstBase=0x00007fffcca22ad8, HstBegin=0x00007fffcca22ba0, HstEnd=0x00007fffcca22ba8, TgtBegin=0x00007f0506400000
Libomptarget --> There are 8 bytes allocated at target address 0x00007f0506400000 - is new
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffcca22ad8, Size=208)...
Libomptarget --> WARNING: Pointer is not mapped but section extends into already mapped data
Libomptarget --> Mapping exists (implicit) with HstPtrBegin=0x00007fffcca22ad8, TgtPtrBegin=0x00007f05063fff38, Size=208, RefCount=1
Libomptarget --> There are 208 bytes allocated at target address 0x00007f05063fff38 - is not new
Libomptarget --> DeviceTy::getMapEntry: requested entry found
Libomptarget --> Moving 208 bytes (hst:0x00007fffcca22ad8) -> (tgt:0x00007f05063fff38)
Target CUDA RTL --> Error when copying data from host to device. Pointers: host = 0x00007fffcca22ad8, device = 0x00007f05063fff38, size = 208
Target CUDA RTL --> CUDA error is: invalid argument
Libomptarget --> Copying data to device failed.
Libomptarget --> Call to targetDataBegin failed, abort target.
Libomptarget --> Failed to process data before launching the kernel.

This stops it from crashing Clang, but I'm not sure if it's fixing the underlying problem. When I compile and run this program it crashes in libomptarget. If you get rid of the float f[50] in the struct it works as expected.

$ clang++ -fopenmp -fopenmp-targets=nvptx64 test.cpp && ./a.out

#include <stdio.h>

struct S2 {
    float f[50];
    double *p; 
};

int main() {
  S2 s;
#pragma omp target map(s.p)
  { s.p = nullptr; }
}

Output from running with debugging

Libomptarget --> Device 0 is ready to use.
Target CUDA RTL --> Load data from image 0x0000000000400a30
Target CUDA RTL --> CUDA module successfully loaded!
Target CUDA RTL --> Entry point 0x0000000000000000 maps to __omp_offloading_fd02_267be70_main_l10 (0x000000000174b170)
Target CUDA RTL --> Sending global device environment data 4 bytes
Libomptarget --> Entry  0: Base=0x00007fffcca22ad8, Begin=0x00007fffcca22ba0, Size=8, Type=0x20
Libomptarget --> Entry  1: Base=0x00007fffcca22ad8, Begin=0x00007fffcca22ad8, Size=208, Type=0x1000000000203
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffcca22ba0, Size=8)...
Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007fffcca22ba0.
Libomptarget --> findBucket: Size 8 is floored to 8.
Libomptarget --> Cannot find a node in the FreeLists. Allocate on device.
Libomptarget --> Node address 0x0000000001732d88, target pointer 0x00007f0506400000, size 8
Libomptarget --> Creating new map entry: HstBase=0x00007fffcca22ad8, HstBegin=0x00007fffcca22ba0, HstEnd=0x00007fffcca22ba8, TgtBegin=0x00007f0506400000
Libomptarget --> There are 8 bytes allocated at target address 0x00007f0506400000 - is new
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffcca22ad8, Size=208)...
Libomptarget --> WARNING: Pointer is not mapped but section extends into already mapped data
Libomptarget --> Mapping exists (implicit) with HstPtrBegin=0x00007fffcca22ad8, TgtPtrBegin=0x00007f05063fff38, Size=208, RefCount=1
Libomptarget --> There are 208 bytes allocated at target address 0x00007f05063fff38 - is not new
Libomptarget --> DeviceTy::getMapEntry: requested entry found
Libomptarget --> Moving 208 bytes (hst:0x00007fffcca22ad8) -> (tgt:0x00007f05063fff38)
Target CUDA RTL --> Error when copying data from host to device. Pointers: host = 0x00007fffcca22ad8, device = 0x00007f05063fff38, size = 208
Target CUDA RTL --> CUDA error is: invalid argument
Libomptarget --> Copying data to device failed.
Libomptarget --> Call to targetDataBegin failed, abort target.
Libomptarget --> Failed to process data before launching the kernel.

It is different problem, which should be addressed in the different patch.

jhuber6 accepted this revision.Nov 12 2020, 11:58 AM

LGTM then.

Any clue what the cause of the other problem is?

This revision is now accepted and ready to land.Nov 12 2020, 11:58 AM

LGTM then.

Any clue what the cause of the other problem is?

Will investigate it and prepare a new patch

This revision was landed with ongoing or failed builds.Nov 12 2020, 12:16 PM
This revision was automatically updated to reflect the committed changes.
ABataev added a comment.EditedNov 12 2020, 12:33 PM

LGTM then.

Any clue what the cause of the other problem is?

Could you try your test with the patch from D86119?

LGTM then.

Any clue what the cause of the other problem is?

Could you try your test with the patch from D86119?

Still crashes after applying D86119.

LGTM then.

Any clue what the cause of the other problem is?

Could you try your test with the patch from D86119?

Still crashes after applying D86119.

Could you provide a debug log?

LGTM then.

Any clue what the cause of the other problem is?

Could you try your test with the patch from D86119?

Still crashes after applying D86119.

Could you provide a debug log?

From Clang or Libomptarget? Libomptarget is the same as above.

LGTM then.

Any clue what the cause of the other problem is?

Could you try your test with the patch from D86119?

Still crashes after applying D86119.

Could you provide a debug log?

From Clang or Libomptarget? Libomptarget is the same as above.

From libomptarget. It must be different. Could you copy it here?

From libomptarget. It must be different. Could you copy it here?

Yeah, I think there's some extra information looking at it again.

Target CUDA RTL --> Start initializing CUDA
Libomptarget --> Successfully loaded library 'libomptarget.rtl.cuda.so'!
Libomptarget --> Registering RTL libomptarget.rtl.cuda.so supporting 1 devices!
Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': libomptarget.rtl.aarch64.so: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.ve.so': libomptarget.rtl.ve.so: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.amdgpu.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.amdgpu.so': libomptarget.rtl.amdgpu.so: cannot open shared object file: No such file or directory!
Libomptarget --> RTLs loaded!
Libomptarget --> Image 0x0000000000400a30 is NOT compatible with RTL libomptarget.rtl.x86_64.so!
Libomptarget --> Image 0x0000000000400a30 is compatible with RTL libomptarget.rtl.cuda.so!
Libomptarget --> RTL 0x00000000008334c0 has index 0!
Libomptarget --> Registering image 0x0000000000400a30 with RTL libomptarget.rtl.cuda.so!
Libomptarget --> Done registering entries!
Libomptarget --> Call to omp_get_num_devices returning 1
Libomptarget --> Default TARGET OFFLOAD policy is now mandatory (devices were found)
Libomptarget --> Entering target region with entry point 0x00000000004009e0 and device Id -1
Libomptarget --> Checking whether device 0 is ready.
Libomptarget --> Is the device 0 (local ID 0) initialized? 0
Target CUDA RTL --> Init requires flags to 1
Target CUDA RTL --> Getting device 0
Target CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC
Target CUDA RTL --> Max CUDA blocks per grid 2147483647 exceeds the hard team limit 65536, capping at the hard limit
Target CUDA RTL --> Using 1024 CUDA threads per block
Target CUDA RTL --> Using warp size 32
Target CUDA RTL --> Device supports up to 65536 CUDA blocks and 1024 threads with a warp size of 32
Target CUDA RTL --> Default number of teams set according to library's default 128
Target CUDA RTL --> Default number of threads set according to library's default 128
Libomptarget --> Device 0 is ready to use.
Target CUDA RTL --> Load data from image 0x0000000000400a30
Target CUDA RTL --> CUDA module successfully loaded!
Target CUDA RTL --> Entry point 0x0000000000000000 maps to __omp_offloading_fd02_267be70_main_l10 (0x00000000011d0ef0)
Target CUDA RTL --> Sending global device environment data 4 bytes
Libomptarget --> Entry  0: Base=0x00007ffc7e77c2a8, Begin=0x00007ffc7e77c370, Size=8, Type=0x20
Libomptarget --> Entry  1: Base=0x00007ffc7e77c2a8, Begin=0x00007ffc7e77c2a8, Size=208, Type=0x1000000000203
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007ffc7e77c370, Size=8)...
Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007ffc7e77c370.
Libomptarget --> findBucket: Size 8 is floored to 8.
Libomptarget --> Cannot find a node in the FreeLists. Allocate on device.
Libomptarget --> Node address 0x00000000011b8ba8, target pointer 0x00007fa792400000, size 8
Libomptarget --> Creating new map entry: HstBase=0x00007ffc7e77c2a8, HstBegin=0x00007ffc7e77c370, HstEnd=0x00007ffc7e77c378, TgtBegin=0x00007fa792400000
Libomptarget --> There are 8 bytes allocated at target address 0x00007fa792400000 - is new
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007ffc7e77c2a8, Size=208)...
Libomptarget --> WARNING: Pointer is not mapped but section extends into already mapped data
Libomptarget --> Mapping exists (implicit) with HstPtrBegin=0x00007ffc7e77c2a8, TgtPtrBegin=0x00007fa7923fff38, Size=208, RefCount=1
Libomptarget --> There are 208 bytes allocated at target address 0x00007fa7923fff38 - is not new
Libomptarget --> DeviceTy::getMapEntry: requested entry found
Libomptarget --> Moving 208 bytes (hst:0x00007ffc7e77c2a8) -> (tgt:0x00007fa7923fff38)
Target CUDA RTL --> Error when copying data from host to device. Pointers: host = 0x00007ffc7e77c2a8, device = 0x00007fa7923fff38, size = 208
Target CUDA RTL --> CUDA error is: invalid argument
Libomptarget --> Copying data to device failed.
Libomptarget --> Call to targetDataBegin failed, abort target.
Libomptarget --> Failed to process data before launching the kernel.
Libomptarget error: run with env LIBOMPTARGET_INFO>1 to dump host-targetpointer maps
Libomptarget fatal error 1: failure of target construct while offloading is mandatory
Aborted (core dumped)

From libomptarget. It must be different. Could you copy it here?

Yeah, I think there's some extra information looking at it again.

Target CUDA RTL --> Start initializing CUDA
Libomptarget --> Successfully loaded library 'libomptarget.rtl.cuda.so'!
Libomptarget --> Registering RTL libomptarget.rtl.cuda.so supporting 1 devices!
Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': libomptarget.rtl.aarch64.so: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.ve.so': libomptarget.rtl.ve.so: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.amdgpu.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.amdgpu.so': libomptarget.rtl.amdgpu.so: cannot open shared object file: No such file or directory!
Libomptarget --> RTLs loaded!
Libomptarget --> Image 0x0000000000400a30 is NOT compatible with RTL libomptarget.rtl.x86_64.so!
Libomptarget --> Image 0x0000000000400a30 is compatible with RTL libomptarget.rtl.cuda.so!
Libomptarget --> RTL 0x00000000008334c0 has index 0!
Libomptarget --> Registering image 0x0000000000400a30 with RTL libomptarget.rtl.cuda.so!
Libomptarget --> Done registering entries!
Libomptarget --> Call to omp_get_num_devices returning 1
Libomptarget --> Default TARGET OFFLOAD policy is now mandatory (devices were found)
Libomptarget --> Entering target region with entry point 0x00000000004009e0 and device Id -1
Libomptarget --> Checking whether device 0 is ready.
Libomptarget --> Is the device 0 (local ID 0) initialized? 0
Target CUDA RTL --> Init requires flags to 1
Target CUDA RTL --> Getting device 0
Target CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC
Target CUDA RTL --> Max CUDA blocks per grid 2147483647 exceeds the hard team limit 65536, capping at the hard limit
Target CUDA RTL --> Using 1024 CUDA threads per block
Target CUDA RTL --> Using warp size 32
Target CUDA RTL --> Device supports up to 65536 CUDA blocks and 1024 threads with a warp size of 32
Target CUDA RTL --> Default number of teams set according to library's default 128
Target CUDA RTL --> Default number of threads set according to library's default 128
Libomptarget --> Device 0 is ready to use.
Target CUDA RTL --> Load data from image 0x0000000000400a30
Target CUDA RTL --> CUDA module successfully loaded!
Target CUDA RTL --> Entry point 0x0000000000000000 maps to __omp_offloading_fd02_267be70_main_l10 (0x00000000011d0ef0)
Target CUDA RTL --> Sending global device environment data 4 bytes
Libomptarget --> Entry  0: Base=0x00007ffc7e77c2a8, Begin=0x00007ffc7e77c370, Size=8, Type=0x20
Libomptarget --> Entry  1: Base=0x00007ffc7e77c2a8, Begin=0x00007ffc7e77c2a8, Size=208, Type=0x1000000000203
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007ffc7e77c370, Size=8)...
Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007ffc7e77c370.
Libomptarget --> findBucket: Size 8 is floored to 8.
Libomptarget --> Cannot find a node in the FreeLists. Allocate on device.
Libomptarget --> Node address 0x00000000011b8ba8, target pointer 0x00007fa792400000, size 8
Libomptarget --> Creating new map entry: HstBase=0x00007ffc7e77c2a8, HstBegin=0x00007ffc7e77c370, HstEnd=0x00007ffc7e77c378, TgtBegin=0x00007fa792400000
Libomptarget --> There are 8 bytes allocated at target address 0x00007fa792400000 - is new
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007ffc7e77c2a8, Size=208)...
Libomptarget --> WARNING: Pointer is not mapped but section extends into already mapped data
Libomptarget --> Mapping exists (implicit) with HstPtrBegin=0x00007ffc7e77c2a8, TgtPtrBegin=0x00007fa7923fff38, Size=208, RefCount=1
Libomptarget --> There are 208 bytes allocated at target address 0x00007fa7923fff38 - is not new
Libomptarget --> DeviceTy::getMapEntry: requested entry found
Libomptarget --> Moving 208 bytes (hst:0x00007ffc7e77c2a8) -> (tgt:0x00007fa7923fff38)
Target CUDA RTL --> Error when copying data from host to device. Pointers: host = 0x00007ffc7e77c2a8, device = 0x00007fa7923fff38, size = 208
Target CUDA RTL --> CUDA error is: invalid argument
Libomptarget --> Copying data to device failed.
Libomptarget --> Call to targetDataBegin failed, abort target.
Libomptarget --> Failed to process data before launching the kernel.
Libomptarget error: run with env LIBOMPTARGET_INFO>1 to dump host-targetpointer maps
Libomptarget fatal error 1: failure of target construct while offloading is mandatory
Aborted (core dumped)

Could you try this patch:

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index ce8846140d4..854b7f3e830 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -9938,7 +9938,7 @@ void CGOpenMPRuntime::emitTargetCall(
           MappedVarSet.insert(CI->getCapturedVar());
         else
           MappedVarSet.insert(nullptr);
-        if (CurInfo.BasePointers.empty())
+        if (CurInfo.BasePointers.empty() && !PartialStruct.Base.isValid())
           MEHandler.generateDefaultMapInfo(*CI, **RI, *CV, CurInfo);
         // Generate correct mapping for variables captured by reference in
         // lambdas.
@@ -9947,7 +9947,7 @@ void CGOpenMPRuntime::emitTargetCall(
                                                   CurInfo, LambdaPointers);
       }
       // We expect to have at least an element of information for this capture.
-      assert(!CurInfo.BasePointers.empty() &&
+      assert((!CurInfo.BasePointers.empty() || PartialStruct.Base.isValid()) &&
              "Non-existing map pointer for capture!");
       assert(CurInfo.BasePointers.size() == CurInfo.Pointers.size() &&
              CurInfo.BasePointers.size() == CurInfo.Sizes.size() &&

Could you try this patch:

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index ce8846140d4..854b7f3e830 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -9938,7 +9938,7 @@ void CGOpenMPRuntime::emitTargetCall(
           MappedVarSet.insert(CI->getCapturedVar());
         else
           MappedVarSet.insert(nullptr);
-        if (CurInfo.BasePointers.empty())
+        if (CurInfo.BasePointers.empty() && !PartialStruct.Base.isValid())
           MEHandler.generateDefaultMapInfo(*CI, **RI, *CV, CurInfo);
         // Generate correct mapping for variables captured by reference in
         // lambdas.
@@ -9947,7 +9947,7 @@ void CGOpenMPRuntime::emitTargetCall(
                                                   CurInfo, LambdaPointers);
       }
       // We expect to have at least an element of information for this capture.
-      assert(!CurInfo.BasePointers.empty() &&
+      assert((!CurInfo.BasePointers.empty() || PartialStruct.Base.isValid()) &&
              "Non-existing map pointer for capture!");
       assert(CurInfo.BasePointers.size() == CurInfo.Pointers.size() &&
              CurInfo.BasePointers.size() == CurInfo.Sizes.size() &&

It stopped the crashing but I'm not sure it's working as intended. The following program doesn't crash, but it doesn't change the value of s.p. Looking at the debug output it doesn't seem to be mapping the whole struct.

struct S {
  float f[50];
  double *p;
};

int main() {
  S s;
  printf(%p\n", s.p);
#pragma omp target map(tofrom:s.p)
  { s.p = nullptr; }
  printf(%p\n", s.p);
}
0x7ffea8eec8f0
0x7ffea8eec8f0
Libomptarget --> Entry  0: Base=0x00007fffa3265500, Begin=0x00007fffa3265690, Size=8, Type=0x20
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffa3265690, Size=8)...
Libomptarget --> MemoryManagerTy::allocate: size 8 with host pointer 0x00007fffa3265690.
Libomptarget --> findBucket: Size 8 is floored to 8.
Libomptarget --> Cannot find a node in the FreeLists. Allocate on device.
Libomptarget --> Node address 0x00000000010511e8, target pointer 0x00007fc846400000, size 8
Libomptarget --> Creating new map entry: HstBase=0x00007fffa3265500, HstBegin=0x00007fffa3265690, HstEnd=0x00007fffa3265698, TgtBegin=0x00007fc846400000
Libomptarget --> There are 8 bytes allocated at target address 0x00007fc846400000 - is new
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffa3265690, Size=8)...
Libomptarget --> Mapping exists with HstPtrBegin=0x00007fffa3265690, TgtPtrBegin=0x00007fc846400000, Size=8, RefCount=1
Libomptarget --> Obtained target argument 0x00007fc8463ffe70 from host pointer 0x00007fffa3265690

However if it's just a capture with only #pragma omp target then it seems to work.

0x7fff2b234ed0
(nil)

Could you try this patch:

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index ce8846140d4..854b7f3e830 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -9938,7 +9938,7 @@ void CGOpenMPRuntime::emitTargetCall(
           MappedVarSet.insert(CI->getCapturedVar());
         else
           MappedVarSet.insert(nullptr);
-        if (CurInfo.BasePointers.empty())
+        if (CurInfo.BasePointers.empty() && !PartialStruct.Base.isValid())
           MEHandler.generateDefaultMapInfo(*CI, **RI, *CV, CurInfo);
         // Generate correct mapping for variables captured by reference in
         // lambdas.
@@ -9947,7 +9947,7 @@ void CGOpenMPRuntime::emitTargetCall(
                                                   CurInfo, LambdaPointers);
       }
       // We expect to have at least an element of information for this capture.
-      assert(!CurInfo.BasePointers.empty() &&
+      assert((!CurInfo.BasePointers.empty() || PartialStruct.Base.isValid()) &&
              "Non-existing map pointer for capture!");
       assert(CurInfo.BasePointers.size() == CurInfo.Pointers.size() &&
              CurInfo.BasePointers.size() == CurInfo.Sizes.size() &&

Also I'm assuming you meant for this to be applied to the master breanch, and not in addition to the previous one.

Could you try this patch:

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index ce8846140d4..854b7f3e830 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -9938,7 +9938,7 @@ void CGOpenMPRuntime::emitTargetCall(
           MappedVarSet.insert(CI->getCapturedVar());
         else
           MappedVarSet.insert(nullptr);
-        if (CurInfo.BasePointers.empty())
+        if (CurInfo.BasePointers.empty() && !PartialStruct.Base.isValid())
           MEHandler.generateDefaultMapInfo(*CI, **RI, *CV, CurInfo);
         // Generate correct mapping for variables captured by reference in
         // lambdas.
@@ -9947,7 +9947,7 @@ void CGOpenMPRuntime::emitTargetCall(
                                                   CurInfo, LambdaPointers);
       }
       // We expect to have at least an element of information for this capture.
-      assert(!CurInfo.BasePointers.empty() &&
+      assert((!CurInfo.BasePointers.empty() || PartialStruct.Base.isValid()) &&
              "Non-existing map pointer for capture!");
       assert(CurInfo.BasePointers.size() == CurInfo.Pointers.size() &&
              CurInfo.BasePointers.size() == CurInfo.Sizes.size() &&

Also I'm assuming you meant for this to be applied to the master breanch, and not in addition to the previous one.

Yes, it was against master. Will check it tomorrow, looks like need to fix mapping flags. Also, the patch itself needs to be fixed.