In order not to modify the tgt_target_data_update information but still be
able to pass the extra information for non-contiguous map item (offset,
count, and stride for each dimension), this patch overload arg when
the maptype is set as OMP_MAP_DESCRIPTOR. The origin arg is for
passing the pointer information, however, the overloaded arg is an
array of descriptor_dim:
struct descriptor_dim { int64_t offset; int64_t count; int64_t stride };
and the array size is the same as dimension size. In addition, since we
have count and stride information in descriptor_dim, we can replace/overload the
arg_size parameter by using dimension size.
More details can be found here: https://github.com/chichunchen/openmp-50-design/blob/master/target_update_noncontiguous.pptx
Edit:
The runtime implementation I'm thinking of is to convert the non-contiguous data into several chunks of contiguous.
For example:
int arr[3][3][3]; #pragma omp target update to (arr[1:2][1:2][0:2])
We can visualize the noncontiguous data as below (X is the data we want to transfer, O is the data want don't bother with):
Dim 0 = {Offset: 0, Count: 1, Stride: 4 bytes (int)}
XXO
Dim 1 = {Offset: 1, Count: 2, Stride: 12 bytes (4 * 3 - since Dim 0 has 3 elements)
OOO
XXO
XXO
Dim 2 = {Offset: 1, Count: 2, Stride: 36 bytes (12 * 3 since Dim 1 has 3 elements)
OOO OOO OOO
OOO XXO XXO
OOO XXO XXO
For the visualization, we know that we want to transfer 4 contiguous chunks and the runtime code could be something similar to:
// we expect this loop to transfer 4 contiguous chunks: // arr[1][1][0:2] // arr[1][2][0:2] // arr[2][1][0:2] // arr[2][2][0:2] for (int i = Dim[2].offset; i < Dim[2].count; i++) { for (int j = Dim[1].offset; j < Dim[1].count; j++) { ptr = bast_ptr + Dim[2].stride * i + Dim[1].stride * j + Dim[2].stride * Dim[0].offset; size = Dim[0].count * Dim[0].stride; // we can hoist it I think transfer(ptr, size, /*flag or some other stuff...*/); } }
For this design, we can support strides by just adding an extra dimension. For instance:
int arr[5][5][5] #pragma omp target update to(arr[0:2:2][1:2:1][0:2:2])
Dim 0 = {offset: 0, count: 1, stride: 4 bytes (int) } // the extra dimension for supporting stride
XO
Dim 1 = {offset: 0, count: 3, stride 8 bytes (4 * 2) }
XOXOX
Dim 2 = {offset: 0, count: 2, stride: 40 bytes (8 * 5) }
OOOOO
XOXOX
XOXOX
OOOOO
OOOOO
Dim 3 = {offset: 0, count: 2, stride: 200 bytes (40 * 5) }
...
clang-format suggested style edits found: