With the current implementation, the use_device_addr does not correctly use a pointer that was already mapped to the device and ends up in segmentation fault. The test showcases the situation.
Details
Diff Detail
Event Timeline
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 | In my understanding of the spec. But in your patch description, it seems treating x differently from a scalar. I also applied your patch on main and got segfault because the x has a value of device address and x[0] fails. This should be the behavior of use_device_ptr instead of use_device_addr. |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 |
So does this mean that if I do something like this in the target data I should get different addresses for x: #pragma omp target data use_device_ptr(x) { fprintf(stderr, "x: %p\n", __LINE__, x); } #pragma omp target data use_device_addr(x) { fprintf(stderr, "x: %p\n", __LINE__, x); }
That's my fault x[0] was the wrong thing to use actually. |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 | When you have an outer target data map(x), then two printf differ. If there is no outer map(x), two printf should be identical. |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 |
This is super helpful thank you! I'll make sure that happens. In the case when an outer target data exists, the print of the x which is under use_device_addr should print the same address as printing x on the host? |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 | I need a correction. When outer map(x) exists, actually the address(not value) of x should be a device address, and the code cannot even print x. Printing &x should be fine. |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 | In the context of the above comment, should &x on the device be an address I can verify, somehow, to make sure that it's correct or is it a completely new device address? So for example, should it be the same as when I do a use_device_ptr but print the &x in that case? (With the current master those two addresses are not the same.) I guess what I need is an example of using use_device_addr that actually does something meaningful because with the current main branch printing the &x of a use_device_addr(x) is nil. |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 | When an outer map(x) is placed, &x does print something meaningful. When there is no map(x), I also got nil, I think this is a bug, &x should keep the host value. I cannot think of a useful example with use_device_addr(x) where x is a pointer. But x can be a scalar float. |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 |
For me, in the same scenario, it prints nil. Here's the full example to avoid any confusion: float *x = (float *) malloc(10*sizeof(float)); #pragma omp target data map(to:x[0:10]) { #pragma omp target data use_device_ptr(x) { fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000 } #pragma omp target data use_device_addr(x) { fprintf(stderr, "line %d x: %p\n", __LINE__, &x); // prints nil for me } } } Note that x has been mapped to the device in the following way: Libomptarget device 0 info: Host Ptr Target Ptr Libomptarget device 0 info: 0x00005578f14de020 0x00007f0bda400000 What should the printed address be though? Note that the above results have been obtained with current Clang/LLVM main not with this patch applied. If you apply this patch and run the above code but change the &x into a x you get: float *x = (float *) malloc(10*sizeof(float)); #pragma omp target data map(to:x[0:10]) { #pragma omp target data use_device_ptr(x) { fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000 } #pragma omp target data use_device_addr(x) { fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000 } } } | |
14 |
|
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 | When I said map(x) I meant exactly #pragma omp target data map(to:x) which is different from #pragma omp target data map(to:x[0:10]). |
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp | ||
---|---|---|
14 | This is what should be used for that case: #pragma omp target data map(x[0:256]) #pragma omp target data use_device_addr(x[0:256]) Array sections are permitted for use_device_addr to handle cases like this. |
The outcome of this discussion resulted in a fix to the original program from use_device_addr(x) to use_device_addr(x[:count]).
In my understanding of the spec.
map(tofrom:x[0:256]) only maps the memory segment that x points to. x itself as a pointer scalar is not mapped.
use_device_addr(x) should fail to find the map of x scalar.
5.2 spec.
If the list item is not a mapped list item, it is assumed to be accessible on the target device.
To me, it seems just keep &x as it was, in this case &x remains a host address.
But in your patch description, it seems treating x differently from a scalar.
I also applied your patch on main and got segfault because the x has a value of device address and x[0] fails. This should be the behavior of use_device_ptr instead of use_device_addr.