This is an archive of the discontinued LLVM Phabricator instance.

[Clang][OpenMP] Fix use_device_addr
AbandonedPublic

Authored by doru1004 on Sep 12 2022, 6:44 AM.

Details

Summary

With the current implementation, the use_device_addr does not correctly use a pointer that was already mapped to the device and ends up in segmentation fault. The test showcases the situation.

Diff Detail

Event Timeline

doru1004 created this revision.Sep 12 2022, 6:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2022, 6:44 AM
doru1004 requested review of this revision.Sep 12 2022, 6:44 AM
Herald added a project: Restricted Project. · View Herald Transcript
doru1004 added a project: Restricted Project.
doru1004 updated this revision to Diff 459452.Sep 12 2022, 7:18 AM
doru1004 updated this revision to Diff 459475.Sep 12 2022, 8:49 AM
ye-luo added a subscriber: ye-luo.Sep 12 2022, 10:08 AM
ye-luo added inline comments.
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

In my understanding of the spec.
map(tofrom:x[0:256]) only maps the memory segment that x points to. x itself as a pointer scalar is not mapped.
use_device_addr(x) should fail to find the map of x scalar.
5.2 spec.
If the list item is not a mapped list item, it is assumed to be accessible on the target device.
To me, it seems just keep &x as it was, in this case &x remains a host address.

But in your patch description, it seems treating x differently from a scalar.

I also applied your patch on main and got segfault because the x has a value of device address and x[0] fails. This should be the behavior of use_device_ptr instead of use_device_addr.

doru1004 added inline comments.Sep 12 2022, 10:32 AM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

To me, it seems just keep &x as it was, in this case &x remains a host address.

So does this mean that if I do something like this in the target data I should get different addresses for x:

#pragma omp target data use_device_ptr(x)
{
    fprintf(stderr, "x: %p\n", __LINE__, x);
}

#pragma omp target data use_device_addr(x)
{
    fprintf(stderr, "x: %p\n", __LINE__, x);
}

I also applied your patch on main and got segfault because the x has a value of device address and x[0] fails.

That's my fault x[0] was the wrong thing to use actually.

ye-luo added inline comments.Sep 12 2022, 11:11 AM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

When you have an outer target data map(x), then two printf differ. If there is no outer map(x), two printf should be identical.

doru1004 added inline comments.Sep 12 2022, 11:23 AM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

When you have an outer target data map(x), then two printf differ. If there is no outer map(x), two printf should be identical.

This is super helpful thank you! I'll make sure that happens.

In the case when an outer target data exists, the print of the x which is under use_device_addr should print the same address as printing x on the host?

ye-luo added inline comments.Sep 12 2022, 11:28 AM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

I need a correction. When outer map(x) exists, actually the address(not value) of x should be a device address, and the code cannot even print x. Printing &x should be fine.

doru1004 added inline comments.Sep 12 2022, 2:05 PM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

In the context of the above comment, should &x on the device be an address I can verify, somehow, to make sure that it's correct or is it a completely new device address?

So for example, should it be the same as when I do a use_device_ptr but print the &x in that case? (With the current master those two addresses are not the same.)

I guess what I need is an example of using use_device_addr that actually does something meaningful because with the current main branch printing the &x of a use_device_addr(x) is nil.

ye-luo added inline comments.Sep 12 2022, 3:18 PM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

When an outer map(x) is placed, &x does print something meaningful.
I tried to access omp_get_mapped_ptr(&x, omp_get_default_device()) but got link time error about missing omp_get_mapped_ptr definition. It seems missing an implementation of this OpenMP API.

When there is no map(x), I also got nil, I think this is a bug, &x should keep the host value.

I cannot think of a useful example with use_device_addr(x) where x is a pointer. But x can be a scalar float.
and then call cublas gemm, the alpha/beta parameter can be &x.

doru1004 added inline comments.Sep 12 2022, 5:05 PM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

When an outer map(x) is placed, &x does print something meaningful.

For me, in the same scenario, it prints nil.

Here's the full example to avoid any confusion:

float *x = (float *) malloc(10*sizeof(float));

#pragma omp target data map(to:x[0:10])
{
        #pragma omp target data use_device_ptr(x)
        {
            fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000
        }

        #pragma omp target data use_device_addr(x)
        {
            fprintf(stderr, "line %d x: %p\n", __LINE__, &x); // prints nil for me
        }
    }
}

Note that x has been mapped to the device in the following way:

Libomptarget device 0 info: Host Ptr                         Target Ptr   
Libomptarget device 0 info: 0x00005578f14de020    0x00007f0bda400000

What should the printed address be though?

Note that the above results have been obtained with current Clang/LLVM main not with this patch applied.

If you apply this patch and run the above code but change the &x into a x you get:

float *x = (float *) malloc(10*sizeof(float));

#pragma omp target data map(to:x[0:10])
{
        #pragma omp target data use_device_ptr(x)
        {
            fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000
        }

        #pragma omp target data use_device_addr(x)
        {
            fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints address 0x7f0bda400000
        }
    }
}
15

When an outer map(x) is placed, &x does print something meaningful.
I tried to access omp_get_mapped_ptr(&x, omp_get_default_device()) but got link time error about missing omp_get_mapped_ptr definition. It seems missing an implementation of this OpenMP API.

When there is no map(x), I also got nil, I think this is a bug, &x should keep the host value.

I cannot think of a useful example with use_device_addr(x) where x is a pointer. But x can be a scalar float.
and then call cublas gemm, the alpha/beta parameter can be &x.

ye-luo added inline comments.Sep 12 2022, 5:18 PM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

When I said map(x) I meant exactly #pragma omp target data map(to:x) which is different from #pragma omp target data map(to:x[0:10]).
The former maps the pointer scalar, the latter maps to the memory segment that x points to.

dreachem added inline comments.
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

This is what should be used for that case:

#pragma omp target data map(x[0:256])
#pragma omp target data use_device_addr(x[0:256])

Array sections are permitted for use_device_addr to handle cases like this.

ye-luo added inline comments.Sep 12 2022, 8:53 PM
clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp
15

If the intention is to access the device ptr of the mapped memory segment, @dreachem is correct, just map the array. I was confused by the intention of this patch.

doru1004 abandoned this revision.Sep 21 2022, 7:13 AM

The outcome of this discussion resulted in a fix to the original program from use_device_addr(x) to use_device_addr(x[:count]).