This is an archive of the discontinued LLVM Phabricator instance.

[OpenCL] Set external linkage for block enqueue kernels
ClosedPublic

Authored by svenvh on Dec 10 2021, 7:37 AM.

Details

Summary

All kernels can be called from the host as per the SPIR_KERNEL calling
convention. As such, all kernels should have external linkage, but
block enqueue kernels were created with internal linkage.

Reported-by: Pedro Olsen Ferreira

Diff Detail

Event Timeline

svenvh created this revision.Dec 10 2021, 7:37 AM
svenvh requested review of this revision.Dec 10 2021, 7:37 AM
Anastasia accepted this revision.Dec 10 2021, 7:47 AM

LGTM! I agree according to the OpenCL programming model attributes internal spir_kernel together on the same function create a contradiction because kernel functions are expected to have global visibility.

Adding @AlexeySachkov who has reported the same issue earlier elsewhere.

This revision is now accepted and ready to land.Dec 10 2021, 7:47 AM
This revision was landed with ongoing or failed builds.Jan 12 2022, 5:31 AM
This revision was automatically updated to reflect the committed changes.

It is possible that block kernels are defined and invoked in static functions, therefore two block kernels in different TU's may have the same name. Making such kernels external may cause duplicate symbols.

It is possible that block kernels are defined and invoked in static functions, therefore two block kernels in different TU's may have the same name. Making such kernels external may cause duplicate symbols.

Potentially we should append the name of the translation unit to all kernel wrapper names for the enqueued blocks to resolve this? For example, global constructors stubs are using such a similar naming scheme taken from the translation unit.

But the kernel function in OpenCL has to be globally visible and many tools have been built with this assumption. Additionally, some toolchains might require the enqueued kernels to be globally visible as well in order to access them as an execution entry point.

It is possible that block kernels are defined and invoked in static functions, therefore two block kernels in different TU's may have the same name. Making such kernels external may cause duplicate symbols.

Potentially we should append the name of the translation unit to all kernel wrapper names for the enqueued blocks to resolve this? For example, global constructors stubs are using such a similar naming scheme taken from the translation unit.

But the kernel function in OpenCL has to be globally visible and many tools have been built with this assumption. Additionally, some toolchains might require the enqueued kernels to be globally visible as well in order to access them as an execution entry point.

If we have to externalize such block kernels whose names are derived from variables with internal linkage, we may need a way to make the block kernel names unique.

One approach would be use MD5 hash of the file path and compile options, similar to https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/Driver.cpp#L2623

It is possible that block kernels are defined and invoked in static functions, therefore two block kernels in different TU's may have the same name. Making such kernels external may cause duplicate symbols.

Potentially we should append the name of the translation unit to all kernel wrapper names for the enqueued blocks to resolve this? For example, global constructors stubs are using such a similar naming scheme taken from the translation unit.

But the kernel function in OpenCL has to be globally visible and many tools have been built with this assumption. Additionally, some toolchains might require the enqueued kernels to be globally visible as well in order to access them as an execution entry point.

If we have to externalize such block kernels whose names are derived from variables with internal linkage, we may need a way to make the block kernel names unique.

One approach would be use MD5 hash of the file path and compile options, similar to https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/Driver.cpp#L2623

Ok, however it might be enough to do what C++ handling does with global ctors. Should we file a bug for this for now?

It is possible that block kernels are defined and invoked in static functions, therefore two block kernels in different TU's may have the same name. Making such kernels external may cause duplicate symbols.

Potentially we should append the name of the translation unit to all kernel wrapper names for the enqueued blocks to resolve this? For example, global constructors stubs are using such a similar naming scheme taken from the translation unit.

But the kernel function in OpenCL has to be globally visible and many tools have been built with this assumption. Additionally, some toolchains might require the enqueued kernels to be globally visible as well in order to access them as an execution entry point.

If we have to externalize such block kernels whose names are derived from variables with internal linkage, we may need a way to make the block kernel names unique.

One approach would be use MD5 hash of the file path and compile options, similar to https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/Driver.cpp#L2623

Ok, however it might be enough to do what C++ handling does with global ctors. Should we file a bug for this for now?

Do you mean the global ctors like this? https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGenCXX/global-init.cpp#L201

They do not have unique names across TU's. They work because they have internal linkage. However OpenCL kernels have external linkage.

I suggest opening a bug to track this. Thanks.

It is possible that block kernels are defined and invoked in static functions, therefore two block kernels in different TU's may have the same name. Making such kernels external may cause duplicate symbols.

Potentially we should append the name of the translation unit to all kernel wrapper names for the enqueued blocks to resolve this? For example, global constructors stubs are using such a similar naming scheme taken from the translation unit.

But the kernel function in OpenCL has to be globally visible and many tools have been built with this assumption. Additionally, some toolchains might require the enqueued kernels to be globally visible as well in order to access them as an execution entry point.

If we have to externalize such block kernels whose names are derived from variables with internal linkage, we may need a way to make the block kernel names unique.

One approach would be use MD5 hash of the file path and compile options, similar to https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/Driver.cpp#L2623

Ok, however it might be enough to do what C++ handling does with global ctors. Should we file a bug for this for now?

Do you mean the global ctors like this? https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGenCXX/global-init.cpp#L201

They do not have unique names across TU's. They work because they have internal linkage. However OpenCL kernels have external linkage.

I suggest opening a bug to track this. Thanks.

Sure: https://github.com/llvm/llvm-project/issues/53572. Feel free to add anything relevant.