Page MenuHomePhabricator

[Sema][AST] Explicit visibility for OpenCL/CUDA kernels/variables
Needs ReviewPublic

Authored by scott.linder on Apr 29 2019, 12:18 PM.



For AMDGPU the visibility of these symbols (OpenCL kernels, CUDA __global__ functions, and CUDA __device__ variables) must not be hidden, as we rely on them being available in the dynamic symbol table in the final DSO.

This patch implements this by considering language attributes as a source of explicit visibility, but rather than attributing any one visibility to them they are simply coerced to be a non-hidden visibility. This allows for the optimization of using protected visibility when these symbols are known to be dso_local.

This patch also adds diagnostics for explicitly setting a hidden visibility on these symbols.

I imagine there are a number of issues with the patch in its current state, but I wanted to get something implemented before reaching out to OpenCL/CUDA maintainers to see if this is a reasonable change. @Anastasia and @tra I wasn't certain if you would be good candidates to discuss this change, so please let me know if I need to keep looking.

Diff Detail

Event Timeline

scott.linder created this revision.Apr 29 2019, 12:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2019, 12:18 PM
tra added a comment.Apr 29 2019, 2:19 PM

A kernel functions in CUDA is actually two different functions. One is the real kernel we compile for the GPU, another is a host-side stub that launches the device-side kernel.

On device side both clang and nvcc currently silently ignore hidden visibility and force the kernel to always be visible:
This is needed because the kernel must be externally visible in the device-side executable for the host-side code to execute it.
The device-side executable and its symbols are isolated from the host DSO they are encapsulated in, so whether the hidden attribute is ignored or not on device side is independent of the visibility the kernel symbol gets on the host side.

On the host side there's no particular reason to give kernel (or, rather, its host-side stub) a special treatment. Setting hidden on it should be fine, if someone needs it for whatever reason.
Most likely users who may have applied hidden to a kernel would do so in order to avoid exposing kernel symbols outside of a DSO and the attribute will do that job just fine.

I think the warning is not going to buy anything for CUDA. The hidden attribute effectively applies to the host side only, where it should work correctly and where it is potentially useful. I'd rather not impose restrictions that are not necessary, even if it's just a warning.

Seems reasonable for OpenCL kernels. You might want to add an AST dump test to check that the visibility is being set correctly in case it's being printed in AST.

Okay. So it sounds like this should either be a device-only rule, with no warning in mixed-mode languages like CUDA, or we should take a different approach.

yaxunl added inline comments.Apr 30 2019, 3:24 PM

we also need this for __constant__ variables.

Anastasia added inline comments.May 2 2019, 6:09 AM

And what about __global and __constant program scope variables in OpenCL?