AMDGPU relies on global properties being set before setTargetProperties is called. Existing targets like MIPS which rely on setTargetProperties do not seem to rely on the current behavior, so this patch moves the call later in SetFunctionAttributes.
Details
Diff Detail
- Repository
- rC Clang
Event Timeline
It seems reasonable to me for target hooks to run after global hooks, but can I ask why AMDGPU specifically relies on this?
We want to ensure certain symbols have a meaningful visibility. For example, kernel symbols must not have hidden visibility. It's reasonable for the user to arrange for a kernel symbol to have either protected or default visibility, though, so we want our hook to be run after the global hooks have already calculated the global visibility.
We effectively consider the user explicitly specifying that a symbol is e.g. a kernel to also carry with it visibility information. We don't want to require the user to redundantly specify that a kernel is not hidden, when it is never meaningful for it to be hidden.
I understand, but if the user explicitly gives it hidden visibility, you should still diagnose that.
Also, shouldn't you just handle this by treating the kernel attribute as a source of explicit visibility at the Sema/AST level?
I agree that we should diagnose it, and I can update the patch accordingly, but I'm unsure how to go about emitting a diagnostic from this callback. As far as doing this at the AST level, this was my original approach in https://reviews.llvm.org/D53153, however this is really more of an AMDGPU implementation detail. I don't think it is necessarily the case that every OpenCL and Cuda implementation wants/needs require these symbols not have hidden visibility.
If we can involve the target in the AST linkage calculations, or agree that in general the kernel specifier should affect the visibility in this way, along with the __device__ specifier on a variable and the __global__ specifier on a function for Cuda, then moving this up to the AST level makes sense to me.
I suspect that other OpenCL and CUDA implementations don't care at all about symbol visibility for device-side code generation, and giving kernel functions default visibility seems like the right thing to do for the (relatively few) things at the AST level that are sensitive to that, like template visibility. Would you mind reaching out to other implementors about that?
This patch seems fine to me regardless.
Yes, I can certainly identify who would be interested in terms of OpenCL and Cuda and work on moving this up to the AST.
If you don't object to this patch then is it reasonable for me to submit it? It will get us the required behavior for AMDGPU while I work on the more general solution.
@rjmccall Would you expect similar conflicts in explicit visibility to result in diagnostics? For example, marking a static variable with an explicit visibility attribute doesn't warn, instead the explicit visibility attribute is silently ignored. GCC 7.3 complains with warning: ‘__visibility__’ attribute ignored [-Wattributes]
@rjmccall I'm not sure if this is the right place to continue discussing this, but I don't have a patch I am happy with and I would rather not post something half-baked.
Currently for AMDGPU we have the behavior that the user can set the visibility of these symbols with explicit attributes. If we consider the kernel attribute itself as an explicit visibility declaration how do we support this flexibility when we will have effectively mandated a single visibility that the user cannot interact with? Even if we are OK with mandating something like default visibility, we do not currently support preemptible symbols so protected is the optimal visibility. This may not be true of other targets, and it may not even be true of AMDGPU in the future, so hardcoding the visibility of kernel symbols to anything doesn't seem correct. Is something like "not-hidden" reasonable?
It seems to fine just forbid hidden. Again, I suspect other targets do not care because they are not using a standard dynamic loader to load the code containing kernel functions.