This fixes erroneously reported CUDA compilation errors in host-side code during device-side compilation.
I've also restricted OpenMP-specific checks to trigger only if we're compiling with OpenMP enabled.
Differential D40275
[CUDA] Report "unsupported VLA" errors only on device side. tra on Nov 20 2017, 4:34 PM. Authored by
Details This fixes erroneously reported CUDA compilation errors in host-side code during device-side compilation. I've also restricted OpenMP-specific checks to trigger only if we're compiling with OpenMP enabled.
Diff Detail
Event TimelineComment Actions I'm not sure what exactly you or @rjmccall have in mind. Specifically - what is the 'generic case' CUDA checks should be added to? Could you give me an example? Comment Actions And please add a regression test which is apparently missing for the case that a VLA is NOT diagnosed in CUDA mode Comment Actions Hmm. We do have test/SemaCUDA/vla.cu which should've triggered the error. Let me see why it didn't happen. Comment Actions Not supporting VLAs is a property of the target we are compiling for, see newly added Context.getTargetInfo().isVLASupported(). So neither CUDA nor OpenMP are special cases in general, it's rather that the targeted architecture doesn't support that feature. What is a special case though is that both CUDA and OpenMP analyze the complete host code again and we need to suppress the diagnostic if the VLA is encountered in the host code that is never codegen'd for the device. For OpenMP, this special case is encoded in shouldDiagnoseTargetSupportFromOpenMP (a horrible name - suggestions welcome!) and I think you should add a similar check for CUDA.
Comment Actions
Interesting. I suspect that we'll end up dealing with this problem for OpenMP as well (in the future - for OpenMP v5). In this next version (for which the draft is available here: http://www.openmp.org/wp-content/uploads/openmp-TR6.pdf), we'll have "implicit declare target" functions (whereby we generate target code based on the locally-defined subset of the transitive closure of the call graph starting from target regions). Comment Actions Updated to partially address rjmccall@ comments.
Comment Actions We've been contemplating treating all functions as __host__ __device__ by default. After all, most of the source code is target-agnostic. Currently a lot of templated code must be __host__ __device__ in order to be usable and it's a major obstacle to making standard library and other template libraries (somewhat) usable on device. Alas, making __host__ __device__ the default would be a major departure from CUDA semantics. Comment Actions @rjmccall : are you OK with this approach? If VLA is not supported by the target, CUDA is handled as a special case so it can emit deferred diag, OpenMP reports an error only if shouldDiagnoseTargetSupportFromOpenMP() allows it, and everything else does so unconditionally. Comment Actions Sorry for the delay; I took Thanksgiving week off. Yes, I think this patch is fine now, thanks.
Comment Actions I'm fine if @rjmccall is fine - but looks like I need to accept this revision because I requested changes in the past?
|