This is an archive of the discontinued LLVM Phabricator instance.

Create a frontend flag to disable CUDA cross-target call checks
ClosedPublic

Authored by eliben on Apr 15 2015, 12:25 PM.

Details

Summary

For CUDA source, Sema checks that the targets of call expressions make sense (e.g. a host function can't call a device function).

Adding a flag that lets us skip this check. Motivation: for source-to-source translation tools that have to accept code that's not strictly kosher CUDA but is still accepted by nvcc. The source-to-source translation tool can then fix the code and leave calls that are semantically valid for the actual compilation stage.

Diff Detail

Repository
rL LLVM

Event Timeline

eliben updated this revision to Diff 23794.Apr 15 2015, 12:25 PM
eliben retitled this revision from to Create a frontend flag to disable CUDA cross-target call checks.
eliben updated this object.
eliben edited the test plan for this revision. (Show Details)
eliben added reviewers: jpienaar, tra, rnk.
eliben added a subscriber: Unknown Object (MLST).
tra added inline comments.Apr 15 2015, 12:47 PM
lib/Sema/SemaCUDA.cpp
63–69 ↗(On Diff #23794)

Do we really need to disable the check completely?
Can it be limited to calls from host-device functions only?
What's expected to happen if we do let host->device or device->host call through?
Do we expect an error further down in the pipeline if/when we get to generate the code for the call?
Or will we generate valid code which would cause runtime error if the call were to happen?
In either case this seems to need a test case to make sure we do a sensible thing for the call that is not expected to work.

eliben added inline comments.Apr 15 2015, 1:08 PM
lib/Sema/SemaCUDA.cpp
63–69 ↗(On Diff #23794)

Do we expect an error further down in the pipeline if/when we get to generate the code for the call?

Yes, that sums it up, pretty much. This flag, as far as I'm concerned, exists for the purpose of source-rewriting tools. There's no codegen involved, and no IR being emitted. In our current CUDA pipeline, real problems in user code will fail in subsequents steps of the pipeline.

tra accepted this revision.Apr 15 2015, 1:13 PM
tra edited edge metadata.

LGTM.
On a side note, do you still need fcuda_allow_host_calls_from_host_device ?

This revision is now accepted and ready to land.Apr 15 2015, 1:13 PM
In D9036#156680, @tra wrote:

LGTM.
On a side note, do you still need fcuda_allow_host_calls_from_host_device ?

Yes, unfortunately. This one is used by the compilation step itself, where Clang then emits a warning (instead of an error). This mimics nvcc's behavior. Some host device functions are only really called on the host side, but the compiler has no way of proving it statically.

rnk edited edge metadata.Apr 15 2015, 1:58 PM

The normal way we do this is to create a default-error warning, and then you can say -Wno-my-warning and disable the error. This also suppresses the error in system headers, which might make sense if you can't parse lots of CUDA headers yet, and just want to compile such impossible calls down to 'unreachable'. See for example -W[no-]c++11-narrowing.

I don't think it's very intuitive that -W flags impact errors, but this is the way we've done things for one-off diagnostics like this and I'd rather stay consistent with that until we come up with a better interface later. What do you think?

rnk accepted this revision.Apr 15 2015, 2:54 PM
rnk edited edge metadata.

As discussed, this flag actually affects overload resolution results, so it should probably be a language option as it is now. Maybe we shouldn't be having this target check affect overload results, but that's another big issue that we don't need to run down right this instant.

lgtm

In D9036#156719, @rnk wrote:

As discussed, this flag actually affects overload resolution results, so it should probably be a language option as it is now. Maybe we shouldn't be having this target check affect overload results, but that's another big issue that we don't need to run down right this instant.

lgtm

Right. In the long run, I think that letting it affect overload resolution is actually beneficial if we want to consider proper overloading based on target attributes.

This revision was automatically updated to reflect the committed changes.