Page MenuHomePhabricator

CUDA host device code with two code paths

Authored by jpienaar on Nov 30 2014, 9:45 AM.



Allow CUDA host device functions with two code paths using CUDA_ARCH to differentiate between code path being compiled.

For example,

__host__ __device__ void host_device_function(void) {
#ifdef __CUDA_ARCH__

Diff Detail


Event Timeline

jpienaar updated this revision to Diff 16752.Nov 30 2014, 9:45 AM
jpienaar retitled this revision from to CUDA host device code with two code paths.
jpienaar updated this object.
jpienaar edited the test plan for this revision. (Show Details)
jpienaar added reviewers: pcc, eliben.
jpienaar added a subscriber: Unknown Object (MLST).
rnk added a subscriber: rnk.Dec 1 2014, 10:00 AM

Can you remind me what the CUDA compilation model is currently? My memory was that the clang driver was eventually going to launch two -cc1 actions, one for device and one for host, presumably with different flags. I would expect that lib/Frontend/InitPreprocessor.cpp would define this macro when targeting the device.

If we're doing a single compilation with a fat object approach, we may need to do something weird to get this right. =/

I think your memory is correct (at least thats what I thought too). And
yes, the macro would be defined externally when targeting the device. At
that point we can remove this check for macro definition as we'd then be
able to check the flags directly. So I do see this as a temporary solution
which disrupts as little as possible.

Is the concern here with how the test is written? (i.e., the test
explicitly sets this macro which will in future be set by the compiler
itself). In which case it could be changed to

host device void hd1(void) {
#ifdef CUDA_ARCH

hd1h(); // expected-error {{no matching function}}


hd1d(); // expected-error {{no matching function}}


hd1g<<<1, 1>>>(); // expected-error {{reference to __global__

function 'hd1g' in host device function}}

pcc edited edge metadata.Dec 1 2014, 11:11 AM

There is already a flag for this, -fcuda-is-device. We should make that flag a LangOption and use it for this check.

rnk added a comment.Dec 1 2014, 11:30 AM
In D6457#7, @pcc wrote:

There is already a flag for this, -fcuda-is-device. We should make that flag a LangOption and use it for this check.

Right, this seems like the correct approach. Is there concern that it would be too disruptive at this stage to define __CUDA_ARCH__?

jpienaar updated this revision to Diff 16862.Dec 3 2014, 7:02 AM
jpienaar updated this object.
jpienaar edited edge metadata.

Creating CUDAIsDevice as language options (using fcuda-is-device frontend option). Using this flag instead of definition of CUDA_ARCH to determine whether host/device compilation is occurring.

Thanks, that's a good idea. Is this close to what you in mind?

rnk added inline comments.Dec 3 2014, 10:59 AM
160 ↗(On Diff #16862)

I think you just want regular LANGOPT, given the description of BENIGN_LANGOPT:

// BENIGN_LANGOPT: for options that don't affect the construction of the AST in
//     any way (that is, the value can be different between an implicit module
//     and the user of that module).
1381 ↗(On Diff #16862)

"... the NVPTX backend." maybe?

873–878 ↗(On Diff #16862)

I guess this definition is intended to satisfy targeting hypothetical non-NVPTX targets from CUDA. OK.

91 ↗(On Diff #16862)

The predominant style in clang for eliding text from a quotation is to use "text [...] text". I'm not trying to be pedantic, this actually threw me off when I read it. :)

I guess this applies equally to the CUDA reference quotes above.

jpienaar updated this revision to Diff 16876.Dec 3 2014, 11:19 AM

Made recommended changes.

rnk accepted this revision.Dec 3 2014, 11:23 AM
rnk added a reviewer: rnk.


This revision is now accepted and ready to land.Dec 3 2014, 11:23 AM

Thanks (and the comment about text elision was helpful :) ). I don't have commit access, could you assist me in committing this? Thanks

rnk closed this revision.Dec 3 2014, 1:54 PM
rnk updated this revision to Diff 16890.

Closed by commit rL223271 (authored by @rnk).