This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Improved target attribute-based overloading.
ClosedPublic

Authored by tra on Oct 19 2016, 5:23 PM.

Details

Summary

Current behavior:

  • host device (HD) functions are considered to be redeclarations of __host__ (H) of __device__ (D) functions with same signature.
  • Target attributes are not taken into account during selection of function template during explicit instantiation and specialization.

Issues:
a) It's possible for a H or D function to inherit HD attributes from a HD declaration which results in those functions being silently treated as HD in the rest of the code which leads to clang accepting the code instead of diagnosing it as an error.
b) If we have definitions of HD and a H or D function, compiler complains about redefinition of the same function, which is misleading.
c) It is impossible to disambiguate across target-overloaded function templates during explicit instantiation/specialization.

Changes in this patch:
a) treat HD functions as overloads and add Sema checks to explicitly diagnose attempts to overload HD functions with H or D ones.
b) Require matching target attributes for explicit function template instantiation/specialization and narrow list of candidates to templates with the same target. Diagnose rejected candidates.

The changes (a) and (b) can be split into separate patches, but both must be committed simultaneously as
(a) alone further breaks function template instantiation/specialization when target attributes are involved and (b) is half-broken until (a) is in place and prevents HD attributes merging into functions with H or D attributes.

Open issues:

  • It's not clear how to handle explicit specialization of constexpr function templates:
    • Implicit target attributes of constexpr functions and templates change depending on whether -fno-cuda-host-device-constexpr is in effect.
    • C++11 [dcl.constexpr]p1: An explicit specialization of a constexpr function can differ from the template declaration with respect to the constexpr specifier.

One idea is to only match explicitly written target attributes when we choose candidate templates. This makes it easier to tell which template we instantiate based only on what's in the source we compile.

Diff Detail

Repository
rL LLVM

Event Timeline

tra updated this revision to Diff 75238.Oct 19 2016, 5:23 PM
tra retitled this revision from to [CUDA] Improved target attribute-based overloading..
tra updated this object.
tra added reviewers: jlebar, rsmith.
tra added a subscriber: cfe-commits.
jlebar added inline comments.Oct 21 2016, 4:03 PM
lib/Sema/SemaCUDA.cpp
87 ↗(On Diff #75238)

Checking ForceCUDAHostDeviceDepth here is...yeah. Especially because the other overload of this function doesn't do it.

I know you get rid of it in the next patch, but how much work would it be to get rid of it here? It definitely makes this patch harder to check.

790 ↗(On Diff #75238)

Nit, can we assert that we're in cuda mode?

791 ↗(On Diff #75238)

If this is just a Decl* or NamedDecl*, can we write out the type?

793 ↗(On Diff #75238)

Please add a comment explaining why we ignore template specializations.

lib/Sema/SemaTemplate.cpp
7044 ↗(On Diff #75238)

Can we have a better comment here? (And, can we expand upon it in D25845?)

8117 ↗(On Diff #75238)

Can we just pass D here (and thus not write the new overload of IdentifyCUDATarget)?

test/SemaCUDA/function-template-overload.cu
34 ↗(On Diff #75238)

"though non-template functions are OK"?

44 ↗(On Diff #75238)
This comment has been deleted.
44 ↗(On Diff #75238)

Please ignore the above comment; it is not correct, but I cannot delete it or edit it in phab. :-/

44 ↗(On Diff #75238)

(Wow, it also didn't submit the comment I was referring to. How broken.)

tra updated this revision to Diff 75515.Oct 21 2016, 5:22 PM
tra marked 5 inline comments as done.

addressed jlebar's comments.

lib/Sema/SemaCUDA.cpp
87 ↗(On Diff #75238)

OK. Moved the check to the place where I use it.

791 ↗(On Diff #75238)

I'm not sure what exactly you'd like to see. Diags will print out the line and target info for both sides.
Could you give me example of existing diagnostics that would be similar to what you want?

793 ↗(On Diff #75238)

That's another check that's no longer needed as the only template instantiations/specializations that end up in the overload set are those that were instantiated from templates with matching target.

lib/Sema/SemaTemplate.cpp
8117 ↗(On Diff #75238)

Nope. D is a Declarator which is very different form FunctionDecl.
It carries info about what we've parsed, but we didn't construct a FunctionDecl from it yet.

tra updated this revision to Diff 75516.Oct 21 2016, 5:24 PM

removed pragma check from IdentifyCUDATarget for real.

jlebar accepted this revision.Oct 21 2016, 6:52 PM
jlebar edited edge metadata.
jlebar added inline comments.
lib/Sema/SemaCUDA.cpp
87 ↗(On Diff #75238)

Thank you -- I appreciate that. :)

791 ↗(On Diff #75238)

Sorry, I just meant for you to change "auto" to the actual type.

793 ↗(On Diff #75238)

Even better. :)

lib/Sema/SemaTemplate.cpp
7047 ↗(On Diff #75516)

what is "it"? Do you mean "the function signature", or something else?

This revision is now accepted and ready to land.Oct 21 2016, 6:52 PM
tra updated this revision to Diff 75652.Oct 24 2016, 3:14 PM
tra edited edge metadata.
tra marked an inline comment as done.

Addressed remaining nits.

tra added inline comments.Oct 24 2016, 3:15 PM
lib/Sema/SemaCUDA.cpp
791 ↗(On Diff #75238)

Ah! That would be NamedDecl * Done..

lib/Sema/SemaTemplate.cpp
7047 ↗(On Diff #75516)

it = "target attributes"

tra updated this revision to Diff 75816.Oct 25 2016, 4:50 PM
  • handle using declarations found in the overload set we check.

Is it possible to write a testcase for the using-declaration change?

tra updated this revision to Diff 80522.Dec 6 2016, 6:16 PM

Removed HD overloading checks for using declarations.
'using' exposes number of issues with the way we handle overloading of HD functions vs H/D.
The issues will be addressed in a separate patch.

jlebar added a comment.Dec 6 2016, 8:01 PM

If you would like me to have another look at this, is it possible to make an interdiff of your changes between this and the last version I reviewed? phab's interdiff is useless because it straddles a rebase.

jlebar added a comment.Dec 6 2016, 8:01 PM

(Alternatively I'm happy not to have another look if you don't think you need it.)

tra added a comment.Dec 7 2016, 10:11 AM

If you would like me to have another look at this, is it possible to make an interdiff of your changes between this and the last version I reviewed? phab's interdiff is useless because it straddles a rebase.

I've reverted to 75652 which is the revision you've approved with a new if() to limit our overloading checks to FunctionDecls only.
It's a the very end of SemaCUDA.cpp (line 864):
https://reviews.llvm.org/differential/changeset/?ref=590934%2F556303&whitespace=ignore-most

I've reverted to 75652

Did you forget to arc diff?

Anyway if it's just that if statement, lgtm.

This revision was automatically updated to reflect the committed changes.
cfe/trunk/test/SemaCUDA/function-template-overload.cu