This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Ignore implicit target attributes during function template instantiation.
ClosedPublic

Authored by tra on Oct 20 2016, 1:51 PM.

Details

Summary

Some functions and templates are treated as __host__ __device__ even when they don't have explicitly specified target attributes.
What's worse, this treatment may change depending on command line options (-fno-cuda-host-device-constexpr) or #pragma clang force_cuda_host_device.

Combined with strict checking for matching function target that comes with D25809, it makes it hard to write code which would explicitly instantiate or specialize some functions regardless of pragmas or command line options in effect.

This patch changes the way we match target attributes of base template vs attributes used in explicit instantiation or specialization so that only explicitly specified attributes are considered.

Diff Detail

Repository
rL LLVM

Event Timeline

tra updated this revision to Diff 75348.Oct 20 2016, 1:51 PM
tra retitled this revision from to [CUDA] Ignore implicit target attributes during function template instantiation. .
tra updated this object.
tra added reviewers: jlebar, rsmith.
tra added a subscriber: cfe-commits.
jlebar added inline comments.Oct 21 2016, 11:49 AM
test/SemaCUDA/function-template-overload.cu
62 ↗(On Diff #75348)

Oh, I didn't know you could specify a count like this. I have a bunch of tests to fix now!

79 ↗(On Diff #75348)

D25809's patch description says:

[New behavior]: Require matching target attributes for explicit function template instantiation/specialization and narrow [the] list of candidates to templates with the same target.

Based on this, I do not see why most of the "these should work" candidates work.

Rather than explain here, can you please add a comment to the test?

(At some point someone is probably going to have to write down and make sense of all of the semantics we have invented. It's going to be a pain either way, but at least we can preserve our reasoning in code rather than phab comments.)

jlebar edited edge metadata.Oct 21 2016, 1:39 PM

This patch changes the way we match target attributes of base template vs attributes used in explicit instantiation or specialization so that only explicitly specified attributes are considered.

Another question about this: When we have something inside of the force-host-device pragma, the HD attributes it gets are implicit.

This means that the only way to specialize something inside one of these pragmas is to use the pragma around the specialization? That...seems weird?

tra updated this revision to Diff 75482.Oct 21 2016, 1:54 PM
tra edited edge metadata.

Added a comment explaining expected constexpr function template matching behavior.

tra added a comment.Oct 21 2016, 2:02 PM

This patch changes the way we match target attributes of base template vs attributes used in explicit instantiation or specialization so that only explicitly specified attributes are considered.

Another question about this: When we have something inside of the force-host-device pragma, the HD attributes it gets are implicit.

.. and therefore ignored for the purposes of matching specialization to template.

This means that the only way to specialize something inside one of these pragmas is to use the pragma around the specialization? That...seems weird?

The whole purpose of this patch is to *ignore* implicit attributes added by compiler and only consider what user has written in the source code. Specialization will work exactly the same regardless of whether template or its specialization are within the pragma. In all cases we'll only consider attributes explicitly specified by the user. I've used constexpr template to illustrate the case where compiler adds implicit HD attributes to the template, but the logic will apply to the specialization, too -- if attributes are not written, they are not considered when we consider which template to match.

To close the loop, we talked about this IRL, and I agree this is the most sane option we can come up with. The user-facing principle is that the signatures of function template specializations must match. We consider CUDA attributes to be part of the function's signature. Therefore the target attributes should match.

Now the question is, should the *implicit plus explicit* or *only explicit* target attributes match? To me, implicit+explicit makes more sense on face. But the problem is, we need to allow C++ code like

constexpr template<typename T> void foo(T);
template<> void foo(int);

We implicitly make constexpr functions HD. The template specialization is not constexpr, so is not implicitly HD. But we cannot disallow this code, because it's valid C++. That effectively constrains our choice above to *only explicit* target attributes.

jlebar added inline comments.Oct 21 2016, 3:38 PM
include/clang/Sema/Sema.h
9396 ↗(On Diff #75482)

We usually call them "target attributes", not "HD attributes"?

lib/Sema/SemaCUDA.cpp
97 ↗(On Diff #75482)

hasAttr?

900 ↗(On Diff #75482)

I don't see where this function is declared?

lib/Sema/SemaDeclAttr.cpp
5628 ↗(On Diff #75482)

Is this a functional change in this patch? We don't check for duplicates of most other attributes, so I'm not sure why it should matter with these. If it does matter, we should have comments...

lib/Sema/SemaTemplate.cpp
7046 ↗(On Diff #75482)

http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html :)

At least please do

IdentifyCUDATarget(Specialization, /* IgnoreImplicitTargetAttrs = */ true);
7168 ↗(On Diff #75482)

I am unsure of whether or why we need this anymore, since I don't see this declared in any header, and also since I thought we weren't merging attrs anymore?

8119 ↗(On Diff #75482)

Same here wrt bool param.

tra updated this revision to Diff 75511.Oct 21 2016, 4:53 PM
tra marked 5 inline comments as done.

addressed jlebar's comments.

tra added inline comments.Oct 21 2016, 4:54 PM
include/clang/Sema/Sema.h
9396 ↗(On Diff #75482)

Implementation only ignores H and D attributes. Global is never explicit (AFAICT) and we never ignore InvalidTarget even if it's implicit.

lib/Sema/SemaDeclAttr.cpp
5628 ↗(On Diff #75482)

This function copies over attributes from template to instantiation, so when we already had explicit attributes we would end up with another one which has potential to mess with our attempt to ignore implicit attributes. getAttr() would return the first attribute it finds and if it happens to be implicit one, explicit one would be ignored.

Plus, it was rather weird to see duplicate attributes hanging off FunctionDecls in AST.

lib/Sema/SemaTemplate.cpp
7168 ↗(On Diff #75482)

It's gone as it's not needed any more. It used to be needed when I disabled copying of target attributes from the base template.

jlebar added inline comments.Oct 21 2016, 5:01 PM
lib/Sema/SemaDeclAttr.cpp
5628 ↗(On Diff #75482)

This function copies over attributes from template to instantiation

It does? I thought this just handled seeing an attribute in the source?

I am all for having a pretty AST, but

  • if the user explicitly puts two __host__ attributes on the function, clearly we don't care about having the attribute twice on the function, and alternatively
  • if the user explicitly puts a __host__ attribute on the function and we also somehow got an implicit attr there, this check seems wrong, since it will let the implicit one win over the explicit one?
lib/Sema/SemaTemplate.cpp
7046 ↗(On Diff #75482)

Please put the equals sign. Without it, who knows whether the following call is sync or async:

make_call(/* async */ false);

Also using the equals sign lets a clang-tidy check ensure that we're using the right name.

8119 ↗(On Diff #75482)

Same here.

tra updated this revision to Diff 75812.Oct 25 2016, 4:45 PM
  • Instead of relying on the first attribute we find, check all matching ones.
  • Specializations inherit their target attributes from their base template only. Their effective target always matches that of the template and is no longer affected by whether specialization differens from template in its constexpr-ness.

Doesn't look like we changed any testcases when we changed this behavior? The change in behavior may be unobservable, but even still it seems worthwhile to have tests that check that we're not doing any of the alternatives we considered.

lib/Sema/SemaCUDA.cpp
99 ↗(On Diff #75812)

Is this early return necessary?

107 ↗(On Diff #75812)

Could we write this function as

return llvm::any_of(D->getAttrs(), [&](Attr *Attribute) {
  return isa<A>(Attribute) && (!IgnoreImplicitAttr || !Attribute->isImplicit()) });
893 ↗(On Diff #75812)

Perhaps ToFD, FromFD would be better names than "new" and "old".

896 ↗(On Diff #75812)

This comment is confusing because this function's implementation doesn't have anything to do with templates, but the first sentence says we're copying from a template to "FD".

In addition to cleaning this up, maybe we should move the comment into Sema.h?

912 ↗(On Diff #75812)

Could this be written as

template<typename AttrTy>
static void copyAttrIfPresent(FunctionDecl *NewFD, FunctionDecl *OldFD) {
  if (A *Attribute = OldFD->getAttr<A>()) {
    ...
  }
}

void Sema::inheritCUDATargetAttrs(...) {
  copyAttrIfPresent<CUDAGlobalAttr>(NewFD, OldFD);
  ...
}

? In particular I am not sure why we need this particular control flow in this function.

lib/Sema/SemaDecl.cpp
8372 ↗(On Diff #75812)

"from their templates" "in *the* foo() call below".

Although because "from their templates" is kind of confusing, I might rewrite to be singular:

Instead, a specialization inherits its target attributes from its template ...
lib/Sema/SemaTemplate.cpp
7048 ↗(On Diff #75812)

Missing some articles:

Target attributes are part of the cuda function signature, so the deduced template's cuda target must match XXX [1].  Given that regular template deduction [2] does not take target attributes into account, we reject candidates here that have a different target.

[1] I am not sure what XXX should be. The deduced template's cuda target must match what, exactly?

[2] What is "regular template deduction"?

7171 ↗(On Diff #75812)

Suggest

A function template specialization inherits the target attributes of its template.  (We require the attributes explicitly in the code to match, but a template may have implicit attributes by virtue e.g. of being constexpr, and it passes these implicit attributes on to its specializations.)
7173 ↗(On Diff #75812)

Isn't this backwards? The function is to, from? If this is a bug, can we add a test to catch this?

tra updated this revision to Diff 80677.Dec 7 2016, 4:02 PM
tra marked 4 inline comments as done.

Addressed Justin's comments.

lib/Sema/SemaCUDA.cpp
99 ↗(On Diff #75812)

Yes. Otherwise D->getAttrs() will trigger assert(hasAttrs) if we don't have any attributes.

107 ↗(On Diff #75812)

Done.

896 ↗(On Diff #75812)

I've changed second argument type to const FunctionTemplateDecl* to reflect my intent a bit better.
Moved the comment to Sema.h.

lib/Sema/SemaTemplate.cpp
7048 ↗(On Diff #75812)

[1] it must match the target of its template.
[2] "C++ template deduction"
Rephrased the comment based on IRL conversation w/ jlebar@.

7173 ↗(On Diff #75812)

The arguments have different types now, one of them const&, so it should be unambiguous what's input and what's output.
As for order of inputs/outputs, my mental model is result=input as in memcpy(dest, src). Style guide does not seem to say anything on order of input/output arguments.

jlebar accepted this revision.Dec 8 2016, 10:09 AM
jlebar edited edge metadata.
jlebar added inline comments.
lib/Sema/SemaDecl.cpp
8415 ↗(On Diff #80677)

"the Foo() call below", or "in Foo() below". (I guess "Foo()" is like a proper name. :)

lib/Sema/SemaTemplate.cpp
7048 ↗(On Diff #75812)

It's not showing a rephrased comment above; maybe you forgot to commit before arc diff?

test/SemaCUDA/function-template-overload.cu
73 ↗(On Diff #80677)

"the implicit HD attributes the compiler gives" :)

74 ↗(On Diff #80677)

comma before "so" (conjunction joining independent clauses)

This revision is now accepted and ready to land.Dec 8 2016, 10:09 AM
tra updated this revision to Diff 80783.Dec 8 2016, 10:28 AM
tra edited edge metadata.
tra marked 3 inline comments as done.

Fixed comments.

tra marked an inline comment as done.Dec 8 2016, 10:30 AM
tra added inline comments.
lib/Sema/SemaTemplate.cpp
7048 ↗(On Diff #75812)

Argh. I've apparently deleted corrected version after I've showed it to you. :-/ Fixed now.

This revision was automatically updated to reflect the committed changes.
tra marked an inline comment as done.
RKSimon added inline comments.
cfe/trunk/lib/Sema/SemaDecl.cpp
8416

@tra Sorry for touching such an old ticket - but cppcheck is warning that we are using a boolean result in a bitwise expression - I'm assuming this should be:

if (getLangOpts().CUDA && !isFunctionTemplateSpecialization)
tra marked an inline comment as done.Aug 12 2019, 11:16 AM
tra added inline comments.
cfe/trunk/lib/Sema/SemaDecl.cpp
8416

Good catch. I'll fix it.

tra marked an inline comment as done.Aug 23 2019, 9:24 AM
tra added inline comments.
cfe/trunk/lib/Sema/SemaDecl.cpp
8416

Fixed in rL369777.