This implements a proposal by Doug Gregor on cfe-dev a couple of years ago, to allow the compiler to emit strong symbols for explicit template instantiations. Doug's spec, which I have tried to follow can be found here: http://lists.llvm.org/pipermail/cfe-dev/attachments/20111203/5e1c6c35/attachment.html. I usually work over in LLVM-land and have only contributed the occasional bug fix to clang, so please let me know how this patch can be improved. I'm also not fully done testing it yet (only did so on the toy example I included as a test case), but I wanted to get this out there to get feedback.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
include/clang/Basic/Attr.td | ||
---|---|---|
1669 | This is not a GCC attribute, so it should not be spelled as such. Since this only applies to C++ code, I would recommend a C++11 spelling (in the clang namespace). If you think this is something C++03 and earlier code would really benefit from, then you could also add a GNU-style spelling. | |
1671 | Please, no undocumented attributes. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2744 | Would this make more sense as an option in warn_attribute_wrong_decl_type instead? (there's an err_ version as well if you wish to keep this an error). Also, please ensure that the attribute is quoted in the diagnostic -- it makes things less confusing for the user. | |
lib/AST/ASTContext.cpp | ||
8741 | I think this would be easier to read (and not have to hoist a declaration out of the switch) as: if (const auto *CTSD = dyn_cast<>()) { if (CTSD->hasAttr<>()) return GVA_StrongExternal; } return GVA_StrongODR; | |
8876 | Similar suggestion here. | |
lib/Sema/SemaDeclAttr.cpp | ||
5374 | The declaration does not need to be hoisted here. | |
5378 | Why is this required as part of the feature design? It seems restrictive. | |
5385 | Is this something that can be handled by mergeDeclAttribute()? I'm not certain how that interplays with templates specifically, but usually we do this sort of logic within a Sema::mergeFooAttr() function. |
Thanks for the quick review.
include/clang/Basic/Attr.td | ||
---|---|---|
1669 | No, this is C++11 only. Will change the spelling. | |
1671 | Will document. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2744 | Ok, so should I add an "explicit template instantiations" option to that err? | |
lib/AST/ASTContext.cpp | ||
8741 | Ok. | |
lib/Sema/SemaDeclAttr.cpp | ||
5378 | This was part of Doug's original Spec, so I implemented it:
I think that makes a decent amount of sense, since you really want to avoid the case where some translation units don't see the extern template declaration. | |
5385 | Hmm, I'm not sure, the goal of this is to ensure that all declarations and definitions of this extern template have the attribute set. It's not really merging per se. Though I suppose it could be made to fit in that framework. |
include/clang/Basic/Attr.td | ||
---|---|---|
1669 | It doesn't appear like the grammar allows attributes to be specified on explicit template definitions. I'll change the spelling to GNU: error: an attribute list cannot appear here template struct foo<int> [[clang::unique_instantiation]]; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [[clang::unique_instantiation]] error: an attribute list cannot appear here template struct [[clang::unique_instantiation]] foo<int>; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: an attribute list cannot appear here template [[clang::unique_instantiation]] struct foo<int>; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: an attribute list cannot appear here [[clang::unique_instantiation]] template struct foo<int>; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. FileCheck error: '-' is empty. |
Address review comments. I had to add a special case to checkNewAttributesAfterDef if we want to use attribute merging for explicit template instantiations, because the Microsoft ABI allows adding dll attributes to the explicit template definition, but not the declaration (which clang considers to be the record's definition).
Thoughts on allowing this attribute to be specified on the templated class itself, with the intention of never allowing any implicit instantiation? As an example, consider SymbolTableListTraits in LLVM. It can only ever be used as an explicit instantiation (because the full implementation is not available).
Rebased, added support for unique_instantiation on explicit function templates and fix the case where one record is embedded in another and the outer is explicitly instantiated. Add testcases for all of these.
Ok, I have tested this more extensively now. I'm happy with the results. Here's a patch that applies this to LLVM for example: https://gist.github.com/Keno/79b08a4b187c4d950dd0
Before:
$llvm-objdump -weak-bind libLLVM-3.8svn.dylib | wc -l 300
After:
$llvm-objdump -weak-bind libLLVM-3.8svn.dylib | wc -l 15
with none of those being LLVM symbols.
lib/CodeGen/CGVTables.cpp | ||
---|---|---|
750 | Looks like some debugging code accidentally made it in. | |
lib/Sema/SemaDecl.cpp | ||
2252 | This function has some formatting issues. Also, I feel like some of the code duplication could be removed. if (old == new) return; Loc = Old->hasAttr ? new->getLocStart() : new->getAttr->getLocation(); S.Diag(Loc, diag::blah); S.Diag(Old->getLocStart(0, blah); | |
2400 | Isn't this a bit broad? I do agree that we don't want to give incorrect warnings, but it seems like this may inadvertently silence correct warnings as well. If my gut feeling is wrong (which it might be), it would be good to have some test cases confirming that this doesn't silence correct diagnostics. | |
lib/Sema/SemaDeclAttr.cpp | ||
5392 | The formatting of this line seems a bit weird -- did clang-format do this? | |
lib/Sema/SemaTemplate.cpp | ||
8207 | Instead of hasAttr<> followed by getAttr<>, better to just call getAttr<> and handle null. | |
8218 | Same comment here. | |
8227 | Formatting; I would recommend running clang-format over the patch to catch these sort of things. |
include/clang/Basic/Attr.td | ||
---|---|---|
1670 | This one should be handled in ClangAttrEmitter.cpp; I can see it being used far more frequently than ExpectedExplicitInstantiation, and all the machinery should already be in place to handle it (in CalculateDiagnostic()). | |
include/clang/Basic/AttrDocs.td | ||
2175 | In the unlikely event the user gets this wrong (lol), is there a way to diagnose this on the backend? The wording here makes it sound like it's possible for this to cause silent miscompiles, which would be unfortunate. |
include/clang/Basic/AttrDocs.td | ||
---|---|---|
2175 | I don't think there is anything that can be really done here. I believe the static linker will link this just fine (there'll be one weak and one strong symbol). The difficulty of course arises with dynamic linkers that won't bother looking for any weak symbols to merge with (at least the OS X one doesn't). I suppose it's not really that different from missing a weak attribute on some declarations but not others. | |
lib/CodeGen/CGVTables.cpp | ||
750 | Thanks! | |
lib/Sema/SemaDecl.cpp | ||
2400 | As far as I know, this is only called from mergeDeclAttributes, which didn't use to be called on these before this patch. I also don't think any attributes but dllexport or unique_instantiation apply to these declarations and those two both need special processing. |
include/clang/Basic/AttrDocs.td | ||
---|---|---|
2175 |
That's unfortunate, but I think you may be right. Blech. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2745 | Please quote 'unique_instantiation' in the diagnostic (here and in the subsequent uses). Instead of "something", let's use "a declaration"? "a explicit template" should be "an explicit template". Also, diagnostics are not complete sentences, so should not be capitalized or end with a full stop. Actually, how about this for a new wording: "'unique_instantiation' attribute only applies to an explicit template declaration or instantiation" | |
2749 | Should say: the 'unique_instantiation' attribute. Also, I'm not keen on "a particular" as it's very non-specific. I wonder if we need err_unique_instantiation_no_declaration and err_unique_instantiation_not_previous to be separate diagnostics? Aren't they both effectively saying, "a 'unique_instantiation' attribute must be specified for all declarations and definitions of the explicit template instantiation" | |
lib/AST/ASTContext.cpp | ||
8730 | Elide braces | |
lib/Sema/SemaDecl.cpp | ||
2259 | hasAttr<> followed by getAttr<> that can be simplified. | |
2400 |
Okay, thanks! | |
lib/Sema/SemaTemplate.cpp | ||
8224 | Elide braces |
include/clang/Basic/Attr.td | ||
---|---|---|
1670 | I hadn't tried before, but looking at this, I'd say they're broken in clang anyway and I'm gonna say fixing that is outside the scope of this revision: $ cat test.cpp template <typename T> T n; extern template int n<int>; template int n<int>; int main() { return n<int>; } $ ./usr/bin/clang++ -std=c++14 -c test.cpp $ nm test.o 0000000000000000 T main U _Z1nIiE $ nm test.o.gcc6 [snip] 00000000060098c u _Z1nIiE | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2749 | They are checking for two different conditions in the spec. One requires that all explicit template instantiations with this attribute have a declaration, the other that every declaration/definition has the attribute. | |
test/CodeGenCXX/unique-instantiation.cpp | ||
2 | No, removing. |
include/clang/Basic/Attr.td | ||
---|---|---|
1670 | They work ok, clang just thinks that it's a declaration of a variable template. Try this: template <typename T> T n = T(); extern template int n<int>; template int n<int>; |
include/clang/Basic/Attr.td | ||
---|---|---|
1670 | I see. I'll look into adding support for it. Can you explain why my example doesn't work? GCC seems to allow this. |
include/clang/Basic/DiagnosticSemaKinds.td | ||
---|---|---|
2749 |
Okay, I see now what this diagnostic is attempting to convey. I think it should read: def err_unique_instantiation_no_declaration : Error< "'unique_instantiation' attribute on an explicit instantiation requires a previous explicit instantiation declaration">; | |
test/SemaCXX/unique-instantiations.cpp | ||
6 | Please spell the entire diagnostic out in expected-error (applies to entire file). | |
24 | Missing tests for correct usage of the attribute. Missing tests of the attribute diagnosing when given arguments. |
Address review feedback regarding diagnostic wording/expand tests to full text of diagnostic.
Modulo the question you and David are discussing about variable templates (for which I don't have an answer handy), I just have a few small testing nits.
test/SemaCXX/unique-instantiations.cpp | ||
---|---|---|
25 |
It is, but those are just tests for code generation, which are all expected to succeed semantically. We generally expect tests in Sema (and related folders) that demonstrate successful cases as well as failure cases for features. | |
29 | The test with unique_instantiation(16) is sufficient, no need for the "Hello World" test as well. |
include/clang/Basic/AttrDocs.td | ||
---|---|---|
2317–2336 | Instead of using all-caps for emphasis in "ONE" and "MUST", why not bold the text instead? It would catch the eye a little faster. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2744–2745 | I can't seem to see any users of err_unique_instantiation_wrong_decl. | |
lib/AST/ASTContext.cpp | ||
8741–8744 | There are a lot of commas here, making it a little hard to read the justification. Perhaps something along the lines of:
| |
lib/CodeGen/CGVTables.cpp | ||
748–751 | What if the key function is not contained within a record which is marked as UniqueInstantiationAttr but the key function itself is marked UniqueInstantiationAttr? |
include/clang/Basic/AttrDocs.td | ||
---|---|---|
2317–2336 | Sure! I assume, this follows standard RST conventions where ** is bold? | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2744–2745 | Yes, it was replaced by adding a case to err_attribute_wrong_decl_type above. Will remove. | |
lib/AST/ASTContext.cpp | ||
8741–8744 | Sounds good. | |
lib/CodeGen/CGVTables.cpp | ||
748–751 | I don't know, I'd be inclined to say the unique instantiation does not apply to the vtable unless it's explicit on the class, but I'd be open to differing opinions. |
lib/AST/ASTContext.cpp | ||
---|---|---|
8726–8735 | Function templates can contain class templates which contain functions. I think this code will return false if the outermost function template is the only one annotated with UniqueInstantiationAttr. |
lib/AST/ASTContext.cpp | ||
---|---|---|
8726–8735 | So you're concerned about this case: template < typename T > auto foo( T x ) { class inner { T y; public: virtual ~inner() {} inner(T y) : y(y) {} T getY() { return y; } }; return inner{x}.getY(); } extern template __attribute__((unique_instantiation)) auto foo<int>(int); template __attribute__((unique_instantiation)) auto foo<int>(int); Right now the inner class's vtable and method is linkonce_odr. If you think it should be strong I'll look into it. |
Address David's concern about inner classes. David also suggested on IRC to propagate
the unique instantiation attribute down rather than walking the context chain to
check for the attribute. I'll try that out, but wanted to put up a known-good version
with all fixed first.
Propagate unique instantiation attribute to children rather than later trying to look through the DeclContexts to see if we're contained in one that has the attribute.
I came across a situation again where this would be useful to have. I know this was approved, but looking it looks like I wanted @majnemer to have another look.
I found a bug in this which I need to fix, but while I'm at it, in doing more testing on this, I came across the following corner case:
template <typename T> struct static_separate_template { typedef T element; static T *a_static_field; }; extern template struct __attribute__((unique_instantiation)) static_separate_template<int>; template struct __attribute__((unique_instantiation)) static_separate_template<int>; extern template struct __attribute__((unique_instantiation)) static_separate_template<char>; template struct __attribute__((unique_instantiation)) static_separate_template<char>; template <typename T> typename static_separate_template<T>::element *static_separate_template<T>::a_static_field = nullptr; template int * __attribute__((unique_instantiation)) static_separate_template<int>::a_static_field; template char * static_separate_template<char>::a_static_field; // expected-error{{'unique_instantiation' attribute must be specified for all declarations and definitions of this explicit template instantiation}}
How should this be handled? I indicated my inclination in the comment, but I'm open to alternative suggestions.
I meant
template __attribute__((unique_instantiation)) int * static_separate_template<int>::a_static_field;
of course, though we probably need a better diagnostic for the other spelling (which applies the attribute to the static_separate_template<int>). I'll look into adding a diagnostic.
I think this attribute is poorly named. Explicit instantiation definitions are *already* required to be globally unique; see [temp.spec]/5.1:
"For a given template and a given set of template-arguments, an explicit instantiation definition shall appear at most once in a program"
The actual meaning of the attribute is that an explicit instantiation declaration will appear before any use that might trigger implicit instantiation. To that end, something like __attribute__((declared_before_all_uses)) might be clearer.
include/clang/Basic/AttrDocs.td | ||
---|---|---|
2317–2336 | Yes, this documentation allows arbitrary RST markup. But I don't see the need for strong emphasis here. | |
2321–2323 | The primary description of the attribute should specify what it means, not the consequences of that meaning. In this case, the meaning is that the programmer is guaranteeing to the compiler that an explicit instantiation precedes all implicit instantiations, and the consequence is that we can use strong symbols. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2749 | This seems like an unnecessarily-strict rule to me. I don't see a reason to disallow: // in header extern template struct __attribute__((whatever_its_called)) X<int>; // in source file template struct X<int>; There doesn't appear to be any safety or correctness concern here. In general, it seems like the attribute is only promising that the explicit instantiation declaration precedes any other instantiation, so the only necessary rule would seem to be: if the attribute is present on any explicit instantiation, the first point of instantiation must be an explicit instantiation declaration that has the attribute. | |
lib/AST/ASTContext.cpp | ||
8789 | revert this change | |
lib/Sema/SemaDecl.cpp | ||
2396–2397 | I assume you mean "explicit instantiation" rather than "explicit template" here (2x). | |
2398–2399 | I have no idea what you mean by this. Did you mean "associated with the explicit instantiation declaration" or something like that? | |
lib/Sema/SemaDeclAttr.cpp | ||
5400 | Why is this rule only applied to explicit instantiations of class templates, and not to function or variable templates? | |
lib/Sema/SemaDeclCXX.cpp | ||
5666–5685 | I don't think this is the right way to handle this case. Note that an explicit instantiation declaration/definition for a class is only an explicit instantiation of those members that are defined at the point where the explicit instantiation appears. It seems like the most consistent behavior is probably to add the attributes to the defined class members as part of the process of explicitly instantiating them, in InstantiateClassMembers. | |
lib/Sema/SemaTemplate.cpp | ||
8063–8066 | Do we need this to be here as a special case, duplicated for each kind of decl that can have these attributes, rather than handling it using the normal attribute-handling mechanism in SemaDeclAttr? If so, why? |
Thanks for the review! Sorry, it's been a while since I wrote this code, so I'm not fluent in all the details, but I've replied below. I am not particularly partial to the name, so whatever you feel is best is ok with me.
include/clang/Basic/AttrDocs.td | ||
---|---|---|
2321–2323 | Ok, sounds good. | |
include/clang/Basic/DiagnosticSemaKinds.td | ||
2749 | While I didn't write the rules (just implemented them from the original proposal), I like the requirement from the point of view of making sure that the explicit instantiation declaration doesn't accidentally get deleted. Those declarations are somewhat odd in that they can be deleted without any semantic loss in functionality (just larger code size/more work for the linker) in the absence of this attribute. However, with this attribute, deleting the declaration would cause a subtle, perhaps hard to track down semantic change, so I like requiring people to be explicit. | |
lib/Sema/SemaDecl.cpp | ||
2398–2399 | Yes, I believe that is what I meant. It's been a while since I wrote this code, but I believe that all I'm trying to say here is that explicit instantiation declarations/definitions are allowed to add attributes after a definition, even though that is not normally allowed. | |
lib/Sema/SemaDeclAttr.cpp | ||
5400 | I don't remember precisely, but I believe (at least when I wrote the patch) that those prior explicit instantiation declarations did not leave an AST node that I could check for. | |
lib/Sema/SemaDeclCXX.cpp | ||
5666–5685 | Ok, I'll try that. | |
lib/Sema/SemaTemplate.cpp | ||
8063–8066 | IIRC, the problem was that the new attributes get added to the same node, so we lose the information of whether the attribute was present on the previous declaration. |
So what prevents from emitting these as "strong" symbols without any attribute? We should have the guarantee that we have only one such copy in the program.
This is not a GCC attribute, so it should not be spelled as such. Since this only applies to C++ code, I would recommend a C++11 spelling (in the clang namespace). If you think this is something C++03 and earlier code would really benefit from, then you could also add a GNU-style spelling.