This is an archive of the discontinued LLVM Phabricator instance.

[RFC][clang] Add attribute-like keywords
AbandonedPublic

Authored by rsandifo-arm on Nov 30 2022, 9:26 AM.

Details

Summary

https://reviews.llvm.org/D127762 adds Clang support for some new SME features. These features allow code to attach semantic information to certain types and declarations. In the current SME ACLE specification, this information is conveyed using GNU-style attributes. (Since the information is semantic, it clearly couldn't be conveyed by standard attributes.)

Although GNU-style attributes are in practice often used to carry semantic information, there is a big drawback to this approach: compilers that don't understand the attributes will emit a warning rather than an error. This is true of both Clang and GCC, and is compounded by the fact that warnings for unknown standard attributes and unknown GNU attributes are conflated into a single warning option. (Arguably an error would have
been more appropriate for GNU attributes, given its historically broad use. But no more than a warning is warranted for standard attributes.)

A slightly hacky, old-school compromise would be to have the compiler automatically define a preprocessor macro that expands to a GNU-style attribute. This still retains the advantage of using an existing mechanism (GNU attributes) to convey the information, but has the additional advantage that older compilers will reject the code. However, it can tend to make diagnostic messages less readable.

A more robust approach would be to add new keywords. This should give a better user experience than the two alternatives described above. However, it would mean having to define new parsing rules that are specific to the SME extension. Two drawbacks with this approach are:

  1. The parsing rules would need to be kept up-to-date as the language evolves.
  1. If other extensions (besides SME) need to define similar parsing rules for their information, there's a risk that different sets of rules will handle corner cases differently, which loses the consistency that the standard attribute syntax brings.

I think the risk of (2) is significant. There are quite a few existing extensions (including AArch64 ones) that use GNU attributes to describe semantics, and if all those extensions had used keywords instead, it seems unlikely that they would have agreed on common rules.

If the SME information were not semantic, standard attributes would be the obvious and ideal way of conveying it. It's also possible to envisage a version of the standard that defined a syntactic gloss that could be added to an attribute to indicate that the attribute cannot be ignored. An implementation would then be required to issue a diagnostic for glossed attributes that it didn't understand.

Using these hypothetical glossed attributes for semantic information would have two big benefits: (a) the standard would define all the necessary production rules and (b) the standard would define exactly what each (glossed) attribute appertains to.

The idea of this patch is get these same two benefits using keywords. It adds the concept of “attribute-like keywords” that can appear exactly where a standard attribute can appear. No exceptions either way are allowed: there are no situations in which one of the two can appear and the other can't.

This makes the parsing rules a very simple extension to the standard. They adapt mechanically as the standard evolves.

Also, the keywords are defined to appertain to exactly the same thing that an attribute would appertain to. Again, no exceptions are allowed. For example, there is no “sliding” of attribute-like keywords from declarations to types.

This does mean that the attribute-like keywords are accepted for parsing purposes in many more places than necessary for SME. However, the compiler would need to reject the keywords in those positions whatever happens, and treating them as wannabe attributes seems no worse than the alternative. In some cases it might even be better. For example, __arm_streaming would make conceptual sense as a statement attribute, so someone who takes a “try-it-and-see” approach might write:

__arm_streaming { …block-of-code…; }

In fact, we did consider supporting this originally. The reason for rejecting it was that it was too difficult to implement, rather than because it didn't make conceptual sense.

The patch is quite large. However, it should be a one-off, up-front cost. Adding extra attribute-like keywords would just require (lexing) updates to TokenKinds.def and Tokens.h. There should be no need for changes in the parser itself.

One slight disadvantage of this approach is that it isn't possible to use #pragma clang attribute with the keywords. Perhaps we could add that in future, if it turns out to be useful.

The approach taken in the patch is that the keywords can be used with any language version. If standard attributes were added in language version Y, the keyword rules for version X<Y are the same as they were for version Y (to the extent possible). Any extensions beyond Y are handled in the same way for both keywords and attributes. This ensures that C++17 (for example) is not treated differently from versions that have yet to be defined.

The SME attributes do not take arguments, but in principle it would be possible to define keywords that do take arguments. Support for this could be added later, when it's needed.

Some notes on the implementation:

  • The patch emits errors rather than warnings for diagnostics that relate to keywords.
  • Where possible, the patch drops “attribute” from the diagnostics relating to keywords.
  • One exception to the previous point is that warnings about C++ extensions do still mention attributes. The use there seemed OK since the diagnostics are noting a change in the production rules.
  • Although the patch updates warn_attribute_wrong_decl_type_str, warn_attribute_wrong_decl_type, and warn_attribute_wrong_decl_type, only the error forms of these strings are used for keywords.
  • I couldn't trigger the warnings in checkUnusedDeclAttributes(), even for existing attributes. An assert on the warnings caused no failures in the testsuite. I think in practice all standard attributes would be diagnosed before this.
  • The patch drops a call to standardAttributesAllowed in ParseFunctionDeclarator. This is because MaybeParseCXX11Attributes checks the same thing itself, where appropriate.
  • A lot of the error messages relating to the SME keywords refer to K&R functions, even for C++. It's pretty easy to fix that, but it's logically separate work. A lot of the expected errors therefore end on “only applies to”. I can go back and tighten this once the K&R thing is fixed.
  • The new tests are based on c2x-attributes.c and cxx0x-attributes.cpp. The C++ test also incorporates a version of cxx11-base-spec-attributes.cpp. The FIXMEs are carried across from the originals.

Diff Detail

Event Timeline

rsandifo-arm created this revision.Nov 30 2022, 9:26 AM
rsandifo-arm requested review of this revision.Nov 30 2022, 9:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 9:26 AM

Reupload with proper linting. No change in code.

Sorry for the noise!

rsandifo-arm edited the summary of this revision. (Show Details)Jan 4 2023, 2:44 AM

Ping. If this approach is acceptable, I think we could use it for future ACLE features too (rather than the GNU attributes that we've used previously).

Roping in Erich as the attributes code owner. In general, I think this is going in roughly the right direction. One concern I have is that the distinction between a keyword spelling for an attribute is now a bit harder. We have keyword attributes and we have attribute-like keywords, and it's hard to know when it's appropriate to use one over another. From what I can tell, it seems that the biggest distinction between the two is automatic parsing or not and diagnostic wording (please correct me if I'm missing something!). If I'm correct, then I don't think we want an additional attribute syntax enumeration for this (it's still AS_Keyword), but we should surface this via Attr.td/tablegen so that the Keyword spelling can opt into the extra generality (or perhaps we want the default to be all keyword attributes get the general behavior and we make a few existing attributes opt out).

clang/include/clang/Basic/AttributeCommonInfo.h
55

I think it would help to describe how this differs from AS_Keyword and AS_ContextSensitiveKeyword as I would expect these to have considerable overlap.

clang/include/clang/Basic/TokenKinds.h
105–107

This strikes me as something that should be handled via tablegen -- the spelling used in Attr.td should show what kind of attribute spelling it is (regular keyword or not).

One thing that might also help is to split this into a few stages. 1) Add the new general functionality, 2) Replacing the existing implementation of keyword attributes with the new functionality where possible, 3) Add new attributes using the new functionality. It seems to me that we should be able to replace a bunch of our existing keyword attributes with the newer generalized approach (I'm thinking about ones like the calling convention attributes specifically, but we've got others as well), and that will help us prove the design as well as simplify the compiler implementation. WDYT?

One thing that might also help is to split this into a few stages. 1) Add the new general functionality, 2) Replacing the existing implementation of keyword attributes with the new functionality where possible, 3) Add new attributes using the new functionality. It seems to me that we should be able to replace a bunch of our existing keyword attributes with the newer generalized approach (I'm thinking about ones like the calling convention attributes specifically, but we've got others as well), and that will help us prove the design as well as simplify the compiler implementation. WDYT?

I think this is an absolute necessity for me. I also find myself wondering if the 3 'keyword' types of attributes hold their own weight individually.

Thanks @aaron.ballman and @erichkeane for the comments. I'll split it into stages and use tablegen more, like you say.

One thing that might also help is to split this into a few stages. 1) Add the new general functionality, 2) Replacing the existing implementation of keyword attributes with the new functionality where possible, 3) Add new attributes using the new functionality. It seems to me that we should be able to replace a bunch of our existing keyword attributes with the newer generalized approach (I'm thinking about ones like the calling convention attributes specifically, but we've got others as well), and that will help us prove the design as well as simplify the compiler implementation. WDYT?

I've tried to go through the existing keyword attributes, but unfortunately, I don't think any of them match the new scheme exactly:

__declspec

I don't think anyone was suggesting we try to convert this :-) but for completeness:

int __declspec(align(16)) a1; // OK, but standard attribute rules would say this appertains to the type
int a2 __declspec();          // Not valid, unlike for standard decl attributes
__forceinline
int __forceinline a1() { return 1; } // OK, but standard attribute rules would say this appertains to the type
int a2 __forceinline() { return 1; } // Not valid, but would be for standard decl attributes
Calling convention keywords

These are generally DeclOrType attributes, with attributes sliding from the decl to the type where necessary.

__stdcall int a1();   // OK, but appertains to the decl and relies on sliding behaviour
int a2 __stdcall();   // Not valid, but is another place to put standard decl attributes
int a3() __stdcall;   // Not valid, but is another place to put standard type attributes
int (__stdcall a4)(); // OK, but standard attributes aren't allowed in this position
extern int (*const __stdcall volatile a5) (); // OK, but standard attributes wouldn't be allowed here
Nullability keywords

Like the calling-convention keywords, the nullability keywords appertain to types but are allowed in some (but not all) positions that would normally appertain to decls:

using ptr = int *;
_Nullable ptr a1; // OK, but appertains to the decl and relies on sliding behaviour
ptr a2 _Nullable; // Not valid, but is another place to put standard decl attributes
extern int *const _Nullable volatile a3; // OK, but a standard attribute wouldn't be allowed here

The same distinction applies to Microsoft pointer attributes.

Would it be OK just to use the new scheme for new attributes?

rsandifo-arm abandoned this revision.Aug 7 2023, 3:55 AM

Superceded by D148700