This is an archive of the discontinued LLVM Phabricator instance.

Support parsing double square-bracket attributes in ObjC
ClosedPublic

Authored by aaron.ballman on Dec 23 2017, 8:07 AM.

Details

Summary

One plausible use for the new double square-bracket attribute support in C are attributes appertaining to Objective-C constructs. This patch adds parsing support for such attributes and exposes a handful of ObjC attributes under the new syntax.

Diff Detail

Event Timeline

aaron.ballman created this revision.Dec 23 2017, 8:07 AM

This patch depends on D41317 for the modifications to the Clang<> attribute spelling.

This appears to add attributes on some Obj-C constructs as an on-by-default feature in Obj-C++11 onwards; it needs review by folks more heavily involved in the Obj-C language design and direction to confirm that this makes sense and (ideally) to update the Obj-C documentation to cover this new feature.

To me, the grammar changes you're proposing look appropriate.

This appears to add attributes on some Obj-C constructs as an on-by-default feature in Obj-C++11 onwards; it needs review by folks more heavily involved in the Obj-C language design and direction to confirm that this makes sense and (ideally) to update the Obj-C documentation to cover this new feature.

Agreed -- adding some reviewers for this language extension.

Post-holiday ping.

rjmccall added inline comments.Jan 2 2018, 7:29 PM
include/clang/Basic/Attr.td
239

I have no objection to allowing ObjC attributes to be spelled in [[clang::objc_whatever]] style. We can debate giving them a more appropriate standard name in the objc namespace at a later time — or even the primary namespace, if we really want to flex our Somewhat Major Language muscles.

I feel like this IncludeC is not the best name for this tablegen property. Perhaps AllowInC?

Also, a random positional 1 argument is pretty obscure TableGen design. Do we really expect to be making very many attributes that intentionally disallow C2x? What would these be, just C++-specific attributes without C analogues? It doesn't seem important to prohibit those at the parsing level rather than at the semantic level, since, after all, we are still allowing the GNU-style spelling, and these would still qualified with `clang::`.

I would suggest standardizing in the opposite direction: instead of forcing attributes to opt in to being allowed in C, we should make attributes that we really don't want to allow in the C2x [[clang::]] namespace be explicit about it. If there are a lot of C++-specific attributes like that, we can make a ClangCXXOnly subclass.

aaron.ballman added inline comments.Jan 3 2018, 11:35 AM
include/clang/Basic/Attr.td
239

I have no objection to allowing ObjC attributes to be spelled in [[clang::objc_whatever]] style. We can debate giving them a more appropriate standard name in the objc namespace at a later time — or even the primary namespace, if we really want to flex our Somewhat Major Language muscles.

Thanks -- are you okay with where the attributes are allowed in the syntax? I tried to follow the position they're allowed in C and C++ as closely as I could, but having confirmation would be nice.

As for putting the attributes in the primary namespace, that would be reasonable for using the attributes in ObjC or ObjC++ but not so much for using the same attributes in a pure C or C++ compilation.

I feel like this IncludeC is not the best name for this tablegen property. Perhaps AllowInC?

I'm fine with that name. I'll change it in D41317 when I commit that.

Also, a random positional 1 argument is pretty obscure TableGen design. Do we really expect to be making very many attributes that intentionally disallow C2x? What would these be, just C++-specific attributes without C analogues?

I agree -- I was toying with using a named entity rather than a numeric literal, but I wanted to see how the design shook out in practice once I had a better feel for how many attributes are impacted. I'm glad you recommend going the opposite direction as that was my ultimate goal. :-) Basically, my initial approach is to disallow everything in C and then start enabling the attributes that make sense. At some point, I believe we'll have more attributes in both language modes than not, and then I plan to reverse the meaning of the flag so that an attribute has to opt-out rather than opt-in. I don't expect we'll have many attributes that are disallowed in C2x. I think they'll fall into two categories: C++ attributes that don't make sense in C and attributes that are governed by some other body.

It doesn't seem important to prohibit those at the parsing level rather than at the semantic level, since, after all, we are still allowing the GNU-style spelling, and these would still qualified with clang::.

I think it's important for C targets when -fdouble-square-bracket-attributes is not enabled, as the attributes cannot syntactically appear in those positions.

I would suggest standardizing in the opposite direction

I suspect I will ultimately get there. I chose this approach to be more conservative about what we expose.

rjmccall added inline comments.Jan 3 2018, 3:01 PM
include/clang/Basic/Attr.td
239

Thanks -- are you okay with where the attributes are allowed in the syntax? I tried to follow the position they're allowed in C and C++ as closely as I could, but having confirmation would be nice.

I'll leave comments on the various places. For the most part, no, I think these are the wrong places; where we allow attributes in ObjC is actually pretty carefully thought-out already, and it's better to follow the places where we parse GNU attributes than to try to imitate the C grammar.

As for putting the attributes in the primary namespace, that would be reasonable for using the attributes in ObjC or ObjC++ but not so much for using the same attributes in a pure C or C++ compilation.

Yes, please ignore that. I was just idly thinking about what we would do if we were adopting this feature as a more standard thing, i.e. leaving the implementation namespace. I think we'd want to rename some of the attributes for consistency, and then I think we'd just put them in the objc:: namespace. None of this affects what we're doing now.

I think it's important for C targets when -fdouble-square-bracket-attributes is not enabled, as the attributes cannot syntactically appear in those positions.

Okay, that's fair.

I suspect I will ultimately get there. I chose this approach to be more conservative about what we expose.

Sure, if this is really just for staging, I won't worry about accidentally leaving attributes out.

lib/Parse/ParseObjc.cpp
231

Okay, so this is allowing attributes after @interface but before the class name, e.g. @interface [[attr]] X. I don't think we actually want this; we really want attributes on classes to be written prior to the @keyword, and I don't think that's spelling-specific in any way. Frankly, I see this is an awkward C-ism that the ObjC grammar just doesn't really require.

726

So this is specifically allowing @property [[attr]] (copy) id x;, whereas today we require you to write @property (copy) [[attr]] id x; or something like it. Again, I don't think we actually want to allow attributes in this position.

1388

So, this is not actually reliably after the method name, it's just after the first selector chunk but before its colon. This is definitely not an acceptable place to write attributes. They should be allowed here if this is a nullary selector, i.e. there is no colon; the right place to do that is just a few lines below this, where you can see we parse GNU attributes and then finish the parse.

1434

ObjC parameter syntax is really its own weird thing. I think this is the right place to allow attributes, after the parenthesized type but before the parameter-variable name (which is optional). And of course we also allow them within the parentheses, but hopefully that just falls out from parsing the type.

1456

So, this is wrong because it's potentially conflated with the end of the method, which we parse below when the tokens run out. You'll see that we allow GNU-style attributes there, and we should allow [[attrs]] there as well.

2041

Like with @interface, I think we don't want to allow attributes here.

aaron.ballman added inline comments.Jan 9 2018, 7:51 AM
include/clang/Basic/Attr.td
239

I'll leave comments on the various places. For the most part, no, I think these are the wrong places; where we allow attributes in ObjC is actually pretty carefully thought-out already, and it's better to follow the places where we parse GNU attributes than to try to imitate the C grammar.

The C grammar was also carefully thought-out to consistently map the location of the attribute syntax in source code to the entity the attribute appertains to. The basic rule is: the attribute always appertains to what's immediately to the left unless the attribute is at the start of the line, in which case it applies to each declaration in the decl group. This logic was taken from C++ attributes where it's proven very valuable over the past seven or so years. If we follow the GNU syntactic placement, that means this rule of thumb doesn't apply in Objective-C and it negates one of the most powerful aspects of the language feature: syntactic consistency (which is something GNU-style attributes absolutely lacks, to everyone's great annoyance).

Can you explain the rationale about why the GNU-style placement is a better approach to the C- and C++-style placement? I think that the syntactic location of the attribute syntax should remain consistent between ObjC constructs and C constructs because both constructs are likely to be mixed together in the same code base. I think we hurt the usability of the language feature by being inconsistent in this fashion (which was one of the reasons we standardized attributes in the first place).

rjmccall added inline comments.Jan 9 2018, 10:35 AM
include/clang/Basic/Attr.td
239

The ObjC grammar is overall quite different from the C grammar. There are no decl groups. Tags like @interface are always declaration introducers, rather than part of the type grammar. Types are not incidentally declared/defined in the type-specifier of a possible object or function declaration. More subjectively, nobody looks at the primary Objective-C declarations like @interface or @protocol and thinks they look like anything else from C or C++, which means consistency is not really an issue.

C++ puts class attributes after the class-key because there's literally no other place to put attributes that wouldn't be conflated with an attribute on the object if an object were being simultaneously declared. The C grammar paints them into a corner. We don't have that specific problem in ObjC because, as mentioned, ObjC declarations are standalone, not wedged into the object/function declaration grammar.

C++ seems to be inconsistent about where it allows keywords when it does have a proper declaration-introducer. For example, the grammar for a namespace declaration is 'namespace' attribute-specifier-seq? identifier?, presumably following the lead of class. But the grammar for an alias declaration is 'using' identifier attribute-specifier-seq? '=' defining-type-id, not 'using' attribute-specifier-seq? identifier '=' defining-type-id. And of course there are the context-sensitive class attributes (e.g. final) that are written in a special position, which is allowable under the grammar because they only need to apply to definitions.

I understand that the GNU/Microsoft/Borland/whatever attribute positions often seem ad hoc and that the C++ committee tried to improve on that when standardizing attributes. In contrast, I think the ObjC community is satisfied with where attributes are written in ObjC declarations today. When I said that those positions were carefully thought-out, I mean that we've already talked a lot about where attributes should go, and the current positions are the result of those conversations. That process seems to have been a success; the only complaints I can remember ever getting about attribute parsing have been related to where C requires them, not ObjC. In general, ObjC pushes attributes to the beginning or (in the case of methods) the end of the declaration, which people see as a good thing: attributes can get quite large, so piling them up on the fringes of the declaration ensures that they don't interfere with reading the more immediately important parts of the declaration. While C/C++ did not have a choice about where to allow attributes on class declarations, objectively it is very awkward to put attributes immediately after the tag because it separates the two most important things in the declaration: the declaration introducer and the name being declared. That is not something we want to imitate.

I am not just speaking for myself here. This is where we would like the attributes to appear.

aaron.ballman added inline comments.Jan 13 2018, 9:06 AM
include/clang/Basic/Attr.td
239

Thank you for the helpful explanation, I appreciate it.

Do you have issue with where C is placing these attributes for constructs that are supported by C, or are those also fine for use in ObjC? Basically, I'm wondering whether you would want this syntax at all in ObjC, or whether you're okay with the C placement for C constructs and the ObjC placement for ObjC constructs?

I'm a bit concerned about this scenario and would like your thoughts:

@interface I
-(void)Foo;
@end

@implementation I
-(void)Foo [[clang::minsize]] { // OK, appertains to the function declaration

}

void bar(void) [[clang::minsize]] { // Error, minsize does not appertain to the function type

}
@end
rjmccall added inline comments.Jan 22 2018, 7:45 AM
include/clang/Basic/Attr.td
239

We should just use the C rules for C constructs in ObjC mode, even if they're intermingled with ObjC constructs. I see your point about the perceived inconsistency, but I don't see a good resolution besides acceptance. Allowing new locations for C attributes would be a mistake.

Updates based on review feedback.

aaron.ballman marked 5 inline comments as done.Jan 29 2018, 1:18 PM
aaron.ballman added inline comments.
lib/Parse/ParseObjc.cpp
1434

Within the parens is tricky -- if the attribute appears immediately after the type, then it appertain to the *type* and not the *declaration*.

-(void)Test: (int) [[foo]] i  to:(int [[bar]]) j from: ([[baz]] int) k;

I would expect foo to appertain to i, bar to appertain to the type int, and baz to be ill-formed, which is what will fall out from parsing the type.

rjmccall added inline comments.Jan 29 2018, 1:48 PM
lib/Parse/ParseObjc.cpp
1434

As you say, the position at the start of a type-specifier-seq normally applies to the declared entity and so cannot have attributes in an abstract declarator. The parens production is meant to resemble a cast, which is normally just '(' type-id ')', but this is also part a declaration of a variable. I don't think it's unnatural to allow declaration attributes after the l-paren, and in practice that's a common place where ObjC programmers write parameter attributes today, but sure, we can conservatively start with only allowing declaration attributes after the r-paren.

Added the remaining ObjC, NS, and CF attributes. I think that's the last of the Apple-related attributes (I plan to continue to work on the other attributes as well, but in separate patches). @rjmccall, do you see any remaining issues?

rjmccall accepted this revision.Feb 11 2018, 11:08 PM

No, this LGTM.

This revision is now accepted and ready to land.Feb 11 2018, 11:08 PM
aaron.ballman closed this revision.Feb 12 2018, 5:40 AM

Thanks for the detailed review! I've commit in r324890.