This is an archive of the discontinued LLVM Phabricator instance.

Introduce an 'external_source_symbol' attribute that describes the origin and the nature of a declaration
ClosedPublic

Authored by arphaman on Feb 10 2017, 4:48 AM.

Details

Summary

This patch adds an external_source_symbol attribute to Clang. This attribute specifies that a declaration originates
from an external source and describes the nature of that source.

This attribute is useful for mixed-language projects or project that use auto-generated code. For instance, Xcode can use this attribute to provide a correct 'jump-to-definition' feature. For a concrete example, consider a protocol that's defined in a Swift file:

@objc public protocol SwiftProtocol {
  func method()
}

This protocol can be used from Objective-C code by including a header file that was generated by the Swift compiler. The declarations in that header can use the external_source_symbol attribute to make Clang aware of the fact that SwiftProtocol actually originates from a Swift module:

__attribute__((external_source_symbol(language=Swift,defined_in="module")))
@protocol SwiftProtocol
@required
- (void) method;
@end

Consequently, when 'jump-to-definition' is performed at a location that references SwiftProtocol, Xcode can jump to the original definition in the Swift source file rather than jumping to the Objective-C declaration in the auto-generated header file.

his attribute uses custom parsing to specify a varying number of clauses such as 'language', 'defined_in' or 'generated_declaration'. The 'language' clause specifies the source language from which the declaration originates. The 'defined_in' clause specifies the name of the source container in which the declaration was defined (the exact definition of source container is language-specific, e.g. Swift's source containers are modules, so 'defined_in' should specify the Swift module name). The 'generated_declaration' is a flag that specifies that whether this declaration was automatically generated by some tool or not.

Diff Detail

Repository
rL LLVM

Event Timeline

arphaman created this revision.Feb 10 2017, 4:48 AM
arphaman edited the summary of this revision. (Show Details)
arphaman edited the summary of this revision. (Show Details)
arphaman edited the summary of this revision. (Show Details)
aaron.ballman requested changes to this revision.Feb 10 2017, 11:27 AM
aaron.ballman added inline comments.
include/clang/Basic/Attr.td
532

Is this supported by GCC? If not, it should be under the clang namespace instead of the gnu namespace. If it is supported by GCC, then you should just use a single GCC spelling.

include/clang/Basic/AttrDocs.td
1007

Would this hold something like a file name? If so, I worry about conflicts between the comma separator and a file name -- you might want to pick a separator that can't be used in a file name (like | or :).

1015

Are these clauses parsed in a strict order? If so, you may want to mention that the order matters (or doesn't).

Also, the code in SemaDeclAttr.cpp implies that some of these are optional. It should be made clear which (if any) arguments are optional.

include/clang/Basic/DiagnosticParseKinds.td
863

I would drop the e.g. and instead describe what's really expected: an identifier. The e.g. muddies the water because it suggests there's a list of supported languages, and it shows something that looks kind of like a string when it isn't one. Something like: expected an identifier representing the source language or some such?

lib/Parse/ParseDecl.cpp
1141

You may want to assert this.

lib/Sema/SemaDeclAttr.cpp
2414

You should also diagnose if there's more than 3 arguments, no?

2418

This isn't really the right diagnostic for a mismatched attribute subject. It should be using warn_attribute_wrong_decl_type instead so that the user is more clear on why the attribute is ignored.

2428

Rather than rely on the implicit conversion to bool, I think this is a case where it should be tested against nullptr explicitly.

test/Parser/attr-external-source-symbol-cxx11.cpp
7

Please put this directly below the RUN line.

test/Parser/attr-external-source-symbol.m
27

I think you can currently get away with writing external_source_symbol(generated_declaration, generated_declaration, generated_declaration, defined_in="module")) and other odd combinations.

There should be additional parser tests for totally crazy parsing scenarios. You may want to consider running a fuzzer to generate some of them.

test/Sema/attr-external-source-symbol.c
6

Should also have a sema test for when there are 2 args, and 4 args.

There should also be a test under Misc that also checks that the args are properly lowered into the AST in the correct way with differing argument orders. e.g, external_source_symbol(generated_declaration, language=Swift, defined_in="module") vs external_source_symbol(language=Swift, generated_declaration, defined_in="module") vs external_source_symbol(defined_in="module", language=Swift, generated_declaration), etc.

This revision now requires changes to proceed.Feb 10 2017, 11:27 AM
arphaman updated this revision to Diff 88222.Feb 13 2017, 10:25 AM
arphaman edited edge metadata.
arphaman marked 9 inline comments as done.

Thanks for the feedback! I've addressed the majority of your comments.

include/clang/Basic/AttrDocs.td
1007

It could potentially include a filename, yes.
I don't quite follow your concerns though.. If a comma is in a string literal then it's wrapped in quotes. Wouldn't that be enough to distinguish the comma separator token from the comma in a filename?

lib/Sema/SemaDeclAttr.cpp
2414

I think an assert would be more appropriate since I only use up to 3 arguments when creating the attribute, so I wouldn't be able to test the diagnostic.

aaron.ballman added inline comments.Feb 21 2017, 5:53 AM
include/clang/Basic/AttrDocs.td
1005

This being an identifier makes me wonder about languages that aren't a single token. For instance, how do you specify Objective-C or Objective-C++? What about C++? Visual Basic .NET?

Perhaps this should also be a string literal.

1007

You're correct, I had a brain fart. :-)

include/clang/Basic/DiagnosticSemaKinds.td
2713

I'm not too keen on this entry, because I'm not certain how many users really know what a "named declaration" is ("don't all declarations have names?" is a valid question for this to raise). However, I don't know of a better term to use, so it may be fine.

include/clang/Parse/Parser.h
146

Can we handle identifiers that have spaces in the name? I'm thinking about the case where Ident_defined_in is something like "My Long File Name With Spaces On Windows.cpp"

We should have something like that as a test to make sure it's handled properly. Same with odd values for Ident_language, like 'c++' or 'Objective-C', etc.

lib/Parse/ParseDecl.cpp
1090

expectAndConsume() instead of manually diagnosing?

1160

As an interesting case: if the attribute has two defined_in arguments and the first one is invalid, they will not get a diagnostic about the second one being a duplicate. e.g., __attribute__((external_source_symbol(defined_in = "1234"udl, defined_in = "1234")))

lib/Parse/ParseDeclCXX.cpp
3818–3819

I don't really like hard-coding a list of attributes like this. I think you should handle clang-namespaced attributes with a separate helper function.

lib/Sema/SemaDeclAttr.cpp
2414

An assert is fine by me, but please add a string literal after the assert to explain why it fails.

2420

This is incorrect -- you need to update the enumeration at the end of AttributeList.h and use an enumerator rather than a magic literal value.

test/Parser/attr-external-source-symbol.m
27

The test is still missing the weird parsing situations (like failing to use an = for arguments, etc). Basically, you should look at your custom parsing code and make sure each case you want to diagnose is actually being diagnosed (so we don't accidentally regress if we ever refactor the parsing code).

arphaman updated this revision to Diff 89298.Feb 21 2017, 4:17 PM
arphaman marked 8 inline comments as done.

I've addressed Aaron's comments and made the language a string literal.

include/clang/Basic/AttrDocs.td
1005

Good point, I've changed it to a string literal.

aaron.ballman added inline comments.Feb 27 2017, 3:44 PM
lib/Parse/ParseDecl.cpp
1161

Add a string literal to the assert?

lib/Parse/ParseDeclCXX.cpp
3818–3819

This still isn't quite what I was looking for -- the helper function I was talking about was a replacement for ParseGNUAttributeArgs(). I'm sorry if I was unclear.

I think we should have a ParseClangAttributeArgs() that handles your custom parsing; it can then call through to ParseAttributeArgsCommon() for common argument handling in the same way ParseGNUAttributeArgs() does. So the predicate would be else if (ScopeName && ScopeName->getName() == "clang").

arphaman updated this revision to Diff 90042.Feb 28 2017, 8:31 AM
arphaman marked 2 inline comments as done.

Use ParseClangAttributeArgs and add a string to an assert.

lib/Parse/ParseDeclCXX.cpp
3818–3819

I see, I added ParseClangAttributeArgs in the updated patch.

aaron.ballman accepted this revision.Feb 28 2017, 11:54 AM

LGTM, despite a long-ish comment.

lib/Parse/ParseDeclCXX.cpp
3828–3835

This code is equivalent to the old code, which is good. However, it points out what I think may be a latent bug -- for attributes in the clang namespace that are also standard attributes, I'm not certain we should be generating parse errors for the clang-namespaced spelling. This only impacts [[fallthrough]] vs [[clang::fallthrough]], according to the implementation of IsBuiltInOrStandardCXX11Attribute().

All of that is to say: I'm not certain we need to hoist NumArgs to get to the call to IsBuiltInOrStandardCXX11Attribute() below, because attributes in the clang namespace are not built-in or standard attributes.

This requires a bit further exploration, but I won't have time to do that for a few weeks. I don't want to hold your patch up for that, so I think this is fine for now -- I can modify the code later when I have the chance to finish my exploration.

This revision is now accepted and ready to land.Feb 28 2017, 11:54 AM
arphaman added inline comments.Mar 1 2017, 10:17 AM
lib/Parse/ParseDeclCXX.cpp
3828–3835

Thanks!

I agree with the comment about NumArgs, I wasn't sure if we need it for clang:: attributes as well.

This revision was automatically updated to reflect the committed changes.