This adds the skeleton of the YAML Compiler for APINotes. This change
only adds the YAML IO model for the API Notes along with a new testing
tool apinotes-test which can be used to verify that can round trip the
YAML content properly. It provides the basis for the future work which
will add a binary serialization and deserialization format to the data
model.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Seems like this patch handles already existing attributes in Clang, so this might not be affected by the concerns brought up here by James:
http://lists.llvm.org/pipermail/cfe-dev/2020-October/066996.html
clang/include/clang/APINotes/APINotesYAMLCompiler.h | ||
---|---|---|
17 | It would be useful to have more documentation here. | |
clang/include/clang/APINotes/Types.h | ||
2 | So we are mapping existing attributes here, I am missing this to be documented. Why do we support only these attributes? | |
26 | This seems a bit redundant to Attrs.td. I'd prefer to have the structure of an attribute defined only in one place. Why can't we directly reuse the types in the generated Attrs.inc? In this case class EnumExtensibilityAttr : public InheritableAttr { public: enum Kind { Closed, Open }; could perfectly fit, wouldn't it? | |
33 | Should this be rather SwiftNewTypeKind? Since swift_wrapper is deprecated according to the docs: https://clang.llvm.org/docs/AttributeReference.html#swift-newtype | |
clang/lib/APINotes/APINotesYAMLCompiler.cpp | ||
241 | Why are these classes in a unnamed namespace? I'd expect them to be in the header under the clang::api_notes namespace, so users of the APINotesYAMLCompiler could use these as the structured form of the YAML content. Isn't that the point of the whole stuff? | |
440 | Hmm, why do we need "none"? Can't we interpret the non-existence as "none"? | |
622 | I think the stream (llvm::outs) should not be written in stone. And why not llvm::errs (we dump to errs usually) ? Could it be a parameter? | |
clang/test/APINotes/Inputs/Frameworks/Simple.framework/Headers/Simple.apinotes | ||
2 | I am pretty sure this does not provide a full coverage. What about e.g Functions? I'd like to see them tested as well. | |
clang/test/APINotes/yaml-roundtrip.test | ||
5 | Why do we have this difference? This seems odd and as a superfluous complexity. | |
clang/tools/apinotes-test/APINotesTest.cpp | ||
23 | auto -> std::string |
Thanks for the feedback!
clang/include/clang/APINotes/Types.h | ||
---|---|---|
2 | These are the ones that are currently needed and can be safely merged. I really wouldn't want to extend the support to all the attributes in the initial attempt to merge the functionality as it is something which is actually in production already. However, once the functionality is in a shared state, it is much easier to revise and co-evolve other consumers. | |
clang/lib/APINotes/APINotesYAMLCompiler.cpp | ||
241 | There will be follow up types that provide structured access to the data. These types are strictly for the serialization to and from YAML via YAML::IO. | |
440 | At the very least we need it for compatibility - this is already a shipping feature. However, nullability is also not completely annotated. So, there is some benefit in tracking the explicit none vs missing. | |
622 | Sure, that is reasonable, I should be able to add a llvm::raw_ostream & parameter. | |
clang/test/APINotes/Inputs/Frameworks/Simple.framework/Headers/Simple.apinotes | ||
2 | Correct, it isn't meant to provide full coverage at all. It is merely meant to be enough to ensure that we can load the YAML and process it. The full coverage will come with the follow up patches as they will require filling out more of the infrastructure. I am trying to keep the patch at a reasonable size. I can add additional test cases if you feel strongly that they should be added now, but, I do worry about the size of the patch ballooning. | |
clang/test/APINotes/yaml-roundtrip.test | ||
5 | The difference is needed for compatibility. Richard wanted the fully spelt out names, but that requires backwards compatibility, so we need the difference here. |
as it is something which is actually in production already
...
At the very least we need it for compatibility - this is already a shipping feature.
I value that you guys have a prototype on a fork that is being used in production. But, upstreaming and these reviews give us the chance to evolve and to create an implementation that serves well the whole Clang community. I am having a hard time to accept "this is how it is implemented in our fork" as a technical argument. Besides, I am not sure how could the Clang community benefit about being backward compatible with a specialized fork and thus making superfluous complications. This feature could be really valuable, but I'd like to have it landed in a high quality form which serves the whole community (and the static analyzer developers). I think that a simple copy-paste of the fork will not do it.
clang/include/clang/APINotes/Types.h | ||
---|---|---|
2 | I understand that an initial implementation may not support all attributes. What's more, an evolved implementation may not do that either. Including Attrs.inc does not put any requirement to support all attributes, but we could reuse the types and enums there. This way we could avoid the need to mirror the types in Attrs.inc. | |
clang/lib/APINotes/APINotesYAMLCompiler.cpp | ||
241 | Sounds good. Could you please describe in a nutshell which are the following steps for the upstreaming? I mean it could make the review easier if we could see a kind of road-map for the upcoming patches. | |
440 | Optional<EnumExtensibilityAttr::Kind> ? | |
clang/test/APINotes/yaml-roundtrip.test | ||
5 | I have concerns about making things more complicated just to be compatible with a downstream fork. |
Adding @NoQ as a reviewer, he might have some valuable comments, as he is very keen on this feature.
I'm not against evolving the implementation. However, compatibility with production code is important. I don't think that it is possible to ignore that, especially when the impact is non-trivial (it is not a small localized use). The large scale usage is valuable in terms of demonstrating the value of the functionality, so it is valuable to clang community. I am not suggesting just a copy from the Swift implementation, and do care about it being a high quality implementation. But that cannot come at the cost of compatibility with the existing usage.
clang/include/clang/APINotes/Types.h | ||
---|---|---|
2 | From that description, I think I am not necessarily following what you are suggesting. Could you be a bit more concrete in what you are suggesting in terms of the reuse? | |
clang/lib/APINotes/APINotesYAMLCompiler.cpp | ||
241 | I am still digesting the complete implementation, which is extremely large. Currently, my thought is to split up it up with this change which only deals with the YAML aspect, then follow that up with the processing that is required:
And then the work to associate the attributes from the deserialized form to the AST. I am trying to split this up into smaller chunks so that it can be adequately reviewed rather than just be a dump of a feature and then have to rework it to be something that makes sense in the upstream tree. I also feel uncomfortable with the current version in Swift being merged as I can't seem to understand the testing aspect well enough, which is why I introduced the apinotes-test-tool so that we can be sure that the pieces can be tested. | |
440 | That representation could work, let me see if I can get YAML::IO to provide something which would be compatible. | |
clang/test/APINotes/yaml-roundtrip.test | ||
5 | The compatibility is extremely important unfortunately. We could go with just the terse version for the time being if you are strongly opposed to the two spellings. |
I am having a hard time to accept "this is how it is implemented in our fork" as a technical argument. Besides, I am not sure how could the Clang community benefit about being backward compatible with a specialized fork and thus making superfluous complications.
Being compatible with Apple's SDKs is quite important for some part of the community.
This is similar to Clang implementing GCC and MSVC attributes, for example. Some might call them specialized attributes -- they are non-standard and are certainly not used by every C++ user -- but there exists a sizeable population of downstream consumers that rely on them. Apple's SDKs are in the same class.
This feature could be really valuable, but I'd like to have it landed in a high quality form which serves the whole community (and the static analyzer developers). I think that a simple copy-paste of the fork will not do it.
I completely agree that the feature should be useful for static analysis. Do compatibility aliases hurt these goals?
clang/lib/APINotes/APINotesYAMLCompiler.cpp | ||
---|---|---|
440 | The representation is already Optional<EnumExtensibilityAttr::Kind>, but we need to map the user providing none back to a llvm::None so this will still need to be listed as is. |
clang/include/clang/APINotes/Types.h | ||
---|---|---|
26 | The none-case here is not the same as missing - it tracks the explicitly audited case. I suppose we can change the internal enum case from None to Audited if you like. |
Fair enough. But I don't think that Clang developers just copied the implementation from GCC or from MSVC.
I accept to keep the interface of the feature (i.e. the YAML format) as it is in the fork, okay. But, I'd like to have maximum flexibility from the submitter by willing to change the details of the implementation. I don't think that the implementation in the fork is necessarily the best realization, we can do better, after all that's the purpose of this review, isn't it?
This feature could be really valuable, but I'd like to have it landed in a high quality form which serves the whole community (and the static analyzer developers). I think that a simple copy-paste of the fork will not do it.
I completely agree that the feature should be useful for static analysis. Do compatibility aliases hurt these goals?
No. But they make the tests and the whole system more complicated, IMHO.
clang/include/clang/APINotes/Types.h | ||
---|---|---|
2 | I was thinking about to simply This way the class clang::EnumExtensibilityAttr and many other attribute classes are immediately available for your use. And you don't have to define very similar classes here again (unless I misunderstand something). | |
26 | I am not sure I can follow. Could you please elaborate what is an "explicit y audited" case? | |
clang/test/APINotes/yaml-roundtrip.test | ||
5 | I'd rather stick with the terse version if possible (do we have to pursue rsmith about this?) If this is really a hassle then let's forget about it. |
Fair enough. But I don't think that Clang developers just copied the implementation from GCC or from MSVC.
No (nor we could copy the implementation for a number of reasons). However, we did have to be bug-for-bug compatible sometimes for compatibility.
I'd like to have maximum flexibility from the submitter by willing to change the details of the implementation
I don't think that is a fair expectation - there are plenty of times where this is technical disagreement on functionality, and it doesn't get immediately resolved. A high quality implementation doesn't require that everything be changed immediately, nor does it mean that everything must be completely flexible. There are trade-offs, and the goal is to strike a balance between simplicity and the needs of the project.
clang/include/clang/APINotes/Types.h | ||
---|---|---|
26 | There are three states consider: enum Unaudited { }; enum __attribute__((__enum_extensibility__(open))) Open { }; enum __attribute__((__enum_extensibility__(closed))) Closed { }; The optionality of the value indicates whether the value is present or not in the APINotes, not the tri-state nature of the attribute. |
Yes, absolutely, flexibility is needed from both sides (submitter and reviewer). It is our common interest to get an extensible implementation upstreamed. I hope I didn't push it too hard, I didn't mean any offence.
clang/include/clang/APINotes/Types.h | ||
---|---|---|
26 | Ok. Thanks for the explanation. But how could we describe then flag_enum? As a possible extension in the future (not in this patch). enum __attribute__((enum_extensibility(open),flag_enum)) OpenFlagEnum { D0 = 1 << 0, D1 = 1 << 1 }; Or here enum __attribute__((flag_enum)) Foo; should Foo have an Audited or None Kind here? If we were to add support to flag_enum would you create a new enum class for that here? Note that I am asking all these questions because I am planning to extend and add more attributes in the future once the whole APINotes stuff is landed. |
clang/lib/APINotes/APINotesYAMLCompiler.cpp | ||
---|---|---|
33 | Could you please clang-format the code? The Lint: Pre-merge checks are all over the code and makes the review a bit harder. |
clang/include/clang/APINotes/Types.h | ||
---|---|---|
26 | The questions are reasonable - and we should understand what is missing and what can be done and what cannot be done. I think that the use of the metadata is extremely important. The question is, are there any users who care about recovering that information? IMO, we should try to represent the input as faithfully as possible, even if that means that we need to replicate the enumerations. However, if the consumers do not care about the inferred state vs the written state, I don't think that it is an immediate concern if we cannot recover that from the representation, we can simplify and extend as and when needed, especially when we have the feature merged (I think that it is easier once merged only because this is a large feature, and there is a large user base that is using this, which complicates things. Really, this functionality should have been merged a long time ago, but that is both my own opinion and past), and hopefully in many cases we can do it during review time. The tricky bit to remember that there is an additional state that is implicitly being added here, and optionality should reference the contents of the APINotes, not the state in the declaration that is being mutated. Is there a good place to write this down? I am willing to add comments explaining this so that the next person (ha, no, not the next person, but me, because I *will* forget) is not tripped up on this nuance. |
clang/lib/APINotes/APINotesYAMLCompiler.cpp | ||
---|---|---|
33 | Sure, though, I do think that some of the extra whitespace makes it easier to read. |
clang/include/clang/APINotes/Types.h | ||
---|---|---|
26 | Thanks for clarifying this. It was not clear for me that these types were to represent the YAML file contents. It makes much more sense now.
I think a file comment right after the license could do it. |
Looks good to me now! Thanks for addressing my concerns.
clang/tools/apinotes-test/APINotesTest.cpp | ||
---|---|---|
16 | We still have: clang-format: please reformat the code. (Mabe you already know, but if you use arc then all these formatting issues are handled automatically and it will upload a linted patch to Phabricator.) |
Thanks for the reviews! I'll reformat the rest of the code before merging. Im going to be working on merging this downstream first before merging this upstream, so it will take a couple of days still.
Looks like the test doesn't pass on Windows: http://45.33.8.238/win/27342/step_7.txt
Please take a look, and revert if it takes a while to fix.
Here's the output of the diff on that Win bot:
thakis@thakis6-w MINGW64 /c/src/llvm-project (merge) $ diff --strip-trailing-cr out/gn/obj/clang/test/APINotes/Output/yaml-roundtrip.test.tmp.result clang/test/APINotes/Inputs/Frameworks/Simple.framework/Headers/Simple.apinotes 1d0 < --- 8c7 < Nullability: Nonnull --- > Nullability: N 14c13 < Nullability: Optional --- > Nullability: O 20c19 < Nullability: Unspecified --- > Nullability: U 26c25 < Nullability: Unspecified --- > Nullability: S 29,30c28 < Nullability: Unspecified < ... --- > Nullability: Scalar
@thakis, thanks for the heads up, I've disabled the test on Windows - I should've disabled it on Windows in the first place.
It would be useful to have more documentation here.