This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Add the `annotate_type` attribute
ClosedPublic

Authored by mboehme on Oct 11 2021, 7:13 AM.

Details

Summary

This is an analog to the annotate attribute but for types. The intent is to allow adding arbitrary annotations to types for use in static analysis tools.

For details, see this RFC:

https://discourse.llvm.org/t/rfc-new-attribute-annotate-type-iteration-2/61378

Diff Detail

Event Timeline

mboehme created this revision.Oct 11 2021, 7:13 AM
mboehme requested review of this revision.Oct 11 2021, 7:13 AM
mboehme updated this revision to Diff 419642.Mar 31 2022, 11:52 PM

Uploading newest version.

Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 11:52 PM
jkorous added a subscriber: jkorous.Apr 4 2022, 1:00 PM
mboehme updated this revision to Diff 421159.Apr 7 2022, 4:41 AM

Add documentation and a unit test.

mboehme edited the summary of this revision. (Show Details)Apr 7 2022, 4:44 AM
xazax.hun added inline comments.Apr 7 2022, 9:00 AM
clang/include/clang/Basic/AttrDocs.td
6427

Do we want to mention that it is not actually part of the type system (i.e., annotated and unannotated types are considered the same)?

6433

Missing closing paren for the second annotate.

clang/lib/AST/TypePrinter.cpp
1711

Would it make sense to print something without the arguments? I wonder which behavior would be less confusing.

mboehme updated this revision to Diff 421506.Apr 8 2022, 6:03 AM

Documentation: Added discussion of type system implications and fixed syntax
error in example.

mboehme marked 2 inline comments as done.Apr 8 2022, 6:13 AM
mboehme added inline comments.
clang/include/clang/Basic/AttrDocs.td
6427

Good point -- I've added some relevant parts from the RFC below.

6433

Oops -- thanks for catching. Fixed.

clang/lib/AST/TypePrinter.cpp
1711

TBH I'm not sure. Is TypePrinter used only to produce debug output or is it required that the output of TypePrinter will parse again correctly?

The reason I'm asking is because annotate_type has a mandatory first argument; if we need the output to be parseable, we would have to print some dummy string, e.g. [[clang::annotate_type("arguments_omitted")]]. This seems a bit strange, but maybe it's still worth doing. OTOH, if the output is used only for debugging, I guess we could print something like [[clang::annotate_type(...)]], which doesn't parse but by that very nature makes it clear that we don't have all the information here.

I guess both of these options could have limited use -- they would at least make it clear that there is an annotation on the type, though without the arguments, we can't know what the actual annotation is.

Would appreciate some more input on the wider context here.

aaron.ballman added inline comments.Apr 8 2022, 7:19 AM
clang/include/clang/Basic/AttrDocs.td
6449–6451

It seems to me that because we found reasons for annotate to be lowered to LLVM IR, we'll find reasons to lower annotate_type eventually. I worry that if we document this, we'll never be able to modify the attribute to output information to LLVM IR because people will rely on it not doing so and be broken by a change. Are we sure we want to commit to this as part of the design?

clang/lib/AST/TypePrinter.cpp
1711

Yeah, the trouble is that you need a TypeLoc in order to get back to the actual attribute information that stores the arguments. But the type printer is printing types not specific instances of types at the location they're used.

The goal with the pretty printers is to print something back that the user actually wrote, but it's not always possible and so this is sort of a best-effort.

clang/lib/Sema/SemaAttr.cpp
396–408

This seems an awful lot like doing DefaultFunctionArrayLValueConversion() here -- can you call that to do the heavy lifting?

Oh, I see we're just shuffling code around. Feel free to ignore or make as an NFC change if you'd prefer.

clang/lib/Sema/SemaType.cpp
8148

Just to avoid re-using a type name as a declaration.

8163–8165
clang/test/SemaCXX/annotate-type.cpp
3

Can you also add some test coverage for applying the attribute to a declaration instead of a type or not giving it any arguments? Also, should test arguments which are not a constant expression.

clang/unittests/AST/AttrTest.cpp
41

Rather than doing the AST tests this way, you could add an -ast-dump to clang\test\AST and test the AST more directly. Coverage is roughly the same, but the AST test may be easier to read.

Other AST things to test: more complicated types that have the attribute to ensure it shows up in the correct places and can be written on deduced types.

int [[attr]] * [[attr]] ident[10] [[attr]];
void (* [[attr]] fp)(void);
__auto_type [[attr]] oooh = 12; // in C mode

struct S { int mem; };
int [[attr]] S::* [[attr]] oye = &S::mem; // in C++ mode

Also, you should add a release note about the new attribute and check the Precommit CI pipeline failure out:

Failed Tests (1):

Clang :: CodeGenCXX/annotate-type.cpp
mboehme updated this revision to Diff 421930.Apr 11 2022, 7:54 AM
mboehme marked 2 inline comments as done.

Various changes in response to review comments.

mboehme marked 4 inline comments as done.EditedApr 11 2022, 8:10 AM

Also, you should add a release note about the new attribute

Done.

and check the Precommit CI pipeline failure out:

Failed Tests (1):

Clang :: CodeGenCXX/annotate-type.cpp

Fixed.

clang/include/clang/Basic/AttrDocs.td
6449–6451

Good point. I've done what I think you're implying, namely that we should remove the comments saying that the attribute won't have any any effect on the LLVM IR?

clang/lib/AST/TypePrinter.cpp
1711

So what's your position -- should I not print the attribute at all (which is what I'm doing right now) or should I print it as something like [[clang::annotate_type(...)]]?

clang/lib/Sema/SemaAttr.cpp
396–408

By "NFC change", you mean I could submit this separately as an NFC change?

Since this change is already part of this patch, I think I'd prefer to just keep it in here, if that's OK?

clang/test/SemaCXX/annotate-type.cpp
3

I've added tests as you suggested, though I put most of them in Sema/annotate-type.c, as they're not specific to C++.

Thanks for prompting me to do this -- the tests caused me to notice and fix a number of bugs.

I haven't managed to diagnose the case when the attribute appears at the beginning of the declaration. It seems to me that, at the point where I've added the check, this case is indistinguishable from an attribute that appears on the type. In both cases, the TAL is TAL_DeclSpec, and the attribute is contained in DeclSpec::getAttributes(). This is because Parser::ParseSimpleDeclaration forwards the declaration attributes to the DeclSpec:

https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParseDecl.cpp#L1851

I believe if I wanted to prohibit this case, I would need to add a check to Parser::ParseStatementOrDeclaration and prohibit any occurrences of annotate_type there. However, this seems the wrong place to do this because it doesn't contain any specific processing for other attributes.

I've noticed that Clang also doesn't prohibit other type attributes (even ones with C++ 11 syntax) from being applied to declarations. For example: https://godbolt.org/z/Yj1zWY7nn

How do you think I should proceed here? I think the underlying issue is that Clang doesn't always distinguish cleanly between declaration attributes and type attributes, and fixing this might be nontrivial.

clang/unittests/AST/AttrTest.cpp
41

Rather than doing the AST tests this way, you could add an -ast-dump to clang\test\AST and test the AST more directly.

I think a problem with this might be that TypePrinter can't print the arguments of the attribute, and I want to test those?

Other AST things to test: more complicated types that have the attribute to ensure it shows up in the correct places and can be written on deduced types.

Done. I've also introduced a test helper AssertAnnotatedAs to keep the tests as compact as I can.

aaron.ballman added inline comments.Apr 18 2022, 9:07 AM
clang/include/clang/Basic/AttrDocs.td
6449–6451

Heh, yes, you interpreted my concerns correctly, thank you!

clang/lib/AST/TypePrinter.cpp
1711

I think we need to print *something* here because the type printer is used by QualType::getAsStringInternal() and QualType::print() which are used to produce diagnostics when giving a QualType argument. So I would print [[clang::annotate]] as a stop-gap with a FIXME comment that it would be nice to print the arguments here some day.

clang/lib/Parse/ParseDecl.cpp
3183

I think this is a FIXME situation -- our policy is to diagnose attributes that are being ignored, and it seems pretty important to (someday, not part of this patch) diagnose these. Especially the lifetime bound attribute given that it's intended to help harden code.

clang/lib/Sema/SemaAttr.cpp
396–408

By "NFC change", you mean I could submit this separately as an NFC change?

Yup!

Since this change is already part of this patch, I think I'd prefer to just keep it in here, if that's OK?

That's fine by me. If you intend to keep it as it is though, you should probably add a FIXME comment about potentially using the other function instead of doing this the hard way.

clang/lib/Sema/SemaType.cpp
8152

Wha?

Do we really get into ProcessDeclAttribute() for something like:

template <typename Ty> void func();
void foo() {
  func<int [[clang::annotate_type()]]>();
}

(My intuition was that this needs to be diagnosed here -- see HandleVectorSizeAttr() -- but I could be remembering type attribute processing somewhat wrong.)

clang/test/Sema/annotate-type.c
24

I'd also like to see an example like:

int *[[clang::annotate_type("second arg is a problem", int)]] zN;

to demonstrate that we don't hit the assert added in HandleAnnotateTypeAttr().

clang/test/SemaCXX/annotate-type.cpp
3

How do you think I should proceed here? I think the underlying issue is that Clang doesn't always distinguish cleanly between declaration attributes and type attributes, and fixing this might be nontrivial.

This is a general issue with attribute processing. I would imagine that SemaDeclAttr.cpp should be able to diagnose that case when the attribute only applies to types and not declarations, but it'd take some investigation for me to be sure.

Because this issue isn't new to your situation, I'd recommend filing an issue about the general problem and then we can solve that later.

34

Another thing we should probably test is parameter pack expansion that's supported by the declaration version of this. e.g.,

template <int... Is> void variadic_nttp() {
  [[clang::annotate("D", Is...)]] int val1;    // This works today
  int [[clang::annotate_type(D, Is...)]] val2; // I'd expect this to work as well
}

int main() {
  variadic_nttp<1, 2, 3>();
}

If it doesn't work yet, I think it's fine to do the implementation work in a follow-up, but we should document the difference so user's aren't caught by surprise.

clang/unittests/AST/AttrTest.cpp
41

I think a problem with this might be that TypePrinter can't print the arguments of the attribute, and I want to test those?

That's a good point. :-/

xbolva00 added inline comments.
clang/test/Sema/annotate-type.c
7

Is it possible to typedef int [[clang::annotate_type("bar")]] mymagicint_t ?

mboehme updated this revision to Diff 423856.Apr 20 2022, 2:56 AM
mboehme marked 2 inline comments as done.

Changes in response to review comments.

mboehme marked 5 inline comments as done.Apr 20 2022, 4:32 AM
mboehme added inline comments.
clang/lib/AST/TypePrinter.cpp
1711

Makes sense -- done!

clang/lib/Parse/ParseDecl.cpp
3183

Maybe I should clarify what's happening here: We _do_ produce diagnostics (errors) for AT_LifetimeBound and AT_AnyX86NoCfCheck, both before and after this patch. (For these attributes, the code falls through to the Diag(PA.getLoc(), diag::err_attribute_not_type_attr) call below.)

Before this patch, we would reject (produce errors for) _all_ C++11 attributes here. Now, we only reject non-type attributes, and in addition, we also reject AT_LifetimeBound and AT_AnyX86NoCfCheck (even though they are type attributes) because we historically haven't allowed them to be used in this way. There are tests for this behavior, so it seemed important to preserve it.

clang/lib/Sema/SemaAttr.cpp
396–408

That's fine by me. If you intend to keep it as it is though, you should probably add a FIXME comment about potentially using the other function instead of doing this the hard way.

Makes sense -- I've done the latter.

clang/lib/Sema/SemaType.cpp
8152

Ah -- true, we don't get into ProcessDeclAttribute() for the example you gave. (I have added this example to the tests.)

So I've realized that I do need to emit the diagnostic here to make sure that I emit it in all cases (and not just those seen by ProcessDeclAttribute(). To prevent ProcessDeclAttribute() from emitting a duplicate diagnostic in the cases where it sees the attribute, I've set the HasCustomParsing flag for the attribute. This seems appropriate anyhow.

clang/test/Sema/annotate-type.c
24

Done (below).

I've also added another test case where the second argument is an identifier, not a keyword.

mboehme added inline comments.Apr 20 2022, 4:32 AM
clang/test/Sema/annotate-type.c
7

Yes -- I've added a test.

clang/test/SemaCXX/annotate-type.cpp
3

I've done some more investigation myself, and I think I've come up with a solution; actually, two potential solutions. I have draft patches for both of these; I wanted to run these by you here first, so I haven't opened a bug yet.

I'd appreciate your input on how you'd prefer me to proceed with this. I do think it makes sense to do this work now because otherwise, people will start putting annotate_type in places where it doesn't belong, and then we'll have to keep supporting that in the future.

My preferred solution

https://reviews.llvm.org/D124081

This adds a function DiagnoseCXX11NonDeclAttrs() and calls it in all places where we should reject attributes that can't be put on declarations. (As part of this work, I noticed that gsl::suppress should be a DeclOrStmtAttr, not just a StmtAttr.)

Advantages:

  • Not very invasive.

Disadvantages:

  • Need to make sure we call DiagnoseCXX11NonDeclAttrs() everywhere that non-declaration attributes should be rejected. (Not sure if I've identified all of those places yet?)

Alternative solution

https://reviews.llvm.org/D124083

This adds an appertainsTo parameter to ParseCXX11Attributes() and other "parse attributes" functions that call it. This parameter specifies whether the attributes in the place where they're currently being parsed appertain to a declaration, statement or type. If ParseCXX11Attributes() encounters an attribute that is not compatible with appertainsTo, it outputs a diagnostic.

Advantages:

  • Every call to ParseCXX11Attributes() _has_ to specify appertainsTo -- so there's no risk of missing some case where we have to output diagnostics

Disadvantages:

  • This change is _much_ more invasive.
  • It's not always clear what value to specify for appertainsTo. The poster child for this is Parser::ParseStatementOrDeclaration(): As the name says, we don't yet know whether we're parsing a statement or a declaration. As a result, we need to be able to specify AppertainsToUnknown and would need to add additional logic deeper in the call stack to produce diagnostics once we know for sure whether we're dealing with a statement or declaration. (This isn't yet implemented.) This makes the solution less elegant that it initially seems.
  • There's a tension between this new functionality and the existing ParsedAttr::diagnoseAppertainsTo(). However, we unfortunately can't extend the existing ParsedAttr::diagnoseAppertainsTo() because it is called from ProcessDeclAttribute() and (despite the name) that function is also processes some legitimate type attributes.
34

I've added a test (and set the AcceptsExprPack flag so the attribute does actually accept parameter packs, just like annotate).

aaron.ballman added inline comments.Apr 28 2022, 5:19 AM
clang/lib/Parse/ParseDecl.cpp
3183

Oh derp, that's my mistake, I read the != logic wrong. Thank you for the clarification!

clang/test/SemaCXX/annotate-type.cpp
3

I do think it makes sense to do this work now because otherwise, people will start putting annotate_type in places where it doesn't belong, and then we'll have to keep supporting that in the future.

Thank you for working on this now, I agree that'd be good to avoid.

My concern with both of those approaches is that they impact *parsing* rather than *semantics* and that seems wrong. As far as the parser is concerned, attributes really only exist as one particular form of syntax or another, but it shouldn't care about whether an attribute is allowed to appertain to a particular thing or not. e.g., an attribute list can either appear somewhere in the syntax or it can't, but what particular attributes are specified do not matter at parsing time. Parsing determine what the attribute is associated with (a declaration, a type, a statement, etc). After we've finished parsing the attributes in the list, we convert the attributes into their semantic form and that's the time at which we care about whether the attribute may be written on the thing it appertains to.

I think we handle all of the parsing correctly today already, but the issue is on the semantic side of things. An attribute at the start of a line can only be one of two things: a statement attribute or a declaration attribute, it can never be a type attribute. So what I think should happen is that ProcessDeclAttribute() needs to get smarter when it's given an attribute that can only appertain to a type. I don't think we should have to touch the parser at all, especially because you can specify multiple attributes in one list, as in:

void func() {
  [[decl_attr, type_attr]] int foo;
}

and we'd want to diagnose type_attr but still allow decl_attr.

mboehme marked 5 inline comments as done.Apr 28 2022, 8:14 AM
mboehme added inline comments.
clang/test/SemaCXX/annotate-type.cpp
3

I think we handle all of the parsing correctly today already

I'm not sure about this. For example, take a look at Parser::ParseSimpleDeclaration, which does this at the end:

DS.takeAttributesFrom(Attrs);

Here, DS is a ParsingDeclSpec, and Attrs are the attributes that were seen on the declaration (I believe).

If my understanding of the code is correct (which it very well may not be), this is putting the attributes that were seen on the declaration in the same place (DeclSpec::Attrs) that will later receive the attributes seen on the decl-specifier-seq. Once we've put them there, it is no longer possible to tell whether those attributes were seen on the declaration (where type attributes are not allowed) or on the decl-specifier-seq (where type attributes _are_ allowed).

(This is why https://reviews.llvm.org/D124081 adds a DiagnoseCXX11NonDeclAttrs(Attrs) just before the DS.takeAttributesFrom(Attrs) call -- because this is the last point at which we haven't lumped declaration and decl-specifier-seq attributes together.)

ProcessDeclAttribute(), when called through Sema::ProcessDeclAttributes() (which I think is the relevant call stack), only sees attributes after they have been put in DeclSpec::Attrs. (It therefore sees, and ignores, all of the type attributes that were put on the decl-specifier-seq.) So I think it isn't possible to tell there whether we put a type attribute on a declaration.

It might be possible to fix this by giving DeclSpec two lists of attributes, one containing the attributes for the declaration, and another containing the attributes for the decl-specifier-seq itself. However, this is a pretty invasive change, as everything that consumes attributes from the DeclSpec potentially needs to be changed (I tried and shied away from it).

Am I misunderstanding something about the code? If not, what are your thoughts on how best to proceed?

aaron.ballman added inline comments.Apr 28 2022, 8:39 AM
clang/test/SemaCXX/annotate-type.cpp
3

Am I misunderstanding something about the code? If not, what are your thoughts on how best to proceed?

I think your understanding is correct and I was imprecise. I meant that we generally parse the [[]] in all the expected locations in the parser. However, I think you're correct that the bits which determine what chunks in a declaration may need to be updated. That code largely exists because GNU-style attributes "slide" sometimes to other chunks, but that's not the case for [[]]-style attributes. It sounds like we are likely lacking expressivity for the [[]] type attributes in the hookup between what we parsed and when we form the semantic attribute because we need to convert the parsed type into a real type.

mboehme added inline comments.Apr 29 2022, 6:26 AM
clang/test/SemaCXX/annotate-type.cpp
3

It sounds like we are likely lacking expressivity for the [[]] type attributes in the hookup between what we parsed and when we form the semantic attribute because we need to convert the parsed type into a real type.

OK, good, sounds like we're on the same page!

I'll look at what we can do to retain the information of whether an attribute was added to the declaration or the decl-specifier-seq, so that Sema can then issue the correct diagnostics. I'll look at changing DeclSpec so that it retains not one, but two lists of attributes, for the declaration and the decl-specifier-seq respectively. There are likely to be some devils in the details -- DeclSpec allows direct write access to its Attrs member variable through the non-const version of getAttributes(), and there are quite a few places in the code that call this. Does this seem like the right direction to you in principle though?

Apart from all of this, are there any outstanding issues that you see with this change? It does of course make sense to prohibit type attributes on declarations before I land this change, but I wanted to check whether there's anything else that would need to happen here.

aaron.ballman added subscribers: rsmith, erichkeane.

Adding in @erichkeane just in case he spots anything I've missed.

clang/test/SemaCXX/annotate-type.cpp
3

Does this seem like the right direction to you in principle though?

Maybe. In terms of the standards concepts, a decl-specifier-seq can have attributes associated with it (https://eel.is/c++draft/dcl.dcl#dcl.spec.general-1.sentence-2) and so from that perspective, a DeclSpec seems like it should have only one list of attributes (those for the entire sequence of declaration specifiers). However, GNU-style attributes "slide" around to other things, so it's possible that the behavior in ParseSimpleDeclaration() is correct for that particular syntax but incorrect for [[]] style attributes. So, depending on which devilish details are learned while working on this, it may turn out that the best approach is a secondary list, or it may turn out that we want to add a new takeAll variant that takes only the correct attributes while leaving others. (It may be useful to summon @rsmith for opinions on the right approach to take here.)

Apart from all of this, are there any outstanding issues that you see with this change? It does of course make sense to prohibit type attributes on declarations before I land this change, but I wanted to check whether there's anything else that would need to happen here.

Nothing else is jumping out at me as missing or needing additional work aside from some formatting nits. I'm going to add another reviewer just to make sure I've not missed something, but I think this is about ready to go.

clang/unittests/AST/AttrTest.cpp
89

The formatting looks a bit off for this in terms of indentation.

159

Formatting looks a bit off here in terms of the indentation for this.

I don't really know how useful this ends up being, these attributes (since they are part of AttributedType end up disappearing pretty quickly/easily. Anything that causes canonicalization will cause these to go away, they don't apply to references to these types, etc. But I otherwise don't have concerns.

HOWEVER, have you implemented the lifetime static analysis that you're wanting to do on top of this already? If you haven't, you might just find that your implementation/idea here won't work, and that you've already committed sufficiently to a design that doesn't work for it.

clang/lib/AST/TypePrinter.cpp
1714

My opinion (though not attached to it) would be clang::annotate_type(...) or, clang::annotate_type(<unavailable>) or something like that.

xbolva00 added a comment.EditedApr 29 2022, 12:54 PM

The intent is to allow adding arbitrary annotations to types for use in static analysis tools

This patch should not land until we see some real use cases to justify new hundreds of lines of code. We should more careful and take maintenance cost really into account and not merge every experimental cool research extensions automatically.

Please answer points in “ Contributing Extensions to Clang “
https://clang.llvm.org/get_involved.html

mboehme updated this revision to Diff 426977.May 4 2022, 6:00 AM

Rebased to a more recent base change.

mboehme added inline comments.May 4 2022, 6:19 AM
clang/test/SemaCXX/annotate-type.cpp
3

Thanks for the input!

I've looked at this more closely, and I've concluded that adding a second list of attributes to the DeclSpec isn't really the right way to go. For one thing, as you say, the attributes we're discussing don't really belong on the decl-specifier-seq.

But I _have_ come up with an alternative solution that I like better than the alternatives I had considered so far and that, as you suggest, emits the diagnostics from Sema, not the Parser. The solution is not just more concise than the other options I had explored but also just feels like the right way of doing things.

Taking a look at the standard, attributes in the location in question (at the beginning of a simple-declaration) have the same semantics as attributes that appear after a declarator-id, and we already have a location to store the latter, namely Declarator::Attr. It therefore seems the right thing to do also put attributes from the simple-declaration in Declarator::Attr. We can then take advantage of the fact that there is existing code already in place to deal with attributes in this location.

The other part of the solution is to make ProcessDeclAttribute warn about C++11 attributes that don’t “slide” to the DeclSpec (except, of course, if ProcessDeclAttribute is actually being called for attributes on a DeclSpec or Declarator).

Here’s a draft change that implements this:

https://reviews.llvm.org/D124919

(The change description contains more details on the attribute semantics from the standard that I alluded to above.)

The change still contains a few TODOs, but all of the hard bits are done. Specifically, all of the tests pass too. I feel pretty good about this change, but before I work on it further to get it ready for review, I wanted to get your reaction: Do you agree that this is the right general direction?

mboehme updated this revision to Diff 426990.May 4 2022, 6:49 AM

Changes in response to review comments.

mboehme marked 2 inline comments as done.May 4 2022, 6:52 AM
mboehme added inline comments.
clang/unittests/AST/AttrTest.cpp
89

Fixed.

159

Fixed.

mboehme marked 3 inline comments as done.May 4 2022, 7:04 AM

I don't really know how useful this ends up being, these attributes (since they are part of AttributedType end up disappearing pretty quickly/easily. Anything that causes canonicalization will cause these to go away,

This is true, but I also think we don't have much choice in the matter if we don't want these attributes to affect the C++ type system. We discuss this at length in the RFC:

https://discourse.llvm.org/t/rfc-new-attribute-annotate-type-iteration-2/61378#rationale-5

I go into more depth in this comment:

https://discourse.llvm.org/t/rfc-new-attribute-annotate-type-iteration-2/61378/4?u=martinboehme

However, it is possible for a static analysis to propagate the annotations through typedefs, template arguments and such, and we do so in the lifetime static analysis (see also below).

they don't apply to references to these types, etc.

Not sure what you mean here -- do you mean that the annotations don't propagate through typedefs? As noted, it's possible for a static analysis tool to perform this propagation, and the discussion here is again relevant:

https://discourse.llvm.org/t/rfc-new-attribute-annotate-type-iteration-2/61378/4?u=martinboehme

But I otherwise don't have concerns.

HOWEVER, have you implemented the lifetime static analysis that you're wanting to do on top of this already

Yes, we have. We are able to propagate lifetime annotations through non-trivial C++ constructs, including template arguments. Based on what our current prototype can do, we're confident these annotations will do everything we need them to. We also believe they are general enough to be useful for other types of static analysis.

clang/lib/AST/TypePrinter.cpp
1714

Good idea -- this makes it more obvious that the arguments were omitted. Changed!

mboehme marked an inline comment as done.May 4 2022, 7:16 AM

This patch should not land until we see some real use cases to justify new hundreds of lines of code. We should more careful and take maintenance cost really into account and not merge every experimental cool research extensions automatically.

I assume you've seen the RFC for the primary use case (Rust-style lifetime annotations for C++) that motivates this change?

There's a bit of a chicken-and-egg problem here: We obviously can't submit code for the lifetime analysis tooling before the type annotations it needs are supported. We're conscious that it's not desirable to add features merely for the purpose of an experimental tool that may never be stabilized. This is why we've consciously steered away from adding attributes that are specific to our use case and are instead proposing a general-purpose type annotation attribute.

Please answer points in “ Contributing Extensions to Clang “
https://clang.llvm.org/get_involved.html

This makes sense. I'll add an appendix answering these points to the RFC for this change. I probably won't get round to doing this today, but I did want to let you know that I've seen your comment. I'll ping this comment again when I've added the appendix to the RFC.

aaron.ballman added inline comments.
clang/test/SemaCXX/annotate-type.cpp
3

The change still contains a few TODOs, but all of the hard bits are done. Specifically, all of the tests pass too. I feel pretty good about this change, but before I work on it further to get it ready for review, I wanted to get your reaction: Do you agree that this is the right general direction?

Thank you for your continued work on this, I *really* appreciate it! I think your newest approach is the one I'm happiest with, though there are still parts of it I hope we can find ways to improve. My biggest concern is with having to sprinkle ExtractDefiniteDeclAttrs() in the "correct" places, as that's likely to be fragile. If there was a way we only had to do that once instead of 8 times, I'd be much happier. My worry is mostly with someone forgetting to call it when they should or thinking they need to call it when modeling new code on existing code.

Aside from that, I think the approach you've got is quite reasonable, though I'd be curious if @rsmith or @rjmccall have additional thoughts or concerns that I might not be seeing yet.

mboehme added inline comments.May 5 2022, 5:27 AM
clang/test/SemaCXX/annotate-type.cpp
3

My biggest concern is with having to sprinkle ExtractDefiniteDeclAttrs() in the "correct" places, as that's likely to be fragile. If there was a way we only had to do that once instead of 8 times, I'd be much happier. My worry is mostly with someone forgetting to call it when they should or thinking they need to call it when modeling new code on existing code.

Agree. I would also prefer it if we could only do this in once place -- but I can't really come up with a good way to do this.

The underlying reason for why we're doing this in multiple places is that there are multiple places in the C++ grammar where an attribute-specifier-seq can occur that appertains to a declaration (e.g. in a simple-declaration, parameter-declaration, exception-declaration, and so on). These are parsed in different places in the code, and so we also need to add ExtractDefiniteDeclAttrs() in those different places. If a future change to the language adds another type of declaration that can contain an attribute-specifier-seq, we'll have to add a new ExtractDefiniteDeclAttrs() there as well -- but that's really just a part of saying that we need to write code that parses that entire type of declaration correctly.

Unfortunately, I don't really see what we could do to make sure people don't get this wrong. The motivation behind my previous attempt in https://reviews.llvm.org/D124083 was to try and make errors impossible by making callers of ParseCXX11Attributes (and related functions) specify what entity (type, statement, or declaration) the attributes appertain to in the current position. Unfortunately, this can't be done consistently, as we don't actually always know what entity the attributes should appertain to (e.g. in Parser::ParseStatementOrDeclaration), and the approach has other downsides, as we've discussed before.

Aside from that, I think the approach you've got is quite reasonable, though I'd be curious if @rsmith or @rjmccall have additional thoughts or concerns that I might not be seeing yet.

I'd like to hear their input too! Besides your mentioning them here, is there anything else we need to do to make sure they see this? (Should I reach out to @rsmith internally?)

By the way, https://reviews.llvm.org/D124919 currently depends on this change (https://reviews.llvm.org/D111548), so that we can show the effect that it has on the tests for annotate_type, but if we agree that https://reviews.llvm.org/D124919 is what we want to proceed with, I would plan to submit it first so that we have the "protections" it provides in place before adding the new attribute.

The intent is to allow adding arbitrary annotations to types for use in static analysis tools

This patch should not land until we see some real use cases to justify new hundreds of lines of code. We should more careful and take maintenance cost really into account and not merge every experimental cool research extensions automatically.

Please answer points in “ Contributing Extensions to Clang “
https://clang.llvm.org/get_involved.html

Sorry, I had missed this. The justification is from the original RFC on adding lifetime annotation checking to the clang static analyzer (https://discourse.llvm.org/t/rfc-new-attribute-annotate-type-iteration-2). I think this code has utility as a general-purpose mechanism in the same manner that the existing declaration annotation attribute does. As attribute code owner, I think this meets the bar for a contribution as an extension.

mboehme updated this revision to Diff 428015.May 9 2022, 2:35 AM

Move some tests here that I originally added to https://reviews.llvm.org/D124919.

rsmith added inline comments.May 9 2022, 3:50 PM
clang/test/SemaCXX/annotate-type.cpp
3

I think the right approach here is that ParseDeclaration, ParseExternalDeclaration, and friends should only be given an attribute list that includes declaration attributes (that is, C++11 attribute syntax attributes), and that list should not be muddled into the list of decl specifier attributes.

As far as I can see, that is in fact already the case with only two exceptions;

  1. ParseExportDeclaration parses MS attributes before recursing into ParseExternalDeclaration. That looks like it's simply a bug as far as I can tell, and that call can be removed. MS attributes will be parsed as part of the decl specifier sequence as needed and don't need to be parsed as declaration attributes.
  2. ParseStatementOrDeclarationAfterAttributes is called after parsing an initial sequence of GNU attributes that might be part of the decl-specifier-seq, or might precede a label. In this case, we should simply have two attribute lists: the C++11-syntax declaration attributes, and the GNU decl-specifier-seq attributes that we've effectively tentatively parsed past.

All other code paths into ParseExternalDeclaration / ParseDeclaration / ParseSimpleDeclaration parse only C++11 attribute syntax before getting there.

With those two exceptions changed, we should be able to keep the declaration attributes entirely separate from the decl-specifier-seq attributes. We can then carefully re-implement the "sliding" of declaration attributes onto the type in the cases where we want to do so for compatibility.

mboehme added inline comments.May 10 2022, 4:20 AM
clang/test/SemaCXX/annotate-type.cpp
3

I think the right approach here is that ParseDeclaration, ParseExternalDeclaration, and friends should only be given an attribute list that includes declaration attributes (that is, C++11 attribute syntax attributes), and that list should not be muddled into the list of decl specifier attributes.

I think this makes sense, but I'm not sure how we would do this in the specific case of ParseStatementOrDeclarationAfterAttributes -- see discussion below.

Also, I wanted to make sure we're on the same page about a specific point: For backwards compatibility, we need to continue to "slide" some C++11 attributes to the DeclSpec, namely those type attributes for which we have allowed this so far. Do you agree?

  1. ParseExportDeclaration parses MS attributes before recursing into ParseExternalDeclaration. That looks like it's simply a bug as far as I can tell, and that call can be removed. MS attributes will be parsed as part of the decl specifier sequence as needed and don't need to be parsed as declaration attributes

That makes sense -- and indeed none of the other callers of ParseExternalDeclaration parse MS attributes before calling it.

I've tried removing the two calls to MaybeParseMicrosoftAttributes in ParseExportDeclaration, and all of the tests still pass.

I'm not sure though if this may produce new errors on some invalid code that we used to allow, namely those cases in ParseExternalDeclaration that call through to ParseDeclaration rather than ParseDeclOrFunctionDefInternal. From looking at the code, one erroneous code sample that I expected the current code to allow is export __declspec(deprecated) namespace foo {}, but it appears that we don't (https://godbolt.org/z/b4sE8E8GG). So maybe there's no concern here?

If we don't think there are any issues with this, I would prepare a small patch that removes these two calls to MaybeParseMicrosoftAttributes. (I think this can and should be a separate patch as it's otherwise independent of the wider issue we're discussing here.)

  1. ParseStatementOrDeclarationAfterAttributes is called after parsing an initial sequence of GNU attributes that might be part of the decl-specifier-seq, or might precede a label. In this case, we should simply have two attribute lists: the C++11-syntax declaration attributes, and the GNU decl-specifier-seq attributes that we've effectively tentatively parsed past.

To make sure I understand: What you're saying is that ParseStatementOrDeclarationAfterAttributes should take two ParsedAttributes arguments, one for the C++11 attributes and one for the GNU attributes?

What I'm not quite clear on is how ParseStatementOrDeclarationAfterAttributes should handle these two attribute lists further. There are two places in this function that call through to ParseDeclaration and I think you're saying that you'd like ParseDeclaration to only take a list of C++11 attributes. But what should ParseStatementOrDeclarationAfterAttributes then do with the GNU attributes -- it can't simply drop them on the floor?

Or does this mean that ParseDeclaration also needs to take two ParsedAttributes arguments (and similarly for any other Parse* functions it passes the attributes on to)? This is certainly doable, but I'm not sure that we get much benefit from it. As noted at the beginning of this comment, some "legacy" C++11 type attributes will also need to "slide" to the DeclSpec for backwards compatibility, so we can't simply put all the C++11 attributes on the declaration (which would be the attraction of having separate lists). At that point, it doesn't seem too harmful to me to have a single ParsedAttributes argument containing all of the attributes (C++11 and GNU) that were specified at the beginning of the declaration.

With those two exceptions changed, we should be able to keep the declaration attributes entirely separate from the decl-specifier-seq attributes. We can then carefully re-implement the "sliding" of declaration attributes onto the type in the cases where we want to do so for compatibility.

This is essentially what https://reviews.llvm.org/D124919 does. To emphasize this (and simplify the code slightly at the same time), I've replaced the previous ExtractDefiniteDeclAttrs() with a function called SlideAttrsToDeclSpec() that moves the "sliding" attributes from the original ParsedAttributes directly to the DeclSpec. Is this along the lines of what you were thinking of?

rsmith added inline comments.May 10 2022, 12:57 PM
clang/test/SemaCXX/annotate-type.cpp
3

For backwards compatibility, we need to continue to "slide" some C++11 attributes to the DeclSpec, namely those type attributes for which we have allowed this so far. Do you agree?

I think we should allow this in the short term, but aim to remove it after we've been warning on it for a release or two. I'd even be OK with the warning being an error by default when it lands.

From looking at the code, one erroneous code sample that I expected the current code to allow is export __declspec(deprecated) namespace foo {}, but it appears that we don't (https://godbolt.org/z/b4sE8E8GG). So maybe there's no concern here?

MS attributes are [...] attributes, so the testcase would be export [blah] namespace foo {}, which we do currently erroneously accept. Still, I think that's highly unlikely to be a concern. Our support for C++20 modules is new, experimental, and incomplete; there's no need to worry about backwards-incompatibility here.

To make sure I understand: What you're saying is that ParseStatementOrDeclarationAfterAttributes should take two ParsedAttributes arguments, one for the C++11 attributes and one for the GNU attributes?

I would phrase that as: one for declaration attributes, and one for decl-specifier attributes (put another way: one parameter for declaration attributes, and one parameter for the portion of a decl-specifier-seq that we parsed while looking for a label, which happens to be only GNU attributes). But yes.

What I'm not quite clear on is how ParseStatementOrDeclarationAfterAttributes should handle these two attribute lists further. There are two places in this function that call through to ParseDeclaration and I think you're saying that you'd like ParseDeclaration to only take a list of C++11 attributes. But what should ParseStatementOrDeclarationAfterAttributes then do with the GNU attributes -- it can't simply drop them on the floor?

Or does this mean that ParseDeclaration also needs to take two ParsedAttributes arguments (and similarly for any other Parse* functions it passes the attributes on to)? This is certainly doable, but I'm not sure that we get much benefit from it. As noted at the beginning of this comment, some "legacy" C++11 type attributes will also need to "slide" to the DeclSpec for backwards compatibility, so we can't simply put all the C++11 attributes on the declaration (which would be the attraction of having separate lists).

Yes, we should propagate the two lists into ParseDeclaration and ParseSimpleDeclaration. That should be the end of it: ParseSimpleDeclaration forms a ParsingDeclSpec, which can be handed the decl-specifier attributes. The other cases that ParseDeclaration can reach should all prohibit decl-specifier attributes.

At that point, it doesn't seem too harmful to me to have a single ParsedAttributes argument containing all of the attributes (C++11 and GNU) that were specified at the beginning of the declaration.

I think it will be best for ongoing maintenance if we model the grammar in the parser as it is -- C++11 declaration attributes are a separate list that's not part of the decl-specifier-seq, and I expect we will have emergent bugs and confusion if we try to combine them. Attribute handling in declaration parsing is already made confusing because we don't clearly distinguish between these two lists, leading to bugs like parsing MS attributes in ParseExportDeclaration.

With those two exceptions changed, we should be able to keep the declaration attributes entirely separate from the decl-specifier-seq attributes. We can then carefully re-implement the "sliding" of declaration attributes onto the type in the cases where we want to do so for compatibility.

This is essentially what https://reviews.llvm.org/D124919 does. To emphasize this (and simplify the code slightly at the same time), I've replaced the previous ExtractDefiniteDeclAttrs() with a function called SlideAttrsToDeclSpec() that moves the "sliding" attributes from the original ParsedAttributes directly to the DeclSpec. Is this along the lines of what you were thinking of?

Approximately. I also think we should avoid moving attributes around between attribute lists entirely, and instead preserve the distinct attribute lists until we reach the point where we can apply them to a declaration in Sema. I would note that the approach in that patch seems like it'll reorder attributes, and we have attributes where the order matters (eg, [[clang::enable_if(a, "")]] void f [[clang::enable_if(b, "")]]()). Also, moving attributes around makes it hard to properly handle source locations / ranges for attributes.

In particular: I would make the ParsingDeclarator have a pointer to the declaration attributes in addition to its list of attributes on the declarator-id. Then, when we apply those attributes to the declaration, pull them from both places (declaration attributes then declarator-id attributes, to preserve source order, skipping "slided" type attributes in the former list). And when we form the type for the declarator, also apply any attributes from the declaration's attribute list that should be "slided" on the type.

mboehme added inline comments.May 20 2022, 4:39 AM
clang/test/SemaCXX/annotate-type.cpp
3

(Oops -- only just realized that I wrote up this comment in draft form last week but never sent it.)

Thank you for the really in-depth reply!

For backwards compatibility, we need to continue to "slide" some C++11 attributes to the DeclSpec, namely those type attributes for which we have allowed this so far. Do you agree?

I think we should allow this in the short term, but aim to remove it after we've been warning on it for a release or two. I'd even be OK with the warning being an error by default when it lands.

And there would be a command-line flag that would allow "downgrading" the error to a warning?

From looking at the code, one erroneous code sample that I expected the current code to allow is export __declspec(deprecated) namespace foo {}, but it appears that we don't (https://godbolt.org/z/b4sE8E8GG). So maybe there's no concern here?

MS attributes are [...] attributes,

Oops *facepalm* A closer look at ParseMicrosoftAttributes could have told me that. I didn't even know of these until now TBH.

so the testcase would be export [blah] namespace foo {}, which we do currently erroneously accept. Still, I think that's highly unlikely to be a concern. Our support for C++20 modules is new, experimental, and incomplete; there's no need to worry about backwards-incompatibility here.

Got it. I'll submit a separate small patch that removes those two calls to MaybeParseMicrosoftAttributes then.

To make sure I understand: What you're saying is that ParseStatementOrDeclarationAfterAttributes should take two ParsedAttributes arguments, one for the C++11 attributes and one for the GNU attributes?

I would phrase that as: one for declaration attributes, and one for decl-specifier attributes (put another way: one parameter for declaration attributes, and one parameter for the portion of a decl-specifier-seq that we parsed while looking for a label, which happens to be only GNU attributes). But yes.

Got it, that makes sense.

[snip]

This is essentially what https://reviews.llvm.org/D124919 does. To emphasize this (and simplify the code slightly at the same time), I've replaced the previous ExtractDefiniteDeclAttrs() with a function called SlideAttrsToDeclSpec() that moves the "sliding" attributes from the original ParsedAttributes directly to the DeclSpec. Is this along the lines of what you were thinking of?

Approximately. I also think we should avoid moving attributes around between attribute lists entirely, and instead preserve the distinct attribute lists until we reach the point where we can apply them to a declaration in Sema. I would note that the approach in that patch seems like it'll reorder attributes, and we have attributes where the order matters (eg, [[clang::enable_if(a, "")]] void f [[clang::enable_if(b, "")]]()). Also, moving attributes around makes it hard to properly handle source locations / ranges for attributes.

Thanks for pointing out the importance of ordering. I did notice some ordering issues while working on https://reviews.llvm.org/D124919 (see comments in SlideAttrsToDeclSpec() if you're interested), but this drives home the point that we need to be careful about ordering.

In particular: I would make the ParsingDeclarator have a pointer to the declaration attributes in addition to its list of attributes on the declarator-id. Then, when we apply those attributes to the declaration, pull them from both places (declaration attributes then declarator-id attributes, to preserve source order, skipping "slided" type attributes in the former list). And when we form the type for the declarator, also apply any attributes from the declaration's attribute list that should be "slided" on the type.

Thanks, I think I can see how that's going to work.

I'll attempt to implement this, which will likely take a few days, and will report back when I have something that works and that I'd like to get feedback on. I may also have some followup questions if I run into unanticipated problems (which is more likely than not, I assume).

mboehme added inline comments.May 20 2022, 4:49 AM
clang/test/SemaCXX/annotate-type.cpp
3

@rsmith I have a draft implementation of the idea we discussed above here:

https://reviews.llvm.org/D126061

(All tests pass.)

I'd appreciate your feedback. Does this look like what you had in mind?

There's one main point I'd like to discuss. I've chosen to store the declaration attributes in a new field on the Declarator, so we now have Declarator::Attrs and Declarator::DeclarationAttrs (and I believe this is what we had discussed before). This makes sense from a semantic point of view because attributes in both positions have the same meaning.

However there's one point I'm not completely satisfied with: Because a single declaration can have multiple declarators, we need to make sure that we preserve the declaration attributes when we reuse a Declarator for multiple declarators in a single declaration. This gives rise to the slightly unfortunate Declarator::clearExceptDeclarationAttrs().

All in all, I feel the implementation would be cleaner if we put the declaration attributes on the DeclSpec instead of the Declarator, even if the declaration attributes are semantically less related to the DeclSpec than the Declarator. What do you think?

mboehme added inline comments.May 20 2022, 12:12 PM
clang/test/SemaCXX/annotate-type.cpp
3

PS Another option would be to introduce a Declaration class and put the declaration attributes on that. The Declarator would hold a reference to the Declaration.

This solution would represent the actual entities in the grammar best, but I wonder if it's worth introducing a Declarator class for this purpose alone. There might also be potential for confusion because we already have a Decl class in the AST.

mboehme updated this revision to Diff 435524.Jun 9 2022, 6:18 AM

Rebased to a newer base revision.

Is there anything else that needs to happen on this change before it's ready to land?

There is some breakage in the pre-merge checks, but it looks as if it's pre-existing breakage. See the comment-only change here that looks as if it exhibits similar breakage:

https://reviews.llvm.org/D127494

clang/test/SemaCXX/annotate-type.cpp
3

Some notes for those following along:

@rsmith and @aaron.ballman are both in the process of reviewing https://reviews.llvm.org/D126061 and have stated that they think this is the right direction.

I also wanted to clarify how I want to proceed with this change and https://reviews.llvm.org/D126061. I believe I originally said that I wanted to land https://reviews.llvm.org/D126061 first, then this change afterwards.

However, I've now changed my mind about this. We want annotate_type to be available in https://reviews.llvm.org/D126061 because annotate_type is the first type attribute that doesn't exhibit the legacy "sliding" behavior. Without it, https://reviews.llvm.org/D126061 can't introduce some of the new behavior it introduces.

So my plan is now to wait until both this change and https://reviews.llvm.org/D126061 are ready to land, and then land them in sequence (this change first, then https://reviews.llvm.org/D126061).

I really only had one salient question (and a tiny nit), so I think this is almost good to go.

clang/include/clang/Basic/AttrDocs.td
6510

(Since this is for C as well as C++)

clang/lib/Parse/ParseDecl.cpp
3187–3193

Should this be done as part of D126061 instead?

clang/test/SemaCXX/annotate-type.cpp
3

So my plan is now to wait until both this change and https://reviews.llvm.org/D126061 are ready to land, and then land them in sequence (this change first, then https://reviews.llvm.org/D126061).

That plan is fine by me.

mboehme updated this revision to Diff 436713.Jun 14 2022, 2:26 AM

Change attribute documentation so it's not specific to C++.

mboehme updated this revision to Diff 436732.Jun 14 2022, 4:01 AM
mboehme marked 2 inline comments as done.

Instead of allowing all type attributes on the decl-specifier-seq, allow only
annotate_type for now. The more general change will be done in
https://reviews.llvm.org/D126061.

mboehme marked an inline comment as done.Jun 14 2022, 4:15 AM
mboehme added inline comments.
clang/include/clang/Basic/AttrDocs.td
6510

Good point -- done.

clang/lib/Parse/ParseDecl.cpp
3187–3193

Thanks for raising this question!

In this patch, I had originally allowed all type attributes to be placed on the decl-specifier-seq because a) I needed this to be true for annotate_type so that I could test the attribute in this position, and b) there didn't appear to be any harm in allowing any type attribute instead of just annotate_type.

However, subsequent discussion on https://reviews.llvm.org/D126061 has made it clear that not all type attributes should actually be allowed on the decl-specifier-seq -- at least not initially, since they won't currently work correctly in that context (see extensive discussion on that other patch). So I've made the code here more restrictive to allow only annotate_type on the decl-specifier-seq (which is all that I need here).

https://reviews.llvm.org/D126061 now contains the change that allows type attributes on the decl-specifier-seq more generally, which is fine because it also contains additional code to disable this behavior again for the specific attributes that need it.

Given that I'm planning to land this patch and https://reviews.llvm.org/D126061 at the same time, it's not _super_ critical where we make this change, but a) the way it's done now makes the patches more logically cohesive, and b) there's always a chance that a patch may need to be reverted, and having the right changes together in one patch will be important then.

Thanks again for bringing this to my attention!

aaron.ballman accepted this revision.Jun 14 2022, 5:32 AM

LGTM, thank you for this! I did find one small formatting issue in the release notes, but you can fix that up when landing.

clang/docs/ReleaseNotes.rst
365
This revision is now accepted and ready to land.Jun 14 2022, 5:32 AM
mboehme updated this revision to Diff 436764.Jun 14 2022, 6:22 AM
mboehme marked an inline comment as done.

Fixed syntax error in ReleaseNotes.rst.

mboehme marked an inline comment as done.Jun 14 2022, 6:23 AM
mboehme added inline comments.
clang/docs/ReleaseNotes.rst
365

Thanks for catching -- fixed. (I'm too used to Markdown syntax...)

This revision was landed with ongoing or failed builds.Jun 15 2022, 12:47 AM
This revision was automatically updated to reflect the committed changes.
mboehme marked an inline comment as done.