This is an archive of the discontinued LLVM Phabricator instance.

[Annotation] Allow parameter pack expansions and initializer lists in annotate attribute
ClosedPublic

Authored by steffenlarsen on Nov 23 2021, 7:01 AM.

Details

Summary

These changes make the Clang parser recognize expression parameter pack expansion and initializer lists in attribute arguments. Because expression parameter pack expansion requires additional handling while creating and instantiating templates, the support for them must be explicitly supported through the AcceptsExprPack flag.

Handling expression pack expansions may require a delay to when the arguments of an attribute are correctly populated. To this end, attributes that are set to accept these - through setting the AcceptsExprPack flag - will automatically have an additional variadic expression argument member named DelayedArgs. This member is not exposed the same way other arguments are but is set through the new CreateWithDelayedArgs creator function generated for applicable attributes.

To illustrate how to implement support for expression pack expansion support, clang::annotate is made to support pack expansions. This is done by making handleAnnotationAttr delay setting the actual attribute arguments until after template instantiation if it was unable to populate the arguments due to dependencies in the parsed expressions.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
This revision now requires changes to proceed.Dec 1 2021, 11:40 AM

I think the safest bet is to be conservative and not allow mixing packs with variadics, and not allowing multiple packs. We should be able to diagnose that situation from ClangAttrEmitter.cpp to help attribute authors out. However, it'd be worth adding a FIXME comment to that diagnostic logic asking whether we want to relax the behavior at some point. But if you feel strongly that we should support these situations initially, I wouldn't be opposed.

From a practical perspective, the only difference between a list of ExprArgument and VariadicExprArgument is the enforced arity.... I would also consider seeing if we can just generalize the pack logic to work for BOTH cases. I'd be all for limiting the pack/ExprArgument list to be the 'last' argument to simplify it.

steffenlarsen added inline comments.Dec 2 2021, 2:35 AM
clang/include/clang/Basic/Attr.td
544–546

Having ParmExpansionArgsSupport in Attr would allow attributes to populate with parameter packs across argument boundaries. It seems more permissive, but frankly I have no attachment to it, so I am okay with tying it to Argument instead.

Of the mentioned options, I agree #2 and #3 are the preferred paths. For simplicity I prefer #2, though I will have to figure out how best to add a check on individual arguments, which @erichkeane's suggestion to limit the attributes to only allowing parameter packs in variadic arguments as the last argument would help with. On the other hand, the population logic for most of the relevant attributes are explicitly defined in clang/lib/Sema/SemaDeclAttr.cpp (which will have to be changed with #3) so being more permissive - like the current version of the patch is - gives more freedom to the attributes.

Does the tablegen design also seem likely to work when we want to introduce type packs instead of value packs?

I am not sure about how tablegen would take it, but the parser doesn't currently parse more than a single type argument, so I don't see a need for allowing type packs at the moment.

I think the safest bet is to be conservative and not allow mixing packs with variadics, and not allowing multiple packs. We should be able to diagnose that situation from ClangAttrEmitter.cpp to help attribute authors out. However, it'd be worth adding a FIXME comment to that diagnostic logic asking whether we want to relax the behavior at some point. But if you feel strongly that we should support these situations initially, I wouldn't be opposed.

If we limit packs to VariadicExprArgument - like annotate does in the current state of this patch - having multiple packs in the same argument is simple to handle. Both are represented as Expr and the variadic argument will simply contain two expressions until the templates are instantiated and the packs are expanded. This is also a feature I would like to keep, though it should be possible to work around the limitation if we conclude it can't stay.

clang/lib/Parse/ParseDecl.cpp
447

Do you have a test for something that isn't a pack followed by an ellipsis? What is the behavior there? I'm also concerned that there doesn't seem to be anything here that handles complex-pack expansions (like folds), can you test that as well?

Good question! I would expect it to fail in kind during template instantiation, or earlier if it is not an expression, but I will test it out once we converge on a new solution.

Also, why is this 'if' not up on line 429 (that is, outside of this 'else'?). I believe ParseAssignmentExpression is going to muck-up the tokens, so I'm not particularly sure what is going to happen here?

I originally tried having the ActOnPackExpansion earlier, but it did indeed cause trouble. However, if my understanding of the previous two branches is correct, they are:

  1. Type argument. This could be consuming the ellipsis as well, though I suspect it needs another path given that it does not seem to produce any expressions. Note however that it currently seems to stop argument parsing after reading the first type (confirmed by the FIXME) so parameter pack logic for this branch would probably be more suitable once it actually allows attributes to take multiple types.
  2. Identifier argument. Correct me if I am wrong, but I don't think this can be populated by a parameter pack.

I could add the diagnostic to these if ellipsis is found, but if my assumptions are correct it doesn't make much sense to expand these branches beyond that.

clang/test/Parser/cxx0x-attributes.cpp
261

It makes more sense for the new test cases to get expressions rather than types and since Ts wasn't used for its contents it did not seem like an intrusive change.

steffenlarsen added inline comments.Dec 2 2021, 2:40 AM
clang/lib/Parse/ParseDecl.cpp
447

I'm also concerned that there doesn't seem to be anything here that handles complex-pack expansions (like folds), can you test that as well?

Sorry, I forgot to address this. I have not tested it, but I would expect complex-pack expansions like folds to be at expression-level. I will make sure to test this as well.

aaron.ballman added inline comments.Dec 2 2021, 8:34 AM
clang/include/clang/Basic/Attr.td
544–546

Having ParmExpansionArgsSupport in Attr would allow attributes to populate with parameter packs across argument boundaries.

That's what I want to avoid. :-D I would prefer that the arguments have to opt into that behavior because the alternative could get ambiguous. I'd rather we do some extra work to explicitly support that situation once we have a specific attribute in mind for it.

... which @erichkeane's suggestion to limit the attributes to only allowing parameter packs in variadic arguments as the last argument would help with.

I agree with that suggestion, btw.

I am not sure about how tablegen would take it, but the parser doesn't currently parse more than a single type argument, so I don't see a need for allowing type packs at the moment.

Fair point, and yet another reason not to allow type packs initially. :-)

If we limit packs to VariadicExprArgument - like annotate does in the current state of this patch - having multiple packs in the same argument is simple to handle. Both are represented as Expr and the variadic argument will simply contain two expressions until the templates are instantiated and the packs are expanded.

Hmm, let's make sure we're on the same page. The situations I think we should avoid are:

// Mixing variadics and packs.
let Args = [VariadicExprArgument<"VarArgs">, VariadicExprArgument<"Pack", /*AllowPack*/1>];

// Multiple packs.
let Args = [VariadicExprArgument<"FirstPack", /*AllowPack*/1>, VariadicExprArgument<"SecondPack", /*AllowPack*/1>];
erichkeane added inline comments.Dec 2 2021, 8:40 AM
clang/include/clang/Basic/Attr.td
544–546

I agree, BOTH of those should be prohibited. Any amount of mixing of VariadicExprArgument (or mixing of any of the variadics IMO) need to be illegal. Otherwise the ambiguous nature of it is untenable.

While I like the possibility of
let Args = [ExprArgument<"Arg1">, ExprArgument<"Arg2">];

ALSO accepting a pack, I think we end up getting a little too 'hacky' with how to store this in the non-instantiated version.

steffenlarsen edited the summary of this revision. (Show Details)

Thank you both for the reviews! Consensus seems to be having support for pack expansion at argument level for now and let more complicated logic be added when there is an actual need. I have applied these changes as I understood them and have added/adjusted the requested test cases. Please have a look and let me know what you think. 😄

That's what I want to avoid. :-D I would prefer that the arguments have to opt into that behavior because the alternative could get ambiguous. I'd rather we do some extra work to explicitly support that situation once we have a specific attribute in mind for it.

I'm okay with that! I have made the changes to only allow it in the last argument if it is marked as supporting pack expansion. Note that it assumes that there are no other variadic parameters except the last one, as suggested.

Hmm, let's make sure we're on the same page. The situations I think we should avoid are:

// Mixing variadics and packs.
let Args = [VariadicExprArgument<"VarArgs">, VariadicExprArgument<"Pack", /*AllowPack*/1>];

// Multiple packs.
let Args = [VariadicExprArgument<"FirstPack", /*AllowPack*/1>, VariadicExprArgument<"SecondPack", /*AllowPack*/1>];

Oh, I see what you mean. Yeah I agree for sure. I think the only exceptions we currently have to this is omp attributes, but given that they are pragmas they should be nowhere close to this.

erichkeane added inline comments.Dec 6 2021, 8:51 AM
clang/include/clang/Basic/DiagnosticParseKinds.td
724

I don't really see why this is required? I would think the 722 would be all you would need.

clang/lib/Parse/ParseDecl.cpp
441

I don't think this should be diagnosed here, and I don't think it is 'right'. I think the ClangAttrEmitter should ensure that the VariadicExprArgument needs to be the 'last' thing, but I think that REALLY means "supports a pack anywhere inside of it".

See my test examples below, I don't think this parsing is sufficient for that.

clang/test/Parser/cxx0x-attributes.cpp
269

what about:
void foz [[clang::annotate("D", Is)]] ();

I would expect that to error.

Another test I'd like to see:

void foz[[clang::annotate("E", 1, 2, 3, Is...)]]

Also, I don't see why if THAT works, that:
void foz[[clang::annotate("E", 1, Is..., 2, 3)]]

shouldn't be allowed as well.

clang/utils/TableGen/ClangAttrEmitter.cpp
1166

The rule of 'only the last argument is allowed to support a pack' should be in the attribute emitter.

steffenlarsen added inline comments.Dec 7 2021, 12:51 AM
clang/include/clang/Basic/DiagnosticParseKinds.td
724

Intention was to make a distinction between the two cases, since we had the granularity to do so. I.e. clang::annotate would never use 722 as it explicitly states that the attribute doesn't support pack expansion (in the last argument) which is untrue for clang::annotate, but if a user was to do pack expansion on the first argument of clang::annotate they would get this error telling them that the given argument cannot accept the pack expansion.

clang/lib/Parse/ParseDecl.cpp
441

That is also the intention here. All it checks is that we are in or beyond the last argument of the attribute. For example, clang::annotate will return 2 from attributeNumberOfArguments as VariadicExprArgument only counts as a single argument. Thus, any pack expansion expressions parsed after the first will be accepted. I do agree though that the error message may be a little confusing for users.

I will add the suggested tests and rethink the diagnostics.

The new revision does the following:

  1. Revisits the diagnostics. Now parameter packs will cause the same error on attributes that do not support parameter packs in arguments, while attributes that support it will cause an error if parameter packs are used in the incorrect argument, specifying which argument should be the earliest for it.
  2. Adds additional tests to ensure that variadic arguments supporting packs also allow pack expansion intermingled with other arguments.
  3. Adds assertions to attribute generation to check that attributes only ever support packs in the last variadic argument and that there are no other variadic arguments if they do.
steffenlarsen added inline comments.Dec 7 2021, 4:35 AM
clang/test/Parser/cxx0x-attributes.cpp
269

what about:
void foz [[clang::annotate("D", Is)]] ();

I would expect that to error.

So would I, but it seems that the parser and annotate is more than happy to consume this example, even in the current state of Clang: https://godbolt.org/z/rx64EKeeE
I am not sure as to why, but seeing as it is a preexisting bug I don't know if it makes sense to include it in the scope of this patch.

Another test I'd like to see:

void foz[[clang::annotate("E", 1, 2, 3, Is...)]]

Also, I don't see why if THAT works, that:
void foz[[clang::annotate("E", 1, Is..., 2, 3)]]

shouldn't be allowed as well.

They seem to work as expected. I have added these cases.

clang/utils/TableGen/ClangAttrEmitter.cpp
1166

The rule of 'only the last argument is allowed to support a pack' should be in the attribute emitter.

I have added some assertions around the area where attributes are generate, as that carries more surrounding information. Is that approximately what you had in mind?

erichkeane added inline comments.Dec 7 2021, 5:56 AM
clang/lib/Parse/ParseDecl.cpp
438

So I was thinking about this overnight... I wonder if this logic is inverted here? Aaron can better answer, but I wonder if we should be instead detecting when we are on the 'last' parameter and it is one of these VariadicExprArgument things (that accept a pack), then dropping the parser into a loop of:

while (Tok.is(tok::comma)){

Tok.Consume();
ArgExpr = CorrectTypos(ParseAssignmentExpr(...));

}
// finally, consume the closing paren.

WDYT?

steffenlarsen added inline comments.Dec 8 2021, 2:50 AM
clang/lib/Parse/ParseDecl.cpp
438

The parsing of attribute arguments is already this lenient. It does not actually care how many arguments the attribute says it takes (unless it takes a type and then it will only currently supply _one_ argument). It simply consumes as many expressions it can before reaching the end parenthesis, then leaves the attributes to consume them in clang/lib/Sema/SemaDeclAttr.cpp.
As a result we do not need any additional work here to handle variadic arguments, but it also means that we have to introduce the restrictions that we can only allow packs in the last argument and allow only that argument to be variadic, as otherwise we have no way of knowing when and where the packs pass between argument boundaries.

erichkeane added inline comments.Dec 8 2021, 6:09 AM
clang/lib/Parse/ParseDecl.cpp
438

My point is, I don't think THIS function should be handling the ellipsis, the expression parsing functions we send the parsing of each component off to should do that.

We unfortunately cannot move the entirety of this parsing section to do that, since 'identifier'/'type', and 'Function' argument types end up needing special handling, but it would be nice if our 'expr' types would all be able to do this, and I suggested earlier that we could start with the VariadicExprArgument to make this an easier patch that didn't have to deal with the issues that come with that.

That said, it would be nice if we could get CLOSE to that.

steffenlarsen added inline comments.Dec 9 2021, 4:32 AM
clang/lib/Parse/ParseDecl.cpp
438

Sorry. I am still not sure I am following. Do you mean the consumption of pack expressions should be moved into the expression parsing so other places (not just attributes) would accept it? If so, I am not opposed to the idea though I have no idea which expressions these would be nor the full implications of such a change. For these we at least have the isolation of having to explicitly support the packs in the given attributes.

erichkeane added inline comments.Dec 9 2021, 6:07 AM
clang/lib/Parse/ParseDecl.cpp
438

I'm saying the other expression parsers should ALREADY properly handle packs, and we should just use those.

steffenlarsen added inline comments.Dec 9 2021, 6:43 AM
clang/lib/Parse/ParseDecl.cpp
438

I think I see what you mean. There is the Parser::ParseExpressionList which may fit the bill, though it seems to also accept initializer lists. But since it is an expression the attributes can decide if it is a valid expression for them, so maybe that is not a problem?

erichkeane added inline comments.Dec 9 2021, 6:45 AM
clang/lib/Parse/ParseDecl.cpp
438

Hmm... I don't have a good idea of that, Aaron has better visibility into that sort of thing.

Part of the advantage to doing it the way I suggest is that it ends up being a single call (that is, the parser should parse the commas!), so we should only have to consume the closing paren.

aaron.ballman added inline comments.Dec 15 2021, 7:25 AM
clang/lib/Parse/ParseDecl.cpp
438

I think we want something more like Parser::ParseSimpleExpressionList() but... that doesn't handle packs, so those appear to be handled special: https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParseExpr.cpp#L3059. But one of these two functions seems to be along the correct lines of what we want (with modification).

FWIW, I like the sound of Erich's approach better than what's in the current patch. I'd rather not try to specially handle the ellipsis so much as recognizing when we expect an arbitrary list of variadic expression arguments as the last "parameter" for an attribute.

I agree that reusing existing parser logic for this would be preferable, but as @aaron.ballman mentions Parser::ParseSimpleExpressionList does not give us the parameter pack expansion parsing we need. We could potentially unify it with the logic from https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParseExpr.cpp#L3059 but on the other hand there could be potential benefits to using Parser::ParseExpressionList instead.

The primary difference between the behavior of Parser::ParseExpressionList and what we are looking for is that it will allow initializer lists as arguments. Since most (if not all) attributes check the validity of their expression arguments, they would likely fail accordingly if they get an initializer list, likely specifying they expected an argument of type X. It may be a blessing in disguise to parse these arguments as well as we are more permissive w.r.t. attribute arguments. Please let me know if there is anything fundamentally wrong with this line of logic. One concern I have is that we currently run Actions.CorrectDelayedTyposInExpr on the expression right after parsing it, then we run pack expansion, whereas Parser::ParseExpressionList only runs the former action in failure case. I am unsure whether or not that might pose a problem.

One problem is that if we only use Parser::ParseExpressionList to parse the arguments of the last argument (if it is variadic) then we get the odd behavior that the initializer lists will only be permitted in variadic arguments, which arguably is a little obscure. However, upon further investigation the only case that stops us from using this for all expression arguments are when the attribute has either VariadicIdentifierArgument or VariadicParamOrParamIdxArgument as the first argument. If we assume that these variadic expressions are limited to the last argument as well (as they are variadic) leaving them as the only argument of the attributes we are parsing here, then we can parse expressions as we did before if we are parsing one of these arguments or switch to using e.g. Parser::ParseExpressionList for all expressions of the attribute (even non-variadic) if not. Then we would just have to iterate over the produced expressions to make sure no parameter pack expansions occur before the last (variadic) argument of the parsed attribute.

Summary of suggested changes:
Split attribute parsing into 3: First case parsing a type (currently exists). Second case keeping the current logic for parsing identifiers and single assignment-expressions if the attribute has a single argument of type VariadicIdentifierArgument or VariadicParamOrParamIdxArgument. Third case, applicable if none of the former ran, parses all expression parameters in one using Parser::ParseExpressionList and then checks that parameter packs only occur in the last variadic argument (if it allows it). Side effect is it allows initializer lists in the third case, but attributes are likely to complain about unexpected expression types if these are passed.

I have made the changes suggested in my previous comment. This makes significantly more changes to the parsing of attribute arguments as the old path was needed for attributes that allow both expressions and types. It also required some new controlling arguments for ParseExpressionList.

Because these changes also allow intializer lists in attribute arguments I have added a test case showing that clang::annotate parses these but will not accept them.

@erichkeane & @aaron.ballman - Is this in line with what you were thinking?

Thanks for this, I think it's shaping up well!

clang/include/clang/Basic/DiagnosticParseKinds.td
723
725
clang/lib/Parse/ParseDecl.cpp
429–431

I'm a bit surprised this logic moved -- are there no times when we'd need it within the new else clause?

452

Coding style nits.

clang/lib/Parse/ParseExpr.cpp
3360 ↗(On Diff #399334)

or something along those lines (note, the logic has reversed meaning). "Fail immediately" doesn't quite convey the behavior to me because we fail immediately either way, it's more about whether we attempt to skip tokens to get to the end of the expression list. I'm not strongly tied to the name I suggested either, btw.

3374–3375 ↗(On Diff #399334)

Not that I think this is a bad change, but it seems unrelated to the patch. Can you explain why you made the change?

clang/test/Parser/cxx0x-attributes.cpp
261

Non-type templates are different from type templates, so I don't really want to lose the test coverage for that (if nothing else, it ensures the parser doesn't explode when it encounters one). It's fine to add a variadic_nttp() function or something along those lines.

Thank you, @aaron.ballman ! I will update the revision in a moment with some of the suggested changes.

clang/lib/Parse/ParseDecl.cpp
429–431

This is done because ParseExpressionList wouldn't know what to do with identifiers, so it keeps the old logic. This means that attributes that consist of a variadic identifier argument cannot parse packs and initializer lists. It shouldn't be a problem for now, but it does kill some generality.

An alternative is to split the individual expression parsing from ParseExpressionList and use that in here in a similar way to how it parsed before. Would that be preferred?

clang/lib/Parse/ParseExpr.cpp
3360 ↗(On Diff #399334)

"Fail immediately" doesn't quite convey the behavior to me because we fail immediately either way

Am I misunderstanding FailUntil or is the intention of SkipUntil(tok::comma, tok::r_paren, StopBeforeMatch) to either stop at the next "," or ")"? If I am not misunderstanding, I believe it will continue parsing expressions after comma if SkipUntil stopped before it. As such I don't think FailImmediatelyOnInvalidExpr is inherently wrong, but I agree that skipping the skipping isn't very well conveyed by the name as is.

3374–3375 ↗(On Diff #399334)

Admittedly I am not sure why it is needed, but in the previous version of the parsing for attribute parameters it does the typo-correction immediately after parsing the expression and if this isn't added a handful of tests fail.

Applied suggestions.

steffenlarsen marked 5 inline comments as done.Jan 20 2022, 3:30 AM
aaron.ballman added inline comments.Jan 20 2022, 8:11 AM
clang/lib/Parse/ParseDecl.cpp
429–431

This is done because ParseExpressionList wouldn't know what to do with identifiers, so it keeps the old logic.

Ah, I think I confused myself into thinking {a, b, c} would be using a DeclRefExpr for each of the identifiers, but this code path is for identifiers that are *not* expressions. Got it.

An alternative is to split the individual expression parsing from ParseExpressionList and use that in here in a similar way to how it parsed before. Would that be preferred?

I think this is okay for now.

437

I think it'd be good to move this comment up closer to the else on line 461 so that it's clear what the else clause covers (and more clear that this is the common case).

clang/lib/Parse/ParseExpr.cpp
3360 ↗(On Diff #399334)

You're correct about the behavior (I had missed that we stopped on a comma as well as closing paren, I was thinking we were stopping on the closing curly brace). So the parameter name isn't inherently wrong, just seems a bit unclear to me. I don't insist on a change, but if someone has a better name for the parameter, I won't be sad either.

3374–3375 ↗(On Diff #399334)

Ah, thanks for the info! So this is basically doing the same work as what's done on line 3410 in the late failure case -- I wonder if this should actually be tied to the FailImmediatelyOnInvalidExpr parameter instead of having its own parameter?

Moved comment.

steffenlarsen marked an inline comment as done.Jan 21 2022, 2:55 AM
steffenlarsen added inline comments.
clang/lib/Parse/ParseDecl.cpp
437

I completely agree. It has been moved to right after the else.

clang/lib/Parse/ParseExpr.cpp
3374–3375 ↗(On Diff #399334)

They could be unified in the current version, but I don't think there is an actual relation between them.

aaron.ballman added inline comments.Jan 25 2022, 11:49 AM
clang/lib/Parse/ParseDecl.cpp
452

We don't typically use top-level const, so that should go as well.

clang/lib/Parse/ParseExpr.cpp
3374–3375 ↗(On Diff #399334)

Do we fail tests if this isn't conditionalized at all (we always do the early typo correction)? (I'd like to avoid adding the new parameter because I'm not certain how a caller would determine whether they do or do not want that behavior. If we need the parameter, then so be it, but it seems like this is something generally useful.)

clang/test/Parser/cxx0x-attributes.cpp
261

You can back out the whitespace changes for an ever-so-slightly smaller patch. :-D

277–278

Shouldn't these cases also give the // expected-error {{pack expansion in 'annotate' may only be applied to the last argument}} diagnostic?

steffenlarsen marked an inline comment as done.

Removing top-level const and adjusting style.

steffenlarsen marked 2 inline comments as done.Jan 26 2022, 3:02 AM
steffenlarsen added inline comments.
clang/lib/Parse/ParseDecl.cpp
452

Completely missed that. My bad!

clang/lib/Parse/ParseExpr.cpp
3374–3375 ↗(On Diff #399334)

11 tests fail if we don't make it conditional. The most telling one is probably SemaCXX/typo-correction-delayed.cpp which will have more errors if we do typo correction early, so it seems that the delay is intentional.

clang/test/Parser/cxx0x-attributes.cpp
261

Absolutely. Clang-format was very insistent, but since it isn't added by this patch I shouldn't worry. 😄

277–278

These should preferably pass. They are part of the "last" argument as annotate has one non-variadic argument and one variadic argument. I agree though that the error message is a little vague. Originally the error did specify that it had to be "argument %1 or later" to try and convey this intent, but maybe you have an idea for a clearer error message?

steffenlarsen retitled this revision from [Annotation] Allow parameter pack expansions in annotate attribute to [Annotation] Allow parameter pack expansions and initializer lists in annotate attribute.
steffenlarsen edited the summary of this revision. (Show Details)

Upon offline sync with @aaron.ballman and @erichkeane I have changed the strategy of this patch to lift the restriction of pack expansion not spanning argument boundaries. This is implemented in clang::annotate by delaying arguments to after template-instantiation if any of the arguments are value dependent.

Took a look, it doesn't seem that this reflects the discussion we had yesterday.

clang/include/clang/AST/Attr.h
51 ↗(On Diff #403578)

What is this supposed to be for? We should be able to tell if we weren't able to instantiate the expressions based on whether the expr-list has stuff in it, and if it is dependent.

clang/include/clang/Basic/Attr.td
208

I'm pretty against this 'fake', we should be putting this into the TableGen for every attribute that has AcceptsExprPack.

778

See above.

clang/lib/Parse/ParseDecl.cpp
461

So this all doesn't seem right either. The algorithm here is essentially:

if (AcceptsArgPack) {

ParseExpressionList(); // plus error handling

// Loop through args to see what matches correctly, if the non-VariadicExprList arguments are non-dependent, fill those in, and create as normal, else keep in the materialized collection.
}

clang/lib/Sema/SemaDeclAttr.cpp
4189 ↗(On Diff #403578)

I think this behavior (adding an attribute with a dependent expression list) should be happening automatically.

In reality, whatever is doing the parsing or potentially-partial-instantiation should be checking the 'StringLiteral' value as a part of how it would without this being dependent.

That is: In the annotate case, the ONLY argument that we actually care about being dependent or not is the 'first' one. When we do the parsing/instantiation, we need to detect if the non-variadic parameters can be 'filled', then we can put the rest into the VariadicExpr list.

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
193

So I don't think this is the right approach (teaching SemaDeclAttr about the delayed args). After substitution here, we should be using these to fill in the 'normal' arguments, then just passing these to the normal 'handleAnnotateAttr'.

I agree with the comments left by @erichkeane regarding some still needed tweaks to the approach.

clang/include/clang/Basic/Attr.td
208

Yeah, Fake is intended to be used for "internal details" and means that we won't pretty print out that data from the AST node. These arguments aren't really fake in that way -- they're things the user wrote in their source code.

clang/lib/Parse/ParseDecl.cpp
424–427

(Might need to re-flow comments to 80 col.) I don't think this is a FIXME so much as a "this just doesn't work like that" situation.

clang/lib/Parse/ParseExpr.cpp
3374–3375 ↗(On Diff #399334)

Good to know, thank you for checking! I agree, it sounds like the delay is intentional then. I think it's okay to leave the parameter as you have it, but it is unfortunate that we've made the interface this much more complicated.

clang/test/Parser/cxx0x-attributes.cpp
277

We should also be adding test coverage for things like redeclarations, partial specializations, etc to make sure that the behavior there is correct. Those would be Sema tests rather than Parser tests, however.

clang/utils/TableGen/ClangAttrEmitter.cpp
2340–2348

Instead of making this an assert, I think we should use PrintFatalError() (this is how you effectively emit a diagnostic when compiling a .td file).

steffenlarsen edited the summary of this revision. (Show Details)
steffenlarsen marked 7 inline comments as done and an inline comment as not done.Jan 31 2022, 2:24 AM
steffenlarsen added inline comments.
clang/include/clang/AST/Attr.h
51 ↗(On Diff #403578)

You're right, it is redundant. I have removed it.

clang/include/clang/Basic/Attr.td
208

I see. I have changed it to automatically generate a variadic expression "argument" named DelayedArgs if an attribute has AcceptsExprPack set.

clang/lib/Parse/ParseDecl.cpp
461

I am not convinced this is the right place for that. These changes are simply to allow the additional parsing of packs and similar. It does not care much for the different arguments of the attribute, beyond whether or not it accepts types or identifiers as well (see the other cases) so having it make decisions about how the attributes should populate their arguments seems out of line with the current structure.

steffenlarsen added inline comments.Jan 31 2022, 2:24 AM
clang/lib/Parse/ParseDecl.cpp
424–427

I think it makes sense to have it be a FIXME because in theory it could be possible to have expression parameter packs expanded in an identifier list as it accepts expressions. I have reworded it slightly. Do you think this is better?

clang/lib/Sema/SemaDeclAttr.cpp
4189 ↗(On Diff #403578)

I have changed it so it will only delay the arguments if the the first argument is dependent. Is this in line with what you were thinking? Since we still need to create the attribute I am not sure how we would do this automatically.

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
193

With the new structure I have simplified this part a little. SemaDeclAttr still knows about the delayed args but solely to create the attribute, while the logic for creating the attribute with the "normal" arguments has been separated and made part of Sema so it is reachable from this function.

clang/test/Parser/cxx0x-attributes.cpp
277

Good point. I will add these tests ASAP.

clang/utils/TableGen/ClangAttrEmitter.cpp
2340–2348

I did not know! Thanks. It has been changed. 😄

I'm down to 1 little thing in how we handle the attribute itself, and I think @aaron.ballman understands the parsing section better than I do...

clang/lib/Parse/ParseDecl.cpp
385

So does this mean that our pack-parsing code is allowing identifiers? I really expected this patch to enforce via the clang-attr-emitter that ONLY the 'expr' types were allowed. So you'd have:

if (attributeAcceptsExprPack(AttrName)) {
... <where this ONLY accepts an expression pack>
} else if (Tok.is(tok::identifier)) {
...
}...

clang/lib/Sema/SemaDeclAttr.cpp
4179 ↗(On Diff #404440)

Does this work with a partial specialization? I'd like to see some tests that ensure we work in that case (both where the partial specialization sets the string literal correctly, and where it doesnt).

4202 ↗(On Diff #404440)

if AllArgs.size() == 0, this is an error case. Annotate requires at least 1 parameter.

4203 ↗(On Diff #404440)

I would like @aaron.ballman to comment on this, but I think we probably want this case to be covered in the top of HandleDeclAttr, which would mean in the 'not all values filled' case, we skip the 'handleAnnotateAttr'.

WDYT Aaron? The downside is that this function couldn't check a 'partially filled in' attribute, but it would make us that much closer to this flag being a very simple thing to opt into.

steffenlarsen marked 2 inline comments as done.Jan 31 2022, 6:52 AM
steffenlarsen added inline comments.
clang/lib/Parse/ParseDecl.cpp
385

This will only consume the identifier if the argument is an identifier argument (variadic or non-variadic). As you say, clang-attr-emitter will make sure that no attribute with AcceptsExprPack has any identifier arguments, so if there is an identifier here it will not be able to consume it and then it will continue on to try and parse it as an expression and fail.

That said, the second part of the following parsing branches should probably also check attributeHasIdentifierArg(*AttrName), so I will add that shortly.

steffenlarsen added inline comments.Jan 31 2022, 7:37 AM
clang/lib/Parse/ParseDecl.cpp
385

That said, the second part of the following parsing branches should probably also check attributeHasIdentifierArg(*AttrName), so I will add that shortly.

Correction: Parsing any identifiers after this assumes there is a variadic identifier argument. Effectively it assumes that all non-variadic identifiers in an attribute are at the start of an attribute arguments, in which case they'll be parsed here. This is a little lackluster as it could simply allow identifiers in other places if there are variadic identifier arguments in the attribute, but I don't think fixing it should be in the scope of this patch.

Removed unnecessary test for arguments.

steffenlarsen added inline comments.Jan 31 2022, 7:56 AM
clang/lib/Sema/SemaDeclAttr.cpp
4202 ↗(On Diff #404440)

I removed the check but thinking about it again I think it would be better to reintroduce it and let HandleAnnotateAttr call checkAtLeastNumArgs. That way it will also complain about missing arguments after template instantiation, e.g. if a lone pack expansion argument evaluates to an empty expression list.

Adds check and diagnostic for no arguments in annotate attribute.

steffenlarsen added inline comments.Jan 31 2022, 8:46 AM
clang/lib/Sema/SemaDeclAttr.cpp
4202 ↗(On Diff #404440)

I have reintroduced the check for AllArgs.size() here. This means if AllArgs is empty it will continue to Sema::HandleAnnotateAttr which will check that there are at least one element in AllArgs and fail otherwise with a useful diagnostic. This diagnostic will also be issued if an empty parameter pack is given and that is the only argument of the annotation. Note that checkAtLeastNumArgs is a member of ParsedAttr which isn't available in Sema::HandleAnnotateAttr, so it replicates the check. It could potentially be moved, but given the size I don't think it's necessary in this patch.

I also added some tests for checking that the diagnostic is issued in the cases (pack expansion and explicitly empty).

@erichkeane - What do you think?

erichkeane added inline comments.Jan 31 2022, 8:50 AM
clang/lib/Sema/SemaDeclAttr.cpp
4202 ↗(On Diff #404440)

Hmm... I would expect this to diagnose even in the 'delayed args' case as well. Otherwise:

template<typename T>
[[annotate()]]

is not an error, right?

steffenlarsen added inline comments.Jan 31 2022, 9:02 AM
clang/lib/Sema/SemaDeclAttr.cpp
4202 ↗(On Diff #404440)

Both "delayed args" and immediate instantiation go through the diagnostics check. The check is intentionally in Sema::HandleAnnotateAttr as either this or template instantiation (if delayed args) goes through it.

template<typename T> [[annotate()]] ...

would fail because it does not have any arguments which will be identified here as "delayed args" will be skipped.

template<int... Is> [[annotate(Is...)]] ...

will fail once instantiated with no template arguments as the instantiating function will pass an empty list of expressions to Sema::HandleAnnotationAttr which will check and fail with the diagnostic. The latter is tested and I can add the former as a case while adding the other requested tests.

erichkeane added inline comments.Jan 31 2022, 9:04 AM
clang/lib/Sema/SemaDeclAttr.cpp
4202 ↗(On Diff #404440)

Can you make sure both of those have a test? I missed that the 'no args' case would go right to HandleAnnotateAttr.

Added tests for redeclarations and template specialization using clang::annoate with packs.

steffenlarsen marked 3 inline comments as done.Feb 1 2022, 5:28 AM
steffenlarsen added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
4179 ↗(On Diff #404440)

I have added such tests in clang/test/SemaTemplate/attributes.cpp. Let me know if they cover the scenario you were thinking about.

4202 ↗(On Diff #404440)

For sure. It has been added.

clang/test/Parser/cxx0x-attributes.cpp
277

I have added such tests in clang/test/SemaTemplate/attributes.cpp. Are these along the line of what you were thinking?

I think I'm happy, though there are the opens for Aaron (deciding whether the 'create as delayed' should happen without entry into 'handleAnnotateAttr', and whether the 'parsing' looks correct, despite not being the way we discussed), so you'll need to wait on him.

I think this is heading in the right direction! I've got a few more comments and questions though.

clang/include/clang/Basic/Attr.td
545

You should probably add a release note about this new feature.

790

You should definitely add a release note about this new behavior for annotate.

791

Le sigh. We should update the documentation for this new feature, but we have no documentation for the feature *to* update. (This is not your problem to solve in this patch.)

clang/lib/Parse/ParseDecl.cpp
424–427

Maybe we're thinking about identifier lists differently. We only have two attributes that use those (cpu_specific and cpu_dispatch) and in both cases (and all cases I would expect), what's being received is effectively a list of enumerators (not enumerators in the C or C++ sense) that could not be mixed with expressions. Expressions would go through sema and do all the usual lookup work to turn them into a value, but these are not real objects and so the lookup would fail for them. e.g., we're not going to be able to support something like: [[clang::cpu_specific(generic, pentium, Ts..., atom)]]. So I don't think there's anything here to be fixed (I prefer my comment formulation as that makes it more clear).

clang/lib/Sema/SemaDeclAttr.cpp
4179–4182 ↗(On Diff #404889)

Please double-check that the call to checkCommonAttributeFeatures() from ProcessDeclAttribute() doesn't already handle this for you. If it does, then I think this can be replaced with assert(!AllArgs.empty() && "expected at least one argument");

4203 ↗(On Diff #404440)

Do you mean ProcessDeclAttribute()? I don't think we should have attribute-specific logic in there, but are you thinking of something more general than that (I'm not seeing how the suggestion makes the flag easier to opt into)?

clang/test/SemaTemplate/attributes.cpp
230–239

I was thrown by the CHECK and CHECK-NEXT mixtures here because I couldn't see that the inherited attributes were or were not also getting the expanded pack values. You should make sure to include all of the lines with CHECK-NEXT so we see a full picture of the AST dump (this file is not super consistent about it).

erichkeane added inline comments.Feb 3 2022, 6:09 AM
clang/lib/Sema/SemaDeclAttr.cpp
4203 ↗(On Diff #404440)

Ah, yes, thats what I mean. The question for ME is whether there should be a generic "this supports variadic pack, so check to see if all the named non-expr arguments are fill-in-able. If not, do the 'delayed' version.

This would mean that HandleAnnotateAttr NEVER sees the "CreateWithDelayedArgs" case.

aaron.ballman added inline comments.Feb 3 2022, 6:13 AM
clang/lib/Sema/SemaDeclAttr.cpp
4203 ↗(On Diff #404440)

Thanks for clarifying -- yes, I think that would be preferable if it works out in a clean, generic way. I'd be fine with tablegen emitting something else (if necessary) to help generalize it.

Adjusted for comments and introduced common handling for creating attributes with delayed arguments.

steffenlarsen marked 3 inline comments as done.Feb 3 2022, 4:13 PM
steffenlarsen added inline comments.
clang/lib/Parse/ParseDecl.cpp
424–427

I see what you mean. I have applied your wording instead.

clang/lib/Sema/SemaDeclAttr.cpp
4179–4182 ↗(On Diff #404889)

It does go through checkCommonAttributeFeatures but (as of the recent changes) it will skip size-checks if arguments are delayed as a pack expansion could potentially populate the seemingly missing expressions after template instantiation for some attribute.
For annotate we could also have a pack as the only expression, which would then evaluate to an empty list of expressions. Since this path is also taken by instantiateDependentAnnotationAttr if there are delayed args. In reality it is only really needed after template instantiations, given as you said checkCommonAttributeFeatures will do the checking in the other case, but I personally think it is cleaner to have it here. If you disagree I will move it into instantiateDependentAnnotationAttr instead and add the assert.

4203 ↗(On Diff #404440)

handleAnnotateAttr is now oblivious to the concept of "delayed arguments". Instead tablegen generates a common handle function (handleAttrWithDelayedArgs) which will, based on the given ParsedAttr that supports delayed arguments, create and add the corresponding attribute with delayed arguments by calling the corresponding CreateWithDelayedArgs. The need for delaying arguments is decided as described in MustDelayAttributeArguments.

Is this approximately what you were thinking?

clang/test/SemaTemplate/attributes.cpp
230–239

I have done this. It is a little more verbose, but I agree with the full-picture sentiment.

1 question, otherwise happy.

clang/lib/Sema/SemaDeclAttr.cpp
8153 ↗(On Diff #405820)

This still doesn't work if the VariadicExprArgument isn't last, right? Do we ensure that is the case in clang-attr-emitter?

steffenlarsen marked 2 inline comments as done.Feb 4 2022, 6:22 AM
steffenlarsen added inline comments.
clang/lib/Sema/SemaDeclAttr.cpp
8153 ↗(On Diff #405820)

Good point. I don't think there's a check as there are select few attributes that use multiple variadic (OMPDeclareSimdDecl and OMPDeclareVariant are the only ones, I think.)

Since I don't think it's safe to check for all, should I make a check similar to the one for type/identifier arguments in attributes marked AcceptsExprPack? Would that suffice?

erichkeane added inline comments.Feb 4 2022, 6:25 AM
clang/lib/Sema/SemaDeclAttr.cpp
8153 ↗(On Diff #405820)

I'm fine rejecting a case that has anything besides expression-arguments(and ones create-able from expression arguments) and limited-to-only-1-must-be-last variadic-expr-list in ClangAttrEmitter.

I believe we discussed that at one point, but I didn't see it here.

Added check restricting variadic expression arguments to the last argument of attributes set to support parameter packs.
Added release notes.

steffenlarsen added inline comments.Feb 4 2022, 7:40 AM
clang/include/clang/Basic/Attr.td
545

I have added a release note about this under "Internal API Changes". Let me know if you think the wording needs changes.

790

I have added a release note about this under "Attribute Changes in Clang". As with the other note, let me know if wording needs changes.

clang/lib/Sema/SemaDeclAttr.cpp
8153 ↗(On Diff #405820)

I have added a check to ClangAttrEmitter next to the check ensuring that the relevant attributes don't have identifier or type arguments.

Added missing isVariadicExprArgument function.

Added a check to prevent duplicate "empty" constructors, i.e. if there are only optional arguments such as a single variadic argument.

aaron.ballman added inline comments.Feb 7 2022, 5:55 AM
clang/include/clang/Sema/ParsedAttr.h
52–57 ↗(On Diff #406373)

Er, this means the bit packed part of the structure can no longer fit in a 32-bit allocation unit (we went from needing 30 bits to needing 35 bits). That's unfortunate. I don't see any other bits to reasonably steal from.

I took a look to see whether I thought this would have a significant performance impact. We definitely keep arrays of these things around and walk over the array, so I expect some performance loss from this change. But it seems like the sort of thing likely to fall out in the noise rather than be on the hot path, so I think this is fine. However, if we see a noticeable performance regression, we should keep this change in mind to see if there's something more we can do to keep us within a 32-bit allocation unit.

113 ↗(On Diff #406373)

I don't think this API is needed. isArgExpr() already exists on the interface and I'm not certain how this differs. (If this API is needed, we need a far better name for it because I'd have no idea how to distinguish isExprArg() and isArgExpr().)

clang/include/clang/Sema/Sema.h
4383–4385 ↗(On Diff #406373)

This name makes it sound like far more general of an API than it is. This function is specific to checking string literal arguments for an attribute. But then why does checkStringLiteralArgumentAttr() not suffice? It does the same thing, but with more bells and whistles.

clang/lib/Sema/SemaDeclAttr.cpp
4179–4182 ↗(On Diff #404889)

Ah, I think what caught me off-guard is that I was expecting checkCommonAttributeFeatures() to be called again from template instantiation time. Then there are less (possibly even no more) dependent arguments, so it can do more automated checking. However, I see that Sema::InstantiateAttrs() is crying out for refactoring to make that more useful, so it's outside the scope of this patch (what you're doing here is fine). Thanks!

4203 ↗(On Diff #404440)

This is more along the lines of what I had in mind, yes.

4185 ↗(On Diff #406373)

This is a slight loss of functionality. We used to recommend identifiers be converted to string literals under the old interface. I don't think this refactoring is necessary any longer, right?

clang/utils/TableGen/ClangAttrEmitter.cpp
2293

The comment doesn't seem grammatical and may need some expounding.

Recent changes:

  • Replaces isExprArg with isArgMemberExprHolder which returns true if the "argument member" can hold one or more expressions.
  • Renames checkStringLiteralExpr to checkASCIIStringLiteralExpr to better convey the intended semantics.
  • Reverts changes to HandleAnnotateAttr and removes HandleAnnotationAttr, moving final-args checks of expressions into instantiateDependentAnnotationAttr.
  • Removes nonsensical comment.
steffenlarsen added inline comments.Feb 7 2022, 8:11 AM
clang/include/clang/Sema/ParsedAttr.h
113 ↗(On Diff #406373)

There is an important distinction between the two, specifically that isArgExpr checks if the parsed argument at the index is an expression, while isExprArg checks if the specified argument in the attribute is an "ExprArgument", i.e. one checks post-parsing the other reflects on the attribute's definition.

That said, I agree that with that naming there's little distinction between the two. I have therefore removed isExprArg and replaced it with isArgMemberExprHolder which is true if the "argument member" (referring to the argument defined in Attr.td) is able to hold an expression (i.e. it is either a ExprArgument or a VaridicExprArgument.) Functionally there is little change to it, as the only place we use the check is after checking if the expression is variadic, but it fit better with the actual intent of the check.

clang/include/clang/Sema/Sema.h
4383–4385 ↗(On Diff #406373)

The problem is that checkStringLiteralArgumentAttr() requires the arguments to be extracted from the parsed attribute, which we do not have after template instantiation, so the split was for generality. However, as you mentioned in another comment, the generality is not that needed anymore so handleAnnotateAttr has been reverted to the old version.

That said, this is still needed for doing post-initalization check if arguments have been delayed, so I have renamed it to checkASCIIStringLiteralExpr which hopefully conveys the intent a little better?

clang/lib/Sema/SemaDeclAttr.cpp
4185 ↗(On Diff #406373)

You're absolutely right. There is actually no reason for this refactoring anymore, except for maybe some minor common code between this and delayed argument instantiation, but it is so little now that if we want to keep the hint (which I agree we should) then there's little reason not to revert the changes to handleAnnotateAttr. So that's what I've done! Similarly, handleAnnotateAttr is no longer needed so it has been removed and the excess logic has been moved to instantiateDependentAnnotationAttr. Thanks for pointing this out!

clang/utils/TableGen/ClangAttrEmitter.cpp
2293

I don't think a comment makes much sense here anyway, so I've removed it. This seems like fairly common practice for these kind of functions. If you disagree I can reintroduce it and expand on it.

I still have some naming issues that need to be corrected, but functionally I think this is good to go. I presume you don't have commit privileges @steffenlarsen? If that's accurate, what name and email address would you like me to use for patch attribution after you've fixed the last few bits?

clang/include/clang/Sema/ParsedAttr.h
113 ↗(On Diff #406373)

Thanks for the clarification!

I think a better way to express this is by naming it isParamExpr() and adding comments along the lines of:

/// Returns true if the specified parameter index for this attribute in Attr.td is an ExprArgument
/// or VariadicExprArgument, or a subclass thereof; returns false otherwise.

(Note, the local style here is to use /// instead of //, so I'm matching that as well. You probably need to flow the comments to 80 cols as well.)

clang/include/clang/Sema/Sema.h
4383–4385 ↗(On Diff #406373)

The new name is still a problem -- it sounds like a generic interface, but the diagnostic it emits is specific to attributes. That's going to be a maintenance issue.

I would make this an overload of checkStringLiteralArgumentAttr() instead of giving it a new name.

Recent changes:

  • Renamed isArgMemberExprHolder to isParamExpr and added the comment suggested by @aaron.ballman .
  • Made checkASCIIStringLiteralExpr an overload of checkStringLiteralArgumentAttr.

Thanks, @aaron.ballman ! I have applied the new suggestions.

I presume you don't have commit privileges @steffenlarsen? If that's accurate, what name and email address would you like me to use for patch attribution after you've fixed the last few bits?

That is correct. Name is "Steffen Larsen" and email is "steffen.larsen@intel.com". Thanks a ton. 😄

clang/include/clang/Sema/ParsedAttr.h
113 ↗(On Diff #406373)

isParamExpr is fine by me! I have also added the suggested comment. Thanks! (Also, good eye on the // vs ///)

clang/include/clang/Sema/Sema.h
4383–4385 ↗(On Diff #406373)

Overloading checkStringLiteralArgumentAttr is a good shout. I have done that.

aaron.ballman accepted this revision.Feb 8 2022, 9:52 AM

LGTM! Thank you for all the good discussion and work on this! I'll commit on your behalf once the precommit CI pipeline comes back green.

This revision is now accepted and ready to land.Feb 8 2022, 9:52 AM
aaron.ballman closed this revision.Feb 8 2022, 10:39 AM

I've commit on your behalf in ead1690d31f815c00fdd2bc23db4766191bbeabc, thank you!