This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Remove unnecessary Attr.isArgIdent checks.
ClosedPublic

Authored by fhahn on Jan 5 2021, 8:35 AM.

Details

Summary

The MatrixType, ExtVectorType, VectorSize and AddressSpace attributes
have arguments defined as ExprArguments in Attr.td. So their arguments
should never be ArgIdents and the logic to handle this case can be
removed.

The logic has been replaced by an assertion to ensure the arguments
are always ArgExpressions

Diff Detail

Event Timeline

fhahn requested review of this revision.Jan 5 2021, 8:35 AM
fhahn created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2021, 8:35 AM
fhahn updated this revision to Diff 314632.Jan 5 2021, 8:41 AM

Also remove Attr.isArgIdent check from AddressSpaceTypeAttribute

fhahn edited the summary of this revision. (Show Details)Jan 5 2021, 8:43 AM

I'm not sure how well Attr.td's constraints are enforced on type attributes, as these often happen before parsing is completely done. I'd imagine this code was put into place at least the 1st time for good reason, but I'm curious as to why we wouldn't have tests that cover that (or, as you assert, it could simply be that this is simply dead code).

I'm generally OK with this (the asserts are unnecessary), but would like @aaron.ballman to double check my expectations here.

clang/lib/Sema/SemaType.cpp
7642–7643

I think the asserts aren't necessary, isArgExpr does a ArgsUnion::get<Expr*>, which already asserts.

fhahn updated this revision to Diff 314635.Jan 5 2021, 8:50 AM

I'm not sure how well Attr.td's constraints are enforced on type attributes, as these often happen before parsing is completely done. I'd imagine this code was put into place at least the 1st time for good reason, but I'm curious as to why we wouldn't have tests that cover that (or, as you assert, it could simply be that this is simply dead code).

I'm generally OK with this (the asserts are unnecessary), but would like @aaron.ballman to double check my expectations here.

It would be great if we could get confirmation!

I tried a few different things to construct matrix_type attributes with ArgIdents, but failed. The patch also adjusts the code for a bunch of attributes. So if there are indeed cases where ArgIdents can show up, we will get some examples for unit tests.

fhahn added a comment.Jan 5 2021, 8:51 AM
clang/lib/Sema/SemaType.cpp
7642–7643

Good point, I removed the assertions. It's even more compact now :)

RKSimon added a subscriber: RKSimon.Jan 5 2021, 8:55 AM
RKSimon added inline comments.
clang/lib/Sema/SemaType.cpp
7661–7662

(style) SizeExpr

7934–7935

these comments seem unnecessary now?

I'm not sure how well Attr.td's constraints are enforced on type attributes, as these often happen before parsing is completely done. I'd imagine this code was put into place at least the 1st time for good reason, but I'm curious as to why we wouldn't have tests that cover that (or, as you assert, it could simply be that this is simply dead code).

I'm generally OK with this (the asserts are unnecessary), but would like @aaron.ballman to double check my expectations here.

It would be great if we could get confirmation!

I tried a few different things to construct matrix_type attributes with ArgIdents, but failed. The patch also adjusts the code for a bunch of attributes. So if there are indeed cases where ArgIdents can show up, we will get some examples for unit tests.

The comment claims: "// Special case where the argument is a template id.". I would expect one of the following to hit that:
https://godbolt.org/z/znYW1s

fhahn added a comment.Jan 5 2021, 9:01 AM

I tried a few different things to construct matrix_type attributes with ArgIdents, but failed. The patch also adjusts the code for a bunch of attributes. So if there are indeed cases where ArgIdents can show up, we will get some examples for unit tests.

The comment claims: "// Special case where the argument is a template id.". I would expect one of the following to hit that:
https://godbolt.org/z/znYW1s

I tried this snippet with the patch (and also an added assert(!Attr.isArgIdent(0)); to HandleVectorSizeAttr just to be sure). Compiled as expected, no crash.

erichkeane accepted this revision.Jan 5 2021, 9:04 AM

I tried a few different things to construct matrix_type attributes with ArgIdents, but failed. The patch also adjusts the code for a bunch of attributes. So if there are indeed cases where ArgIdents can show up, we will get some examples for unit tests.

The comment claims: "// Special case where the argument is a template id.". I would expect one of the following to hit that:
https://godbolt.org/z/znYW1s

I tried this snippet with the patch (and also an added assert(!Attr.isArgIdent(0)); to HandleVectorSizeAttr just to be sure). Compiled as expected, no crash.

Hmm... then I believe that you are correct that this is dead code. I can't come up with any other template-ids that would cause this to work differently.

I'll accept, but please give @aaron.ballman a day to take a look before submitting.

This revision is now accepted and ready to land.Jan 5 2021, 9:04 AM
fhahn updated this revision to Diff 314640.Jan 5 2021, 9:08 AM

Address Simon's comments, thanks!

I'm not sure how well Attr.td's constraints are enforced on type attributes, as these often happen before parsing is completely done.

Type (and statement) attributes do not have the same automatic error checking that declaration attributes enjoy (yet), so perhaps this was misguided error handling code.

I'd imagine this code was put into place at least the 1st time for good reason, but I'm curious as to why we wouldn't have tests that cover that (or, as you assert, it could simply be that this is simply dead code).

I think it's dead code. In ParseAttributeArgsCommon() in ParseDecl.cpp, we check to see whether the attribute definition from Attr.td specifies an identifier argument or not. If it doesn't (which these don't), then it parses the identifier as an expression instead. So I don't think there's a way to hit these code paths, even in template cases.

I'm generally OK with this (the asserts are unnecessary), but would like @aaron.ballman to double check my expectations here.

My expectation is that it's dead code, but I've been surprised before. :-D

Without bothering to look it up, I would guess that the attribute-parsing code used to generically handle the ambiguity between identifier expressions and identifier attribute arguments by just always parsing simple identifiers as identifier arguments, making it Sema's responsibility to turn that back into an expression. At some point, the parser was made sensitive to the actual attribute being parsed, but we never bothered to simplify Sema. At any rate, the parser does now know exactly which argument of which attribute it's parsing, so there's zero reason for it to force this complexity on Sema anymore; if we find a case that parses identifier arguments, we should fix it in the parser to parse an expression instead.

Without bothering to look it up, I would guess that the attribute-parsing code used to generically handle the ambiguity between identifier expressions and identifier attribute arguments by just always parsing simple identifiers as identifier arguments, making it Sema's responsibility to turn that back into an expression. At some point, the parser was made sensitive to the actual attribute being parsed, but we never bothered to simplify Sema. At any rate, the parser does now know exactly which argument of which attribute it's parsing, so there's zero reason for it to force this complexity on Sema anymore; if we find a case that parses identifier arguments, we should fix it in the parser to parse an expression instead.

I don't think it will be quite that trivial (switching all identifiers to be parsed as expressions instead). For instance, attributes that take enumeration arguments can parse those arguments as identifiers, but those identifiers will never be found by usual expression lookup because they don't really exist. That said, any attribute that currently accepts an identifier because it really wants to use that identifier in an expression in Sema should update the attribute argument clauses in Attr.td and make the corresponding changes in SemaDeclAttr.cpp/SemaStmt.cpp/SemaType.cpp as appropriate.

erichkeane added a comment.EditedJan 6 2021, 6:15 AM

Without bothering to look it up, I would guess that the attribute-parsing code used to generically handle the ambiguity between identifier expressions and identifier attribute arguments by just always parsing simple identifiers as identifier arguments, making it Sema's responsibility to turn that back into an expression. At some point, the parser was made sensitive to the actual attribute being parsed, but we never bothered to simplify Sema. At any rate, the parser does now know exactly which argument of which attribute it's parsing, so there's zero reason for it to force this complexity on Sema anymore; if we find a case that parses identifier arguments, we should fix it in the parser to parse an expression instead.

I don't think it will be quite that trivial (switching all identifiers to be parsed as expressions instead). For instance, attributes that take enumeration arguments can parse those arguments as identifiers, but those identifiers will never be found by usual expression lookup because they don't really exist. That said, any attribute that currently accepts an identifier because it really wants to use that identifier in an expression in Sema should update the attribute argument clauses in Attr.td and make the corresponding changes in SemaDeclAttr.cpp/SemaStmt.cpp/SemaType.cpp as appropriate.

Right, there is an Identifier type (and collection type) in Attr.td that at least a handful of attributes are using intentionally. It is typically a case where GCC accepted an identifier, and it isn't always something that is look-up-able via normal C++ mechanisms (such as in the case of cpu_dispatch, where the identifiers are CPU names).

EDIT:
That said, I believe @rjmccall is correct, I think based on my run through history (and knowing a bit about it from past digging) that the vector_size attribute was added before attribute parsing did nearly as much as it did (I think it even predates Attr.td).

I suspect the rest of the attributes (Particularly Matrix and Address Space!) simply copy/pasted that section and edited it to work, despite not having a reproducer that hit it.

fhahn added a comment.Jan 6 2021, 8:10 AM

Without bothering to look it up, I would guess that the attribute-parsing code used to generically handle the ambiguity between identifier expressions and identifier attribute arguments by just always parsing simple identifiers as identifier arguments, making it Sema's responsibility to turn that back into an expression. At some point, the parser was made sensitive to the actual attribute being parsed, but we never bothered to simplify Sema. At any rate, the parser does now know exactly which argument of which attribute it's parsing, so there's zero reason for it to force this complexity on Sema anymore; if we find a case that parses identifier arguments, we should fix it in the parser to parse an expression instead.

I don't think it will be quite that trivial (switching all identifiers to be parsed as expressions instead). For instance, attributes that take enumeration arguments can parse those arguments as identifiers, but those identifiers will never be found by usual expression lookup because they don't really exist. That said, any attribute that currently accepts an identifier because it really wants to use that identifier in an expression in Sema should update the attribute argument clauses in Attr.td and make the corresponding changes in SemaDeclAttr.cpp/SemaStmt.cpp/SemaType.cpp as appropriate.

Right, there is an Identifier type (and collection type) in Attr.td that at least a handful of attributes are using intentionally. It is typically a case where GCC accepted an identifier, and it isn't always something that is look-up-able via normal C++ mechanisms (such as in the case of cpu_dispatch, where the identifiers are CPU names).

EDIT:
That said, I believe @rjmccall is correct, I think based on my run through history (and knowing a bit about it from past digging) that the vector_size attribute was added before attribute parsing did nearly as much as it did (I think it even predates Attr.td).

I suspect the rest of the attributes (Particularly Matrix and Address Space!) simply copy/pasted that section and edited it to work, despite not having a reproducer that hit it.

For the matrix_type attribute, I originally just copied & adjusted the ext_vector_type version.

Thanks for all the comments, I'll go ahead and land this now. We should know soon enough if there's anything out in the wild that can pass an ArgIdent.

This revision was landed with ongoing or failed builds.Jan 6 2021, 10:02 AM
This revision was automatically updated to reflect the committed changes.

Without bothering to look it up, I would guess that the attribute-parsing code used to generically handle the ambiguity between identifier expressions and identifier attribute arguments by just always parsing simple identifiers as identifier arguments, making it Sema's responsibility to turn that back into an expression. At some point, the parser was made sensitive to the actual attribute being parsed, but we never bothered to simplify Sema. At any rate, the parser does now know exactly which argument of which attribute it's parsing, so there's zero reason for it to force this complexity on Sema anymore; if we find a case that parses identifier arguments, we should fix it in the parser to parse an expression instead.

I don't think it will be quite that trivial (switching all identifiers to be parsed as expressions instead). For instance, attributes that take enumeration arguments can parse those arguments as identifiers, but those identifiers will never be found by usual expression lookup because they don't really exist. That said, any attribute that currently accepts an identifier because it really wants to use that identifier in an expression in Sema should update the attribute argument clauses in Attr.td and make the corresponding changes in SemaDeclAttr.cpp/SemaStmt.cpp/SemaType.cpp as appropriate.

You're exactly right about how it's necessary to parse certain attributes. Just to be clear, though, I'm not proposing any change here; I'm describing the change that already happened, years ago, by way of trying to identify why this particular bit of code was once necessary and thus why it's likely unnecessary now. So I think you will find that those changes to Attr.td are in fact already in place.