This is an archive of the discontinued LLVM Phabricator instance.

[Attr] Support _attribute__ ((fallthrough))
ClosedPublic

Authored by Nathan-Huckleberry on Jul 16 2019, 4:05 PM.

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 16 2019, 4:05 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
  • Fixed formatting
xbolva00 added inline comments.Jul 16 2019, 4:32 PM
clang/lib/Parse/ParseTentative.cpp
2131 ↗(On Diff #210215)

Negative size() ?

Did you mean “== 0”? Not sure if “empty()” exists there..

clang/lib/Sema/AnalysisBasedWarnings.cpp
1278–1279

Make it unconditional (see my patch)

Thanks, I think this is fine solution for now.

Probably not ideal (@aaron.ballman mentioned the ideal solution - rewrite the parser), but “suboptimal” parser should not stop any progress in this area.

xbolva00 added inline comments.Jul 16 2019, 4:51 PM
clang/lib/Parse/ParseTentative.cpp
2146 ↗(On Diff #210215)

Is this “cheap” in terms of compile time?

efriedma added inline comments.Jul 16 2019, 5:09 PM
clang/lib/Parse/ParseStmt.cpp
104

If we're going to generally support statement attributes, it should be possible to apply them to non-null statements. (Even if there aren't any valid attributes right now, that's likely to change in the future.) Or is that not possible to implement for some reason?

rsmith added inline comments.Jul 16 2019, 11:33 PM
clang/lib/Parse/ParseTentative.cpp
2130 ↗(On Diff #210215)

It's not correct in general to call arbitrary parsing actions in a tentatively-parsed region; they might annotate or otherwise mess with the token stream in undesirable ways, or produce diagnostics, etc. If we keep this tentative parsing formulation, you'll need to instead recognize the __attribute__ token then manually skip its attribute list.

2146 ↗(On Diff #210215)

No; this is not a reasonable thing to do for every block-scope statement or declaration that we parse.

We should do the same thing that we do for all other kinds of attribute: parse them unconditionally then reject them in the contexts where they should not be permitted.

Nathan-Huckleberry added a comment.EditedJul 17 2019, 10:14 AM

The main problem that we have is that the __attribute__ token always causes the parser to read the line as a declaration. Then the declaration parser handles reading the attributes list.

This case demonstrates the problem:

void foo() {
  __attribute__((address_space(0))) *x;
}

If the attribute token is read beforehand this code just becomes *x and is parsed as a dereference of an undefined variable when it should actually be a declaration.

Maybe the best solution is to pull the attributes parsing out to ParseStatementOrDeclaration then pass those attributes through to following functions.
When the determination is made whether a line is a declaration or a statement the attributes list can be looked at to determine if the attributes are statement or declaration attributes and continue accordingly. Maybe throwing a warning if mixing of declaration and statement attributes occur.

Does that sound like a good solution?

Yes, that seems reasonable.

The main problem that we have is that the __attribute__ token always causes the parser to read the line as a declaration. Then the declaration parser handles reading the attributes list.

This case demonstrates the problem:

void foo() {
  __attribute__((address_space(0))) *x;
}

If the attribute token is read beforehand this code just becomes *x and is parsed as a dereference of an undefined variable when it should actually be a declaration.

Maybe the best solution is to pull the attributes parsing out to ParseStatementOrDeclaration then pass those attributes through to following functions.
When the determination is made whether a line is a declaration or a statement the attributes list can be looked at to determine if the attributes are statement or declaration attributes and continue accordingly. Maybe throwing a warning if mixing of declaration and statement attributes occur.

Does that sound like a good solution?

Please see the discussion in https://reviews.llvm.org/D63299#inline-564887 -- if I understand your approach properly, this approach leads to spooky action at a distance, where the name of the attribute dictates whether something is a statement or a declaration. I'm not comfortable with that. There's no reason we could not have an attribute that is both a statement attribute and a declaration attribute, and how would *that* parse?

I think this requires some deeper surgery in the parsing logic so that attributes do not impact whether something is treated as a declaration or a statement. The parser should parse attributes first, keep them around for a while, and attach them to either the decl or the statement at a later point.

clang/lib/Parse/ParseStmt.cpp
104

We already generally support statement attributes. See ProcessStmtAttribute() in SemaStmtAttr.cpp. They definitely need to apply to non-null statements.

Nathan-Huckleberry added a comment.EditedJul 18 2019, 11:34 AM

The main problem that we have is that the __attribute__ token always causes the parser to read the line as a declaration. Then the declaration parser handles reading the attributes list.

This case demonstrates the problem:

void foo() {
  __attribute__((address_space(0))) *x;
}

If the attribute token is read beforehand this code just becomes *x and is parsed as a dereference of an undefined variable when it should actually be a declaration.

Maybe the best solution is to pull the attributes parsing out to ParseStatementOrDeclaration then pass those attributes through to following functions.
When the determination is made whether a line is a declaration or a statement the attributes list can be looked at to determine if the attributes are statement or declaration attributes and continue accordingly. Maybe throwing a warning if mixing of declaration and statement attributes occur.

Does that sound like a good solution?

Please see the discussion in https://reviews.llvm.org/D63299#inline-564887 -- if I understand your approach properly, this approach leads to spooky action at a distance, where the name of the attribute dictates whether something is a statement or a declaration. I'm not comfortable with that. There's no reason we could not have an attribute that is both a statement attribute and a declaration attribute, and how would *that* parse?

I think this requires some deeper surgery in the parsing logic so that attributes do not impact whether something is treated as a declaration or a statement. The parser should parse attributes first, keep them around for a while, and attach them to either the decl or the statement at a later point.

Any idea how to fix the problem in the code sample I gave? The addition of the attributes token causes code to be read as a declaration and without the token it is read as a unary operator.

The main problem that we have is that the __attribute__ token always causes the parser to read the line as a declaration. Then the declaration parser handles reading the attributes list.

This case demonstrates the problem:

void foo() {
  __attribute__((address_space(0))) *x;
}

If the attribute token is read beforehand this code just becomes *x and is parsed as a dereference of an undefined variable when it should actually be a declaration.

Maybe the best solution is to pull the attributes parsing out to ParseStatementOrDeclaration then pass those attributes through to following functions.
When the determination is made whether a line is a declaration or a statement the attributes list can be looked at to determine if the attributes are statement or declaration attributes and continue accordingly. Maybe throwing a warning if mixing of declaration and statement attributes occur.

Does that sound like a good solution?

Please see the discussion in https://reviews.llvm.org/D63299#inline-564887 -- if I understand your approach properly, this approach leads to spooky action at a distance, where the name of the attribute dictates whether something is a statement or a declaration. I'm not comfortable with that. There's no reason we could not have an attribute that is both a statement attribute and a declaration attribute, and how would *that* parse?

I think this requires some deeper surgery in the parsing logic so that attributes do not impact whether something is treated as a declaration or a statement. The parser should parse attributes first, keep them around for a while, and attach them to either the decl or the statement at a later point.

Any idea how to fix the problem in the code sample I gave? The addition of the attributes token causes code to be read as a declaration and without the token it is read as a unary operator.

I would start by trying to have Parser::ParseStatementOrDeclarationAfterAttributes() parse the GNU-style attributes if the __attribute__ token is encountered, adding to the Attrs, and then retrying the loop.

Nathan-Huckleberry added a comment.EditedJul 18 2019, 3:58 PM
void foo() {
  __attribute__((address_space(0))) *x;
  *y;
}

If the attributes are parsed then function body looks like this to the parser:

{
  *x; //this one has attributes now
  *y;
{

The first line should be a valid declaration and the second like should be a dereference of an uninitialized variable. If the attributes token is discarded before parsing the rest of the line the only way to differentiate these is by looking at the attributes added to them.

An alternative may be parse the attributes list and immediately try to parse as a declaration then if that parsing fails attempt to parse as something else. Although this approach also has the scary implication of things that are supposed to be declarations getting reparsed as something entirely different.

void foo() {
  __attribute__((address_space(0))) *x;
  *y;
}

If the attributes are parsed then function body looks like this to the parser:

{
  *x; //this one has attributes now
  *y;
{

The first line should be a valid declaration and the second like should be a dereference of an uninitialized variable. If the attributes token is discarded before parsing the rest of the line the only way to differentiate these is by looking at the attributes added to them.

An alternative may be parse the attributes list and immediately try to parse as a declaration then if that parsing fails attempt to parse as something else. Although this approach also has the scary implication of things that are supposed to be declarations getting reparsed as something entirely different.

The issue is that a leading GNU-style attribute is not sufficient information to determine whether we're parsing a declaration or a statement; it shouldn't always be treated as a decl-specifier. I spoke with a GCC dev about how they handle this, and effectively, they parse the attributes first then attempt to parse a declaration; if that fails, they fall back to parsing a statement. I think the way forward for us that should be similar is to parse the attributes first and then wait until we see a decl-specifier before determining whether we want to parse a declaration or a statement, and attach the attributes after we've figured out which production we have. @rsmith may have even better approaches in mind, but we're definitely agreed that we should not parse statement/decl based on attribute identity. I would hope we could find a way to avoid lots of re-parsing work if we can (even to the point of perhaps breaking the address_space case because implicit int is pretty horrible to rely on in the first place; it depends on whether breaking that will break a lot of code or not).

ojeda added a subscriber: ojeda.Jul 21 2019, 12:52 AM
void foo() {
  __attribute__((address_space(0))) *x;
  *y;
}

If the attributes are parsed then function body looks like this to the parser:

{
  *x; //this one has attributes now
  *y;
{

The first line should be a valid declaration and the second like should be a dereference of an uninitialized variable. If the attributes token is discarded before parsing the rest of the line the only way to differentiate these is by looking at the attributes added to them.

An alternative may be parse the attributes list and immediately try to parse as a declaration then if that parsing fails attempt to parse as something else. Although this approach also has the scary implication of things that are supposed to be declarations getting reparsed as something entirely different.

The issue is that a leading GNU-style attribute is not sufficient information to determine whether we're parsing a declaration or a statement; it shouldn't always be treated as a decl-specifier. I spoke with a GCC dev about how they handle this, and effectively, they parse the attributes first then attempt to parse a declaration; if that fails, they fall back to parsing a statement. I think the way forward for us that should be similar is to parse the attributes first and then wait until we see a decl-specifier before determining whether we want to parse a declaration or a statement, and attach the attributes after we've figured out which production we have. @rsmith may have even better approaches in mind, but we're definitely agreed that we should not parse statement/decl based on attribute identity. I would hope we could find a way to avoid lots of re-parsing work if we can (even to the point of perhaps breaking the address_space case because implicit int is pretty horrible to rely on in the first place; it depends on whether breaking that will break a lot of code or not).

@xbolva00's patch https://reviews.llvm.org/D63260?id=204583 essentially does that already. I'm not sure how to continue on this patch, several solutions have been suggested, but I'm not sure which to implement.

they parse the attributes first then attempt to parse a declaration; if that fails, they fall back to parsing a statement

Well, I don’t think this reparsing is ideal in terms of compile time either.

If we really care about attributes on implicit ints: I don’t think that parsing according to attribute name is so bad solution - if only “possible” issue is same attr. name for stmt and decl for some future attribute - let’s talk with GCC devs and make a deal about it.

they parse the attributes first then attempt to parse a declaration; if that fails, they fall back to parsing a statement

Well, I don’t think this reparsing is ideal in terms of compile time either.

If we really care about attributes on implicit ints: I don’t think that parsing according to attribute name is so bad solution - if only “possible” issue is same attr. name for stmt and decl for some future attribute - let’s talk with GCC devs and make a deal about it.

Richard and I spoke about that offline last week and both agree that changing parsing behavior according to the attribute name is not acceptable behavior for the parser.

I agree that parsing according to attribute name/type is not a good solution.

It sounds like we have narrowed it down to two choices:
Do we want to follow the gcc method of parsing once and falling back if parsing fails?
Do we want to parse attributes first and then wait until we see a decl-specifier (breaking the implicit int case)?

I agree that parsing according to attribute name/type is not a good solution.

It sounds like we have narrowed it down to two choices:
Do we want to follow the gcc method of parsing once and falling back if parsing fails?
Do we want to parse attributes first and then wait until we see a decl-specifier (breaking the implicit int case)?

I don't think so. A GCC attribute is a decl-specifier, so should trigger implicit-int in the languages that have it.

Option 1: teach the statement/declaration disambiguation code that an initial GNU attribute does not resolve the ambiguity and that it needs to disambiguate past one.

Option 2: parse the attributes and then call the disambiguation code and tell it that we've already consumed a decl-specifier.

  • Rework attribute parsing
  • Formatting fixes
Harbormaster completed remote builds in B35708: Diff 212014.
aaron.ballman added inline comments.Jul 28 2019, 10:15 AM
clang/lib/Parse/ParseStmt.cpp
103

Spurious newline?

217

Elide braces.

clang/lib/Sema/AnalysisBasedWarnings.cpp
1278–1279

What is special about C99 here?

xbolva00 added inline comments.Jul 28 2019, 10:19 AM
clang/lib/Sema/AnalysisBasedWarnings.cpp
1278–1279

I think we talked about it in previous patch and decided to drop ‘if’ and always run this code.

I had already noted it here, probably it was just missed yet.

xbolva00 added inline comments.Jul 28 2019, 11:13 AM
clang/lib/Sema/AnalysisBasedWarnings.cpp
1228

This code is not tested?

I think you can use tests from my older patch.

  • Fix test, formatting and conditional check
Nathan-Huckleberry marked 6 inline comments as done.Jul 29 2019, 10:11 AM

Thanks! Better and better:)

Change in clang/test/Index/blocks.c seems not ideal, or is it okay? @aaron.ballman

xbolva00 added inline comments.Jul 29 2019, 10:34 AM
clang/lib/Parse/ParseStmt.cpp
233

Since you know that tok is kw_attr, I think you can use 'ParseGNUAttributes'.

  • Fix test case spacing
  • Allow decl-specifier source location to propagate to decl parsing
  • Remove changes from accidentally formatted files

+1, looks good

aaron.ballman added inline comments.Aug 5 2019, 11:31 AM
clang/lib/Parse/ParseDecl.cpp
1767 ↗(On Diff #213407)

Should this also be passed DeclSpecStart?

clang/lib/Parse/ParseStmt.cpp
156

I think you can use GNUAttributeLoc.isValid() instead of using the extra local variable.

233

Agreed, you don't need to use the Maybe check here.

clang/test/SemaCXX/warn-unused-label-error.cpp
23

This change in diagnostics makes me very happy!

Nathan-Huckleberry marked 6 inline comments as done.Aug 5 2019, 12:40 PM
  • Remove 'maybe', remove boolean and fix other call to ParseSimpleDeclaration
aaron.ballman accepted this revision.Aug 6 2019, 5:01 AM

This LGTM, but you should hold off a bit to commit -- @rsmith, do you have any concerns with this approach?

This revision is now accepted and ready to land.Aug 6 2019, 5:01 AM
rsmith accepted this revision.Aug 6 2019, 9:56 PM
xbolva00 accepted this revision.Aug 15 2019, 11:39 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 20 2019, 10:16 AM

Compilation fails with this patch
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/22936/steps/bootstrap%20clang/logs/stdio

[  8%] Built target LLVMMC
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:541:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
        default:
        ^
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:541:2: note: insert '__attribute__((fallthrough));' to silence this warning
        default:
        ^
        __attribute__((fallthrough)); 
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:541:2: note: insert 'break;' to avoid fall-through
        default:
        ^
        break; 
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:737:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
        default:
        ^
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:737:2: note: insert '__attribute__((fallthrough));' to silence this warning
        default:
        ^
        __attribute__((fallthrough)); 
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:737:2: note: insert 'break;' to avoid fall-through
        default:
        ^
        break; 
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:1639:3: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
                default:                /* things that break a sequence */
                ^
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:1639:3: note: insert '__attribute__((fallthrough));' to silence this warning
                default:                /* things that break a sequence */
                ^
                __attribute__((fallthrough)); 
/b/sanitizer-x86_64-linux/build/llvm/lib/Support/regcomp.c:1639:3: note: insert 'break;' to avoid fall-through
                default:                /* things that break a sequence */
                ^
                break; 
3 errors generated.
lib/Support/CMakeFiles/LLVMSupport.dir/build.make:2558: recipe for target 'lib/Support/CMakeFiles/LLVMSupport.dir/regcomp.c.o' failed
make[3]: *** [lib/Support/CMakeFiles/LLVMSupport.dir/regcomp.c.o] Error 1
CMakeFiles/Makefile2:1206: recipe for target 'lib/Support/CMakeFiles/LLVMSupport.dir/all' failed
make[2]: *** [lib/Support/CMakeFiles/LLVMSupport.dir/all] Error 2
CMakeFiles/Makefile2:22547: recipe for target 'tools/gold/CMakeFiles/LLVMgold.dir/rule' failed
make[1]: *** [tools/gold/CMakeFiles/LLVMgold.dir/rule] Error 2
Makefile:6905: recipe for target 'LLVMgold' failed
make: *** [LLVMgold] Error 2
+ echo @@@STEP_FAILURE@@@
+ check_64bit 1 sanitizer
+ CONDITION=1
+ SANITIZER=sanitizer
@@@STEP_FAILURE@@@
+ '[' 1 == 1 ']'
+ echo @@@BUILD_STEP 64-bit check-sanitizer@@@


--------------------------------------------------------------------------------
started: Wed Aug 21 01:01:43 2019
ended: Wed Aug 21 01:17:17 2019
duration: 15 mins, 34 secs