This is an archive of the discontinued LLVM Phabricator instance.

Added a better diagnostic when using the delete operator with lambdas
ClosedPublic

Authored by Rakete1111 on Aug 5 2017, 8:10 AM.

Details

Summary

This adds a new error for missing parentheses around lambdas in delete operators.

int main() {
  delete []() { return new int(); }();
}

This will result in:

test.cpp:2:3: error: '[]' after delete interpreted as 'delete[]'
  delete []() { return new int(); }();
  ^~~~~~~~~
test.cpp:2:9: note: add parentheses around the lambda
  delete []() { return new int(); }();
        ^
        (                          )

Diff Detail

Repository
rL LLVM

Event Timeline

Rakete1111 created this revision.Aug 5 2017, 8:10 AM
Rakete1111 edited the summary of this revision. (Show Details)Aug 5 2017, 8:10 AM

Rebased and friendly ping.

Rakete1111 added a project: Restricted Project.

Used better diagnostic id.

Updated error message, added a FixItHint + a rebase and friendly ping :)

Rakete1111 edited the summary of this revision. (Show Details)Dec 1 2017, 11:59 AM
Rakete1111 updated this revision to Diff 128668.Jan 4 2018, 4:05 PM

Rebased + friendly 2018 ping :)

rsmith added inline comments.Jan 5 2018, 11:20 AM
lib/Parse/ParseExprCXX.cpp
2906–2912 ↗(On Diff #128668)

This seems error-prone. Given:

for (item *p = first, *oldp = p; p; p = p->next, delete [] oldp, oldp = p) {
  /*...*/;
}

... we'll decide the delete-expression is followed by a lambda. Likewise for cases like:

delete [] f->getPtr([] {
  return blah;
});

Our goal here should be a fast heuristic (don't use lookahead except when you're already confident you have a lambda) and zero false positives. How about these heuristics instead:

Assume that the delete [] is actually delete followed by a lambda if either:

  • The next token is { or <, or
  • The next token is ( and either
    • the following token is ) or
    • the following tokens are a type specifier followed by an identifier

This should have no false positives, and only has false negatives if a lambda has an unnamed parameter or a parameter with a non-trivial parameter type. (For the last condition, we could try to tentatively parse an entire parameter and see if it has a name, which would handle all cases except an expression/declaration ambiguity in the parameter declaration, but that seems like overkill to me. This is already performing more lookahead than I'd like, but delete [] expressions are rare enough that using two lookahead tokens for an improved error message seems OK.)

Rakete1111 added inline comments.Jan 5 2018, 11:56 AM
lib/Parse/ParseExprCXX.cpp
2906–2912 ↗(On Diff #128668)

Ah, I see. I had no idea that lookaheads were that expensive. Thanks for finding a better algorithm than me :)

Addressed review comments :)

Did you forget to upload the updated patch? This looks unchanged compared to the prior version.

Did you forget to upload the updated patch? This looks unchanged compared to the prior version.

Whoops, that's true. Sorry...

@rsmith Which type specifiers should I test for? just T? Or also T*, T&, ...? Now I'm just checking for an identifier, but anything else would make the lookahead more complicated I think. Any ideas?

@rsmith Should I only check for an identifier, like only "int" and nothing else? Because parsing a full type specifier might be a bit expensive, no?

rsmith added inline comments.Jun 21 2018, 2:42 PM
lib/Parse/ParseExprCXX.cpp
2907–2909 ↗(On Diff #138335)

Please don't request the additional lookahead tokens until you've checked Next. (Maybe factor out some parts of the check into a lambda so you can return early?)

2919 ↗(On Diff #138335)

TryParseLambdaExpression will always commit to parsing as a lambda-expression, because the first two tokens are []. Since you've already determined that we definitely have a *lambda-expression* here, you may as well just call ParseLambdaExpression directly.

2925 ↗(On Diff #138335)

This offset of -1 isn't right: when we have a fix-it pointing at a character, insertions are inserted before that character. Given delete[]{}, this would insert the ( between delet and e[], rather than between delete and [].

2929 ↗(On Diff #138335)

Instead of getLocWithOffset here, use Lexer::getLocForEndOfToken; getLocWIthOffset(1) will return the wrong location if the closing brace token's length is greater than 1 (which can happen in the presence of escaped newlines).

2935–2936 ↗(On Diff #138335)

Our coding style puts the else on the same line as the }.

However, we also generally don't like using else when the if block returns. And in this case I'd reverse the sense of the earlier if:

if (Lambda.isInvalid())
  return ExprError();

Addressed review comments :)

Rakete1111 marked 7 inline comments as done.Jun 28 2018, 1:49 AM

Rebased + friendly ping :)

Thanks!

lib/Parse/ParseExprCXX.cpp
2956 ↗(On Diff #154882)

I don't think this lambda is useful: since you're not caching the result of GetLookAheadToken(3) between calls, you may as well call it directly below.

2961–2962 ↗(On Diff #154882)

Use GetLookAheadToken(3).isOneOf(tok::r_paren, tok::identifier) here.

2971–2980 ↗(On Diff #154882)

Because we can be very confident the fix is correct, and we're recovering as if the fix were applied, you can combine the err_ diagnostic and the note_ diagnostic into one. That way, tools like clang -fixit will apply your fixit.

2972 ↗(On Diff #154882)

If you want the location of the ] token for the end of the range here, it'd be best to save NextToken().getLocation() prior to calling ParseLambdaExpression(). Adding 1 to the location of the [ token isn't necessarily the right location (for instance, if there's whitespace between the [ and ], or if the left square bracket is spelled as <:).

Rakete1111 edited the summary of this revision. (Show Details)

Addressed review comments.

Rakete1111 marked 4 inline comments as done.Jul 13 2018, 2:55 AM

Rebased + friendly ping

This looks good, with some minor changes. Please add more test coverage, though, specifically:

  • test all four forms of lambda that we recognize after delete
  • add a test that the FixItHint is correct (maybe in test/FixIt/fixit-cxx0x.cpp, which checks that applying the fixes results in working code)
lib/Parse/ParseExprCXX.cpp
2956 ↗(On Diff #157611)

This doesn't seem quite right now. []() is not recognized as a lambda unless it's followed by an identifier. I think we want (and sorry if this is reverting to what you had before switching to isOneOf):

if (Next.isOneOf(tok::l_brace, tok::less) ||
    (Next.is(tok::l_paren) &&
     (GetLookAheadToken(3).is(tok::r_paren) ||
      (GetLookAheadToken(3).is(tok::identifier) && GetLookAheadToken(4).is(tok::r_paren))))) {

That is, we're matching:

  • []{
  • []<
  • []()
  • [](identifier identifier
2969 ↗(On Diff #157611)

The 1 here should be a 0. Given

delete []{return new int;}();

this will insert the ) after the final ; instead of before it.

2978 ↗(On Diff #157611)

No need for an else here, since the if block terminates by return.

Addressed review comments.

Note that clang doesn't support the fourth kind of lambda yet ([]<>), because D36527 hasn't been merged yet, so I didn't add a test case for that one.

Rakete1111 updated this revision to Diff 159832.Aug 8 2018, 5:24 PM

Rebase + friendly ping :)

Rebase and friendly ping :)

Rebase + friendly ping :).

Rakete1111 updated this revision to Diff 198142.May 4 2019, 8:21 AM

friendly ping :)

rsmith accepted this revision.May 4 2019, 6:35 PM

Thanks! Some minor nits, please feel free to commit once they're addressed.

Note that clang doesn't support the fourth kind of lambda yet ([]<>), because D36527 hasn't been merged yet, so I didn't add a test case for that one.

We support that now, so please add a test for that :)

clang/include/clang/Basic/DiagnosticParseKinds.td
109 ↗(On Diff #198142)

Please add to this: "; add parentheses to treat this as a lambda-expression" or similar.

clang/lib/Parse/ParseExprCXX.cpp
2996 ↗(On Diff #198142)

RightBracketLock -> RSquareLoc

(Our convention is to use Loc for "location" and to avoid the word "bracket" because it means different things in different English dialects -- usually [] in US English and usually () in UK English.)

2997 ↗(On Diff #198142)

if -> that
parenthesis -> parentheses

3014 ↗(On Diff #198142)

Maybe we should do this before producing the diagnostic so that we can suggest inserting the ) after the complete expression? (But I don't have a strong preference either way.)

This revision is now accepted and ready to land.May 4 2019, 6:35 PM

@rsmith One last question: The fixit diagnostic seems to be inconsistent with the rest?

main.cpp:2:3: error: '[]' after delete interpreted as 'delete[]'
  delete[] { return new int; }
  ^~~~~~~~
        (                     )

Shouldn't the ^~~~~~~ be starting at the []?

Is this also okay?

main.cpp:2:9: warning: lambda expressions are incompatible with C++98 [-Wc++98-compat]
  delete[] { return new int; }
        ^
main.cpp:2:3: error: '[]' after delete interpreted as 'delete[]'; add parentheses to treat this as a lambda-expression
  delete[] { return new int; }
  ^~~~~~~~
        (                     )

@rsmith One last question: The fixit diagnostic seems to be inconsistent with the rest?

main.cpp:2:3: error: '[]' after delete interpreted as 'delete[]'
  delete[] { return new int; }
  ^~~~~~~~
        (                     )

Shouldn't the ^~~~~~~ be starting at the []?

Pointing the caret at the [ might be clearer, but I think that's fine too.

Is this also okay?

main.cpp:2:9: warning: lambda expressions are incompatible with C++98 [-Wc++98-compat]
  delete[] { return new int; }
        ^
main.cpp:2:3: error: '[]' after delete interpreted as 'delete[]'; add parentheses to treat this as a lambda-expression
  delete[] { return new int; }
  ^~~~~~~~
        (                     )

Hmm. It would be better to produce the error first, but that would make it more difficult to get the fixit hint right. I suppose in the case where we've already decided we're going to diagnose, we could perform a tentative parse and skip to the } of the lambda.

Rakete1111 marked 3 inline comments as done.May 14 2019, 2:19 AM

we could perform a tentative parse and skip to the } of the lambda.

How should I do this? Do I just skip to the next }, or also take into account any additional scopes? Also does this mean that I skip and then revert, because that seems pretty expensive?

  • Addressed review comments

How should I do this? Do I just skip to the next }, or also take into account any additional scopes? Also does this mean that I skip and then revert, because that seems pretty expensive?

It would be a little expensive, yes, but we'd only be doing it on codepaths where we're producing an error -- for an ill-formed program, it's OK to take more time in order to produce a better diagnostic. Skipping to the next } won't work, because SkipUntil will skip over pairs of brace-balanced tokens (so you'll skip past the } you're looking for), but skipping until the next { and then skipping to the } after it should work.

clang/lib/Parse/ParseExprCXX.cpp
2996 ↗(On Diff #198142)

Lock -> Loc. There's no k in "location" =)

Rakete1111 marked 2 inline comments as done.EditedMay 14 2019, 1:41 PM

How should I do this? Do I just skip to the next }, or also take into account any additional scopes? Also does this mean that I skip and then revert, because that seems pretty expensive?

It would be a little expensive, yes, but we'd only be doing it on codepaths where we're producing an error -- for an ill-formed program, it's OK to take more time in order to produce a better diagnostic. Skipping to the next } won't work, because SkipUntil will skip over pairs of brace-balanced tokens (so you'll skip past the } you're looking for), but skipping until the next { and then skipping to the } after it should work.

Hmm wouldn't this interact badly with {} in initializers?

[]<int = {0}> {};
[](int = {0}) {};
clang/lib/Parse/ParseExprCXX.cpp
2996 ↗(On Diff #198142)

No idea how that k managed to sneak in :)

Rakete1111 marked an inline comment as done.

Nevermind, seems to be working fine even with.

How should I do this? Do I just skip to the next }, or also take into account any additional scopes? Also does this mean that I skip and then revert, because that seems pretty expensive?

It would be a little expensive, yes, but we'd only be doing it on codepaths where we're producing an error -- for an ill-formed program, it's OK to take more time in order to produce a better diagnostic. Skipping to the next } won't work, because SkipUntil will skip over pairs of brace-balanced tokens (so you'll skip past the } you're looking for), but skipping until the next { and then skipping to the } after it should work.

Hmm wouldn't this interact badly with {} in initializers?

[]<int = {0}> {};
[](int = {0}) {};

Ouch. The first of these two seems like it won't work, since SkipUntil has no idea the angles are brackets. =( Getting this right is ... really hairy. Eg:

[]<bool = a<b>{} // end of lambda here?
                >(){} // or here?

(The answer depends on whether a is a template.) And the same problem shows up in the trailing-return-type and the requires-clause.

How about this: SkipUntil an { or <. If you find a < first, don't attach a FixItHint to the diagnostic since we can't easily tell where to suggest putting the ).

  • Fix crash with invalid postfix expr
  • Don't emit FixIt when using <
  • Use TryConsumeToken
rsmith accepted this revision.May 16 2019, 11:37 AM

LGTM, thank you! :)

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2019, 8:05 AM