Page MenuHomePhabricator

Added a better diagnostic when using the delete operator with lambdas
Needs ReviewPublic

Authored by Rakete1111 on Aug 5 2017, 8:10 AM.

Details

Reviewers
rsmith
Summary

This adds a new error for missing parentheses around lambdas in delete operators.

int main() {
  delete []() { return new int(); }();
}

This will result in:

test.cpp:2:3: error: '[]' after delete interpreted as 'delete[]'
  delete []() { return new int(); }();
  ^~~~~~~~~
test.cpp:2:9: note: add parentheses around the lambda
  delete []() { return new int(); }();
        ^
        (                          )

Diff Detail

Event Timeline

Rakete1111 created this revision.Aug 5 2017, 8:10 AM
Rakete1111 edited the summary of this revision. (Show Details)

Rebased and friendly ping.

Rakete1111 added a project: Restricted Project.

Used better diagnostic id.

Updated error message, added a FixItHint + a rebase and friendly ping :)

Rakete1111 edited the summary of this revision. (Show Details)Dec 1 2017, 11:59 AM
Rakete1111 updated this revision to Diff 128668.Jan 4 2018, 4:05 PM

Rebased + friendly 2018 ping :)

rsmith added inline comments.Jan 5 2018, 11:20 AM
lib/Parse/ParseExprCXX.cpp
3017–3018

This seems error-prone. Given:

for (item *p = first, *oldp = p; p; p = p->next, delete [] oldp, oldp = p) {
  /*...*/;
}

... we'll decide the delete-expression is followed by a lambda. Likewise for cases like:

delete [] f->getPtr([] {
  return blah;
});

Our goal here should be a fast heuristic (don't use lookahead except when you're already confident you have a lambda) and zero false positives. How about these heuristics instead:

Assume that the delete [] is actually delete followed by a lambda if either:

  • The next token is { or <, or
  • The next token is ( and either
    • the following token is ) or
    • the following tokens are a type specifier followed by an identifier

This should have no false positives, and only has false negatives if a lambda has an unnamed parameter or a parameter with a non-trivial parameter type. (For the last condition, we could try to tentatively parse an entire parameter and see if it has a name, which would handle all cases except an expression/declaration ambiguity in the parameter declaration, but that seems like overkill to me. This is already performing more lookahead than I'd like, but delete [] expressions are rare enough that using two lookahead tokens for an improved error message seems OK.)

Rakete1111 added inline comments.Jan 5 2018, 11:56 AM
lib/Parse/ParseExprCXX.cpp
3017–3018

Ah, I see. I had no idea that lookaheads were that expensive. Thanks for finding a better algorithm than me :)

Addressed review comments :)

Did you forget to upload the updated patch? This looks unchanged compared to the prior version.

Did you forget to upload the updated patch? This looks unchanged compared to the prior version.

Whoops, that's true. Sorry...

@rsmith Which type specifiers should I test for? just T? Or also T*, T&, ...? Now I'm just checking for an identifier, but anything else would make the lookahead more complicated I think. Any ideas?

@rsmith Should I only check for an identifier, like only "int" and nothing else? Because parsing a full type specifier might be a bit expensive, no?

rsmith added inline comments.Jun 21 2018, 2:42 PM
lib/Parse/ParseExprCXX.cpp
3017–3018

Please don't request the additional lookahead tokens until you've checked Next. (Maybe factor out some parts of the check into a lambda so you can return early?)

3029

TryParseLambdaExpression will always commit to parsing as a lambda-expression, because the first two tokens are []. Since you've already determined that we definitely have a *lambda-expression* here, you may as well just call ParseLambdaExpression directly.

3035

This offset of -1 isn't right: when we have a fix-it pointing at a character, insertions are inserted before that character. Given delete[]{}, this would insert the ( between delet and e[], rather than between delete and [].

3039

Instead of getLocWithOffset here, use Lexer::getLocForEndOfToken; getLocWIthOffset(1) will return the wrong location if the closing brace token's length is greater than 1 (which can happen in the presence of escaped newlines).

3045–3046

Our coding style puts the else on the same line as the }.

However, we also generally don't like using else when the if block returns. And in this case I'd reverse the sense of the earlier if:

if (Lambda.isInvalid())
  return ExprError();

Addressed review comments :)

Rakete1111 marked 7 inline comments as done.Jun 28 2018, 1:49 AM

Rebased + friendly ping :)

Thanks!

lib/Parse/ParseExprCXX.cpp
3018

I don't think this lambda is useful: since you're not caching the result of GetLookAheadToken(3) between calls, you may as well call it directly below.

3023–3024

Use GetLookAheadToken(3).isOneOf(tok::r_paren, tok::identifier) here.

3033–3042

Because we can be very confident the fix is correct, and we're recovering as if the fix were applied, you can combine the err_ diagnostic and the note_ diagnostic into one. That way, tools like clang -fixit will apply your fixit.

3034

If you want the location of the ] token for the end of the range here, it'd be best to save NextToken().getLocation() prior to calling ParseLambdaExpression(). Adding 1 to the location of the [ token isn't necessarily the right location (for instance, if there's whitespace between the [ and ], or if the left square bracket is spelled as <:).

Rakete1111 edited the summary of this revision. (Show Details)

Addressed review comments.

Rakete1111 marked 4 inline comments as done.Jul 13 2018, 2:55 AM

Rebased + friendly ping

This looks good, with some minor changes. Please add more test coverage, though, specifically:

  • test all four forms of lambda that we recognize after delete
  • add a test that the FixItHint is correct (maybe in test/FixIt/fixit-cxx0x.cpp, which checks that applying the fixes results in working code)
lib/Parse/ParseExprCXX.cpp
3023

This doesn't seem quite right now. []() is not recognized as a lambda unless it's followed by an identifier. I think we want (and sorry if this is reverting to what you had before switching to isOneOf):

if (Next.isOneOf(tok::l_brace, tok::less) ||
    (Next.is(tok::l_paren) &&
     (GetLookAheadToken(3).is(tok::r_paren) ||
      (GetLookAheadToken(3).is(tok::identifier) && GetLookAheadToken(4).is(tok::r_paren))))) {

That is, we're matching:

  • []{
  • []<
  • []()
  • [](identifier identifier
3036

The 1 here should be a 0. Given

delete []{return new int;}();

this will insert the ) after the final ; instead of before it.

3045

No need for an else here, since the if block terminates by return.

Addressed review comments.

Note that clang doesn't support the fourth kind of lambda yet ([]<>), because D36527 hasn't been merged yet, so I didn't add a test case for that one.

Rakete1111 updated this revision to Diff 159832.Aug 8 2018, 5:24 PM

Rebase + friendly ping :)

Rebase and friendly ping :)