This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Don't detect C++11 attribute specifiers as ObjC
ClosedPublic

Authored by benhamilton on Feb 28 2018, 1:39 PM.

Details

Summary

Previously, clang-format would detect C++11 and C++17 attribute
specifiers like the following as Objective-C method invocations:

[[noreturn]];
[[clang::fallthrough]];
[[noreturn, deprecated("so sorry")]];
[[using gsl: suppress("type")]];

To fix this, I ported part of the logic from
tools/clang/lib/Parse/ParseTentative.cpp into TokenAnnotator.cpp so we
can explicitly parse and identify C++11 attribute specifiers.

This allows the guessLanguage() and getStyle() APIs to correctly
guess files containing the C++11 attribute specifiers as C++,
not Objective-C.

Test Plan: New tests added. Ran tests with:

make -j12 FormatTests && ./tools/clang/unittests/Format/FormatTests

Diff Detail

Event Timeline

benhamilton created this revision.Feb 28 2018, 1:39 PM
aaron.ballman added inline comments.
lib/Format/TokenAnnotator.cpp
346

C can use this syntax as well with -fdouble-square-bracket-attributes, which would be good to also support.

benhamilton added inline comments.Feb 28 2018, 1:55 PM
lib/Format/TokenAnnotator.cpp
346

Ah, good to know. As far as I know, clang-format doesn't actually distinguish between C and C++ (or Objective-C and Objective-C++).

In this case, isCpp() returns true for all of C, C++, Objective-C, and Objective-C++:

https://github.com/llvm-mirror/clang/blob/master/include/clang/Format/Format.h#L1229

jolesiak requested changes to this revision.Mar 5 2018, 8:30 AM
jolesiak added inline comments.
lib/Format/TokenAnnotator.cpp
323

I feel like it would be clearer (and more consistent) if we used const& here and just make a copy inside function body.
I see no benefit of passing nullptr based on usage presented in this patch.

325

For consistency reasons I wouldn't use one line ifs. Let's keep existing format with line break and without braces.

337

Is second part of this condition necessary?

345

Pointer should be to const type, but same as above, I feel like const& would be a better choice.

351

How about:

bool NamespaceAllowed;
if (AttributeToken->startsSequence(tok::kw_using, tok::identifier,
                                   tok::colon)) {
  // C++17 '[[using namespace: foo, bar(baz, blech)]]'
  AttributeToken = AttributeToken->Next->Next->Next;
  NamespaceAllowed = false;
} else {
  // C++11 '[[namespace::foo, namespace::bar(baz, blech)]]'
  NamespaceAllowed = true;
}
while (AttributeToken) {
  AttributeToken =
      parseCpp11Attribute(AttributeToken, NamespaceAllowed);
  if (!AttributeToken || AttributeToken->isNot(tok::comma)) break;
  AttributeToken = AttributeToken->Next;
}
This revision now requires changes to proceed.Mar 5 2018, 8:30 AM
benhamilton marked 7 inline comments as done.

Fixes from @jolesiak

Thanks very much for the code review. I fixed those issues.

lib/Format/TokenAnnotator.cpp
323

Fixed.

325

Fixed. (My mistake was using git-clang-format without passing --style llvm — for some reason that's applying a different style.)

337

Not needed. Removed.

345

Fixed.

351

Fixed.

djasper added inline comments.Mar 5 2018, 10:39 AM
lib/Format/TokenAnnotator.cpp
352

Just join this if with the previous one.

357

Replace lines 355-365 with:

bool IsUsingNamespace = AttrTok && AttrTok->startsSequence( ... );
if (IsUsingNamespace)
  AttrTok = AttrTok->Next->Next->Next;
374

Why not replace these three lines with:

return AttrTok && AttrTok->startsSequence(..);
397

This seems weird. I think if we return true, we should have consumed everything up to the closing r_square.

unittests/Format/FormatTest.cpp
11972

You are not adding any test (AFAICT) that tests that you have correctly set TT_AttributeSpecifier or that you are making any formatting decisions based on it. If you can't test it, remove that part of this patch.

krasimir added inline comments.Mar 5 2018, 10:43 AM
lib/Format/TokenAnnotator.cpp
323

Please inline this into the other function: it's used only once and its name and arguments aren't clear outside of that context.

unittests/Format/FormatTest.cpp
11972

This will also require adding tests about not messing up the formatting inside/around attribute specifiers too much. At least take these examples and copy them to the C++/ObjC formatting unittests.

benhamilton marked 5 inline comments as done.
benhamilton added inline comments.Mar 5 2018, 2:55 PM
lib/Format/TokenAnnotator.cpp
323

OK, made this a lambda.

352

Fixed.

357

Fixed.

374

Fixed.

397

That's super useful feedback, I wasn't sure what the contract of this method was.

Fixed to consume everything including the closing r_square. (I actually had to consume that too, or parsing was left with an un-parsed value and parsing would fail.)

unittests/Format/FormatTest.cpp
6068–6069

I couldn't figure out why this breaks before the [ in [[unused]], when the identical test using __attribute__((unused)) above does *not* break before the __attribute__.

I ran with -debug and the two seem fairly similar. Can anyone help shed light on this?

With __attribute__((unused))

P8067

With [[unused]]

P8068

11972

Added formatting tests and logic to use this.

Renamed to TT_AttributeSquare to match TT_AttributeParen.

11972

I don't think we have any tests for square-bracket attribute specifier formatting (only __attribute__((foo))) today. My guess is it's pretty broken.

I've added more tests.

krasimir added inline comments.Mar 7 2018, 2:41 AM
unittests/Format/FormatTest.cpp
6068–6069

The block of a-s appears to be longer here than in the __attribute__ example. That might be a reason.

I made it longer so the two lines (attribute((foo)) vs. [[foo]]) had
the same length.

If I shorten it much more, everything goes on one line.

djasper added inline comments.Mar 8 2018, 3:55 PM
lib/Format/TokenAnnotator.cpp
329

Can you make this:

// C++17 '[[using <namespace>: foo, bar(baz, blech)]]'

To make clear that we are not looking for kw_namespace here?

332

No braces.

336

Do you actually need to put the return type here? I would have thought that it can be deduced as you pass back a const FormatToken* from a codepath and nullptr from all the others.

342

No braces.

350

Sorry that I have missed this before, I thought this was in a different file. Can you try to avoid iterating trying to count or match parentheses inside any of parseSquare/parseParen/parseAngle. We avoided that AFAICT for everything else and I don't think it's necessary here. Especially as you are not actually moving the token position forward, it's too easy to create a quadratic algorithm here.

Also: Do you actually have a test case for the the parentheses case? This thing could use a lot more comments...

366

No braces for single statement ifs. Don't use "else" after "return".

399

No braces for single-statement ifs.

400

What happens if you don't assign this type here? Which formatting decision is based on it?

unittests/Format/FormatTest.cpp
6064

If this is meant to contrast a TT_AttributeColon from a different colon, that doesn't work. "::" is it's own token type coloncolon.

benhamilton marked 9 inline comments as done.

Refactor to avoid matching open and close parens. Fix @djasper comments.

Great, I refactored it to avoid parsing the matching parens, so there's no more danger of O(N^2) parsing behavior.

I had to add a new InCpp11AttributeSpecifier boolean to Context so when parsing the inner [] we didn't treat it as an ObjC message send or an array subscript.

lib/Format/TokenAnnotator.cpp
329

Ah, good call. Done.

332

Done.

336

Looks like we don't. Fixed.

342

Done.

350

Fixed. There is a test case for the parentheses case.

I definitely see from your other reviews that it's important to structure code the way you describe.

In this scenario, we technically have to deal with all of these (http://en.cppreference.com/w/cpp/language/attributes):

void f() {
  int y[3];
  int i [[cats::meow([[]])]]; // OK
  int i [[cats::meow(foo:bar)]]; // OK
  int i [[cats::meow(foo::bar)]]; // OK
}

I was able to fudge it and assume nobody will actually use attribute parameters containing colons in the arguments, which allowed me to avoid looking for closing parens.

366

Fixed.

399

Fixed.

400

Added a comment to explain:

// Remember that this is a [[using ns: foo]] C++ attribute, so we don't                                                                                                                     
// add a space before the colon (unlike other colons).

There is a formatting test for this.

unittests/Format/FormatTest.cpp
6064

It's not, I just wanted to make sure we had a test with a namespace.

djasper accepted this revision.Mar 12 2018, 8:34 AM

Ok, looks good.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 12 2018, 8:45 AM
This revision was automatically updated to reflect the committed changes.