This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Adjust braced list detection (try 2)
ClosedPublic

Authored by galenelias on May 11 2023, 3:02 PM.

Details

Summary

This is a retry of https://reviews.llvm.org/D114583, which was backed
out for regressions. This fixes
https://github.com/llvm/llvm-project/issues/33891 and
https://github.com/llvm/llvm-project/issues/52911.

Clang Format is detecting a nested scope followed by another open brace
as a braced initializer list due to incorrectly thinking it's matching a
braced initializer at the end of a constructor initializer list which is
followed by the body open brace.

Unfortunately, UnwrappedLineParser isn't doing a very detailed parse, so
it's not super straightforward to distinguish these cases given the
current structure of calculateBraceTypes. My current hypothesis is that
these can be disambiguated by looking at the token preceding the
l_brace, as initializer list parameters will be preceded by an
identifier, but a scope block generally will not (barring the MACRO
wildcard).

To this end, I am adding tracking of the previous token to the LBraceStack
to help scope this particular case.

TokenAnnotatorTests cherry picked from https://reviews.llvm.org/D150452
where @sstwcw also fixed this issue simultaneously.

Diff Detail

Event Timeline

galenelias created this revision.May 11 2023, 3:02 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptMay 11 2023, 3:03 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
galenelias requested review of this revision.May 11 2023, 3:03 PM

Looks like @sstwcw also submitted a fix for the same issue, with a bit of a different approach: https://reviews.llvm.org/D150452

Shouldn't it have token annotator tests?

clang/lib/Format/UnwrappedLineParser.cpp
495–496
579–580

@HazardyKnusperkeks, thanks for the feedback. I added the TokenAnnotatorTests from @sstwcw's review. I also updated the design to use a single stack instead of the two parallel stacks.

sstwcw added a comment.EditedMay 12 2023, 6:32 PM

I would suggest adding a link to the revision on the issue thread for your future bug fixes. So people like me don't try to fix what others have fixed.

To this end, I am tracking the previous token type in a stack parallel
to LBraceStack to help scope this particular case.

Maybe keep the description up to date with the code.

galenelias edited the summary of this revision. (Show Details)May 12 2023, 9:02 PM

Please add the full stop to the code comment(s), and then mark review comments as done.

You should also give credit to @sstwcw in the commit message/differential description for the token annotator tests.

I would suggest adding a link to the revision on the issue thread for your future bug fixes. So people like me don't try to fix what others have fixed.

That's correct, you could even assign the issue to you (or ask to be done so), to show that someone is working on it.

clang/unittests/Format/TokenAnnotatorTest.cpp
44

To be consistent, and to get a better output when the expect fails.

galenelias edited the summary of this revision. (Show Details)

Addressed remaining review feedback.

galenelias marked 3 inline comments as done.May 15 2023, 10:11 AM

Thanks for the review and feedback. Sorry about the unfortunate timing between @sstwcw and my fix - we submitted our fixes less than 24 hours apart. I didn't think there would be another simultaneous fix for this 5 year old bug, so didn't even think to link to the review in the GitHub issue. I will definitely do this for any future contributions.

This revision is now accepted and ready to land.May 15 2023, 1:07 PM

Thanks @HazardyKnusperkeks! I don't have commit access, so will need someone to land this for me.

Thanks @HazardyKnusperkeks! I don't have commit access, so will need someone to land this for me.

We'll wait a bit, if someone might have a comment. And (at least I) need name and email for the commit.

We'll wait a bit, if someone might have a comment. And (at least I) need name and email for the commit.

Name: Galen Elias
Email: galenelias at gmail dot com

This patch also fixes https://github.com/llvm/llvm-project/issues/52911 (which is probably a duplicate anyway)

clang/lib/Format/UnwrappedLineParser.cpp
583–584
llvm-project/clang/lib/Format/UnwrappedLineParser.cpp:583:71: warning: '&&' within '||' [-Wlogical-op-parentheses]
          NextTok->is(tok::l_brace) && LBraceStack.back().PrevTok &&
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
llvm-project/clang/lib/Format/UnwrappedLineParser.cpp:583:71: note: place parentheses around the '&&' expression to silence this warning
          NextTok->is(tok::l_brace) && LBraceStack.back().PrevTok &&
                                                                  ^
          (

1 warning generated.

We'll wait a bit, if someone might have a comment. And (at least I) need name and email for the commit.

Name: Galen Elias
Email: galenelias at gmail dot com

Can you put your authorship in the 'John Doe <jdoe@llvm.org>' format?

This patch also fixes https://github.com/llvm/llvm-project/issues/52911 (which is probably a duplicate anyway)

@galenelias Can you update the summary?

clang/unittests/Format/TokenAnnotatorTest.cpp
1806–1807
galenelias edited the summary of this revision. (Show Details)

Thanks for the additional review comments. Hopefully everything if fixed.

Can you put your authorship in the 'John Doe <jdoe@llvm.org>' format?

Galen Elias <galenelias@gmail.com>

I'm not sure if there is somewhere I should be putting that in the summary or diff itself, or just here in the comments?

galenelias marked 2 inline comments as done.May 22 2023, 2:08 PM
owenpan accepted this revision.May 22 2023, 8:24 PM

Galen Elias <galenelias@gmail.com>

I'm not sure if there is somewhere I should be putting that in the summary or diff itself, or just here in the comments?

Just here in the comments would be enough.

This seems to cause a regression. See https://github.com/llvm/llvm-project/issues/64134. @galenelias any idea?

I will take a look. The logic I added is trying to distinguish { } { by looking at the token prior to the first l_brace, and checking if it's an identifier to quantify this as a braced list brace. In this case, it's actually a >, so the code decides it's a scope brace instead.

I could easily patch it to also allow for a >, but don't want this to devolve into a long trail of hacks either. Let me think about it a bit more, and I'll come up with a forward fix.