This is an archive of the discontinued LLVM Phabricator instance.

clang-tidy: modernize-use-equals-default avoid adding redundant semicolons
ClosedPublic

Authored by poelmanc on Nov 12 2019, 1:54 PM.

Details

Summary

modernize-use-equals-default replaces default constructors/destructors with = default;. When the optional semicolon after a member function is present, this results in two consecutive semicolons.

This patch checks to see if the next non-comment token after the code to be replaced is a semicolon, and if so offers a replacement of = default rather than = default;.

This patch adds trailing comments and semicolons to about 5 existing tests.

Diff Detail

Event Timeline

poelmanc created this revision.Nov 12 2019, 1:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2019, 1:54 PM

Please mention this improvement in the release notes as well :)

clang-tools-extra/clang-tidy/modernize/UseEqualsDefaultCheck.cpp
307

There should be utility code in clang-tidy/util/ that deals with custom lexing, that should be able to do the job.

poelmanc marked an inline comment as done.Nov 13 2019, 8:09 AM
poelmanc added inline comments.
clang-tools-extra/clang-tidy/modernize/UseEqualsDefaultCheck.cpp
307

Thanks for the quick feedback! I looked there and couldn't find a utility function that searched (a) forward rather than backwards, and (b) for any token not of a given list of types. Here's a new helper function that could be placed in LexerUtils.h if you think it's generally useful:

// Analogous to findNextAnyTokenKind, finds next token not of
// the given set of TokenKinds. Useful for skipping comments.
template <typename TokenKind, typename... TokenKinds>
Optional<Token> findNextTokenSkippingKind(SourceLocation Start,
                                          const SourceManager &SM,
                                          const LangOptions &LangOpts,
                                          TokenKind TK, TokenKinds... TKs) {
  Optional<Token> CurrentToken;
  do {
    Optional<Token> CurrentToken = Lexer::findNextToken(Start, SM, LangOpts);
  } while (CurrentToken && CurrentToken.isOneOf(TK, TKs...));
  return CurrentToken;
}

Then the do loop below can be simplified to:

Optional<Token> Token = findNextTokenSkippingKind(
        Body->getSourceRange().getEnd().getLocWithOffset(1),
        Result.Context->getSourceManager(), Result.Context->getLangOpts(),
        tok::comment);
JonasToth added inline comments.Nov 13 2019, 8:25 AM
clang-tools-extra/clang-tidy/modernize/UseEqualsDefaultCheck.cpp
307

There is findNextAnyTokenKind, not sure if it is actually practical to use in this case.

You can implement a skipping function as well. I think that is useful in other contexts (I recall that I had to do similiar skipping already at some point, not sure where it was :( )

mgehre removed a subscriber: mgehre.Nov 13 2019, 9:30 AM
poelmanc updated this revision to Diff 229161.Nov 13 2019, 12:47 PM
poelmanc edited the summary of this revision. (Show Details)

Update to add and use new findNextTokenSkippingComments and findNextTokenSkippingKind utility functions. Since the former calls the latter with just one token type, this required generalizing Token::isOneOf() to work with 1-to-N token kinds versus requiring 2-to-N.

Also added a release note.

To me the change to Token::isOneOf() is an improvement (less code and more flexibility when calling it from variadic templates), but if the team prefers not to modify Token::isOneOf(), we could eliminate the findNextTokenSkippingKind utility function and instead implement findNextTokenSkippingComments directly.

Are you using the lexer-util functions? Maybe i am just to tired, but they seem to be unused.

clang/include/clang/Lex/Token.h
101 ↗(On Diff #229161)

i would be rather against it. it is logically wrong as well, asl isOneOf() implies a choice.

poelmanc updated this revision to Diff 229202.Nov 13 2019, 4:40 PM

Change to add just one helper function findNextTokenSkippingComments to LexerUtils.h, requiring no change to Token::isOneOf, and properly call it from UseEqualsDefaultCheck.cpp.

Are you using the lexer-util functions? Maybe i am just to tired, but they seem to be unused.

Sorry, that was not you! I was experimenting with the git show command to generate these patches and failed to check the results closely. The correct calling code should be in the updated diff.

poelmanc marked 2 inline comments as done.Nov 13 2019, 4:41 PM

Should clang::format::cleanupAroundReplacements() handle this instead?

Should clang::format::cleanupAroundReplacements() handle this instead?

Of yes. Totally forgot that. That would probably the most elegant way :)

poelmanc added a comment.EditedNov 14 2019, 9:00 AM

Should clang::format::cleanupAroundReplacements() handle this instead?

Of yes. Totally forgot that. That would probably the most elegant way :)

Interesting, so is the advantage of that approach that any fixit Replacement or Insertion that ends with a semicolon would have it removed if a semicolon already immediately follows it? That makes sense - one less thing for individual check developers to worry about.

I can take a look at doing that.

Should clang::format::cleanupAroundReplacements() handle this instead?

Of yes. Totally forgot that. That would probably the most elegant way :)

Interesting, so is the advantage of that approach that any fixit Replacement or Insertion that ends with a semicolon would have it removed if a semicolon already immediately follows it? That makes sense - one less thing for individual check developers to worry about.

I can take a look at doing that.

Thinking about that: The cleanup should be called in general after clang-tidy runs. So it is weird, that it does not catch it.

! In D70144#1745583, @JonasToth wrote:

! In D70144#1745532, @malcolm.parsons wrote:

Should clang::format::cleanupAroundReplacements() handle this instead?

My initial attempt did not go well. I thought perhaps adding:

cleanupLeft(Line->First, tok::semi, tok::semi);

to clang/lib/Format/Format.cpp:1491 might do the trick. Stepping through that in the debugger shows that cleanupPair iterates over tokens on affected lines over the affected range. But after the newly added default token and subsequent semi token comes a nullptr - I could not see how to peek past the default; to see what else is on the line.

There's a comment admitting that this needs some architectural work:

// FIXME: in the current implementation the granularity of affected range
// is an annotated line. However, this is not sufficient. Furthermore,
// redundant code introduced by replacements does not necessarily
// intercept with ranges of replacements that result in the redundancy.
// To determine if some redundant code is actually introduced by
// replacements(e.g. deletions), we need to come up with a more
// sophisticated way of computing affected ranges.

Agreed. Even if we could see all the tokens on a line, it seems the test case I added to this patch at clang-tidy/checkers/modernize-use-equals-default.cpp:50, where the redundant semicolon is on the next line, could never be addressed given the current architecture. Changing the TokenAnalyzer::analyze() interface would affect JavaScriptRequoter, ObjCHeaderStyleGuesser, JavaScriptImportSorter, Formatter, and others.

Thoughts on how to proceed?

Hmm. I think this is fine, even though its not perfect.
@aaron.ballman wdyt?

Just a quick update, I made some progress addressing the architectural limitation of AnnotatedLines and evaluate(). I changed the AnnotatedLines constructor to also take a Line *PrevAnnotatedLine argument and set Line->First->Prev to the Last of the previous line and Line->Last->Next to the First of the next line. So now the Tokens remain connected across lines, so individual TokenAnalyzer subclasses can easily peek ahead and behind the current changed Line if they wish. Places that previously iterated e.g. while(Tok) had to be changed to iterate while(Tok && Tok != Line->Last->Next) - I probably haven't found all of those places yet.

The good news is, with that architectural change my prior addition of:

cleanupLeft(Line->First, tok::semi, tok::semi);

successfully removed redundant semicolons after default, even when the next semicolon was on the next line and/or separated by comments!

The bad news is this change also removed consecutive semicolons in e.g.:

for(<possible stuff
    here> ;; <maybe more here>)

which breaks some existing tests. I believe @MyDeveloperDay had thoughts on how to avoid replacing those meaningful consecutive semicolons? We have only tokens rather than an AST. We could search backwards for a for token, but I don't know how far backwards we would want to search to determine whether this particular ;; is valid.

Hmm. I think this is fine, even though its not perfect.
@aaron.ballman wdyt?

The attached patch is nice and small: a 5-line generally useful utility function plus a few lines to call it.

This revision is now accepted and ready to land.Nov 20 2019, 7:09 AM
This revision was automatically updated to reflect the committed changes.