Page MenuHomePhabricator

Clang-tidy fix removals removing all non-blank text from a line should remove the line
Needs ReviewPublic

Authored by poelmanc on Tue, Oct 8, 8:02 PM.

Details

Summary

Clang-tidy can ignore most formatting concerns as formatting is clang-format's job. However, clang-format treats blank lines as intentional separators, so clang-tidy fixes that remove all text from lines and leave newly blank lines can prevent clang-format from properly fixing formatting. For example, applying readability-redundant-member-init to:

struct F10 {
  F10() :
    f(),
    g()
  {}
  S f, g;
};

results in:

struct F10 {
  F10()


  {}
  S f, g;
};

which may get converted to:

struct F10 {
  F10()


= default;
  S f, g;
};

The newly blank lines prevent clang-format from fixing this to the desired:

struct F10 {
  F10() = default;
  S f, g;
};

A few individual fixes take steps to reduce some newly blank lines, e.g. RedundantControlFlowCheck.cpp calls Lexer::findLocationAfterToken(.../*SkipTrailingWhitespaceAndNewLine=*/true ) to chomp trailing newlines after redundant continue; or break; statements. But relying on each check to individually address the problem:

  1. does not check whether the whole line is blank before removing newlines, as that can only be known once all Replacements on the line are known
  2. can fail to notice when multiple removals combine to generate a blank line
  3. forces more work for each check writer
  4. results in inconsistency between checks regarding removals creating newly blank lines

For example:

F11::F11()
: a()
{}

is fixed by readability-redundant-member-init with multiple independent removals to remove the "a()" and the ":", which then leaves the line blank.

This patch adds a function removeNewlyBlankLines to Format.cpp that looks for sequences of Replacement removals that leave a previously non-blank line blank and, where found, adds a Replacement to remove the whole line. removeNewlyBlankLines is called from cleanupAroundReplacements.

The patch adds the above and other tests to readability-redundant-member-init.cpp, and adds tests to readability-redundant-control-flow.cpp and readability-redundant-declaration.cpp to demonstrate its effectiveness in avoiding newly blank lines anywhere.

Two previously private/static helper functions are exposed to assist in the implementation: AST/CommentLexer's skipNewline and AST/CommentParser's isWhitespace(StringRef).

Diff Detail

Event Timeline

poelmanc created this revision.Tue, Oct 8, 8:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptTue, Oct 8, 8:02 PM
Eugene.Zelenko added a subscriber: gribozavr.
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

You may be interested to also look on PR43583 related to readability-redundant-member-init.

Sorry, I'm going to give a drive-by comment (which you can ignore), mainly by because you mentioned clang-format.

This seems like a good idea as obviously it solves this problem, however, isn't rather like trying to fix it after the fact?

what if (I'm sure there isn't anything currently doing this so maybe a moot point), someone wrote a clang-tidy checker to say insert newlines between things, sort of imagine something like https://bugs.llvm.org/show_bug.cgi?id=42767 where the person wanted to add missing newlines based on some semantic rule rather than the more traditional way of handling it in clang-format.

This code change kind of says doing anything like replacing something with a newline would be stripped away, to me it feels like this removal of extra white space needs to be handled at the point the replacement is created and not on all the final replacements where the context is lost. (perhaps you already considered that)

Just my 2c worth. but I do think its good to remove the newlines in this case so thank you for that.

gribozavr requested changes to this revision.Wed, Oct 9, 1:42 AM
gribozavr added a reviewer: gribozavr.

+1 to what MyDeveloperDay said. The infrastructure can't know whether the newlines are intentional or not. Checks that edit the source code should be improved to delete newlines where they become unnecessary. I don't think we can accept the patch that changes how we apply edits.

This revision now requires changes to proceed.Wed, Oct 9, 1:42 AM
poelmanc updated this revision to Diff 224089.EditedWed, Oct 9, 11:56 AM
poelmanc retitled this revision from Clang-tidy fixes should avoid creating new blank lines to Clang-tidy fix removals removing all non-blank text from a line should remove the line.
poelmanc edited the summary of this revision. (Show Details)

Thanks for the prompt review and feedback! MyDeveloperDay, your "drive-by" comments are quite welcome as the goal is to make clang-tidy and clang-format work better in tandem. You are correct that my prior version could have thwarted someone's chang-tidy checker that adds whitespace, which was an oversight on my part. I've updated the patch to change isWhitespace( RemovalText ) to RemovalText.isBlank(). Now this only applies to Removals that remove all non-blank text from a line. I also updated the Title and Summary accordingly, adding the example that most confounded me of ": a()\n", which readability-redundant-member-init makes blank with a combination of two separate removals: one for the ":" and one for the "a()".

I understand and appreciate the feedback:

  • this removal of extra white space needs to be handled at the point the replacement is created and not on all the final replacements where the context is lost
  • Checks that edit the source code should be improved to delete newlines where they become unnecessary

Indeed I first tried fixing this entirely within readability-redundant-member-init. One problem was that when a line becomes blank due to two separate removals, code that peeks ahead or backwards to see if the line is blank doesn't know about other removals that will occur. Perhaps this can be fixed if the check() function has access to other removals that have already been created? (See also Note 1.) As I broadened my search I found code in readability-redundant-control-flow that heuristically tried to remove blank lines, but it failed to examine the entire line and thus removed newlines from the ends of lines that were not actually blank. (See the test case I added to readability-redundant-control-flow.cpp, where /* NOTE */ continue;\n had the newline following continue; removed, leaving the comment attached to the next line.) And I found that readability-redundant-declaration did not attempt to remove blank lines, and its test cases specifically added comments to all removed declarations so that in its test cases, none of the resulting lines were blank. (See the test case I added to readability-redundant-declaration.cpp.)

Digging into these examples led me to the exact opposite conclusion: that the "newly blank line" problem needs to be solved consistently at a higher-level where all removals can be examined jointly, rather than trying to address it within each check.

I guess here's the high-level question: should all removals that remove all non-blank text from a line also delete the line?

  • If yes: consider reviewing the patch
  • If mostly: consider adding this as a top-level clang-tidy option that's off by default
  • Otherwise: if it's possible to address this solely within each of readability-redundant-member-init, readability-redundant-control-flow, readability-redundant-declaration, etc., I can revisit that original approach. I'd appreciate pointers on how to access prior Removals on the same line from within check(), and any guesses regarding the Windows line-ending issue of Note 1.

Thanks!

Note 1: When applied to a file with Windows-style 2-character line endings, adding a Removal to readability-redundant-member-init whose Length was set to include the two-character newline would delete only the first character of the newline, leaving a \r. Then when I increased the removal length by 1, it would delete both newline characters and the first character of the line after the newline, which would be extremely undesirable. Updating removals at the top-level of ClangTidy.cpp did not have this problem. I was never able to track down where/why this error occurred, though it's surely fixable.

You may be interested to also look on PR43583 related to readability-redundant-member-init.

Thanks Eugene, I'm having trouble finding that. https://reviews.llvm.org/D43583 seems related to MIPS instructions rather than readability-redundant-member-init. Could you please post a link? Thanks.

@poelmanc thank you for the very detailed explanation to what was probably my very naive point, I have to say with more information I am inclined to agree with you that perhaps you need to see all the replacements as a whole to make reasonable assumptions. But I appreciate you making the change to accommodate my concern.

As I'm not over here in clang-tidy land very often I will leave you to the others (and they give very detailed reviews), so I'll wish you good luck and if it helps make clang-tidy and clang-format work better together then you definitely have my vote!

clang-tools-extra/clang-tidy/ClangTidy.cpp
27 ↗(On Diff #224089)

Nit: I'm guessing you'd drop these comments later

143 ↗(On Diff #224089)

so If I understand correctly, if the ReplcementText is empty we are assuming we've removed it, as opposed to if someone was adding a blank line the ReplacementText would be all whitespace as we might remove it by mistake.. sound reasonable

poelmanc marked an inline comment as done.Wed, Oct 9, 2:38 PM
poelmanc added inline comments.
clang-tools-extra/clang-tidy/ClangTidy.cpp
27 ↗(On Diff #224089)

Thanks, will do!

143 ↗(On Diff #224089)

Exactly. A check's check() function can create a clang::FixItHint as an Insertion, Removal, or Replacement. Later FixItHints get processed, sorted, and merged into an ordered set of clang::tooling::Replacement. Each Replacement has a file Offset, Length, and ReplacementText. So a Removal FixItHint generally ends up as a Replacement whose ReplacementText is the empty string.

You may be interested to also look on PR43583 related to readability-redundant-member-init.

Thanks Eugene, I'm having trouble finding that. https://reviews.llvm.org/D43583 seems related to MIPS instructions rather than readability-redundant-member-init. Could you please post a link? Thanks.

PRs are bug reports. You can access them like this: https://llvm.org/PR43583

You may be interested to also look on PR43583 related to readability-redundant-member-init.

Thanks Eugene, I'm having trouble finding that. https://reviews.llvm.org/D43583 seems related to MIPS instructions rather than readability-redundant-member-init. Could you please post a link? Thanks.

PRs are bug reports. You can access them like this: https://llvm.org/PR43583

Or direct link to LLVM Bugzilla (https://bugs.llvm.org/show_bug.cgi?id=<Number>). Ideally Phabricator should recognize PR prefixes in same way as it recognize D.

I guess here's the high-level question: should all removals that remove all non-blank text from a line also delete the line?

I see your point, but as https://llvm.org/PR43583 shows, we have a much larger problem: textual replacements don't compose. So, whatever we do, it will only be hacky workarounds for specific high-value cases.

About how exactly to do it. It would be preferred if checkers that already know that they are deleting a full line, just deleted the newline characters. However, for other cases, I feel like the place where this patch implements newline deletion is not ideal. It implements newline deletion while applying all fixes. Applying fixes is not the only client of fixit information. Probably it is one of the least-important ones, to be honest. I don't know who will run clang-tidy and just trust it to apply whatever fixes it wants. More likely, the user will want to see a preview of findings and fixes, and fixit application will be done by some other tool. Therefore, implementing newline deletion in the code that applies all fixes is not going to be helpful for most workflows.

Therefore, I think the most appropriate place to do this cleanup is cleanupAroundReplacements. What do you think?

You may be interested to also look on PR43583 related to readability-redundant-member-init.

Thanks Eugene, I'm having trouble finding that. https://reviews.llvm.org/D43583 seems related to MIPS instructions rather than readability-redundant-member-init. Could you please post a link? Thanks.

PRs are bug reports. You can access them like this: https://llvm.org/PR43583

Or direct link to LLVM Bugzilla (https://bugs.llvm.org/show_bug.cgi?id=<Number>). Ideally Phabricator should recognize PR prefixes in same way as it recognize D.

Thanks @Eugene.Zelenko and @jonathanmeier. https://llvm.org/PR43583 now explains to me perfectly why my initial attempts to remove blank lines solely within readability-redundant-member-init failed: the final removal of ":" that in some cases left the line blank was not made in readability-redundant-member-init but higher up in the cleanup stage.

I guess here's the high-level question: should all removals that remove all non-blank text from a line also delete the line?

I see your point, but as https://llvm.org/PR43583 shows, we have a much larger problem: textual replacements don't compose. So, whatever we do, it will only be hacky workarounds for specific high-value cases.

About how exactly to do it. It would be preferred if checkers that already know that they are deleting a full line, just deleted the newline characters. However, for other cases, I feel like the place where this patch implements newline deletion is not ideal. It implements newline deletion while applying all fixes. Applying fixes is not the only client of fixit information. Probably it is one of the least-important ones, to be honest. I don't know who will run clang-tidy and just trust it to apply whatever fixes it wants. More likely, the user will want to see a preview of findings and fixes, and fixit application will be done by some other tool. Therefore, implementing newline deletion in the code that applies all fixes is not going to be helpful for most workflows.

Therefore, I think the most appropriate place to do this cleanup is cleanupAroundReplacements. What do you think?

Thanks, from the name that sounds like the perfect place to do it. If cleanupAroundReplacements also is used by clang-format, would we want to make the functionality optional, e.g. via a new bool parameter to cleanupAroundReplacements, a new option in FormatStyle, etc.?

Thanks, from the name that sounds like the perfect place to do it. If cleanupAroundReplacements also is used by clang-format, would we want to make the functionality optional, e.g. via a new bool parameter to cleanupAroundReplacements, a new option in FormatStyle, etc.?

cleanupAroundReplacements is not used by clang-format itself. I believe, it's a part of clang-format only because it was easier to implement it on top of some infrastructure clang-format provides.

poelmanc updated this revision to Diff 225523.EditedThu, Oct 17, 2:43 PM
poelmanc edited the summary of this revision. (Show Details)

cleanupAroundReplacements is not used by clang-format itself. I believe, it's a part of clang-format only because it was easier to implement it on top of some infrastructure clang-format provides.

Thanks for the correction @alexfh. I've updated the patch to follow @gribozavr's suggestion to perform the line removal within cleanupAroundReplacements().