Page MenuHomePhabricator

[clang-tidy] Add support for `NOLINTBEGIN` ... `NOLINTEND` comments
ClosedPublic

Authored by salman-javed-nz on Aug 23 2021, 8:41 AM.

Details

Summary

Add support for NOLINTBEGIN ... NOLINTEND comments to suppress
clang-tidy warnings over multiple lines. All lines between the "begin"
and "end" markers are suppressed.

Example:

// NOLINTBEGIN(some-check)
<Code with warnings to be suppressed, line 1>
<Code with warnings to be suppressed, line 2>
<Code with warnings to be suppressed, line 3>
// NOLINTEND(some-check)

Follows similar syntax as the NOLINT and NOLINTNEXTLINE comments
that are already implemented, i.e. allows multiple checks to be provided
in parentheses; suppresses all checks if the parentheses are omitted,
etc.

If the comments are misused, e.g. using a NOLINTBEGIN but not
terminating it with a NOLINTEND, a clang-tidy-nolint diagnostic
message pointing to the misuse is generated.

As part of implementing this feature, the following bugs were fixed in
existing code:

  • IsNOLINTFound(): IsNOLINTFound("NOLINT", Str) returns true when Str is "NOLINTNEXTLINE". This is because the textual search finds NOLINT as the stem of NOLINTNEXTLINE.
  • LineIsMarkedWithNOLINT(): NOLINTNEXTLINEs on the very first line of a file are ignored. This is due to rsplit('\n\').second returning a blank string when there are no more newline chars to split on.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

clang-tools-extra/test/clang-tidy/infrastructure/nolintbeginend.cpp:

  • Added test to check what happens when NOLINTEND marker is used before NOLINTBEGIN marker (class B1, line 7).
    • Warning is still displayed (NOLINTEND ends suppression but this is redundant as there was no suppression in the first place).
  • Added test to check what happens when NOLINTBEGIN marker is used but no NOLINTEND marker is used afterwards (classes H1, H2; lines 87, 88).
    • Warning is suppressed for the remainder of the file.

Is this syntax used by any other tools?

clang-tools-extra/test/clang-tidy/infrastructure/nolintbeginend.cpp
7

Do you think this should be diagnosed as a sign of user confusion with the markings?

87

Should this be diagnosed as user confusion?

My concern in both of these cases isn't so much that someone writes this intentionally, but that one of the begin/end pair gets removed accidentally when refactoring. Helping the user to identify *where* the unmatched delimiters are seems like it could be user-friendly behavior.

clang-tools-extra/test/clang-tidy/infrastructure/nolintbeginend.cpp
7

For a stray NOLINTEND like this one, I don't think so. The original warning is still raised, so I see this as clang-tidy failing safe. The user is forced to fix their mistake before the warning goes away.

The consequences are of the same severity as misusing the existing NOLINT and NOLINTNEXTLINE markers, e.g. putting NOLINT on the wrong line, or adding a blank line after NOLINTNEXTLINE.

87

The consequences of this one are higher, as there is the potential to suppress warnings unintentionally and allow clang-tidy rule violations to go undetected. I agree that something more could be done here.

I can think of two improvements:

  1. In LineIsMarkedWithNOLINT(), when a NOLINTBEGIN is found, only return true if the corresponding NOLINTEND is found as well. Raise the original warning if the NOLINTEND is omitted.
  1. Raise an additional warning regarding the unmatched pair of delimiters. Some guidance on how to implement it would be appreciated. In the call stack of the LineIsMarkedWithNOLINT() function, I can't see any exposed functionality to generate new diagnostics on the fly. Would a new clang-tidy check be the place to implement this?

Changes in this latest patch:

  • LineIsMarkedWithNOLINT(): Moved NOLINTBEGIN/END aspects into a separate function.
  • LineIsWithinNOLINTBEGIN(): A NOLINTBEGIN/END region is only considered valid if both the BEGIN and END markers are present. If the region is malformed, the original warning will be raised.

Discussion points:

  • Misuse of NOLINTBEGIN/END now always results in the original warning being raised. However, a diagnostic directing the user to location of the mismatched BEGIN and END markers is not implemented yet.

Is this syntax used by any other tools?

It seems Google have implemented NOLINTBEGIN and NOLINTEND support in cpplint. I see lines such as // NOLINTBEGIN(whitespace/line_length), // NOLINTBEGIN(readability/check), // NOLINTBEGIN(build/include) scattered across different Google projects on GitHub.

aaron.ballman added inline comments.Sep 14 2021, 5:58 AM
clang-tools-extra/test/clang-tidy/infrastructure/nolintbeginend.cpp
7

Hmm, I'm not yet convinced we don't want to diagnose this situation. I agree that the behavior of *other* diagnostics is good (the user still gets those diagnostics because no range has been suppressed). But it seems like the programmer made a mistake if they don't balance the begin and end markers. I don't think this causes major issues, but I think the code is a bit harder to read because someone who spots the end marker may try looking for the begin marker that doesn't exist.

I suppose there's a small chance that a stray END may surprise users for more than just code readability -- consider a file with a stray end marker where the user wants to lazily suppress the end file by putting NOLINTBEGIN at the head of the file and NOLINTEND at the end of the file -- the stray NOLINTEND in the middle interrupts the full range.

87

That's a good question -- I don't know that I would expect LineIsMarkedWithNOLINT() to generate a diagnostic, but it's the obvious place for checking the validity of the markers. Naively, I would not expect to have to run a linter to check my lint markings, I would expect the linting tool to do that for me.

Would it make sense for shouldSuppressDiagnostic() to take a container of diagnostics generated (so LineIsMarkedWithNOLINT() has a place to store diagnostics), and ClangTidyDiagnosticConsumer::HandleDiagnostic() then checks whether the container is empty and if not, emits the additional diagnostics from there?

salman-javed-nz edited the summary of this revision. (Show Details)Sep 18 2021, 7:54 AM
salman-javed-nz marked 5 inline comments as done.
salman-javed-nz added inline comments.
clang-tools-extra/test/clang-tidy/infrastructure/nolintbeginend.cpp
7

I see your point. Clang-Tidy should alert the user about any stray NOLINTBEGIN or NOLINTEND markers it sees, not leave it up to the user to find them themselves. Not diagnosing this is only going to create more headache for the user in the long run.

87

Thank you for the suggestion. That is exactly what I have implemented in my latest patch.

salman-javed-nz marked 2 inline comments as done.

shouldSuppressDiagnostic():

  • Changed to take a container of diagnostics as an argument. If it finds any stray NOLINTBEGIN/NOLINTEND markers, it creates a diagnostic regarding the stray marker and places it in the container.

HandleDiagnostic():

  • Emits any diagnostics returned by shouldSuppressDiagnostic().

New unit tests:

  • test\clang-tidy\infrastructure\nolintbeginend-begin-without-end.cpp
  • test\clang-tidy\infrastructure\nolintbeginend-end-without-begin.cpp.
  • test\clang-tidy\infrastructure\nolintbeginend-mismatched-delims.cpp.

IsNOLINTFound():

  • Bug fix. IsNOLINTFound("NOLINT", Str) returns true when Str is "NOLINTNEXTLINE". This is because the text search finds "NOLINT" as a substring of "NOLINTNEXTLINE".
  • Added test case in test\clang-tidy\infrastructure\nolint.cpp.

LineIsMarkedWithNOLINT():

  • Bug fix. NOLINTNEXTLINEs on the very first line of a file are ignored. This is due to rsplit('\n\').second returning a blank string when there are no more newline chars to split on.
  • Added test case in test\clang-tidy\infrastructure\nolintnextline.cpp.

Minor update to function signatures

  • Remove arguments that are not absolutely required
  • Added consts

This really should have been incorporated in my previous patch, so sorry about the double notification.

Thanks, I think this is getting close! There are two more test cases that I think are interesting (and should cause any issues, hopefully):

// NOLINTEND
// CHECK-MESSAGES: for diagnostic on the previous line

and

// CHECK-MESSAGES: for diagnostic on the subsequent line
// NOLINTBEGIN

as the only contents in the file. The idea is that we want to check that searching for a "begin" when seeing an "end" at the top of the file doesn't do anything silly like try to read off the start of the file, and similar for "end" when seeing a "begin" at the end of a file.

clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp
208

This helps the reader of the code to understand that *File below is doing something special (that includes an assert check, which is great).

367

This message works, but I think it'd be better if we split it into two messages:

unmatched 'NOLINTBEGIN' comment without a subsequent 'NOLINTEND' comment
unmatched 'NOLINTEND' comment without a previous 'NOLINTBEGIN' comment

381–387
390
509–510

Should we assert that Unused is empty before returning?

clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.h
237

Might be better as a SmallVectorImpl as this should almost always be a container of 0 or 1 elements.

clang-tools-extra/test/clang-tidy/infrastructure/nolintbeginend-begin-without-end.cpp
6–8

THANK YOU for these comments! :-)

Updated according to review comments.

  • test/clang-tidy/infrastructure/nolintbeginend-begin-at-eof.cpp: new test
  • test/clang-tidy/infrastructure/nolintbeginend-end-at-sof.cpp: new test
  • Use llvm::ErrorOr return type with getFile()
  • Split diagnostic message into 2 messages (one specific to NOLINTBEGIN and one specific to NOLINTEND)
  • if ... else brace wrapping
  • Spelling corrected to "nonexistent"
  • Assert that Unused diagnostic container is empty
  • change std::vector<ClangTidyError> to llvm::SmallVectorImpl

Sorry for not thinking of this sooner, but there is another diagnostic we might want to consider.

// NOLINTBEGIN(check)
// NOLINTEND(other-check)

where the file does not contain a NOLINTEND(check) comment anywhere. In this case, the markers are not actually matched, so it's a NOLINTBEGIN(check) comment with no end and a NOLINTEND(other-check) comment with no begin. That seems like user confusion we'd also want to diagnose, but it also could be tricky because you really want to add diagnostic arguments for the check name. My concern here is mostly with catching typos where the user types check in the first and chekc in the second, so if we wanted to use the edit distance between what was provided and the list of check names to provide a fix-it, that would be very handy (and also probably difficult to implement so I definitely don't insist on a fix-it).

Relatedly, I think this construct is perhaps fine:

// NOLINTBEGIN(check)
// NOLINTEND(*)

because the end "covers" the begin. I'm a bit less clear on this though:

// NOLINTBEGIN(*)
// NOLINTEND(check)

because the begin is not fully covered by the end. However, I'm a bit less clear on what should or should not be diagnosed here, so if we wanted to leave that for follow-up work, that'd be fine.

salman-javed-nz marked 6 inline comments as done.

lineIsWithinNolintBegin():

  • If the search through the preceding lines returns no active NOLINTBEGINs, carry on reading the rest of the file anyway.

Thanks, I think this is getting close! There are two more test cases that I think are interesting (and should cause any issues, hopefully):

// NOLINTEND
// CHECK-MESSAGES: for diagnostic on the previous line

and

// CHECK-MESSAGES: for diagnostic on the subsequent line
// NOLINTBEGIN

as the only contents in the file. The idea is that we want to check that searching for a "begin" when seeing an "end" at the top of the file doesn't do anything silly like try to read off the start of the file, and similar for "end" when seeing a "begin" at the end of a file.

The good news is that the program does not do anything silly like read off the boundaries of the file. :-)
What is noteworthy, however, is that in test/clang-tidy/infrastructure/nolintbeginend-begin-at-eof.cpp only the original clang-tidy check diag is shown, not the diag about the unmatched NOLINTBEGIN. This is because of the early exit in lineIsWithinNolintBegin():

// Check if there's an open NOLINT(BEGIN...END) block on the previous lines.
// ...
auto Error = tallyNolintBegins(Context, SM, CheckName, PrevLines,
                               FileStartLoc, NolintBegins);
// ...
if (NolintBegins.empty())
  return false;

// Check that every block is terminated correctly on the following lines.
// ...

This has been fixed in the patch I just posted.

As for your latest message about NOLINTBEGIN(check) ... NOLINTEND(other-check), I'll have a think about it and get back to you in a day or two.

Thanks!

Sorry for not thinking of this sooner, but there is another diagnostic we might want to consider.

// NOLINTBEGIN(check)
// NOLINTEND(other-check)

where the file does not contain a NOLINTEND(check) comment anywhere. In this case, the markers are not actually matched, so it's a NOLINTBEGIN(check) comment with no end and a NOLINTEND(other-check) comment with no begin. That seems like user confusion we'd also want to diagnose, but it also could be tricky because you really want to add diagnostic arguments for the check name.

See new test for this scenario in test\clang-tidy\infrastructure\nolintbeginend-mismatched-check-names.cpp.
Diagnostics are generated for both the NOLINTBEGIN(check) with no end and the NOLINTEND(other-check) with no begin.

My concern here is mostly with catching typos where the user types check in the first and chekc in the second

See new test in test\clang-tidy\infrastructure\nolintbeginend-typo-in-check-name.cpp.

Relatedly, I think this construct is perhaps fine:

// NOLINTBEGIN(check)
// NOLINTEND(*)

because the end "covers" the begin.

That was my initial feeling too. However, after doing a bit more thinking, I feel that this construct causes more problems than it's worth. Consider the following example:

// NOLINTBEGIN(check-A)
<code that violates check-A>
// NOLINTEND(*)

This seems OK, but consider what happens when we add a 4th line:

// NOLINTBEGIN(check-A)
<code that violates check-A>
// NOLINTEND(*)
<code that violates check-B>

...this generates a NOLINTEND without a previous NOLINTBEGIN diagnostic on line 3. From check-B's perspective, it has been ENDed on line 3 but there was no BEGIN(check-B) or BEGIN(*) on lines 1-2.
Which NOLINT(BEGIN/END) comments are acceptable on a given line depends on which check you're evaluating at a given moment.

This problem goes away if we don't allow the check-specific NOLINTs and the "all checks" NOLINTs to be mixed and matched:

Example 1:

// NOLINTBEGIN(check-A)
<code that violates check-A>
// NOLINTEND(check-A)
<code that violates check-B>

Example 2:

// NOLINTBEGIN
<code that violates check-A>
// NOLINTEND
<code that violates check-B>

Let's look at an example where auto-generated code with its own NOLINT(BEGIN/END)s is embedded in another file:

// NOLINTBEGIN(check-A)
<code that violates check-A>
<code that violates check-A>

/*
 *
 Place-holder for auto-generated code
 *
 */

<code that violates check-A>
// NOLINTEND(check-A)

// ...
// ...

If the auto-generated code is as follows:

// NOLINTBEGIN(check-B)
<code that violates check-B>
<code that violates check-B>
// NOLINTEND

Then the complete file is:

// NOLINTBEGIN(check-A)
<code that violates check-A>
<code that violates check-A>

// NOLINTBEGIN(check-B)
<code that violates check-B>
<code that violates check-B>
// NOLINTEND

<code that violates check-A>
// NOLINTEND(check-A)

// ...
// ...

The NOLINTEND in the autogenerated code ends check-B as well as the parent file's check-A, against the user's intention.
Clang-Tidy can't offer any helpful advice without mind-reading.

I'm a bit less clear on this though:

// NOLINTBEGIN(*)
// NOLINTEND(check)

because the begin is not fully covered by the end. However, I'm a bit less clear on what should or should not be diagnosed here, so if we wanted to leave that for follow-up work, that'd be fine.

The same rule as above, don't mix and match check-specific NOLINTs with "all checks" NOLINTs, should apply here.

Also, in your example, you have begun all checks (check, check2, check3 ... checkN) but only ended one of them. The remaining checks are still awaiting a NOLINTEND comment of some sort.

I say all this in the interest of keeping the Clang-Tidy code simple, and reducing the amount of weird behavior that is possible (due to mistakes, lack of knowledge, or "creative" use of the feature). I rather this feature be strict and limited in scope, rather than flexible enough to be used in unforeseen error-prone ways.

Perhaps this feature can be extended in the future (if need be) after it gets some use and user feedback comes in?

salman-javed-nz edited the summary of this revision. (Show Details)Sep 24 2021, 4:23 AM

lineIsWithinNolintBegin():
Put NOLINTBEGIN comments into 2 buckets:

  1. Comments that apply to a specific check - NOLINTBEGIN(check)
  2. Comments that apply to all checks - NOLINTBEGIN(*) and NOLINTBEGIN with no args

If a check is begun with one type of comment, it must be ended with the same type.

New tests:
nolintbeginend-begin-global-end-specific.cpp
nolintbeginend-begin-specific-end-global.cpp
nolintbeginend-mismatched-check-names.cpp
nolintbeginend-typo-in-check-name.cpp

Pre-merge build error doesn't seem related to my change, but rebasing patch anyway just to be sure.

Also, in your example, you have begun all checks (check, check2, check3 ... checkN) but only ended one of them. The remaining checks are still awaiting a NOLINTEND comment of some sort.

+1

I say all this in the interest of keeping the Clang-Tidy code simple, and reducing the amount of weird behavior that is possible (due to mistakes, lack of knowledge, or "creative" use of the feature). I rather this feature be strict and limited in scope, rather than flexible enough to be used in unforeseen error-prone ways.

Thank you for the careful evaluation -- I agree with you!

Perhaps this feature can be extended in the future (if need be) after it gets some use and user feedback comes in?

Agreed.

clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp
387–388

List is the same on every iteration through the loop, so we might as well set it once and reuse it.

391–393
397–399
clang-tools-extra/docs/clang-tidy/index.rst
306–309

We should also document the diagnostic behavior of the comments themselves (the fact that misusing these can generate diagnostics).

salman-javed-nz marked 4 inline comments as done.
salman-javed-nz retitled this revision from [clang-tidy] Add support for NOLINTBEGIN ... NOLINTEND comments to suppress clang-tidy warnings over multiple lines to [clang-tidy] Add support for `NOLINTBEGIN` ... `NOLINTEND` comments.
salman-javed-nz edited the summary of this revision. (Show Details)

Addressing review comments.

tallyNolintBegins():

  • Refactor so that the &List = SuppressionIsSpecific ? SpecificNolintBegins : GlobalNolintBegins line is not duplicated in 2 places.

index.rst:

  • Added documentation on the rules regarding proper usage of NOLINTBEGIN/NOLINTEND, and the diagnostic that is raised if they aren't followed.

Additional changes:

Phabricator Differential summary:

  • Shortened commit title
  • Word-wrapped commit message body for readability
clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp
387–388

I've refactored this so that the logic to select the appropriate list (the ternary operation) is only defined once.
However, this logic still has to be called within the for loop, not outside, because SuppressionIsSpecific can change on every iteration of the loop.

e.g.

// NOLINTBEGIN(check)  <-- for loop iteration 0: SuppressionIsSpecific == true,  &List = SpecificNolintBegins
// NOLINTBEGIN         <-- for loop iteration 1: SuppressionIsSpecific == false, &List = GlobalNolintBegins
// NOLINTEND           <-- for loop iteration 2: SuppressionIsSpecific == false, &List = GlobalNolintBegins
// NOLINTEND(check)    <-- for loop iteration 3: SuppressionIsSpecific == true,  &List = SpecificNolintBegins
clang-tools-extra/docs/clang-tidy/index.rst
361

Additional documentation added here about the new diagnostic and how it is triggered.

Also, the pre-merge build failure is confirmed to be a problem with the build system and not the contents of my patch.
https://github.com/google/llvm-premerge-checks/issues/353

aaron.ballman accepted this revision.Sep 28 2021, 4:06 AM

LGTM, thank you for this! Do you need me to commit on your behalf? If so, what name and email address would you like me to use for patch attribution in git?

This revision is now accepted and ready to land.Sep 28 2021, 4:06 AM

Yes, I will need your help to commit.

Salman Javed
mail@salmanjaved.org

Thank you very much for the review. Looking back at my initial initial submission a few weeks ago, I can see all the valuable improvements the review process has made. Hopefully people find this NOLINTBEGIN/NOLINTEND feature useful.

aaron.ballman closed this revision.Sep 28 2021, 4:54 AM

Yes, I will need your help to commit.

Salman Javed
mail@salmanjaved.org

Thank you very much for the review. Looking back at my initial initial submission a few weeks ago, I can see all the valuable improvements the review process has made. Hopefully people find this NOLINTBEGIN/NOLINTEND feature useful.

My pleasure, thank you for the discussion and the great new functionality! I've commit on your behalf in c0687e1984a82925918c874b7bb68ad34c32aed0.

I had to revert in 9b944c184396ce55a3ad608779cc326ba12c9ee3 because there are testing failures.

FAIL: Clang Tools :: clang-tidy/infrastructure/nolintbeginend.cpp (1 of 1)
******************** TEST 'Clang Tools :: clang-tidy/infrastructure/nolintbeginend.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   'C:/Program Files/Python39/python.exe' F:/source/llvm-project/clang-tools-extra/test/../test\clang-tidy\check_clang_tidy.py F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure\nolintbeginend.cpp google-explicit-constructor F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp -- --header-filter=.* -system-headers -- -isystem F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure/Inputs/nolintbeginend
--
Exit Code: 1

Command Output (stdout):
--
$ ":" "RUN: at line 1"
$ "C:/Program Files/Python39/python.exe" "F:/source/llvm-project/clang-tools-extra/test/../test\clang-tidy\check_clang_tidy.py" "F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure\nolintbeginend.cpp" "google-explicit-constructor" "F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp" "--" "--header-filter=.*" "-system-headers" "--" "-isystem" "F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure/Inputs/nolintbeginend"
# command output:
Running ['clang-tidy', 'F:\\source\\llvm-project\\llvm\\out\\build\\x64-Debug\\tools\\clang\\tools\\extra\\test\\clang-tidy\\infrastructure\\Output\\nolintbeginend.cpp.tmp.cpp', '-fix', '--checks=-*,google-explicit-constructor', '--header-filter=.*', '-system-headers', '-config={}', '--', '-isystem', 'F:\\source\\llvm-project\\clang-tools-extra\\test\\clang-tidy\\infrastructure/Inputs/nolintbeginend', '-std=c++11', '-nostdinc++']...
------------------------ clang-tidy output -----------------------
6 warnings generated.
F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure/Inputs/nolintbeginend\error_in_include.inc:1:11: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
class G { G(int i); };
          ^
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:3:11: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
class A { A(int i); };
          ^
          explicit
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:3:11: note: FIX-IT applied suggested code changes
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:30:12: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
class B5 { B5(int i); };
           ^
           explicit
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:30:12: note: FIX-IT applied suggested code changes
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:42:12: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
class C3 { C3(int i); };
           ^
           explicit
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:42:12: note: FIX-IT applied suggested code changes
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:57:12: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
class C6 { C6(int i); };
           ^
           explicit
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:57:12: note: FIX-IT applied suggested code changes
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:95:7: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
MACRO(D1)
      ^
      explicit
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:95:7: note: FIX-IT applied suggested code changes
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:93:28: note: expanded from macro 'MACRO'
#define MACRO(X) class X { X(int i); };
                           ^
clang-tidy applied 5 of 5 suggested fixes.
Suppressed 19 warnings (19 NOLINT).

------------------------------------------------------------------
------------------------------ Fixes -----------------------------
--- F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp.orig       2021-09-28 14:35:01.893193700 -0400
+++ F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp    2021-09-28 14:35:02.802959500 -0400
@@ -1,6 +1,6 @@
 // RUN: %check_clang_tidy %s google-explicit-constructor %t -- --header-filter=.* -system-headers -- -isystem %S/Inputs/nolintbeginend

-class A { A(int i); };
+class A { explicit A(int i); };
 //

 // NOLINTBEGIN
@@ -27,7 +27,7 @@

 // NOLINTBEGIN
 // NOLINTEND
-class B5 { B5(int i); };
+class B5 { explicit B5(int i); };
 //

 // NOLINTBEGIN(google-explicit-constructor)
@@ -39,7 +39,7 @@
 // NOLINTEND(*)

 // NOLINTBEGIN(some-other-check)
-class C3 { C3(int i); };
+class C3 { explicit C3(int i); };
 //
 // NOLINTEND(some-other-check)

@@ -54,7 +54,7 @@

 // NOLINTBEGIN(some-other-check, google-explicit-constructor)
 // NOLINTEND(google-explicit-constructor)
-class C6 { C6(int i); };
+class C6 { explicit C6(int i); };
 //
 // NOLINTEND(some-other-check)

@@ -92,7 +92,7 @@

 #define MACRO(X) class X { X(int i); };

-MACRO(D1)
+MACRO(explicit D1)
 //
 //


------------------------------------------------------------------
FileCheck failed:
command line:1:22: error: CHECK-MESSAGES-NOT: excluded string found in input
-implicit-check-not='{{warning|error}}:'
                     ^
F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp.msg:2:122: note: found here
F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure/Inputs/nolintbeginend\error_in_include.inc:1:11: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
                                                                                                                         ^~~~~~~~

Input file: F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp.msg
Check file: F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure\nolintbeginend.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: 6 warnings generated.
          2: F:\source\llvm-project\clang-tools-extra\test\clang-tidy\infrastructure/Inputs/nolintbeginend\error_in_include.inc:1:11: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
not:imp1                                                                                                                              !~~~~~~~                                                                                                                                 error: no match expected
          3: class G { G(int i); };
          4:  ^
          5: F:\source\llvm-project\llvm\out\build\x64-Debug\tools\clang\tools\extra\test\clang-tidy\infrastructure\Output\nolintbeginend.cpp.tmp.cpp:3:11: warning: single-argument constructors must be marked explicit to avoid unintentional implicit conversions [google-explicit-constructor]
          6: class A { A(int i); };
          7:  ^
          .
          .
          .
>>>>>>


# command stderr:
Traceback (most recent call last):
  File "F:\source\llvm-project\clang-tools-extra\test\clang-tidy\check_clang_tidy.py", line 252, in <module>
    main()
  File "F:\source\llvm-project\clang-tools-extra\test\clang-tidy\check_clang_tidy.py", line 248, in main
    run_test_once(args, extra_args)
  File "F:\source\llvm-project\clang-tools-extra\test\clang-tidy\check_clang_tidy.py", line 183, in run_test_once
    subprocess.check_output(
  File "c:\program files\python39\lib\subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "c:\program files\python39\lib\subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['FileCheck', '-input-file=F:\\source\\llvm-project\\llvm\\out\\build\\x64-Debug\\tools\\clang\\tools\\extra\\test\\clang-tidy\\infrastructure\\Output\\nolintbeginend.cpp.tmp.cpp.msg', 'F:\\source\\llvm-project\\clang-tools-extra\\test\\clang-tidy\\infrastructure\\nolintbeginend.cpp', '-check-prefixes=CHECK-MESSAGES', '-implicit-check-not={{warning|error}}:']' returned non-zero exit status 1.

error: command failed with exit status: 1

--

Resolving build error due to failed unit test.

On some build bots, the clang-tidy diagnostics coming from error_in_include.inc get printed BEFORE all other diagnostics, REGARDLESS of the location of the #include directive in the unit test cpp file.
On other build bots, these diagnostics are printed AFTER all other diagnostics. Due to this inconsistency between build bots, the CHECK-MESSAGES statements are not finding their expected messages on their expected lines.

Fixed by moving the unit tests related to including error_in_include.inc from nolintbeginend.cpp into its own file nolintbeginend-error-within-include.cpp.
This new file only checks for one diagnostic, so is therefore unaffected by the diagnostic display order.


You can see in the attached file, that when I ran the unit test on a x86_64 Windows 10 PC, the diagnostic from error_in_include.inc is printed at the end. However, on this this clang-s390x-linux-multistage build, it is printing the diagnostic from error_in_include.inc first:
https://lab.llvm.org/buildbot/#/builders/8/builds/1860/steps/5/logs/FAIL__Clang_Tools__nolintbeginend_cpp.

Thanks for the fix, those changes LGTM! I've recommit on your behalf in 722e705f72dd3077a1f51dc62717828e7ccf23e8

Aaron, Salman,

We've seen a serious perfomance regression caused by this patch. The issue is that the new code scans every file in its entirety for every single diagnostic encountered in that file. In fact, each lines is copied and scanned twice (respectively), so it's like O(n * m), n = size of files, m = number of diagnostics, with high constant factor.

Can we roll back this patch until a fix is developed? At the least, we'd request that this feature be disabled by default, either with a flag or in the preprocessor. Additionally, we'd ask you to reconsider the design to ensure that each file is processed at most once (via a cache or whatnot). We'd be happy to review the patches.

Thanks!

Aaron, Salman,

We've seen a serious perfomance regression caused by this patch. The issue is that the new code scans every file in its entirety for every single diagnostic encountered in that file. In fact, each lines is copied and scanned twice (respectively), so it's like O(n * m), n = size of files, m = number of diagnostics, with high constant factor.

Oof, good catch!

Can we roll back this patch until a fix is developed? At the least, we'd request that this feature be disabled by default, either with a flag or in the preprocessor. Additionally, we'd ask you to reconsider the design to ensure that each file is processed at most once (via a cache or whatnot). We'd be happy to review the patches.

I'm fine (if a bit sad) with reverting these changes until we have a better approach. @salman-javed-nz , can you look into that (either the revert or the fix)?

Demo patch here: https://reviews.llvm.org/D114981. Still needs fixes to the tests and any potential plumbing from flags in clang-tidy main.

Good catch! That explains the performance drop we observed as well.

We are however currently relying on this feature so it would be sad to revert it. Would it be possible/make sense to enable/disable via CLI/config?

Good catch! That explains the performance drop we observed as well.

We are however currently relying on this feature so it would be sad to revert it. Would it be possible/make sense to enable/disable via CLI/config?

I don't think that's a good initial approach. The command line flag ends up living a weird life once we eventually fix the performance issues (then the flag isn't necessary but we still have to maintain it until we can remove it). I'm not strictly opposed to a flag, but I'd want that to be more of a last resort than a first effort.

salman-javed-nz added a comment.EditedThu, Dec 2, 12:21 PM

Thanks for the investigation. Just exploring the options...

With regards to reverting it, is the cat out of the bag? I already see some usage of the NOLINTBEGIN feature in other projects.
There's also a NOLINT check globbing feature that builds on top of this, so it's not as simple as just backing out one commit.

I can have a go at coming up with a solution where it does caching of the NOLINT marker positions when a diagnostic is generated in a file. That would probably help a lot with the performance. It wouldn't be too complicated to implement either. Would you be happy with something like that?

If the current code stayed in the main repo, how long could you afford to wait for me to provide a patch?

I can have a go at coming up with a solution where it does caching of the NOLINT marker positions when a diagnostic is generated in a file. That would probably help a lot with the performance. It wouldn't be too complicated to implement either. Would you be happy with something like that?

That was me just thinking out loud. If you have other approaches, I would be keen to hear them.

Thanks for the investigation. Just exploring the options...

With regards to reverting it, is the cat out of the bag? I already see some usage of the NOLINTBEGIN feature in other projects.
There's also a NOLINT check globbing feature that builds on top of this, so it's not as simple as just backing out one commit.

I can have a go at coming up with a solution where it does some caching of the NOLINT marker positions when a diagnostic is generated in a file. That would probably help a lot with the performance. It wouldn't be too complicated to implement either. Would you be happy with something like that?

If the current code stayed in the main repo, how long could you afford to wait for me to provide a patch?

I would say that for me personally, at a maximum, we have to either fix or revert before this goes out in a release of Clang (because Clang 13 did not have support for NOLINTBEGIN/NOLINTEND, so this is technically unreleased functionality).

If the current code stayed in the main repo, how long could you afford to wait for me to provide a patch?

I would say that for me personally, at a maximum, we have to either fix or revert before this goes out in a release of Clang (because Clang 13 did not have support for NOLINTBEGIN/NOLINTEND, so this is technically unreleased functionality).

I see. Under normal circumstances, which release would this patch be going into? Clang 14, not the next 13 point release, right?

I'm happy to help reviewing the fix if we decide to go that way. I agree that a simple revert might not be straightforward due to other patches building on top of it.

Good catch! That explains the performance drop we observed as well.

We are however currently relying on this feature so it would be sad to revert it. Would it be possible/make sense to enable/disable via CLI/config?

I don't think that's a good initial approach. The command line flag ends up living a weird life once we eventually fix the performance issues (then the flag isn't necessary but we still have to maintain it until we can remove it). I'm not strictly opposed to a flag, but I'd want that to be more of a last resort than a first effort.

I see the problem now, thanks!

If the current code stayed in the main repo, how long could you afford to wait for me to provide a patch?

I would say that for me personally, at a maximum, we have to either fix or revert before this goes out in a release of Clang (because Clang 13 did not have support for NOLINTBEGIN/NOLINTEND, so this is technically unreleased functionality).

I see. Under normal circumstances, which release would this patch be going into? Clang 14, not the next 13 point release, right?

Correct! Our point releases don't add new features, they only cherry-pick bug fixes. So if NOLINTBEGIN did land in Clang 13, then the fix could potentially be a good one for Clang 13.0.1. But because NOLINTBEGIN didn't make it to Clang 13, Clang 14 will be the first release exposing the feature.

Thanks, all, for the prompt response to these concerns!!

Thanks for the investigation. Just exploring the options...

With regards to reverting it, is the cat out of the bag? I already see some usage of the NOLINTBEGIN feature in other projects.
There's also a NOLINT check globbing feature that builds on top of this, so it's not as simple as just backing out one commit.

Agreed. I think something along the lines of https://reviews.llvm.org/D114981 is more realistic -- turning it off without backing out the code.

I can have a go at coming up with a solution where it does caching of the NOLINT marker positions when a diagnostic is generated in a file. That would probably help a lot with the performance. It wouldn't be too complicated to implement either. Would you be happy with something like that?

Yes, that sounds good!

If the current code stayed in the main repo, how long could you afford to wait for me to provide a patch?

If there's something we could do ASAP, we'd prefer that to waiting given that its having measurable impact already. I wonder if we change my patch so that it leaves the feature enabled by default, it would still allow us to modify internal tools to disable the feature in the interim.

Also, I'm concerned that even an optimized implementation of this might have noticable (if much smaller) impact, since it will be adding an O(n) pass to clang-tidy. Compared with the cost of building the AST this may be trivial, but still. So, the parameter I added to the constructor might be valuable long term to allow clients of the class to disable this feature, even if we don't want to expose it as a CLI flag (for the reasons Aaron mentioned above).

Update: I modified https://reviews.llvm.org/D114981 to leave it enabled by default. That makes it essentially an NFC while still allowing clients to disable it easily in tooling that uses it.

We can still disable by default, but that will be a separate patch from adding the control over enable/disable.

salman-javed-nz added a comment.EditedThu, Dec 2, 2:05 PM

With regards to reverting it, is the cat out of the bag? I already see some usage of the NOLINTBEGIN feature in other projects.
There's also a NOLINT check globbing feature that builds on top of this, so it's not as simple as just backing out one commit.

If this is the approach we take, I prefer the logic to be switched around: the NOLINT block feature should be enabled by default. Once (the worst of) this performance issue is resolved, it would be nice from a usability perspective if the user did not have to explicitly opt in to use the feature.

I wonder if we change my patch so that it leaves the feature enabled by default, it would still allow us to modify internal tools to disable the feature in the interim.

Yes, that sounds better to me.

Also, I'm concerned that even an optimized implementation of this might have noticable (if much smaller) impact, since it will be adding an O(n) pass to clang-tidy. Compared with the cost of building the AST this may be trivial, but still. So, the parameter I added to the constructor might be valuable long term to allow clients of the class to disable this feature, even if we don't want to expose it as a CLI flag (for the reasons Aaron mentioned above).

I'll do my best to make this feature as efficient as possible. There will always be some performance impact - due to the inherent nature of this feature. I wonder how much the execution time increases when a new clang-tidy check is added to the repo.
I don't have the answer to what's the right balance between performance and functionality. The NOLINT block feature has been something people have been asking for for a long time. How much performance hit are they willing to take to make suppressing clang-tidy warnings a bit easier?

salman-javed-nz added a comment.EditedThu, Dec 2, 2:12 PM

Update: OK I see you have already changed your patch to make the feature enabled by default.
As far as the performance is concerned, I will have a go at improving it and hope to have something to share next week. How does that sound?

Update: OK I see you have already changed your patch to make the feature enabled by default.
As far as the performance is concerned, I will have a go at improving it and hope to have something to share next week. How does that sound?

That sounds great. Thank you for prompt attention and understanding response! :)

salman-javed-nz added a comment.EditedWed, Dec 8, 1:47 AM

Hi, just giving a progress update. This is just a early heads-up - no action needed from you at this stage.

I have a rough prototype that seems promising. The execution time is no longer heavily dependent on the number of headers being included and the number of diagnostics being raised. I need to tidy up and refactor the code before sharing with you but to give an early indication of the path I'm taking, it's essentially:

struct NolintBlock {
  SourceLocation BeginLoc;
  SourceLocation EndLoc;
  std::string Checks;
};

// Need to keep a cache of every NOLINT block seen, not only in the current file
// but in any other file seen during this run of the program. We may be context
// switching between files (if there is a check violation in a macro in an
// #included file, for instance), and we don't want to be starting from scratch
// each time.
some_kind_of_map<FileID, SmallVector<NolintBlock>> NolintBlocks;
// Performance of each type of map class still needs to be evaluated...

bool lineIsMarkedWithNOLINT(/* ... */) {
  // These two `if` statements will more or less be the same as they are now:
  //   If NOLINT statement found on this line, return true
  //   If NOLINTNEXTLINE statement found on previous line, return true

  // If no cache for the file exists, generate it (warning: heavy operation)
  
  // For NolintBlock in NolintBlocks[FileID]:
  //   if NolintBlock.StartLoc < Loc < NolintBlock.EndLoc:
  //      return true
  // Return false
}