This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Check to find unintended semicolons that changes the semantics.
ClosedPublic

Authored by xazax.hun on Jan 25 2016, 8:17 AM.

Details

Summary

This patch adds a checker to find semicolons that are probably unintended and modify the semantics of the program. See the examples and the documentation for more details.

Diff Detail

Repository
rL LLVM

Event Timeline

xazax.hun updated this revision to Diff 45869.Jan 25 2016, 8:17 AM
xazax.hun retitled this revision from to [clang-tidy] Check to find unintended semicolons that changes the semantics..
xazax.hun updated this object.
xazax.hun added a reviewer: alexfh.
xazax.hun added subscribers: dkrupp, cfe-commits.
kimgr added a subscriber: kimgr.Jan 25 2016, 12:39 PM

Cool check, I like how it pays attention to indentation!

I found some minor doc nits, see inline.

docs/clang-tidy/checks/misc-suspicious-semicolon.rst
8–9 ↗(On Diff #45869)

"context" looks like it would fit on the wrapped line.

20 ↗(On Diff #45869)

incremented?

29 ↗(On Diff #45869)

Typo: identation

71 ↗(On Diff #45869)

There's been some preference for "check" over "checker" lately.

test/clang-tidy/misc-suspicious-semicolon.cpp
88 ↗(On Diff #45869)

Weird pointer alignment is a little distracting here, better stick with LLVM convention and attach it to s?

xazax.hun marked 5 inline comments as done.Jan 26 2016, 12:07 AM
xazax.hun updated this revision to Diff 45949.Jan 26 2016, 12:08 AM
  • Fixed the nits pointed out by the review.
alexfh edited edge metadata.

Sorry for the long delay. Currently rather swamped.

Haojian, could you take a look at this patch?

hokein added inline comments.Feb 9 2016, 5:23 AM
clang-tidy/misc/SuspiciousSemicolonCheck.cpp
23 ↗(On Diff #45949)

Looks like this check doesn't handle the case that unintended semicolon is in else statement.

if (condition1) {
} else if (condition2);
  a = 2
test/clang-tidy/misc-suspicious-semicolon.cpp
28 ↗(On Diff #45949)

Can you add the following if statement cases in the test?

if (condition)
  ;
if (condition)
  ;
else {
}

The behavior of the check is that both two cases are not warned. But I think we should warn the first one since there is no reason to write if code for that.

xazax.hun updated this revision to Diff 47316.Feb 9 2016, 6:59 AM
xazax.hun edited edge metadata.
  • Fixed the cases pointed out by the review.

Thank you, good catches!

hokein added inline comments.Feb 9 2016, 9:01 AM
docs/clang-tidy/checks/misc-suspicious-semicolon.rst
35 ↗(On Diff #47316)

The doc needs to be updated.

With your latest patch, this is also a warning case.

xazax.hun added inline comments.Feb 10 2016, 5:18 AM
docs/clang-tidy/checks/misc-suspicious-semicolon.rst
35 ↗(On Diff #47316)

The documentation implies that this is a warning case but I will make it clearer to be less confusing.

xazax.hun updated this revision to Diff 47446.Feb 10 2016, 5:19 AM
  • Documentation clarification.
hokein edited edge metadata.Feb 10 2016, 5:55 AM

LGTM, thanks for working on this! Ping @alexfh

hokein accepted this revision.Feb 10 2016, 5:55 AM
hokein edited edge metadata.
This revision is now accepted and ready to land.Feb 10 2016, 5:55 AM
alexfh accepted this revision.Feb 10 2016, 7:28 AM
alexfh edited edge metadata.

LG. Thanks!

Could you run the check on LLVM and post here a summary of results: how many warnings are generated, whether there are any false positives (based on a reasonably-sized random sample, if there are too many warnings to inspect all relevant code), etc.

This revision was automatically updated to reflect the committed changes.

Could you run the check on LLVM and post here a summary of results: how many warnings are generated, whether there are any false positives (based on a reasonably-sized random sample, if there are too many warnings to inspect all relevant code), etc.

This check did not give any results for LLVM codebase. I did not see any false positive on other projects yet, so I assume the false positive rate should be low (but might depend on coding style).

omtcyf0 added a subscriber: omtcyf0.EditedFeb 26 2016, 4:10 AM

@xazax.hun did you actually run the tool on the LLVM codebase?

Because this check generates tons of false-positive reports during codebase analysis.

See the minimal example below.

omtcyf0-laptop:playground omtcyf0$ cat main.cpp 
#include <vector>

int main() {
  std::vector<int> numbers = {1, 2, 3, 4, 5, 6};
  for (std::vector<int>::iterator it = std::begin(numbers),
                                  end = std::end(numbers);
       it != end; ++it) {
    (*it)++;
  }
  return 0;
}
omtcyf0-laptop:playground omtcyf0$ /Users/omtcyf0/Documents/dev/build/Release/llvm/bin/clang-tidy -checks=misc-suspicious-semicolon main.cpp 
Error while trying to load a compilation database:
Could not auto-detect compilation database for file "main.cpp"
No compilation database found in /Users/omtcyf0/Documents/dev/playground or any parent directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.
1 warning and 1 error generated.
Error while processing /Users/omtcyf0/Documents/dev/playground/main.cpp.
/Users/omtcyf0/Documents/dev/playground/main.cpp:6:17: warning: potentially unintended semicolon [misc-suspicious-semicolon]
       it != end; ++it) {
                ^
/usr/include/wchar.h:89:10: error: 'stdarg.h' file not found [clang-diagnostic-error]
#include <stdarg.h>
         ^

And this is happening all over the LLVM codebase, because there is nothing bad there.

Can you please fix that?

@xazax.hun did you actually run the tool on the LLVM codebase?

Because this check generates tons of false-positive reports during codebase analysis.

See the minimal example below.

omtcyf0-laptop:playground omtcyf0$ cat main.cpp 
#include <vector>

int main() {
  std::vector<int> numbers = {1, 2, 3, 4, 5, 6};
  for (std::vector<int>::iterator it = std::begin(numbers),
                                  end = std::end(numbers);
       it != end; ++it) {
    (*it)++;
  }
  return 0;
}
omtcyf0-laptop:playground omtcyf0$ /Users/omtcyf0/Documents/dev/build/Release/llvm/bin/clang-tidy -checks=misc-suspicious-semicolon main.cpp 
Error while trying to load a compilation database:
Could not auto-detect compilation database for file "main.cpp"
No compilation database found in /Users/omtcyf0/Documents/dev/playground or any parent directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.
1 warning and 1 error generated.
Error while processing /Users/omtcyf0/Documents/dev/playground/main.cpp.
/Users/omtcyf0/Documents/dev/playground/main.cpp:6:17: warning: potentially unintended semicolon [misc-suspicious-semicolon]
       it != end; ++it) {
                ^
/usr/include/wchar.h:89:10: error: 'stdarg.h' file not found [clang-diagnostic-error]
#include <stdarg.h>
         ^

And this is happening all over the LLVM codebase, because there is nothing bad there.

Can you please fix that?

Kirill, the problem in your case may be related to the check seeing incomplete AST due to compilation errors. Can you append -- -std=c++11 to your clang-tidy invocation and try again whether it will be able to parse the file completely (i.e. without any "file not found" and other compilation errors)?

Having said that, it makes sense to handle invalid AST in the check somehow (e.g. completely disable the check), so it doesn't generate spurious errors.

omtcyf0 added a comment.EditedFeb 26 2016, 4:45 AM

@xazax.hun did you actually run the tool on the LLVM codebase?

Because this check generates tons of false-positive reports during codebase analysis.

See the minimal example below.

omtcyf0-laptop:playground omtcyf0$ cat main.cpp 
#include <vector>

int main() {
  std::vector<int> numbers = {1, 2, 3, 4, 5, 6};
  for (std::vector<int>::iterator it = std::begin(numbers),
                                  end = std::end(numbers);
       it != end; ++it) {
    (*it)++;
  }
  return 0;
}
omtcyf0-laptop:playground omtcyf0$ /Users/omtcyf0/Documents/dev/build/Release/llvm/bin/clang-tidy -checks=misc-suspicious-semicolon main.cpp 
Error while trying to load a compilation database:
Could not auto-detect compilation database for file "main.cpp"
No compilation database found in /Users/omtcyf0/Documents/dev/playground or any parent directory
json-compilation-database: Error while opening JSON database: No such file or directory
Running without flags.
1 warning and 1 error generated.
Error while processing /Users/omtcyf0/Documents/dev/playground/main.cpp.
/Users/omtcyf0/Documents/dev/playground/main.cpp:6:17: warning: potentially unintended semicolon [misc-suspicious-semicolon]
       it != end; ++it) {
                ^
/usr/include/wchar.h:89:10: error: 'stdarg.h' file not found [clang-diagnostic-error]
#include <stdarg.h>
         ^

And this is happening all over the LLVM codebase, because there is nothing bad there.

Can you please fix that?

Kirill, the problem in your case may be related to the check seeing incomplete AST due to compilation errors. Can you append -- -std=c++11 to your clang-tidy invocation and try again whether it will be able to parse the file completely (i.e. without any "file not found" and other compilation errors)?

Yes, you're right, it does suppress the warning.

However, running vanilla run-clang-tidy.py on LLVM codebase still outputs such issues. Is it run-clang-tidy.py's fault? My compilation database contains -std=c++11 specifiers, I assumed this gets added to clang-tidy options while performing analysis of these sources, doesn't it?

omtcyf0 added a comment.EditedFeb 26 2016, 5:12 AM

Hm. Seems like this is either me using run-clang-tidy.py wrong or it working not properly, because I still get compilation errors. I believe I've done same thing yesterday using older tree and I didn't get any.

Thus said, everything's alright if AST is acquired correctly. Otherwise the check works not as expected, so the fix Alexander wrote about is needed.

I think, the check can be submitted as is and guards against spurious errors on invalid AST can be added as a follow up.

I think, the check can be submitted as is and guards against spurious errors on invalid AST can be added as a follow up.

This check is already submitted. However I did not found any API in tidy to check whether the compilation failed. It would be great to be able to query that information at the time of registering matchers.

I have one more question though: does it make sense to run clang tidy at all, when the compilation failed? Isn't it a better default behaviour to not to run any of the checks in that case?

I think, the check can be submitted as is and guards against spurious errors on invalid AST can be added as a follow up.

This check is already submitted.

Yup, /me needs to sleep more ;)

However I did not found any API in tidy to check whether the compilation failed. It would be great to be able to query that information at the time of registering matchers.

This information is not available when registering matchers, since this happens before parsing. You can use isInvalidDecl() to check whether particular declaration had any errors, or Result.Context->getTranslationUnitDecl()->isInvalidDecl() to check the whole translation unit.

I have one more question though: does it make sense to run clang tidy at all, when the compilation failed? Isn't it a better default behaviour to not to run any of the checks in that case?

There are some valid use cases for running checks on a non-compilable code, e.g. editor integration.

LegalizeAdulthood added inline comments.
docs/clang-tidy/checks/misc-suspicious-semicolon.rst
48–49 ↗(On Diff #47316)

In this documentation, can we please format code so that control structures don't look like function calls? By this I mean write:

if (x >=y);
  x -= y;

while (readwhitespace());
  Token t = readNextToken();

instead of

if(x >=y);
  x -= y;

while(readwhitespace());
  Token t = readNextToken();

The comments should be addressed in: http://reviews.llvm.org/rL262615