Page MenuHomePhabricator

Add readability-redundant-void-arg check to clang-tidy
ClosedPublic

Authored by LegalizeAdulthood on Feb 13 2015, 9:57 PM.

Details

Summary

This check for clang-tidy looks for function with zero arguments declared as (void) and removes the unnecessary void token.

int foo(void);

becomes

int foo();

The check performs no formatting of the surrounding context but uses the lexer to look for the token sequence "(", "void", ")" in the prototype text. If this sequence of tokens is found, a removal is issued for the void token only.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Simplify check for nullary function.

xazax.hun added inline comments.Mar 2 2015, 11:20 PM
clang-tidy/readability/RedundantVoidArg.cpp
148 ↗(On Diff #21084)

You can also pass std::string to an llvm::StringRef. In the llvm codebase it is a rule of thumb to use StringRef instead of const std::string&.

158 ↗(On Diff #21084)

It is possible to create a lexer without creating an std::string. You can check an example in: http://reviews.llvm.org/D7375 (look for getBuffer and getCharacterData). Of course, this way you have to make sure to end lexing after the interested position is left, but you avoid the heap allocation.

alexfh added inline comments.Mar 3 2015, 7:17 AM
clang-tidy/readability/RedundantVoidArg.cpp
41 ↗(On Diff #21084)

Please use either "auto" or "const auto *" consistently. This function has usages of both.

clang-tidy/readability/RemoveVoidArg.h
38 ↗(On Diff #20949)

Language options only become available when a compiler instance is being initialized. The matchers are registered way before that and then multiple different compilations may be run potentially with different options (if multiple translation units have been passed to clang-tidy).

So there's no way to learn what language will be analyzed when we register matchers.

LegalizeAdulthood retitled this revision from Add readability-remove-void-arg check to clang-tidy to Add readability-redundant-void-arg check to clang-tidy.Mar 3 2015, 2:20 PM
LegalizeAdulthood updated this object.
clang-tidy/readability/RedundantVoidArg.cpp
41 ↗(On Diff #21084)

Fixed

148 ↗(On Diff #21084)

Fixed.

158 ↗(On Diff #21084)

This suggestion has been made several times during this review and every time I try it I get this error at runtime:

void clang::Lexer::InitLexer(const char *, const char *, const char *): Assertion `BufEnd[0] == 0 && "We assume that the input buffer has a null character at the end" " to simplify lexing!"' failed.

If you can show me how to lex a SourceRange without getting that error, I'm happy to change it. I tried the method shown in the linked diff and I get the same error.

However, one good thing came from the experiment. This checker only creates fixits that remove text, so I don't need to compute replacement text and this simplified the replacement code a little bit more.

clang-tidy/readability/RemoveVoidArg.cpp
42 ↗(On Diff #20949)

Fixed.

73 ↗(On Diff #20949)

Fixed.

176 ↗(On Diff #20949)

Fixed.

186 ↗(On Diff #20949)

Fixed.

219 ↗(On Diff #20949)

Fixed.

227 ↗(On Diff #20949)

Fixed

241 ↗(On Diff #20949)

Fixed.

255 ↗(On Diff #20949)

Fixed.

267 ↗(On Diff #20949)

Fixed

65 ↗(On Diff #19961)

I did find ways to reject more cases before beginning a lexical scan of text, so I'm opting out earlier than I was initially.

I couldn't find the right matcher incantation with clang-query to narrow it further in the matcher instead of in the callback.

Fixed.

78 ↗(On Diff #19961)

Fixed.

169 ↗(On Diff #19961)

Fixed

170 ↗(On Diff #19961)

Fixed

174 ↗(On Diff #19961)

Fixed.

105 ↗(On Diff #20580)

Fixed.

207 ↗(On Diff #20580)

Fixed

clang-tidy/readability/RemoveVoidArg.h
38 ↗(On Diff #20949)

Fixed.

26 ↗(On Diff #19961)

Fixed.

38 ↗(On Diff #19961)

const issues fixed.

test/clang-tidy/readability-remove-void-arg.cpp
28 ↗(On Diff #19961)

Fixed

145 ↗(On Diff #19961)

Fixed.

Correct issues found in review.

xazax.hun added a comment.EditedMar 8 2015, 12:43 AM

If you can show me how to lex a SourceRange without getting that error, I'm happy to change it. I tried the method shown in the linked diff and I get the same error.

As far as I know, the point is that you can not do that without creating a null terminated string, however you do not need to do that. What you can do insead: you can lex the whole file buffer starting from the beginning of a declaration and you can make sure that you finish lexing on the end of the declaration (e.g.: end lexing when you match the closing paren of the declaration). This way you might have a bit more complicated lexing logic but you avoid heap allocation. So the point is: you construct the lexer using the whole file buffer starting from the right position and not relying on the source range at all to finish lexing.

LegalizeAdulthood added a comment.EditedMar 8 2015, 7:29 AM

If you can show me how to lex a SourceRange without getting that error, I'm happy to change it. I tried the method shown in the linked diff and I get the same error.

As far as I know, the point is that you can not do that without creating a null terminated string, however you do not need to do that. What you can do insead: you can lex the whole file buffer starting from the beginning of a declaration and you can make sure that you finish lexing on the end of the declaration (e.g.: end lexing when you match the closing paren of the declaration). This way you might have a bit more complicated lexing logic but you avoid heap allocation. So the point is: you construct the lexer using the whole file buffer starting from the right position and not relying on the source range at all to finish lexing.

Relex the entire file just to avoid a heap allocation? That seems a bit excessive. Do we have any measurements on real code bases that show this to be the better approach? I don't want to optimize like this without real data to show that it is worthwhile.

Most of the time this string is going to either be "()" or "(void)". Small string optimization means that there isn't even a heap allocation.

If you can show me how to lex a SourceRange without getting that error, I'm happy to change it. I tried the method shown in the linked diff and I get the same error.

As far as I know, the point is that you can not do that without creating a null terminated string, however you do not need to do that. What you can do insead: you can lex the whole file buffer starting from the beginning of a declaration and you can make sure that you finish lexing on the end of the declaration (e.g.: end lexing when you match the closing paren of the declaration). This way you might have a bit more complicated lexing logic but you avoid heap allocation. So the point is: you construct the lexer using the whole file buffer starting from the right position and not relying on the source range at all to finish lexing.

Relex the entire file just to avoid a heap allocation? That seems a bit excessive. Do we have any measurements on real code bases that show this to be the better approach? I don't want to optimize like this without real data to show that it is worthwhile.

Most of the time this string is going to either be "()" or "(void)". Small string optimization means that there isn't even a heap allocation.

Well, you won't relex anything that does not need to be relexed. You only initialize the lexer using the whole buffer, so you get a null terminated string. But the starting point of the lexing would be the starting source location of the declaration you want to lex. And you would only lex the tokens you need. I think the lexer is lazy, so initializing with a bigger buffer should not cause any performance regressions in case you do not lex more tokens that you need. Of course measurements would tell this for sure.

alexfh added inline comments.Mar 10 2015, 11:02 AM
clang-tidy/readability/RedundantVoidArgCheck.cpp
14 ↗(On Diff #21435)

nit: Remove this and just start namespace clang { (or all three of them) here instead of line 50 below. Then clang::ast_matchers can be replaced with ast_matchers as well.

146 ↗(On Diff #21435)

nit: No need for clang:: here.

195 ↗(On Diff #21435)

It's better to use token ranges here:

CharSourceRange::getTokenRange(VoidLoc, VoidLoc)
test/clang-tidy/readability-redundant-void-arg.cpp
86 ↗(On Diff #21435)

Please use LLVM style in tests unless needed otherwise for testing whitespace handling: 2 character indentation, left brace on the previous line, etc.

I have experimented with using typeLoc() matchers to just match the appropriate type expressions and that mostly works, but not completely. If anyone has more experience with typeloc() style matching and how to use it, I'd like to hear their thoughts on this.

If you can show me how to lex a SourceRange without getting that error, I'm happy to change it. I tried the method shown in the linked diff and I get the same error.

As far as I know, the point is that you can not do that without creating a null terminated string, however you do not need to do that. What you can do insead: you can lex the whole file buffer starting from the beginning of a declaration and you can make sure that you finish lexing on the end of the declaration (e.g.: end lexing when you match the closing paren of the declaration). This way you might have a bit more complicated lexing logic but you avoid heap allocation. So the point is: you construct the lexer using the whole file buffer starting from the right position and not relying on the source range at all to finish lexing.

Relex the entire file just to avoid a heap allocation? That seems a bit excessive. Do we have any measurements on real code bases that show this to be the better approach? I don't want to optimize like this without real data to show that it is worthwhile.

Most of the time this string is going to either be "()" or "(void)". Small string optimization means that there isn't even a heap allocation.

Well, you won't relex anything that does not need to be relexed. You only initialize the lexer using the whole buffer, so you get a null terminated string. But the starting point of the lexing would be the starting source location of the declaration you want to lex. And you would only lex the tokens you need. I think the lexer is lazy, so initializing with a bigger buffer should not cause any performance regressions in case you do not lex more tokens that you need. Of course measurements would tell this for sure.

I'd like to address this in a subsequent changeset, since ti seems to be a matter of performance and not a matter of correctness with the existing code and I'd like to move this check forward.

clang-tidy/readability/RedundantVoidArgCheck.cpp
14 ↗(On Diff #21435)

Fixed.

146 ↗(On Diff #21435)

Fixed.

195 ↗(On Diff #21435)

Fixed.

test/clang-tidy/readability-redundant-void-arg.cpp
86 ↗(On Diff #21435)

Fixed.

Revise from review comments

run trunk clang-format on source files

If you can show me how to lex a SourceRange without getting that error, I'm happy to change it. I tried the method shown in the linked diff and I get the same error.

As far as I know, the point is that you can not do that without creating a null terminated string, however you do not need to do that. What you can do insead: you can lex the whole file buffer starting from the beginning of a declaration and you can make sure that you finish lexing on the end of the declaration (e.g.: end lexing when you match the closing paren of the declaration). This way you might have a bit more complicated lexing logic but you avoid heap allocation. So the point is: you construct the lexer using the whole file buffer starting from the right position and not relying on the source range at all to finish lexing.

Relex the entire file just to avoid a heap allocation? That seems a bit excessive. Do we have any measurements on real code bases that show this to be the better approach? I don't want to optimize like this without real data to show that it is worthwhile.

Most of the time this string is going to either be "()" or "(void)". Small string optimization means that there isn't even a heap allocation.

Well, you won't relex anything that does not need to be relexed. You only initialize the lexer using the whole buffer, so you get a null terminated string. But the starting point of the lexing would be the starting source location of the declaration you want to lex. And you would only lex the tokens you need. I think the lexer is lazy, so initializing with a bigger buffer should not cause any performance regressions in case you do not lex more tokens that you need. Of course measurements would tell this for sure.

I'd like to address this in a subsequent changeset, since ti seems to be a matter of performance and not a matter of correctness with the existing code and I'd like to move this check forward.

This looks like a very small and quick change, so it would be nice to address this concern in the very first version of the check.

clang-tidy/readability/RedundantVoidArgCheck.cpp
244 ↗(On Diff #25928)

Character-based offsets can be brittle in case escaped newlines and alternative tokens are used. I'd prefer getting the type range directly when it's possible. Here and for other casts something like this should work: CStyleCast->getTypeInfoAsWritten()->getTypeLoc().getSourceRange().

Also, do we need separate functions for ExplicitCastExpr and two of its children classes?

test/clang-tidy/readability-redundant-void-arg.c
77 ↗(On Diff #25928)

Special tests for whitespace handling seem to be superfluous when you just need to check that no changes are made. Also, the invariant "no warnings => no fixes" seems to be safe to rely on, so you can remove all CHECK-FIXES and add one "CHECK-MESSAGES-NOT: warning:" so that the script runs the FileCheck on the output.

clang-tidy/readability/RedundantVoidArgCheck.cpp
244 ↗(On Diff #25928)

I don't think the character based offsets can be avoided here. The nodes don't provide getter functions for all the locations I'm interested in and I have to construct the appropriate source location myself. I will try what you suggest and see how it goes.

I tried doing things with just ExplicitCastExpr initially and when I added test cases for the functional and C-style casts, I needed to do different replacement ranges.

clang-tidy/readability/RedundantVoidArgCheck.cpp
244 ↗(On Diff #25928)

I was able to avoid some of the custom SourceRange computations, which allowed me to eliminate shrinkRangeByOne as well.

test/clang-tidy/readability-redundant-void-arg.c
77 ↗(On Diff #25928)

I tried this, but it doesn't work with the shell script used to run clang-tidy. You need to pass --allow-empty to FileCheck, but the script doesn't allow for this. So I left the file unchanged.

Update from comments

Something wrong happened with this patch: there are tons of unrelated changes here. Maybe you need to update your working copy?

clang-tidy/readability/RedundantVoidArgCheck.cpp
50 ↗(On Diff #28202)

So why do we need isExpansionInMainFile() here and in all other matchers? ClangTidy has its own mechanism to filter warnings by location.

229 ↗(On Diff #28202)

Now processNamedCastExpr, processNamedCastExpr and processFunctionalCast methods are almost identical and can be replaced by a single method. And I think this method should be processExplicitCastExpr that should have the same implementation.

BTW, currently, the processExplicitCastExpr method can only be called if the code encounters the only class inherited from ExplicitCastExpr that is not handled explicitly: ObjCBridgedCastExpr.

Once again: replace processExplicitCastExpr implementation with this:

void RedundantVoidArgCheck::processExplicitCastExpr(
    const MatchFinder::MatchResult &Result,
    const ExplicitCastExpr *ExplicitCast) {
  if (protoTypeHasNoParms(ExplicitCast->getTypeAsWritten())) {
    removeVoidArgumentTokens(
      Result,
      ExplicitCast->getTypeInfoAsWritten()->getTypeLoc().getSourceRange(),
       "cast expression");
  }
}

... and remove the other three process...Cast methods and the corresponding if branches.

Also, register a single explicitCastExpr matcher instead of separate matchers for all kinds of casts.

sorry but I am personally skeptic about this checker.

why is the void removed?

it does not cause any wrong behaviour to keep it.

the void is not likely added there by mistake, is it? the developer probably wrote it by intention and this checker thinks that the developer intentions are wrong..

how about moving it to clang-modernize?

Rebase onto HEAD of master

sorry but I am personally skeptic about this checker.

why is the void removed?

it does not cause any wrong behaviour to keep it.

the void is not likely added there by mistake, is it? the developer probably wrote it by intention and this checker thinks that the developer intentions are wrong..

how about moving it to clang-modernize?

It is a readability check, it isn't designed to detect "wrong" behavior. (void) is a C-ism and is a holdover from C-style coding. It is completely redundant and unnecessary in C++. If a developer wants to keep unnecessary and redundant tokens in their code, then they can turn this check off or not run it.

sorry but I am personally skeptic about this checker.

why is the void removed?

it does not cause any wrong behaviour to keep it.

the void is not likely added there by mistake, is it? the developer probably wrote it by intention and this checker thinks that the developer intentions are wrong..

how about moving it to clang-modernize?

It is a readability check, it isn't designed to detect "wrong" behavior. (void) is a C-ism and is a holdover from C-style coding. It is completely redundant and unnecessary in C++. If a developer wants to keep unnecessary and redundant tokens in their code, then they can turn this check off or not run it.

Agree. Also, we want to migrate clang-modernize transformations to clang-tidy some day (it's an extremely low priority task though). Wrt this check, it might fit better in a new module named "legacy". Not extremely important and definitely not necessary for this patch.

What's necessary, is to address my comments (http://reviews.llvm.org/D7639?id=28202#inline-85173 and http://reviews.llvm.org/D7639?id=28202#inline-85176).

sorry but I am personally skeptic about this checker.

why is the void removed?

it does not cause any wrong behaviour to keep it.

the void is not likely added there by mistake, is it? the developer probably wrote it by intention and this checker thinks that the developer intentions are wrong..

how about moving it to clang-modernize?

It is a readability check, it isn't designed to detect "wrong" behavior. (void) is a C-ism and is a holdover from C-style coding. It is completely redundant and unnecessary in C++. If a developer wants to keep unnecessary and redundant tokens in their code, then they can turn this check off or not run it.

I know it is redundant in C++. And probably the developer knows it too? I expect that nearly 100% of the warnings will be false positives because as I said: "this checker thinks that the developer intentions are wrong.". Stylistic checkers are problematic because people have different taste so the warnings are seen by both those who wants to see it and those who don't, however in this case I have the feeling that your warning will only be seen by those who don't want to have the warning. I doubt people add void by mistake.

Does anybody know that some C++ developers use void because they think it's safer?

When you try this on real code.. what is the false positive ratio?

Agree. Also, we want to migrate clang-modernize transformations to clang-tidy some day (it's an extremely low priority task though). Wrt this check, it might fit better in a new module named "legacy". Not extremely important and definitely not necessary for this patch.

What's necessary, is to address my comments (http://reviews.llvm.org/D7639?id=28202#inline-85173 and http://reviews.llvm.org/D7639?id=28202#inline-85176).

I am afraid I don't understand. I had the impression that checkers should try to warn about stuff that is not intentional. Imho a warning about intentional code that does not do anything wrong is a false positive.

I can understand that some checkers could be highly useful in some projects but noisy in most projects. Can we try to separate these so they can be suppressed easily and/or must be enabled specifically? Instead of the other way around where most projects that get the warning must disable it manually. If many such checkers are added it will be a pain to disable them if they are not separated.

Sorry... I have changed my mind. This checker is fine. You can ignore my comments.

I think there are valid use cases for this checker. There are several C++ codebases that was converted from C, and half of the codebase contains the void keywords in the argument list half of them is not. Making such codebases consistent is a good thing. It makes it easier to grep for a function for instance.

I know it is redundant in C++. And probably the developer knows it too? I expect that nearly 100% of the warnings will be false positives because as I said: "this checker thinks that the developer intentions are wrong.".

In clang-tidy discussions, "false positive" means that the check incorrectly analyzed the code and that the resulting application of an automated fix would change the code to syntactically incorrect or semantically incorrect. Under this definition, the only false positive would be a case where the fix was applied to C code, where (void) is necessary to distinguish a function with no arguments as opposed to a function with an unspecified number and type of arguments.

There are plenty of readability checks in clang-tidy that impose a certain point of view. A good example is the LLVM namespace comment check. Here, a style guideline is being imposed on the code. If you don't want your code subjected to that style guideline, then you shouldn't be running that check and applying the suggested fixes to your code. clang-tidy is not part of the compiler, nothing is forcing you to run clang-tidy and even if you use clang-tidy, nothing is forcing you to run this check. You only get what you ask for.

Stylistic checkers are problematic because people have different taste so the warnings are seen by both those who wants to see it and those who don't, however in this case I have the feeling that your warning will only be seen by those who don't want to have the warning. I doubt people add void by mistake.

I created this check because I was working on C++ teams where people kept bringing their C habits into the C++ code. This is purely a C-ism and people coming from C routinely bring this into their C++ code because they operate under the (erroneous IMO) opinion that "C++ is just C with classes".

Does anybody know that some C++ developers use void because they think it's safer?

There is no issue of safety here. In C, function prototypes are underspecified if they don't write (void). In C++, no such thing occurs.

When you try this on real code.. what is the false positive ratio?

Again, there are no "false positives" in the clang-tidy sense. I believe I have added sufficient C oriented test cases to cover that. If you don't want the results of this check, or any other clang-tidy check, applied to your code, then don't ask for it.

clang-tidy/readability/RedundantVoidArgCheck.cpp
50 ↗(On Diff #28202)

The lexer crashes with an assert if I don't limit it to the main file. Why, I don't know, but that's what happens.

clang-tidy/readability/RedundantVoidArgCheck.cpp
229 ↗(On Diff #28202)

Fixed. When I took out the cStyleCast matcher, this test case fails:

/home/richard/dev/llvm/tools/clang/tools/extra/test/clang-tidy/readability-redundant-void-arg.cpp:348:19: error: expected string not found in input
  // CHECK-FIXES: void (*f4)() = (void (*)()) 0;{{$}}
                  ^
/home/richard/dev/llvm-build/tools/clang/tools/extra/test/clang-tidy/Output/readability-redundant-void-arg.cpp.tmp.cpp:341:3: note: scanning from here
  //
  ^
/home/richard/dev/llvm-build/tools/clang/tools/extra/test/clang-tidy/Output/readability-redundant-void-arg.cpp.tmp.cpp:345:3: note: possible intended match here
  void (*f4)() = (void (*)(void)) 0;
  ^

So I left in the matcher for cStyleCastExpr.

Update from comments

Improve documentation of check.

What is preventing to add this check to Clang-tidy? Just found another piece of fresh C++ code in LLDB with (void) as argument list...

What is preventing to add this check to Clang-tidy? Just found another piece of fresh C++ code in LLDB with (void) as argument list...

To be honest, I don't know. This review had taken SOOOOOOOOO long to get approved. At this point, the code is correct and I'd really like to get it committed and available for use instead of fussing with it any longer.

LegalizeAdulthood marked 17 inline comments as done.Oct 23 2015, 7:34 AM

Can we get this committed? I've addressed all comments.

sbenza edited edge metadata.Oct 23 2015, 7:46 AM

I'm sorry. This change fell through the cracks. I'll take a look today.

Just fyi, I am looking at this diff. It is very large with a lot of rounds of comments and I didn't have the context.
I don't know if I should giving comments at this point of the change, but here it is.
Have you considered matching on typeLoc() instead of having a large list of different cases?
For example, if you match typeLoc(loc(functionType())) it will match all the places where a function type is mentioned, including things like static_cast<XXX>, variable declarations, lambda return type declarations, etc. Might help remove redundancy in the check.

Just fyi, I am looking at this diff. It is very large with a lot of rounds of comments and I didn't have the context.
I don't know if I should giving comments at this point of the change, but here it is.
Have you considered matching on typeLoc() instead of having a large list of different cases?
For example, if you match typeLoc(loc(functionType())) it will match all the places where a function type is mentioned, including things like static_cast<XXX>, variable declarations, lambda return type declarations, etc. Might help remove redundancy in the check.

That occurred to me and I did an experiment and it didn't work out. I forget the exact details now as it was months ago and this review has been sitting here languishing with a correct implementation as-is. I really just want to get this committed and make incremental improvement, instead of re-evaluating the entire thing from scratch at this time.

I mean, look at this review. I created it on Feb 13th 2015. It's been getting ground through the review process for 8 months. Why can't we move forward?

sbenza accepted this revision.Oct 27 2015, 2:27 PM
sbenza edited edge metadata.

Just wanted to know it was considered, and it was.
It looks good to me then.

This revision is now accepted and ready to land.Oct 27 2015, 2:27 PM
alexfh accepted this revision.Oct 27 2015, 5:49 PM
alexfh edited edge metadata.

I mean, look at this review. I created it on Feb 13th 2015. It's been getting ground through the review process for 8 months. Why can't we move forward?

Yes, I know the long reviews may be annoying. And I apologize for missing the latest updates on this patch from June. Please feel free to ping the reviews when there are no updates on them for a few days.

There are still a few comments on this patch that were not completely addressed, but it seems fine to submit the patch at this stage and address the concerns after the patch is in.

I'll commit the patch for you.

clang-tidy/readability/RedundantVoidArgCheck.cpp
51 ↗(On Diff #28641)

I'm pretty sure the lexer is able to work on header files, if used correctly. If you need help figuring out what's wrong, please provide more information: which assert fails, call stack, and a reduced test case where the assert fails.

FYI, I had to update tests' RUN lines for some recent changes. I'm also going to move the check to the modernize module, where it belongs.

This revision was automatically updated to reflect the committed changes.