Page MenuHomePhabricator

[clang-tidy] Add readability-implicit-bool-cast check
ClosedPublic

Authored by piotrdz on Oct 11 2015, 4:51 AM.

Details

Summary

This is another check that I ported to clang-tidy from my colobot-lint tool.

As previously discussed on cfe-dev mailing list, this is one of those checks that I think is general and useful enough for contribution to clang-tidy.

This patch contains implementation of check taken from colobot-lint, but also extended a great deal, including FixIt hints for automated refactoring, exhaustive testcases, and user documentation.

Diff Detail

Event Timeline

piotrdz updated this revision to Diff 37054.Oct 11 2015, 4:51 AM
piotrdz retitled this revision from to [clang-tidy] Add readability-implicit-bool-cast check.
piotrdz updated this object.
piotrdz added a reviewer: alexfh.

I agree with Piotr idea to introduce option to enable/disable check for pointer casts. Same should be done for integers where zero/non-zero value is used by system calls to indicate success/failure or in case of strings – end of string. Without such options check may be too verbose on legacy code bases.

May be dedicated category for casts checks should be created? We have google-readability-casting (through it duplicates in many aspects -Wold-style-cast, see PR24370), cppcoreguidelines-pro-type-const-cast and cppcoreguidelines-pro-type-reinterpret-cast. Other possible check is PR24525.

sbenza edited edge metadata.Oct 14 2015, 9:07 AM

Thanks for the check.
I know a lot of people that would appreciate if these implicit conversions didn't exist. =)

clang-tidy/readability/ImplicitBoolCastCheck.cpp
33

We should move most of these matchers to ASTMatchers.h as they are very general.
For example, we already have isInteger() as a QualType matcher so adding isBool() makes sense.

37

Maybe name it isMacroExpansion like the functions in SourceManager.

40

Please use the punctuation symbols for the boolean operators. (ie || instead of 'or')
The rest of the LLVM project uses them and the code should be consistent.

46

No need for the nested anyOf(). You are already in an anyOf() context.

47

Use isInTemplateInstantiation() instead of 'hasAncestor(functionDecl(ast_matchers::isTemplateInstantiation()))'

51

wouldn't the cxxBoolLiteral() also pass the isBoolType() check?

90

What about removing from unary_not->getLocStart() until unary_not->getSubExpr()->getLocStart()?
Or if you just want to remove one token do
CharSourceRange::getTokenRange(unary_not->getLocStart(),unary_not->getLocStart()).

102

You also need to check for CXXOperatorCallExpr, but only for operators that require the parens.

113

I think it is better if we construct a prefix and a suffix strings and then add a single hint for each.
Having multiple insert hints in the same location is brittle.

137

There are more literals. For example, NULL (GNUNullExpr)

147

Why are you using the _or_null overloads everywhere?
Do you expect them to be null?

154

You can move the declaration into the if statement. It doesn't pollute the outer scope and allows you to have simpler variable names.
Eg:

if (const auto *FL = llvm::dyn_cast<FloatingLiteral>(LiteralExpression)) {
  return ...
}
167

use isa<>. You don't really need the pointer.

172

Use llvm_unreachable() instead

179

Just CastExpression->getLocEnd()
You are using getTokenRange so it already covers from the start of the start token to the end of the end token.

199

Construct a string and collapse all the insertions into a single one.

224

You might also want to add a case for pointers here.
'false' converts to pointer.
ie this is "valid" code:

bool is_null(void* p) { return p == false; }

232

just getLocEnd(), same as above.

255

isn't this createImplicitCastFromBoolMatcher() ?
if not it is very close.
Maybe you can modify createImplicitCastFromBoolMatcher to reuse it. It is only used once so no reason for it to exist otherwise.

256

remove anyOf()

263

why this anyOf()? It must have a parent stmt.

275

Instead of the custom hasOpcode(BO_EQ) you can use the provided hasOperatorName("==")

286

move declarations into if statement. same as above

docs/clang-tidy/checks/readability-implicit-bool-cast.rst
58

why do we prefer a static_cast<> here, but we use != 0 in the if() above?

test/clang-tidy/readability-implicit-bool-cast.cpp
4

You could have a single function template:
template <typename T>
void functionTaking(T);

Then call with explicit template type:

functionTaking<int>(boolean);

Might make the code below more explicit as you can see the real C++ type is the expression.

23

move the // CHECK* comments after the line you are checking. That is the convention on other tests.

23

The first check should have the full message, including the [name-of-check] suffix to make sure it is generated correctly.

161

// namespace ignoreImplicitCastFromBoolInTemplateInstantiations

329

What would happen with code like:

assert(false && "some message");

?

Will this rewrite it to:

assert(false && true);

?

368

// namespace ignoreImplicitCastToBoolInTemplateInstantiations

381

// namespace ignoreUserDefinedConversionOperator

piotrdz marked 22 inline comments as done.Oct 15 2015, 4:31 PM

@sbenza: Thanks for the review. I have added my comments with expanations as necessary, and I have already fixed some issues you reported. I'm not posting a new diff at this time, as I want to finish with remaining issues, and then post updated diff once I'm ready with everything. I'll also work on adding a config option to conditionally allow for pointer to bool casts, as discussed on cfe-dev mailing list.

clang-tidy/readability/ImplicitBoolCastCheck.cpp
33

I agree. The only question is, should I do this as part of this patch, or isn't it better to check in this code first, and later move the matchers to clang in separate commit? A separate problem is that I don't have commit rights myself, so in any case, I must ask someone else to do it for me.

90

Ah, so that's how I do it.

I must say that it is totally not obvious how to work with source ranges in this or similar situations.

I came up with this version with getting token location and "magic" -1, because this is the first thing that actually worked for me. Previously, I was banging my head against the wall, trying to use something obvious like CharSourceRange::getTokenRange(ParentStatement->getLocStart(), ParentStatement->getLocEnd()). Needless to say, it doesn't work as you would expect.

102

And I thought I had everything covered. You're right, I'll need to fix that.

137

Fair point. Currently NULL will not work also because it's seen as macro, and so eliminated by my unless(isMacroExpansion()). That case (e.g. functionTakingBool(NULL) is actually a bit weird, because the whole ImplicitCastExpr is marked as expanded from NULL macro, and not where it actually occurs in code.

I'll try fixing it later, but I don't relish this. Macros are just evil.

147

It's just a habit I picked up early on when working with Clang. I ran into too many crashes due to null pointers, so I decided to always write defensive code. I guess now that I know more about ins and outs of Clang, I can do without it.

154

Ah, as irony would have it, it's an example of an implicit bool cast which my check would warn against. But as they say, for every rule, there is an exception.

224

No, it doesn't work like that. false will decay to integer constant and result in compile error of trying to compare pointer to an integer:

/tmp/test.cpp:1:35: error: comparison between pointer and integer ('void *' and 'int') [clang-diagnostic-error]
bool test(void* ptr) { return ptr == false; }

At least this we can be sure of - booleans will not convert to pointers, thank heavens for that.

255

This is matcher for implicit cast *to* bool, not *from* bool. They only seem to be similar. The only common part is this unless() part, which I extracted to function createExceptionCasesMatcher().

263

It doesn't have to be a Stmt. Consider: int integer = false;. Here, the parent of ImplicitCastExpr is a VarDecl.

docs/clang-tidy/checks/readability-implicit-bool-cast.rst
58

I can change it to ternary operator but it still looks ugly. Unfortunately, we run here into the same problem as in any other cast - it's just ugly code which pretends to be pretty.

The case of casting from something to bool type is easier to deal with, as such expression always corresponds to some kind of logical comparison. This check aims to make this comparison visible in code, rather than hidden, even under explicit cast like static_cast<bool>. So the choice of comparison operators comes naturally, and in my humble opinion, actually looks better in code than ugly static_cast.

But in the case of casting from bool to other type, we are casting away types, hoping that the conversion happens "out of the box". There just isn't a clear way of representing this in code, other than static_cast<type>. Okay, we can propose here a ternary operator as alternative: (cond) ? 1 : 0, but I feel that this is a bit too verbose, and actually less readable than the static_cast version. Besides, I'm not 100% sure, but I think it may actually change semantics (i.e. add branch to the output code where there wasn't any before). Casts such as these are just ugly and there is little we can do about that :-(.

I don't remember where, but I once read an interesting passage on the topic of casts. Quoting from memory:
"static_cast looks ugly in code, and it looks ugly on purpose. It is an indication that the programmer is trying to outsmart the type system in some way".

test/clang-tidy/readability-implicit-bool-cast.cpp
329

No, currently all macros are ignored, and personally, I think it is for the better. I simply don't like messing with macros, as that can have all sort of nasty consequences.

But even if the check would perform such replacement, I have to ask: is it really the fault of this check? Using assert() in such way is a prime example of "hacked" solution to a missing functionality (adding message displayed with the assert).

It would work perfectly, if it was used in a bit more sane manner, such as:

#define assert_with_message(condition, message) assert(condition && message)

assert_with_message(false, "some message");

Then, the implicit cast in macro body would be ignored, and the actual assert statement written in such way would be clearer.

piotrdz updated this revision to Diff 37707.Oct 18 2015, 6:54 AM
piotrdz edited edge metadata.
piotrdz marked 6 inline comments as done.

I fixed the issues found in review and added a few other enhancements:

  • add config options to allow integer and pointer casts in conditionals, e.g. "if (pointer)",
  • add support for NULL macro cast to boolean,
  • ignore usage of variable declaration in control statements, e.g. "if (int x = function())",
  • add support case of false literal to pointer conversion in pre-C++11 standards
  • propose NULL instead of nullptr in replacements in pre-C++11 standards

I think this is a more complete version, and we're getting closer to getting my patch committed. @sbenza: Do you agree?

clang-tidy/readability/ImplicitBoolCastCheck.cpp
154

Also, it turns out, I missed this case in my tests. I added appropriate exclusion for using variable declarations in this way.

224

Ah, my bad. clang-tidy tests use by default C++11. Of course this code will be accepted in older C++ standards. I added provisions for that case, and also changed nullptr to NULL in suggested replacements when nullptr is not available.

sbenza added inline comments.Oct 20 2015, 1:59 PM
clang-tidy/readability/ImplicitBoolCastCheck.cpp
34

I can submit the matchers for you.
I'll do that when the change is stable.

44

I was a little confused by the name. I thought the expansion was somehow a null expansion, not that it is the NULL macro.
Maybe should be isNULLMacroExpansion?

88

NULL might need an #include

90

A little explanation...
When you get getLocStart/getLocEnd you get the location of the start of the first token and the start of the last token.

Eg: in 'foo + bar' getLocStart points to 'f' and getLocEnd points to 'b'.

When you create a CharSourceRange you have two options:

  • getTokenRange(): spans from the first char of the token at start to past-the-end of the token at end
  • getCharRange(): spans the locations passed exactly.

getTokenRange(loc at 'f', loc at 'b') will give a range spanning 'foo + bar'
getCharRange(loc at 'f', loc at 'b') will give a range spanning 'foo + '

They are both useful. Sometimes you need a little extra help from the lexer of manually offsetting the locations.

Note that if the node spans only one token, then start/end will be the same location.

120

fold declaration into the if() condition

137

Macros are the nemesis of clang-tidy checks...

docs/clang-tidy/checks/readability-implicit-bool-cast.rst
58

I agree with all that, but that is not what the comment is saying.
As you say, we want to say 'integer != 0' and not 'static_cast<bool>(integer)'

test/clang-tidy/readability-implicit-bool-cast-cxx98.cpp
21

Don't add the [readability...] on each hit. Just the first one.

test/clang-tidy/readability-implicit-bool-cast.cpp
307

Just out of curiosity, does it handle -0.0 correctly?

329

Is there any non-macro assert-like API where people do the same trick?

piotrdz updated this revision to Diff 37939.Oct 20 2015, 3:15 PM
piotrdz marked 4 inline comments as done.

Addressed latest review issues.

There you go, third revision. I think this version should be ready for commit.

clang-tidy/readability/ImplicitBoolCastCheck.cpp
88

I changed it to 0.

91

Thanks for explaining that, it makes sense to me now. Perhaps such description could be added to Doxygen documentation?

test/clang-tidy/readability-implicit-bool-cast.cpp
307

Yes, it will. -0.0 is parsed as expression using binary operator (operator unary minus + double const). It will be turned into (-0.0) != 0.0 which will evaluate to false, the same as original -0.0.

sbenza added inline comments.Oct 21 2015, 12:33 PM
docs/clang-tidy/checks/readability-implicit-bool-cast.rst
59

This is still a thing.
The replacement is not what this comment says.

test/clang-tidy/readability-implicit-bool-cast.cpp
307

This is for literals, so it will try to convert to the bool literal instead of using != 0.0
However, the check above does:

(FloatLit->getValue().bitcastToAPInt() == 0) ? "false" : "true";

and that will return "true" for -0.0, I think.

piotrdz updated this revision to Diff 38044.Oct 21 2015, 1:35 PM

Addressed last issues in review

Any other comments?

docs/clang-tidy/checks/readability-implicit-bool-cast.rst
59

Sorry, somehow I missed that one. I meant to show casts from bool to int here, so the code is wrong, as you say.

test/clang-tidy/readability-implicit-bool-cast.cpp
307

No, it doesn't work like that. As I said:
-0.0 will *never* be parsed as a literal. These are two tokens in source code: unary minus AND a 0.0 floating point literal. The standard says that literals are only numbers and - is never a part of a literal.

However, I can grant that someone, somewhere might invent negative literals, so to be absolutely on the safe side, I changed the code in question to take the absolute value of literal.

And to prove that -0.0 is parsed as an expression, I added tests for that.

Also note that (-0.0) != 0.0 is perfectly fine as a replacement, as it still evaluates to false. The standard says that -0.0 behaves just like regular 0.0 in most aspects, including equality with 0.0 and so on. The only difference is seen in operations like division or multiplication, where it changes the sign of result accordingly.

sbenza accepted this revision.Oct 21 2015, 2:03 PM
sbenza edited edge metadata.
sbenza added inline comments.
test/clang-tidy/readability-implicit-bool-cast.cpp
307

You are right.
I was thinking that -0.0 was a FloatingLiteral, but it is in fact a UnaryOperator with a FloatingLiteral inside. Sorry for the noise.

This revision is now accepted and ready to land.Oct 21 2015, 2:03 PM

Thanks for accepting the review. However, I still have to ask someone to make the commit. Or should I obtain commit rights myself? If so, how do I do that?

I have obtained now commit rights but I want to make sure: I can commit this patch now that it is accepted? That is, accepting the review is the same as saying "LGTM", right?

Accepted status is what is most important.

Please don't forget to mention Differential revision in commit comment, so is revision will be closed automatically. See example.

Please also be near computer for ~hour, just to make sure all build bots finished successfully. I learned this after breaking LLDB MSVC builds couple of times :-)

This revision was automatically updated to reflect the committed changes.

@eugene: thanks for clarifying that and your advice. I made the commit and it seems I didn't break anything, yet :-).