Page MenuHomePhabricator

new clang-tidy checker misc-long-cast
ClosedPublic

Authored by danielmarjamaki on Jan 19 2016, 2:26 AM.

Details

Summary

This is a new checker for clang-tidy.

This checker will warn when there is a explicit redundant cast of a calculation
result to a bigger type. If the intention of the cast is to avoid loss of
precision then the cast is misplaced, and there can be loss of precision.
Otherwise the cast is ineffective.

No warning is written when it is seen that the cast is ineffective.

I tested this on debian projects.. and I see quite little noise. in 1212 projects I got 76 warnings. The cast is either ineffective or there is loss of precision in all these cases. I don't see warnings where I think the warning is wrong. But I don't want to warn when it can be determined that the cast is just ineffective, I think that the perfect checker would be better at determining this.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Eugene.Zelenko set the repository for this revision to rL LLVM.
Eugene.Zelenko removed a subscriber: alexfh.

Clang-tidy has 6 cast related checks. May be this is good time to introduce dedicated category for them?

Why not supply a fixit that removes the cast?

Why not supply a fixit that removes the cast?

I am skeptic. There are different valid fixes.

Example code:

l = (long)(a*1000);

Fix1:

l = ((long)a * 1000);

Fix2:

l = (a * (long)1000);

Fix3:

l = (a * 1000L);
danielmarjamaki removed rL LLVM as the repository for this revision.

Fixed review comment; s/checker/check/

danielmarjamaki marked an inline comment as done.Jan 20 2016, 5:08 AM

Clang-tidy has 6 cast related checks. May be this is good time to introduce dedicated category for them?

I am not against this.

I don't know which ones you are thinking about.. but if a check is not in the misc category now then it should probably not be moved.

I can do this.. but I would prefer to do it separately. If there is agreement that this should be done I can create a new revision that move the cast checks. I suggest that is done before this long-cast checker is committed.

danielmarjamaki set the repository for this revision to rL LLVM.Jan 20 2016, 5:18 AM

If you state what the check does, then

Why not supply a fixit that removes the cast?

I am skeptic. There are different valid fixes.

Example code:

l = (long)(a*1000);

Fix1:

l = ((long)a * 1000);

Fix2:

l = (a * (long)1000);

Fix3:

l = (a * 1000L);

The way I see it, the check is complaining about the pointless cast and pointing the finger at the beginning of the cast. To me, my expectation is that the suggested fix is none of the options you gave but instead:

l = (a*1000);

In other words, the cast is superfluous for the code as written. Omitting the cast directly expresses the code as written without unnecessary casting. Because the check can't know your intent, it hints that you may have intended something else, but for what was written, this is the semantics.

If we want to get into detecting numeric overflow, then we're talking some kind of run-time assisted check and it's beyond the scope of clang-tidy.

Basically, I look at this unnecessary cast as clutter. I'm fine with static analysis that warns about intermediate expressions possibly overflowing when assigned to a "larger" type, but to me that sounds like something for the static analyzer and not for tidy.

In other words, I think of clang-tidy as a refactoring tool, not a static analysis tool.

clang-tidy/misc/LongCastCheck.cpp
43 ↗(On Diff #45378)

Prefer anonymous namespace over static to scope visibility.

97 ↗(On Diff #45378)

Why don't you check for casting a float expression to a double or long double?

Isn't this the exact same issue?

If so, add a test case for casting a float expression to double and a test case for casting a double expression to a long double.

clang-tidy/misc/MiscTidyModule.cpp
61 ↗(On Diff #45378)

The documentation describes this check as one that looks for a cast to a "bigger type", but the name of the check implies that it only works for casts to long.

The name of the check should be made more generic to reflect reality.

Perhaps misc-redundant-cast-to-larger-type or misc-redundant-bigger-type-cast?

docs/clang-tidy/checks/misc-long-cast.rst
11 ↗(On Diff #45378)

Please add an example for another type other than long, such as casting a float expression to a double.

If you state what the check does, then

Why not supply a fixit that removes the cast?

I am skeptic. There are different valid fixes.

Example code:

l = (long)(a*1000);

Fix1:

l = ((long)a * 1000);

Fix2:

l = (a * (long)1000);

Fix3:

l = (a * 1000L);

The way I see it, the check is complaining about the pointless cast and pointing the finger at the beginning of the cast. To me, my expectation is that the suggested fix is none of the options you gave but instead:

l = (a*1000);

In other words, the cast is superfluous for the code as written. Omitting the cast directly expresses the code as written without unnecessary casting. Because the check can't know your intent, it hints that you may have intended something else, but for what was written, this is the semantics.

If we want to get into detecting numeric overflow, then we're talking some kind of run-time assisted check and it's beyond the scope of clang-tidy.

Basically, I look at this unnecessary cast as clutter. I'm fine with static analysis that warns about intermediate expressions possibly overflowing when assigned to a "larger" type, but to me that sounds like something for the static analyzer and not for tidy.

In other words, I think of clang-tidy as a refactoring tool, not a static analysis tool.

Why is there a cast in the first place? It is unlikely that the programmer added a useless cast for no reason.

I say that it looks like the programmer wanted to avoid loss of precision.

If we remove the useless cast there will still be loss of precision. And the warnings go away so maybe a sloppy programmer thinks that all is good.

I tested this on debian projects.. and I see quite little noise. in 1212 projects I got 76 warnings.

In my opinion the proper fix for these are to cast the expressions properly.

danielmarjamaki marked an inline comment as done.Jan 21 2016, 6:19 AM
danielmarjamaki added inline comments.
clang-tidy/misc/LongCastCheck.cpp
43 ↗(On Diff #45378)

As I read the LLVM coding standards we prefer static for functions.

http://llvm.org/docs/CodingStandards.html#anonymous-namespaces

97 ↗(On Diff #45378)

in theory yes.. but somehow that feels strange to me. yes there will possibly be loss of precision in some decimals, that is normal when using floating point numbers. if such loss of precision would be unwanted then float should be avoided to start with.

so I do agree in theory but I don't think I would feel good about adding such warnings.

clang-tidy/misc/MiscTidyModule.cpp
61 ↗(On Diff #45378)

Yes I agree.. will fix..

I used long because that is the typical case that I envision.

alexfh added inline comments.Jan 21 2016, 6:43 AM
clang-tidy/misc/LongCastCheck.cpp
21 ↗(On Diff #45378)

Any reason to limit this to returnStmt, varDecl and assignment? This pattern can appear in any expression and lead to equally incorrect results.

22 ↗(On Diff #45378)

Why only c-style casts? The problem applies to static_cast as well. Not sure how likely a reinterpret_cast is to appear in this situation, maybe it should be handled as well.

23 ↗(On Diff #45378)

Why other operators are not considered here? Subtraction, for example, can suffer the same problem.

66 ↗(On Diff #45378)

No else after return, please.

71 ↗(On Diff #45378)

Please remove the extra line breaks.

100 ↗(On Diff #45378)

Ctx or Context, please.

clang-tidy/misc/MiscTidyModule.cpp
61 ↗(On Diff #45378)

How about misc-misplaced-widening-cast?

docs/clang-tidy/checks/misc-long-cast.rst
11 ↗(On Diff #45378)

Is the use of two colons intended?

test/clang-tidy/misc-long-cast.cpp
1 ↗(On Diff #45378)

Please add tests with templates: casting to or from a dependent type shouldn't trigger the warning.

danielmarjamaki marked an inline comment as done.Jan 21 2016, 7:20 AM
danielmarjamaki added inline comments.
clang-tidy/misc/LongCastCheck.cpp
21 ↗(On Diff #45378)

Yes. There could be some extra pattern we want to look for later. But I don't want to warn in general. Example:

int A,B,C;
long ABC = (long)(A * B) + C;

That code makes perfect sense if the calculation A*B won't overflow and you want that the addition is done with long precision.

22 ↗(On Diff #45378)

ok, I will handle that too in next patch.

23 ↗(On Diff #45378)

yes. true. the ~ operator also. I can't say how noisy that would be but I will test it.

clang-tidy/misc/MiscTidyModule.cpp
61 ↗(On Diff #45378)

I already changed.. but I like misc-misplaced-widening-cast better so I will change again..

alexfh added inline comments.Jan 21 2016, 7:28 AM
clang-tidy/misc/LongCastCheck.cpp
21 ↗(On Diff #45378)

Then you don't need to repeat the inner matcher. Assign it to an auto variable and then use it in the other matchers:

auto CastExpr = cStyleCastExpr(...);
Finder->addMatcher(returnStmt(has(CastExpr)));
Finder->addMatcher(varDecl(has(CastExpr)));
...
55 ↗(On Diff #45378)

nit: Braces are not needed around one-line if bodies.

61 ↗(On Diff #45378)

Please insert braces around the if body here: it's hard to see its scope otherwise.

Why is there a cast in the first place? It is unlikely that the programmer added a useless cast for no reason.

If this has universally been your experience on a code base, then I say you should count your blessings that you have worked only with such good programmers!

Sadly, I have encountered many code bases where such silly things were written. Sometimes people just write the cast because that is the return type of the function.

At any rate, the check simply cannot divine programmer intent from the source code alone, which is why for clang-tidy (not static analyzer) purposes, my expectation is that clang-tidy would say "this cast is redundant, so I suggest you remove it". Hence the fixit removes the cast.

Only a human being can look at the change suggested by clang-tidy and say "hmm.... looks like the correct thing here is that there was some loss of precision in the intermediate result, so I should fix that instead".

If a developer is going to blindly accept the output of clang-tidy, then there is nothing we can do about that. (A case in point: I am currently reviewing thousands of changes proposed by google-readability-braces-around-statements; it has quite a few bugs and can actually generate syntactically invalid code from its suggested fixits!) At a bare minimum, clang-tidy should never offer fixits that change the meaning of the code and my suggested fixit does not do that. The other proposed fixits discussed on this thread do change the meaning of the code by changing the precision of the intermediate calculations and are therefore not refactorings. They are bug fixes. Without dynamic analysis, only a human can decide on their validity by considering the larger context of the code.

Again, my feeling is that clang-tidy should be a refactoring tool when it comes to fixits -- the suggested changes should never change the semantic meaning of the code.

clang-tidy/misc/LongCastCheck.cpp
97 ↗(On Diff #45378)

For floating-point quantities, when I think of the term "precision" I am thinking of the number of bits allocated to the mantissa. This may or may not be correct terminology according to floating-point experts, but from what I can tell it seems to agree with how the term is used on wikipedias article on floating-point.

So while digits are technically lost when we add a small floating-point quantity to a large floating-point quantity (the large quantity gobbles up all the bits of the mantissa and the small quantity has its least-significant mantissa bits discarded), the precision of the result isn't changed -- the number of bits in the mantissa is the same for the result as it was for the inputs.

To then take a quantity of N bits of mantissa and cast that to a quantity of M bits of mantissa where M > N seems just as pointless as the approach of casting an int to a long. The extra tokens for casting do absolutely nothing and are redundant as far as the computation as written is concerned.

clang-tidy/misc/MiscTidyModule.cpp
61 ↗(On Diff #45378)

Yes, Alex's name suggestion is better than mine. Yay code reviews!

If you state what the check does, then

Why not supply a fixit that removes the cast?

I am skeptic. There are different valid fixes.

Example code:

l = (long)(a*1000);

Fix1:

l = ((long)a * 1000);

Fix2:

l = (a * (long)1000);

Fix3:

l = (a * 1000L);

The way I see it, the check is complaining about the pointless cast and pointing the finger at the beginning of the cast. To me, my expectation is that the suggested fix is none of the options you gave but instead:

l = (a*1000);

I expect that this warning will in most cases be fixed by moving the cast.

I believe there is often a bug in such code.

If you want that we hide these bugs by removing the casts I can do it.. but I personally think that is wrong.

And yes clang-tidy does not have static analysis. that would not help much to determine proper fix anyway - if we can see there is overflow then should it be fixed with fix 1,2,3 or is the overflow intentional.

I find this an interesting discussion. I do not mean to imply that my remarks constitute any sort of demand that this check produce my suggestion for a fixit.

I expect that this warning will in most cases be fixed by moving the cast.

For a seasoned C++ developer having authored the offending code, I agree. However, I have seen lots of code where things were done that were just unnecessary.

I believe there is often a bug in such code.

I think it depends on who was writing it.

If you want that we hide these bugs by removing the casts I can do it.. but I personally think that is wrong.

I'm simply saying that if it is going to suggest a fixit, then the fixit suggested should be one that doesn't change the meaning of the code as written.

I agree that such a suggested fixit would not be the correct change in all cases, since clang-tidy can't know the original intent of the author.

However, suppose we had code like this:

long f(int a) {
  return a*1000;
}

This code is not going to issue any warnings about potential loss of precision, but by simply adding the cast on the result expression, suddenly I'm getting warnings about possible loss of precision.

In both cases, the precision loss is a potentiality.

The check assumes that placing the cast there is an indicator of a bug.

When I look at such code with a redundant cast, I would infer the opposite: the programmer is adding stuff that isn't necessary.

Maybe our differing attitude on what the cast implies is a reflection of the difference in our experiences in working on different code bases. Am I just so unlucky as to have been exposed to so much badly written C++ over the years? The checks that I have added and would like to continue to add are primarily focused on "cleaning up the mess" that I've seen in so many C++ code bases over the years. I have seen many uses of unnecessary casting over the years, as opposed to misplaced casts that were intended to prevent loss of precision.

Without access to the original intent (locked up in someone's brain somewhere), whether or not a loss of precision is a problem can only be determined by dynamic analysis or some other form of runtime testing.

Perhaps because both of our interpretations are valid but different points of view, it is best that this check not offer any fixit at all. If it does offer a fixit, then there should be a configuration option to say which kind of fixit to prefer. A developer can then use a workflow like this:

  1. Run the tool with no fixits to see if there are any potential problems.
  2. Examine each instance and decide if the cast is redundant or misplaced.
  3. For redundant casts, run the tool again configured to drop the redundant cast and selectively apply the fixits on the appropriate instances.
  4. For misplaced casts, run the tool again configured to move the cast and selectively apply the fixits on the appropriate instances.

This leaves the tool operating conservatively by default, yet still providing the developer with the means of deciding the intent on each flagged instance and applying the appropriate fix for each case.

danielmarjamaki marked 3 inline comments as done.Jan 25 2016, 2:45 AM
danielmarjamaki added inline comments.
docs/clang-tidy/checks/misc-long-cast.rst
11 ↗(On Diff #45378)

The use of two colons is intended. When I look at the output here: http://rst.ninjs.org/ it will say "System Message: WARNING/2 (<string>, line 15)" if I use a single colon. If you have a different tool to look at rst that I should use let me know.

danielmarjamaki removed rL LLVM as the repository for this revision.

Fixed review comments

danielmarjamaki marked 5 inline comments as done.Jan 25 2016, 4:23 AM
danielmarjamaki marked 13 inline comments as done.Jan 25 2016, 4:25 AM
clang-tidy/misc/LongCastCheck.cpp
97 ↗(On Diff #45378)

It does feel strange to me. But I think you are technically right. I can implement it and see what warnings that generates.

For information I have now tested latest patch.

Statistics:
projects: 1804
files: 85976
warnings: 100

There were some new interesting warnings and imho they were TP.

There were some new interesting warnings and imho they were TP.

TP?

Refactoring the AST matching

There were some new interesting warnings and imho they were TP.

TP?

Sorry.. True Positive

clang-tidy/misc/MisplacedWideningCastCheck.cpp
32 ↗(On Diff #45854)

I would like to use a anyOf(cStyleCastExpr(..), cxxStaticCastExpr(..), ..) ... would that be possible somehow?

I would also like to use anyOf(binaryOperator(..), unaryOperator(..)) is that possible?

danielmarjamaki marked an inline comment as done.Jan 28 2016, 12:38 AM
danielmarjamaki added inline comments.
clang-tidy/misc/MisplacedWideningCastCheck.cpp
33 ↗(On Diff #46235)

I have refactored these expressions in latest patch. I did not know I have to use stmt().

danielmarjamaki marked an inline comment as done.Feb 5 2016, 12:01 AM

ping?

alexfh added inline comments.Feb 5 2016, 6:35 AM
clang-tidy/misc/MisplacedWideningCastCheck.cpp
20 ↗(On Diff #46235)

It makes sense to use the closest common parent, which is expr here, not stmt.

27 ↗(On Diff #46235)

You can use explicitCastExpr instead of stmt and restructure the matcher to reduce code duplication:

explicitCastExpr(anyOf(cStyleCastExpr(), cxxStaticCastExpr(),
                       cxxReinterpretCastExpr()),
                 has(Calc))
31 ↗(On Diff #46235)

FYI, these matchers can be combined using anyOf. Not sure whether this will be better (readability-wise and performance-wise) or not.

78 ↗(On Diff #46235)

IIUC, both "cast" and "calc" are bound in non-optional branches of the matcher, so Cast and Calc should never be nullptr. Please move the !Cast and !Calc checks from ifs to asserts.

88 ↗(On Diff #46235)

You should be able to check this in the matcher (using hasType(isInteger()) for the expr matcher and a bit more specific hasDestinationType(isInteger()) for explicitCastExpr).

clang-tidy/misc/MisplacedWideningCastCheck.h
16 ↗(On Diff #46235)

Please add namespace misc {. BTW, do you know the add_new_check.py script that uses the currently recommended template?

21 ↗(On Diff #46235)

Please add a link to the user docs:

// For the user-facing documentation see:
// http://clang.llvm.org/extra/clang-tidy/checks/misc-misplaced-widening-cast.html

(and next time use add_new_check.py ;)

docs/clang-tidy/checks/misc-misplaced-widening-cast.rst
1 ↗(On Diff #46235)

If you wonder how to test the RST you write:

  1. when running cmake, specify -DCLANG_TOOLS_EXTRA_INCLUDE_DOCS=ON (you can also turn on LLVM_INCLUDE_DOCS and CLANG_INCLUDE_DOCS, if you wish, but this shouldn't be necessary)
  2. make/ninja docs-clang-tools-html, the results should be generated in "<your-build-directory>/tools/clang/tools/extra/docs/html"
4 ↗(On Diff #46235)

nit: Please make the underline the same length as the previous line.

test/clang-tidy/misc-misplaced-widening-cast.cpp
31 ↗(On Diff #46235)

Please use proper punctuation and capitalization in the comments.

danielmarjamaki added inline comments.Feb 8 2016, 5:42 AM
clang-tidy/misc/MisplacedWideningCastCheck.cpp
31 ↗(On Diff #46235)

hmm.. are you sure .. for instance this does not work: stmt(anyOf(returnStmt(), varDecl()))

clang-tidy/misc/MisplacedWideningCastCheck.h
16 ↗(On Diff #46235)

yes I used add_new_check.py to create this checker, thank you.. so I did not think I would have to run it again. I'll try to rerun add_new_check.py in the future.

docs/clang-tidy/checks/misc-misplaced-widening-cast.rst
1 ↗(On Diff #46235)

thanks.. something does not work.

mkdir ~/llvm/build && cd ~/llvm/build
cmake -DCLANG_TOOLS_EXTRA_INCLUDE_DOCS=ON ..
make -j4
make docs-clang-tools-html

>

make: *** No rule to create target ”docs-clang-tools-html”.  Stopping.

I have debian on my computer.

Fixes according to review comments.

danielmarjamaki marked 9 inline comments as done.

Fixed review comments

clang-tidy/misc/MisplacedWideningCastCheck.cpp
79 ↗(On Diff #47181)

I looked in another checker , that did not have a condition nor assert so I removed it.

clang-tidy/misc/MisplacedWideningCastCheck.h
17 ↗(On Diff #47181)

I have rerun add_new_check.py for my latest patch.

docs/clang-tidy/checks/misc-misplaced-widening-cast.rst
5 ↗(On Diff #47181)

yes good catch.

danielmarjamaki added inline comments.Feb 8 2016, 7:12 AM
docs/clang-tidy/checks/misc-misplaced-widening-cast.rst
2 ↗(On Diff #47182)

For information, it seem I had to use -DLLVM_ENABLE_SPHINX=ON also ..

alexfh accepted this revision.Feb 8 2016, 7:55 AM
alexfh edited edge metadata.

Looks good after fixing capitalization in comments.

Thank you!

clang-tidy/misc/MisplacedWideningCastCheck.cpp
32 ↗(On Diff #47182)

Err, no. I was wrong. Not all of them can be combined: the stmts and decls still need to be separate, so just ignore that comment ;)

80 ↗(On Diff #47182)

The assert here could simplify debugging a bit in case the matchers are changed and the invariant doesn't hold any more (and the check starts crashing after this line). I don't feel strongly about the presence of an assert in similar places, but there definitely should be no if.

docs/clang-tidy/checks/misc-misplaced-widening-cast.rst
2 ↗(On Diff #47182)

Yes, thanks. Apparently, I haven't found all relevant flags in ccmake.

test/clang-tidy/misc-misplaced-widening-cast.cpp
32 ↗(On Diff #47182)

You missed the "capitalization" part ;)

This revision is now accepted and ready to land.Feb 8 2016, 7:55 AM
This revision was automatically updated to reflect the committed changes.
clang-tools-extra/trunk/clang-tidy/misc/MisplacedWideningCastCheck.cpp
21–23

Sorry for the late observation, but why doesn't this check for % and / operators?

clang-tools-extra/trunk/clang-tidy/misc/MisplacedWideningCastCheck.cpp
21–23

That is intentional.. you can't get overflow with / , % , & , | , etc...