This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] WIP implement cppcoreguidelines check for mixed integer arithmetic
Needs ReviewPublic

Authored by JonasToth on Dec 5 2017, 2:15 PM.

Details

Summary

This check implements rules ES.100 and ES.102 of the CppCoreGuidelines
forbidding to mix signed and unsigned integer types in arithmetic expression.

It currently functions for the "normal" arithmetic but when implicit casts
to its happen for short and char i have no idea how to implement that
properly. Here some advice would be really helpful.

Event Timeline

JonasToth created this revision.Dec 5 2017, 2:15 PM
JonasToth updated this revision to Diff 125614.Dec 5 2017, 2:17 PM
  • [Misc] remove iostream
Eugene.Zelenko added inline comments.
docs/ReleaseNotes.rst
131

I think will be good idea to shorten description to one sentence and also make first sentence in documentation same.

aaron.ballman edited edge metadata.Dec 8 2017, 12:51 PM

The "enforcement" listed on the C++ Core Guidelines is very unhelpful, so it might be worth bringing it up as an issue on their bug tracker. ES.100 basically says "you know what we mean, wink wink" as enforcement and doesn't give any guidance as to what is safe or unsafe. It gives no exceptions, which I think is unlikely to be useful to most developers. For instance: void f(unsigned i) { if (i > 0) {} }, this is a mixed signed and unsigned comparison (literals are signed by default) -- should this check diagnose? How about unsigned int i = 12; i += 1;? ES.102 is equally as strange with "Flag unsigned literals (e.g. -2) used as container subscripts." That's a signed literal (2) with a negation operator -- do they mean "Flag container subscripts whose value is known to be negative", or something else?

clang-tidy/cppcoreguidelines/MixedIntArithmeticCheck.cpp
31

What about % or comparisons operators?

test/clang-tidy/cppcoreguidelines-mixed-int-arithmetic.cpp
167–168

I think these are intended to be flagged under ES.102. "Flag results of unsigned arithmetic assigned to or printed as signed."

The "enforcement" listed on the C++ Core Guidelines is very unhelpful, so it might be worth bringing it up as an issue on their bug tracker. ES.100 basically says "you know what we mean, wink wink" as enforcement and doesn't give any guidance as to what is safe or unsafe. It gives no exceptions, which I think is unlikely to be useful to most developers. For instance: void f(unsigned i) { if (i > 0) {} }, this is a mixed signed and unsigned comparison (literals are signed by default) -- should this check diagnose?

I think the guidelines mostly aim for actual calculations here. For ES.102 there is an exception to "we want actual modulo arithmetic". They seem mostly concerned with the mixing of the signedness of integer types that comes from explicitly expressing non-negative values with unsigned types. Because this is a common pattern (e.g. STL containers), the mixing from literals and co is quickly overseen.
See ES.106: Don’t try to avoid negative values by using unsigned

How about unsigned int i = 12; i += 1;? ES.102 is equally as strange with "Flag unsigned literals (e.g. -2) used as container subscripts." That's a signed literal (2) with a negation operator -- do they mean "Flag container subscripts whose value is known to be negative", or something else?

I think here ES.106 is again the rationale but your example shows a hole. unsigned int i = -1 is explicitly forbidden and the example shows a chain of implicit conversion of integer types as bad code.

My gut feeling is that the guidelines want programers to write auto i = 12u or unsigned int = 12u but they are not clear about that. Other rules that could relate to this are:
ES.11: Use auto to avoid redundant repetition of type names, ES.23: Prefer the {} initializer syntax (forbidding implicit conversions in initialization), ES.48: Avoid casts (maybe, but implicit conversion are not covered there).

I will ask them about this issue and hopefully we have some clarification. But I didn't have much success before :D

@aaron.ballman You are a maintainer for the cert rules, are you? How do you handle these common issues with integer types? Maybe we could already propose some guidance based on cert. It is mentioned as well written standard in the guidelines :)

clang-tidy/cppcoreguidelines/MixedIntArithmeticCheck.cpp
31

I forgot %. Comparing signed and unsigned should be covered by front-end warnings (-Wsign-compare)

test/clang-tidy/cppcoreguidelines-mixed-int-arithmetic.cpp
167–168

Yes they are but that was unintended :)

I should add a new section for such cases and include some logic in the matchers for it.

The "enforcement" listed on the C++ Core Guidelines is very unhelpful, so it might be worth bringing it up as an issue on their bug tracker. ES.100 basically says "you know what we mean, wink wink" as enforcement and doesn't give any guidance as to what is safe or unsafe. It gives no exceptions, which I think is unlikely to be useful to most developers. For instance: void f(unsigned i) { if (i > 0) {} }, this is a mixed signed and unsigned comparison (literals are signed by default) -- should this check diagnose?

I think the guidelines mostly aim for actual calculations here. For ES.102 there is an exception to "we want actual modulo arithmetic". They seem mostly concerned with the mixing of the signedness of integer types that comes from explicitly expressing non-negative values with unsigned types. Because this is a common pattern (e.g. STL containers), the mixing from literals and co is quickly overseen.
See ES.106: Don’t try to avoid negative values by using unsigned

How about unsigned int i = 12; i += 1;? ES.102 is equally as strange with "Flag unsigned literals (e.g. -2) used as container subscripts." That's a signed literal (2) with a negation operator -- do they mean "Flag container subscripts whose value is known to be negative", or something else?

I think here ES.106 is again the rationale but your example shows a hole. unsigned int i = -1 is explicitly forbidden and the example shows a chain of implicit conversion of integer types as bad code.

My gut feeling is that the guidelines want programers to write auto i = 12u or unsigned int = 12u but they are not clear about that. Other rules that could relate to this are:
ES.11: Use auto to avoid redundant repetition of type names, ES.23: Prefer the {} initializer syntax (forbidding implicit conversions in initialization), ES.48: Avoid casts (maybe, but implicit conversion are not covered there).

I will ask them about this issue and hopefully we have some clarification. But I didn't have much success before :D

Thanks! As it stands, I'm not able to determine whether your implementation actually does or does not enforce those rules. That's why I was wondering what the extent of the coverage is according to the rule authors, because I am not certain whether we're actually implementing the intent of the rules or not.

@aaron.ballman You are a maintainer for the cert rules, are you? How do you handle these common issues with integer types? Maybe we could already propose some guidance based on cert. It is mentioned as well written standard in the guidelines :)

Yes, I used to maintain the CERT rules when I worked for the SEI (and I authored a considerable number of them). CERT and the C++ Core Guidelines have a somewhat fundamental difference in opinion on how we write the rules. CERT flags things that are demonstrably going to lead to incorrectness (specifically, vulnerabilities). The C++ Core Guidelines flag things that could possibly lead to correctness issues but may also be perfectly correct code. Because of this difference, CERT rules are a bit harder to implement in clang-tidy because they often will have some component of data or flow analysis to them, while C++ Core Guidelines are often blanket bans.

That said, CERT's guidance on integers mostly falls back to the C rules: https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152052

I would recommend looking at the C++ Core Guideline wording and try to come up with example code that would run afoul of the rule to see if *you* think a diagnostic would be useful with that code. If you can't convince yourself that diagnosing the code is a good idea, it's worth bringing up to the authors to see if they agree. Similarly, see if you can find example code that would not run afoul of the rules but you think *should* to see if they agree. (You can use the CERT rules on integers to help give ideas.) Once you have a corpus of examples you want to see diagnosed and not diagnosed, try to devise (possibly new) wording for the enforcement section of the rule and see if the rule authors agree with the enforcement. From there, writing the check becomes a pretty simple exercise in translating the enforcement into a clang-tidy check.

JonasToth updated this revision to Diff 128870.Jan 7 2018, 4:31 AM
  • rebase after release of 6.0

rebase to 7.0.0

The guideline authors will focus on mixing integer length and integer signdness.

Thank you for the suggestion! Per our editor's discussion, we agree with "If one then mixes integer lengths and signedness in calculations confusion on what happens might occur" and would like to focus the discussion there.
signed and unsigned are defined in the standard and we cannot quibble with those but we can discuss mixed arithmetic. We should consider this separately.

Bjarne will add some text to this item.

alexfh removed a reviewer: alexfh.Mar 14 2018, 7:51 AM
JonasToth updated this revision to Diff 139706.Mar 24 2018, 4:21 AM
  • update to trunk
  • [Doc] update to new doc style
JonasToth updated this revision to Diff 159064.Aug 3 2018, 11:54 AM
  • get check up to date
  • Merge branch 'master' into check_mixed_arithmetic
JonasToth updated this revision to Diff 181877.Jan 15 2019, 1:42 PM
  • Merge branch 'master' into check_mixed_arithmetic, a lot has happened...
  • avoid bitrot
MyDeveloperDay added inline comments.
clang-tidy/cppcoreguidelines/MixedIntArithmeticCheck.h
2

I guess the license has to be updated

docs/ReleaseNotes.rst
216

remove

docs/clang-tidy/checks/cppcoreguidelines-mixed-int-arithmetic.rst
12

probably needs running though validate_check.py from { D55523} to make this 80 chars wide

Herald added a project: Restricted Project. · View Herald TranscriptFeb 9 2019, 2:18 AM
MyDeveloperDay added a project: Restricted Project.Feb 9 2019, 2:19 AM