This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add portability-non-portable-integer-constant check
Needs RevisionPublic

Authored by Discookie on Mar 23 2023, 5:16 AM.

Details

Summary

This check finds integer literals that are being used in a non-portable manner.

Currently, the check detects cases where maximum or minimum values should be used
instead, as well as error-prone integer literals having leading zeroes, or
relying on most significant bits.

Diff Detail

Event Timeline

Discookie created this revision.Mar 23 2023, 5:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2023, 5:16 AM
Discookie requested review of this revision.Mar 23 2023, 5:16 AM

Please add alias entry in Release Notes.

clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.cpp
95

Excessive newline.

clang-tools-extra/docs/ReleaseNotes.rst
136

Incorrect entry.

clang-tools-extra/docs/clang-tidy/checks/portability/non-portable-integer-constant.rst
4

Please make is same length as title.

PiotrZSL added inline comments.Mar 23 2023, 8:21 AM
clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.cpp
92

consider excluding system headers, or even exclude things like "equals(0)" to speed up check.

103

use type instead of auto

146

add here some hint to developer what to do, in cpp suggest using std::numeric_limits, in C some other MAX macro.
Same for other.

clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.h
20

this check is already in portability group, no need to duplicate "portable".
Maybe just portability-integer-constant

clang-tools-extra/docs/clang-tidy/checks/portability/non-portable-integer-constant.rst
22

well, not, right way would be using std::numeric_limits<std::uint32_t>::max().
And unsigned long can be 4 bytes or 8 bytes or anything compiler decide.
So this example got more issues., and 0XFFF.. is a last of them, and -1 is not valid in such case.

clang-tools-extra/test/clang-tidy/checkers/portability/non-portable-integer-constant.cpp
1

no need to limit this check to c++17 or later.
when it's targeting all versions of C and C++

229

exclude system macros, otherwise it may warn for things like ULONG_MAX

PiotrZSL requested changes to this revision.Mar 23 2023, 3:00 PM

Additionally, fix build.
Once you finish making changes to this review, switch it back to ready for review.

This revision now requires changes to proceed.Mar 23 2023, 3:00 PM
LegalizeAdulthood resigned from this revision.Mar 28 2023, 1:23 PM
whisperity requested changes to this revision.Jul 6 2023, 8:34 AM

Earlier (IIRC in March) we did an internal test of the check and the following results were obtained. The results are kinda weird, to say the least. Numbers are after CodeChecker has done a serverside uniqueing (most likely things such as "if X in H.h is reported in the compilation/analysis of A.cpp and B.cpp then store X only once") of the bugs.

  • memcache-d: 43
  • tmux: 4
  • curl: 283
  • twin: 161
  • vim: 2 206
  • OpenSSL: 23 724
  • sqlite: 752
  • ffmpeg: 87 121
  • postgres: 116 635
  • tinyxml2: 6
  • libwebm: 34
  • xerces: 48 732
  • bitcoin: 4 429
  • protobuf: 1 405
  • codechecker-ldlogger: 0
  • ACID: 4 831
  • Qt: 34 973
  • Contour v2: 14 246
  • LLVM: 6 018

No crashes (of the checker) were observed. Most of the large outlier numbers are attributable to heavy mathematics (ffmpeg, Qt), crypto (OpenSSL, PostgreSQL), or character encoding (Xerces, Postgres, Contour) operations in the project.

Unfortunately, this is a very significant amount of false positives to the point where the consumption of results is slowed down not on the mental load side, but also the technical side (such as triggering server hangs in CodeChecker and whatnot...). However, after putting some thoughts into it and trying to hand-clusterise some of the reports I've found a few significant opportunities where this check should be improved.

Note: These improvements might be worthwhile to implement as a follow-up patch which is merged together with this current review. As it stands right now, this check is way too noisy (on maths-based projects) to be merged!

1. Enums

One is when the literal is used to initialise an enum. So far I don't know whether simply just ignoring enums is good, or would it be worth to somehow emit a "unified" warning for the entire enum that follows the pattern as shown below.

2. (u)?int([\d]+)_t

The other are cases when the user explicitly initialises a fixed-width (or *constrained-width*) variable, such as int64_t or uint16_t with a literal. Right now, the implementation hard matches only the IntegerLiteral instances, i.e., lines such as static constexpr std::uint64_t X = 0x07F000000 where the width and characteristics of the literal might just as well *exactly* and *properly* match the variable it is assigned to (granted, we only match and silence Decls for this, and lose expressivity on expressions), and do not report then.

3. Large arrays of magic literals

While this is not the silencing of a "valid" use case where the check, as of the current implementation, it is too verbose for, but I believe we could reasonably cull the noisiness of the check by simply saying that if there are array initialisers or brace-init-lists or whatever that contain multiple non-portable literals, then either we simply suck it up and do not warn for any of the elements... or for the sake of collegiality give one warning for the user. The former option would be the most silencing of all, while the latter would allow the user to do a single // NOLINT for the array instead of having to deal with /* NOLINTBEGIN */ ... /* NOLINTEND */.

Several chunks of noisy warnings come out of places like SHA-256, FFT, and various other "very mathematical" contexts where a large constant array is created with very specific proven values, all of which expressed and reported as "non-portable literals". It is no use to do 500000 warnings for a single array. (This is perhaps after all the same(-ish) family as the "enum" problem.)

clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.cpp
87

(Unrealistic nit: This will consume hypothetical additional literals such as 0!ZZZbb= as if they were decimal literals.)

153–154

(Style nit: These lines could be joined and still fit 80-col, I think.)

clang-tools-extra/clang-tidy/portability/NonPortableIntegerConstantCheck.h
16

I am still not entirely convinced whether this one-line summary makes too much sense to a user who does not know the Standard out of heart... As we hunt for bit widths more or less perhaps we could weave this notion into the single sentence summary saying that integers with "non-portable representations because of bit width"... Will need to give this some thought.

clang-tools-extra/test/clang-tidy/checkers/portability/non-portable-integer-constant.cpp
1

Binary literals require C++17 and some version of newer C standard (think C11?). However, technically it could work to use c++11-or-later and then use the __cplusplus macro to only compile the binary literal test when appropriately C++17.

72–74

(Style nit: Did we lose a test here, or is this just an accidental contiguous area of newlines?)