This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add a check for "magic numbers"
ClosedPublic

Authored by 0x8000-0000 on Jul 9 2018, 5:44 PM.

Details

Summary

Add a clang-tidy check for "magic numbers", integers and floating point values embedded in the code instead of using symbols or constants.

Bad example:

double circleArea = 3.1415926535 * radius * radius;
double totalCharge = 1.08 * itemPrice;

Good example:

double circleArea = M_PI * radius * radius;
double totalCharge = (1.0 + TAX_RATE) * itemPrice;

This check supports HICPP 5.1.1 Use symbolic names instead of literal values in code and Rule ES.45: Avoid “magic constants”; use symbolic constants

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
JonasToth added inline comments.Jul 17 2018, 7:27 AM
clang-tidy/readability/MagicNumbersCheck.cpp
43

Use the type the function returns (what auto would deduce to).

Adding C++ Core Guidelines alias is definitely low-hanging fruit which could be implemented within this patch.

clang-tidy/readability/MagicNumbersCheck.cpp
15

Please remove empty line.

31

auto could be used here because it's range loop over container.

clang-tidy/readability/MagicNumbersCheck.h
15

Please remove empty line and also include <vector>.

test/clang-tidy/readability-magic-numbers.cpp
125

Please run Clang-format over test case.

0x8000-0000 marked 4 inline comments as done.Jul 17 2018, 4:27 PM

@Eugene.Zelenko thank you for suggesting the alias - I didn't know it is that easy, but a grep on "alias" led me to the right place.

clang-tidy/readability/MagicNumbersCheck.h
56

The IgnoredValuesInput is parsed via utils::options::parseStringList which returns a std::vector<std::string>. I am using that instance directly - copying it into a llvm::SmallVector would be more expensive.

test/clang-tidy/readability-magic-numbers.cpp
125

I do run it constantly. In this case there is no change. I am using the clang-6.0 formatter though.

0x8000-0000 marked 13 inline comments as done.Jul 17 2018, 4:28 PM
Eugene.Zelenko added inline comments.Jul 17 2018, 4:37 PM
test/clang-tidy/readability-magic-numbers.cpp
125

It's weird, since closing curly bracket should be in same line as opening. Also Rectangle constructor is longer then 80 symbols.

Address several review comments. Create alias for cppcoreguidelines-avoid-magic-numbers .

0x8000-0000 marked an inline comment as done.Jul 17 2018, 6:04 PM
0x8000-0000 added inline comments.
test/clang-tidy/readability-magic-numbers.cpp
125

If you download the code and run clang-format-6.0 do you get something different?

0x8000-0000 marked an inline comment as done.Jul 17 2018, 6:09 PM

florin@helios:~/tools/llvm$ find . -name .clang-format | sort
./.clang-format
./projects/compiler-rt/lib/asan/.clang-format
./projects/compiler-rt/lib/dfsan/.clang-format
./projects/compiler-rt/lib/hwasan/.clang-format
./projects/compiler-rt/lib/interception/.clang-format
./projects/compiler-rt/lib/lsan/.clang-format
./projects/compiler-rt/lib/msan/.clang-format
./projects/compiler-rt/lib/safestack/.clang-format
./projects/compiler-rt/lib/sanitizer_common/.clang-format
./projects/compiler-rt/lib/tsan/.clang-format
./projects/libcxxabi/.clang-format
./projects/libcxx/.clang-format
./projects/libunwind/.clang-format
./projects/lld/.clang-format
./projects/openmp/runtime/.clang-format
./test/.clang-format
./tools/clang/.clang-format
./tools/clang/test/.clang-format
./tools/clang/tools/extra/test/.clang-format
florin@helios:~/tools/llvm$ cat ./tools/clang/tools/extra/test/.clang-format
BasedOnStyle: LLVM
ColumnLimit: 0

Doesn't seem that weird ;)

0x8000-0000 marked an inline comment as done.Jul 17 2018, 6:10 PM
aaron.ballman added inline comments.Jul 18 2018, 1:16 PM
clang-tidy/readability/MagicNumbersCheck.cpp
24

Why 2, 10, and 100?

31–33

This can be replaced with llvm::transform(IgnoredValuesInput, IgnoredValues.begin(), std::stoll);

34

Please use llvm::sort() instead to help find non-determinism bugs. (I'm a bit surprised we don't have a range-based sort though.)

38

IgnoredIntegerValues as the public facing option as well, unless you want to allow users to specify ignored floating-point values too (which could be useful as well).

42–43

The names ii and ff could be a bit more user-friendly. Also, this can be written using a single matcher, I think.
anyOf(integerLiteral().bind("integer"), floatLiteral().bind("float"))

55

Rather than indent everything for the typical case, I'd prefer this to be an early return.

73

This diagnostic text doesn't really tell the user what's wrong with their code or how to fix it. Also, this function is largely duplicated in checkFloatMatch(). I think they should be combined where possible and the diagnostic should read: '%0' is a magic number; consider replacing with a named constant or something along those lines.

119

Can use const auto * here as the type is spelled out in the initialization.

120–124

The if statements can be combined into a single statement and the braces can be elided.

125–127

Same here for auto and single statement.

136–141

return Node.get<SubstNonTypeTemplateParmExpr> != nullptr; However, that makes me wonder why this should be a function at all.

148

Elide braces

152–159

I think this could be replaced by a call to llvm::any_of()

165

No need for this local variable.

169–175

llvm::any_of()

184–187

No need for fancy C-style comments here; can just use //.

clang-tidy/readability/MagicNumbersCheck.h
16

This include can be removed.

54

I think this should be IgnoredIntegerValue given that it only is checked for integers. That should be reflected in the documentation as well.

clang-tidy/readability/ReadabilityTidyModule.cpp
69–70

I think this should also be registered as a C++ Core Guideline checker, as I believe it covers at least the integer and floating-point parts of https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Res-magic

docs/clang-tidy/checks/readability-magic-numbers.rst
2

This doesn't document the user-facing configuration options, but it should.

5

Underlining is too long.

7

literal -> literals

lebedev.ri added inline comments.
clang-tidy/readability/MagicNumbersCheck.cpp
24

These really should be a config option.

aaron.ballman added inline comments.Jul 18 2018, 2:16 PM
clang-tidy/readability/MagicNumbersCheck.cpp
24

These are the default values for the config option. I think 0 and 1 make a lot of sense to have in the default list given how prevalent they are in real world code. But the other three seem to be used far less often.

I think if we wanted to have any values other than 0 and 1 (and perhaps -1) as defaults, I'd want to see some data on large amounts of code to justify anything else.

0x8000-0000 added inline comments.Jul 18 2018, 3:08 PM
clang-tidy/readability/MagicNumbersCheck.cpp
24

No need for -1, as it is covered by 1, plus unary operator.

I know 100 is used often for converting percentages.

10 is used for number parsing (yes, many people still do that by hand instead of libraries).

But this is the configuration - and I'm ok with reverting to 0 and 1 as I had originally.

clang-tidy/readability/MagicNumbersCheck.h
53

Ignored. Please spellcheck throughout.

docs/clang-tidy/checks/readability-magic-numbers.rst
34

I notice you changed -1 to -3. Is return -1 not an example of the "no magic numbers" rule? If not, why not — and can you put something in the documentation about it?
At least add a unit test proving that return -1; is accepted without any diagnostic, if that's the intended behavior.

test/clang-tidy/readability-magic-numbers.cpp
17

Please add test cases immediately following this one, for

const int BadLocalConstInt = 6;
constexpr int BadLocalConstexprInt = 6;
static const int BadLocalStaticConstInt = 6;
static constexpr int BadLocalStaticConstexprInt = 6;

(except of course changing "Bad" to "Good" in any cases where 6 is acceptable as an initializer).

Also

std::vector<int> BadLocalVector(6);
const std::vector<int> BadLocalConstVector(6);

etc. etc.

85

Constant. Please spellcheck throughout.

152

Surely you should expect some diagnostics on this line!

0x8000-0000 marked 17 inline comments as done.Jul 18 2018, 4:48 PM
0x8000-0000 added inline comments.
clang-tidy/readability/MagicNumbersCheck.cpp
31–33

/home/florin/tools/llvm/tools/clang/tools/extra/clang-tidy/readability/MagicNumbersCheck.cpp:30:3: error: no matching function for call to 'transform'

llvm::transform(IgnoredValuesInput, IgnoredIntegerValues.begin(), std::stoll);
^~~~~~~~~~~~~~~

/home/florin/tools/llvm/include/llvm/ADT/STLExtras.h:990:10: note: candidate template ignored: couldn't infer template argument 'UnaryPredicate'
OutputIt transform(R &&Range, OutputIt d_first, UnaryPredicate P) {

^

1 error generated.

Shall I wrap it in a lambda?

42–43

addMatcher(anyOf(...), this) is ambiguous. Is there any benefit over the two statements?

clang-tidy/readability/ReadabilityTidyModule.cpp
69–70

I have it as an alias in my branch. I will check why it was not exported in the patch.

0x8000-0000 marked 8 inline comments as done.Jul 18 2018, 5:06 PM
0x8000-0000 added inline comments.
docs/clang-tidy/checks/readability-magic-numbers.rst
34

-1 is parsed as "UnaryOperator" - "IntegerLiteral" 1. I think -1 should be acceptable by default.

test/clang-tidy/readability-magic-numbers.cpp
17

Again... all the "const .* (\d)+" patterns should be acceptable. We're initializing a constant. Would you prefer an explicit option?

152

I am using those values to initialize a constant, as described earlier.

We can disable this - but this will be a terribly noisy checker.

I think this requires a separate discussion - do we accept magic values only when they are used directly to initialize a constant of the same type? Or do we allow them generically when they are used to initialize any constant? Or do we only accept several layers of nesting, but not too much?

a) const int Foo = 42;
b) #define FROB 66
c) cont int Bar[] = { 43, 44 };
d) const geometry::Rectangle<double> mandelbrotCanvas{geometry::Point<double>{-2.5, -1}, geometry::Dimension<double>{3.5, 2}};

Which of these should produce a warning?

Significant refactoring to address review comments - mainly to reduce duplication and implement in functional style.

cppcoreguidelines-avoid-magic-numbers is documented as an alias to readability-magic-numbers check.

0x8000-0000 marked an inline comment as done.Jul 18 2018, 7:18 PM
0x8000-0000 added inline comments.
test/clang-tidy/readability-magic-numbers.cpp
17

I have template and constructor arguments already in the test. I have tried including <vector> but somehow it is not found and the std::vector is reported as an error in itself.

aaron.ballman added inline comments.Jul 19 2018, 5:13 AM
clang-tidy/readability/MagicNumbersCheck.cpp
24

Good point on -1. I'd say let's revert until we have statistics to justify any other default values. That said, I'd still be curious to know how chatty this is over large code bases. How many times does this trigger in LLVM, for instance?

31–33

Yeah, I'd wrap in a lambda then.

42–43

Better memoization, which may not really be critical given how trivial the matchers are.

test/clang-tidy/readability-magic-numbers.cpp
17

Tests need to be hermetic and cannot rely on STL headers or other system headers. Basically, you have to replicate the contents of <vector> (or whatever) within the test file for whatever you're trying to test.

0x8000-0000 marked 3 inline comments as done.Jul 19 2018, 3:00 PM
0x8000-0000 added inline comments.
test/clang-tidy/readability-magic-numbers.cpp
17

Ok - we have that already and there's nothing magic about std::vector or std::array.

Small refactoring and documentation update.

Revert built-in acceptable integers to -1, 0 and 1 and document them.

Add a (presently failing) test for not tripping up on LINE through several layers of macro expansion (as in GoogleTest library). This creates a lot of false positives in the unit tests and needs to be fixed.

0x8000-0000 updated this revision to Diff 156425.EditedJul 19 2018, 9:20 PM

Filter out synthetic integers (such as _LINE_) from the report. All the tests are passing now.

0x8000-0000 added a comment.EditedJul 19 2018, 10:33 PM
./tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -clang-tidy-binary ../llvm.rel/bin/clang-tidy -checks="-*,readability-magic-numbers" -j 12 -p ../llvm.rel -j 12 -quiet > /tmp/llvm.magic
grep "warning:" /tmp/llvm.magic | cut -d: -f5 | cut -d" " -f2 | sort | uniq -c | sort -rn | head -40

With about 2000 files scanned (on the current llvm+clang+extra tree), these are the popularity rankings:

9559 2
5041 4
4359 8
2811 3
2555 16
1748 32
1213 64
1170 10
1155 128
 910 5
 853 6
 606 7
 537 256
 385 12
 382 15
 349 20
 322 1024
 288 255
 252 100
 238 9
 233 11
 181 40
 174 63
 170 24
 163 31
 161 512
 146 65535
 144 13
 136 14
 126 18
 120 1
 113 17
 103 1000
 100 4096
  92 60
  85 192
  84 30
  81 21
  76 25
  71 23

Total warning count is ~40K.

So it seems powers of 2 are popular, as are 10, 100, 1000.

2 occurs 25% of the time; 4+8+16 another 27%, while 32... 1024 occur 14%

Avoid parsing and reformatting the input literal - just print the original source code.

Ignore literals implicitly added by the compiler, such as when using a range-for loop over a constant array.

Bail out early if a [grand-]parent node is a declaration, but not a constant declaration.
Search for enumerations and declarations in the same tree traversal.

@aaron.ballman , @JonasToth , @Eugene.Zelenko - would you be so kind to do one more review pass and let me know if there are any blockers or if this is ready to merge? I do not have commit privileges, so if it is alright, feel free to merge it.

Eugene.Zelenko added inline comments.Jul 24 2018, 7:00 PM
clang-tidy/readability/MagicNumbersCheck.h
43

Unnecessary semicolon. Please also add empty line after.

docs/clang-tidy/checks/readability-magic-numbers.rst
56

Please enclose numbers in `. Same for 0.0.

57

Please use `.

Fix a few typos

Eugene.Zelenko added inline comments.Jul 24 2018, 7:11 PM
docs/clang-tidy/checks/readability-magic-numbers.rst
57

Sorry, almost forgot. Please prefix IgnoredIntegerValues with :option:. You still need to use ` for it.

0x8000-0000 marked 4 inline comments as done.Jul 24 2018, 7:13 PM

Update links in documentation

0x8000-0000 marked an inline comment as done.Jul 24 2018, 7:16 PM

I did not find any major issue :)

clang-tidy/readability/MagicNumbersCheck.cpp
21

You move the using namespace clang::ast_matchers; up to shorten your signature.
Adding a using for ast_type_traits is possible, too .

80

is the clang:: necessary? The code should be in that namespace already. I think shortening the really long type qualifier helps with readability. Similar on other places.

docs/clang-tidy/checks/readability-magic-numbers.rst
56

-1 is not in the default list anymore.

0x8000-0000 marked 3 inline comments as done.Jul 26 2018, 5:13 PM
0x8000-0000 added inline comments.
docs/clang-tidy/checks/readability-magic-numbers.rst
56

I will update the comment. -1 is still accepted because it is (unary operator) - followed by accepted (integer literal) 1.

0x8000-0000 marked an inline comment as done.

Remove extra verbose namespaces and update documentation to indicate why -1 is accepted even when not explicitly called out.

0x8000-0000 marked 15 inline comments as done.Jul 26 2018, 6:12 PM
aaron.ballman added inline comments.Jul 28 2018, 9:43 AM
clang-tidy/cppcoreguidelines/CppCoreGuidelinesTidyModule.cpp
47–48

Please keep this alphabetized in the list.

clang-tidy/readability/MagicNumbersCheck.cpp
54

Is the trailing semicolon after the 1 necessary?

89

No need to use != nullptr, or having the parens for that expression.

110

Can elide the braces.

clang-tidy/readability/MagicNumbersCheck.h
50

boundName -> BoundName per naming conventions.

52

!MatchedLiteral

docs/clang-tidy/checks/list.rst
82

This should have a (redirects to blah) blurb after it.

docs/clang-tidy/checks/readability-magic-numbers.rst
13

Given that the C++ Core Guideline check redirects here, it would be good to link to the correct place in those guidelines from here.

62–64

I am curious to know how true this is. You got some data for integer values and reported it, but I'm wondering if you've tried the same experiment with floating-point numbers?

Address review comments to improve documentation and readability

0x8000-0000 marked 7 inline comments as done.Jul 28 2018, 10:13 AM
0x8000-0000 added inline comments.
clang-tidy/readability/MagicNumbersCheck.cpp
54

I have found so, experimentally.

docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

The problem with the floating point numbers as text is: they need to be parsed both from the configuration and from the source code _then_ compared. What is an acceptable epsilon? I don't know. Is the same epsilon acceptable on all source code? I don't know.

aaron.ballman added inline comments.Jul 28 2018, 10:40 AM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

Yeah, I'm not too worried about the situations in which the epsilon matters. I'm more worried that we'll see a lot of 1.0, 2.0 floating-point literals where the floating-point value is a nice, round, easy-to-represent number but users have no way to disable this diagnostic short of const float Two = 2.0f;

lebedev.ri added inline comments.Jul 28 2018, 10:45 AM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

Random thought: the types that are ignored should/could be configurable, i.e. there should be a switch
whether or not to complain about floats.

0x8000-0000 marked an inline comment as done.Jul 28 2018, 10:46 AM
0x8000-0000 added inline comments.
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

Even though they might be nice and round... they should mean _something_ other than 'Two'.

The thing is... magic integers are used as buffer sizes, or to map things that are discrete in nature - number of legs of a typical mammal for instance. Not sure what magic numbers exist in nature besides pi and e and some fundamental physical constants )Avogadro's number, etc). But even there, it is better to use a symbolic constant.

0x8000-0000 added inline comments.Jul 28 2018, 10:47 AM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

Actually that is a _great_ idea, thank you!

I will add one more option: IgnoreFloatingPointValues, to ignore all floats and doubles, because the FloatingLiteral does not distinguish between them (as implementation detail), and I don't see a good reason to be strict about doubles and lenient about floats, or viceversa. The default value for this option is false.

aaron.ballman added inline comments.Jul 28 2018, 10:59 AM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

The thing is... magic integers are used as buffer sizes, or to map things that are discrete in nature - number of legs of a typical mammal for instance. Not sure what magic numbers exist in nature besides pi and e and some fundamental physical constants )Avogadro's number, etc). But even there, it is better to use a symbolic constant.

That's my point -- I think there's a lot of uses of round floating-point values that are not magical numbers, they're sensible constants. Looking at LLVM's code base shows a *lot* of 1.0 and 2.0 values (hundreds of instances from a quick text-based search). No one should be forced to turn those into named constants. However, I've seen code using 1.02 and .98 in places -- those seem like sensible things to make named constants because the values have semantically interesting meaning to the surrounding code.

Random thought: the types that are ignored should/could be configurable, i.e. there should be a switch

whether or not to complain about floats.

I think this would be a useful option, for sure (I used to work at a place that did a ton of floating-point math that would benefit from the integer side of this check but could never use the floating-point side of it). However, the presence of such an option doesn't give us a pass on coming up with a data-driven list of default values to ignore for the floating-point side. If we don't want to make that list configurable, I think that's something we can discuss (I think I'm fine with not making it a user-facing configuration option). But I think that 0.0 is insufficient.

lebedev.ri added inline comments.Jul 28 2018, 11:02 AM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

Yep. I didn't mean for that flag to be a replacement for the ignore-list for fp constants.

Add option to ignore all floating point values.

0x8000-0000 added inline comments.Jul 28 2018, 11:22 AM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

This opens up an entire can of worms that requires quite a bit of thought and invites significant amount of bikesheding. Is "2.00" as acceptable as "2.0"? Do we compare textually, or with regular expressions? Is 2.00000001 represented the same way on PowerPC and ARM as on Intel?

As this check is specified and implemented right now, people have the following options:

  • not use the check at all
  • use it to flag all literals
  • use it to flag only integral literals
  • use it to flag all integral literals not on a white list.

If somebody can come up with a more logical spec for how to filter out certain floating point values - they can definitely extend this check, without impacting users that are happy with the menu offered above.

Does this sound reasonable?

aaron.ballman added inline comments.Jul 28 2018, 12:44 PM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

This opens up an entire can of worms that requires quite a bit of thought and invites significant amount of bikesheding. Is "2.00" as acceptable as "2.0"? Do we compare textually, or with regular expressions? Is 2.00000001 represented the same way on PowerPC and ARM as on Intel?

I guess my mental model for this check was a bit easier than that. We don't care about 0x2 vs 2, for instance, so why would we care about 2.0 vs 2.00? It's a value-based check, so it checks the values of the literals, just like with integers. If we want to expose a configuration list, we can push those strings into an APFloat and get their value as well. APFloat handles the underlying floating-point model semantics, so I don't think this will be particularly problematic. Once we have some data on the common floating-point literal values we can worry about equality issues; my hunch is that they won't be problematic because the most common literals are likely to be small, whole numbers that are precisely represented.

If somebody can come up with a more logical spec for how to filter out certain floating point values - they can definitely extend this check, without impacting users that are happy with the menu offered above.
Does this sound reasonable?

No, because I believe this check, as is, has too high of a false positive rate due to the floating-point literal portion of the check, which is defaulted on. I'm also hesitant to release the check where the options for floating-point values are either: on and everything other than 0.0 must have a named constant, or off.

0x8000-0000 added inline comments.Jul 28 2018, 1:31 PM
docs/clang-tidy/checks/readability-magic-numbers.rst
62–64

I guess my mental model for this check was a bit easier than that. We don't care about 0x2 vs 2, for instance, so why would we care about 2.0 vs 2.00?

Here is my mental model for how to implement it, and the fact that the implementation details leak into the specification is a feature, given the highly technical nature of this check, especially as it relates to floating point values:

  • we can either implement it by parsing the floating points and comparing them as numbers using some defined epsilon, or
  • we can compare them as text, in which case we run into complexities with extra 0s, scientific notation, etc.

Does this sound reasonable?

No, because I believe this check, as is, has too high of a false positive rate due to the floating-point literal portion of the check, which is defaulted on. I'm also hesitant to release the check where the options for floating-point values are either: on and everything other than 0.0 must have a named constant, or off.

If we rename this check to AvoidMagicIntegerLiterals and remove any mention of floating point values, would it be valuable in and of itself?

I am not sure I can distill your concern ("make it easy to accept _some_ nice looking floating points values") into a succinct and and coherent specification - but if you can write the documentation portion, I will do my best to implement it. If you want to default floating point values to "ignored" I'm good with that, too.

Not trying to be difficult here - I have attempted to implement the straight-forward check.

Added this to the MagicNumbersCheck::MagicNumbersCheck constructor:

+  // process set of ignored floating point values
+  const std::vector<std::string> IgnoredFloatingPointValuesInput =
+      utils::options::parseStringList(Options.get(
+          "IgnoredFloatingPointValues", DefaultIgnoredFloatingPointValues));
+  IgnoredFloatingPointValues.reserve(IgnoredFloatingPointValuesInput.size());
+  for (const auto &InputValue : IgnoredFloatingPointValuesInput) {
+    llvm::APFloat FloatValue(llvm::APFloat::IEEEdouble());
+    FloatValue.convertFromString(InputValue, DefaultRoundingMode);
+    const double Value = FloatValue.convertToDouble();
+    IgnoredFloatingPointValues.push_back(Value);
+  }
+  llvm::sort(IgnoredFloatingPointValues.begin(),
+             IgnoredFloatingPointValues.end());

and changed isIgnoredValue(const FloatingLiteral *Literal) like this:

 bool MagicNumbersCheck::isIgnoredValue(const FloatingLiteral *Literal) const {
-  llvm::APFloat FloatValue = Literal->getValue();
-  return FloatValue.isZero();
+  const llvm::APFloat FloatValue = Literal->getValue();
+  double Value;
+  if (&FloatValue.getSemantics() == &llvm::APFloat::IEEEdouble()) {
+    Value = FloatValue.convertToDouble();
+  } else if (&FloatValue.getSemantics() == &llvm::APFloat::IEEEsingle()) {
+    Value = static_cast<double>(FloatValue.convertToFloat());
+  } else
+    return false;
+
+  return std::binary_search(IgnoredFloatingPointValues.begin(),
+                            IgnoredFloatingPointValues.end(), Value);
 }

When running under the debugger and printing various values, it seems that 3.14 read as double and converted to double prints as 3.1399999999999997, while 3.14f read as float and cast to double prints as 3.1400001049041748

I will parse the IgnoredFloatingPointValues twice and store the results in two arrays, one of doubles and one of floats.

Add support for ignoring specific floating point numbers.

Thank you for working on this, I think it's getting closer! I'd use a slightly different approach to handling floating-point values, but if that turns out to be a clean implementation we may want to think about whether there are improvements from modelling integers similarly (only using APInt instead of APFloat).

clang-tidy/readability/MagicNumbersCheck.cpp
58

I would still like to see some data on common floating-point literal values used in large open source project so that we can see what sensible values should be in this list.

64

Comments should be grammatically correct, including capitalization and punctuation (elsewhere in this function as well).

77–87

This is where I would construct an APFloat object from the string given. As for the semantics to be used, I would recommend getting it from TargetInfo::getDoubleFormat() on the belief that we aren't going to care about precision (explained in the documentation).

129–140

Here is where I would compare FloatValue against everything in the IgnoredFloatingPointValues array using APFloat::compare(). However, you need the floats to have the same semantics for that to work, so you'd have to convert FloatValue using APFloat::convert() to whatever the target floating-point format is.

clang-tidy/readability/MagicNumbersCheck.h
86–89

The model I was thinking of would have this be: llvm::SmallVector<APFloat, SensibleNumberOfMagicValueExceptions> IgnoredFloatingPointValues; so that you don't need to distinguish between floats and doubles (which also nicely supports long doubles, __float128, and _Float16).

docs/clang-tidy/checks/readability-magic-numbers.rst
64–68

This isn't far off the mark, but I'd go with something like:

For each value in that set, the given string value is converted to a floating-point value representation used by the target architecture <NNN>. If a floating-point literal value compares equal to one of the converted values, then that literal is not diagnosed by this check. Because floating-point equality is used to determine whether to diagnose or not, the user needs to be aware of the details of floating-point representations for any values that cannot be precisely represented for their target architecture.

The <NNN> bit is where we can stick details we haven't worked out yet, like what rounding mode is used to perform the conversion and whether lossy conversions are allowed or ignored.

Indicate that 0 and 0.0 are accepted unconditionally (because it makes sense in the source code, and speeds-up many checks as 0s are very common and we don't want to spend log2(n) to find them at the beginning of the vector).

0x8000-0000 marked 4 inline comments as done.Jul 29 2018, 6:51 AM

See inline comments. Basically we need two arrays because APFloats of different semantics don't compare well, and even if we coerce them, they sometimes are not equal.

clang-tidy/readability/MagicNumbersCheck.cpp
58

What value would that bring? The ideal target is that there are no magic values - no guideline that I have seen makes exception for 3.141 or 9.81. Each project is special based on how they evolved, and they need to decide for themselves what is worth cleaning vs what can be swept under the rug for now. Why would we lend authority to any particular floating point value?

77–87

Here is the problem I tried to explain last night but perhaps I wasn't clear enough.

When we parse the input list from strings, we have to commit to one floating point value "semantic" - in our case single or double precision.

When we encounter the value in the source code and it is captured by a matcher, it comes as either one of those values.

Floats with different semantics can't be directly compared - so we have to maintain two distinct arrays.

If we do that, rather than store APFloats and sort/compare them with awkward lambdas, we might as well just use the native float/double and be done with it more cleanly.

129–140

If you do that, 3.14f will not be equal to 3.14 . You need two arrays.

aaron.ballman added inline comments.Jul 29 2018, 7:14 AM
clang-tidy/readability/MagicNumbersCheck.cpp
58

Because that's too high of a high false positive rate for an acceptable clang-tidy check. As mentioned before, there are literally hundreds of unnameable floating-point literals in LLVM alone where the value is 1.0 or 2.0. Having statistical data to pick sensible defaults for this list is valuable in that it lowers the false positive rate. If the user dislikes the default list for some reason (because for their project, maybe 2.0 is a supremely nameable literal value), they can pick a different set of defaults.

Right now, I'm operating off an assumption that most floating-point literals that should not be named are going to be whole numbers that are precisely represented in all floating-point semantic models. This data will tell us if that assumption is wrong, and if the assumption is wrong, we might want to go with separate lists like you've done.

77–87

When we encounter the value in the source code and it is captured by a matcher, it comes as either one of those values.

It may also come in as long double or __float128, for instance, because there are type suffixes for that.

Floats with different semantics can't be directly compared - so we have to maintain two distinct arrays.

Yes, floats with different semantics cannot be directly compared. That's why I said below that we should coerce the literal values.

If we do that, rather than store APFloats and sort/compare them with awkward lambdas, we might as well just use the native float/double and be done with it more cleanly.

There are too many different floating-point semantics for this to be viable, hence why coercion is a reasonable behavior.

129–140

The point to a named constants check is to give names to literal values that should have names because there's something special about them. No one wants to name 1 because they're incrementing by one, or 2.0 because they're cutting a value in half. So the thrust of this check should be to make it hard for users to have literals without names while making it easy to provide exceptions for truly unnameable values because they don't have semantic meaning. I am not worried that 3.14 is not equal to 3.14f because numbers which require precision most likely *should* be given named constants because there's most likely *something* special about that value.

0x8000-0000 marked 2 inline comments as done.Jul 29 2018, 7:22 AM
0x8000-0000 added inline comments.
clang-tidy/readability/MagicNumbersCheck.cpp
77–87

Let me see if I understood it - your proposal is: store only doubles, and when a floating-point literal is encountered in code, do not use the FloatingLiteral instance, but parse it again into a double and compare exactly. If the comparison matches - ignore it.

In that case what is the value of storing APFloats with double semantics in the IgnoredValues array, instead of doubles?

aaron.ballman added inline comments.Jul 29 2018, 7:35 AM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87

Let me see if I understood it - your proposal is: store only doubles, and when a floating-point literal is encountered in code, do not use the FloatingLiteral instance, but parse it again into a double and compare exactly. If the comparison matches - ignore it.

My proposal is to use APFloat as the storage and comparison medium. Read in strings from the configuration and convert them to an APFloat that has double semantics. Read in literals and call FloatLiteral::getValue() to get the APFloat from it, convert it to one that has double semantics as needed, then perform the comparison between those two APFloat objects.

In that case what is the value of storing APFloats with double semantics in the IgnoredValues array, instead of doubles?

Mostly that it allows us to modify or extend the check for more complicated semantics in the future. Also, it's good practice to use something with consistent semantic behavior across hosts and targets (comparisons between numbers that cannot be precisely represented will at least be consistently compared across hosts when compiling for the same target).

0x8000-0000 added inline comments.Jul 29 2018, 8:27 AM
clang-tidy/readability/MagicNumbersCheck.cpp
58

Here are the results with the check as-is, run on the llvm code base as of last night:

top-40

10435 2
 5543 4
 4629 8
 3271 3
 2702 16
 1876 32
 1324 64
 1309 10
 1207 5
 1116 128
  966 6
  733 7
  575 256
  421 20
  406 12
  339 9
  331 1024
  311 100
  281 42
  253 11
  226 15
  189 40
  172 24
  169 0xff
  168 13
  168 0x80
  166 512
  137 1.0
  133 14
  132 31
  129 0xDEADBEEF
  120 18
  120 17
  120 1000
  115 4096
  100 30
   94 60
   94 0x1234
   89 0x20
   86 0xFF

1.0 is in position 28 with 137 occurrences
2.0 is in position 93 with 27 occurrences
100.0 is in position 96 with 26 occurences
1.0f is in position 182 with 11 occurences

we also have 2.0e0 four times :)

This data suggests that there would be value in a IgnorePowerOf2IntegerLiterals option.

77–87

ok - coming right up!

aaron.ballman added inline comments.Jul 29 2018, 8:46 AM
clang-tidy/readability/MagicNumbersCheck.cpp
58

Any chance you can hack the check run on LLVM where it doesn't report any integer values (I'm curious about the top ten or so for floating-point values)? Additionally, it'd be great to get similar numbers from some other projects, like https://github.com/qt just to see how it compares (I've used bear to get compile_commands.json file out of Qt before, so I would guess that would work here).

0x8000-0000 added inline comments.Jul 29 2018, 8:55 AM
clang-tidy/readability/MagicNumbersCheck.cpp
58

No need for the hack, I just grep for dot in my report:

137 1.0
 27 2.0
 26 100.0
 20 0.5
 11 1.0f
  8 1.02
  7 3.0
  7 1.98
  7 1.5
  6 .5
  6 1.1
  5 0.01
  4 89.0f
  4 4294967296.f
  4 3.14159
  4 2.0e0
  4 10.0
  3 88.0f
  3 255.0
  3 127.0f
  3 12345.0f
  3 123.45
  3 0.2f
  3 0.25
  3 0.1f
  3 0.1
  2 80.0
  2 710.0f
  2 710.0
  2 709.0f
  2 6.0
  2 3.14f
  2 3.14
  2 2.5
  2 2349214918.58107
  2 149.0f
  2 14.5f
  2 1.17549435e-38F
  2 11357.0f
  2 11356.0f
  2 103.0f
  2 0.99
  2 0.9
  2 0.01f
  1 745.0f
  1 745.0
  1 7.1653228759765625e2f
  1 709.0
  1 7.08687663657301358331704585496e-268
  1 6.62E-34
  1 64.0f
  1 6.02E23
  1 4950.0f
  1 4932.0f
  1 45.0f
  1 42.42
  1 4.2
  1 4.0
  1 38.0f
  1 3.7
  1 323.0f
  1 32.0f
  1 32.0
  1 3.1415
  1 308.0f
  1 2.718
  1 2.7
  1 225.0f
  1 21.67
  1 2.0f
  1 2.088
  1 .17532f
  1 16445.0f
  1 1.616f
  1 128.0f
  1 12346.0f
  1 12.0f
  1 1.2
  1 1.1f
  1 11399.0f
  1 11383.0f
  1 1.0L
  1 1.0E+9
  1 1.0E+6
  1 1.0e-5f
  1 1.0E+12
  1 1074.0f
  1 1074.0
  1 1023.0f
  1 1023.0
  1 1.01F
  1 1.01
  1 10.0f
  1 100.
  1 0.999999
  1 0.8f
  1 .08215f
  1 0.7
  1 0.6
  1 0.5F
  1 0.5f
  1 0.59
  1 0.4f
  1 0.2
  1 0.06
  1 0.02
  1 0.005

I'll check out Qt shortly.

0x8000-0000 added inline comments.Jul 29 2018, 9:14 AM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87

My proposal is to use APFloat as the storage and comparison medium. Read in strings from the configuration and convert them to an APFloat that has double semantics.

This is easy.

Read in literals and call FloatLiteral::getValue() to get the APFloat from it,

this is easy as well,

convert it to one that has double semantics as needed, then perform the comparison between those two APFloat objects.

The conversion methods in APFloat only produce plain-old-data-types:

double convertToDouble() const;                                                                                                                                                              
float convertToFloat() const;

There is no

APFloat convertToOtherSemantics(const fltSemantics &) const;

method.

What I can do is

  1. convert the APFloat to float or double, depending on what the semantics is; cast to double then load into an APFloat with double semantics and then search into the set
  2. parse the textual representation of the FloatingLiteral directly into an APFloat with double semantics.
77–87

TargetInfo::getDoubleFormat() is not accessible directly from ClangTidyContext or from MatchFinder

0x8000-0000 added inline comments.Jul 29 2018, 9:29 AM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87
  1. doesn't quite work; APFloat.convertFromString chokes on 3.14f ;(
0x8000-0000 added inline comments.Jul 29 2018, 9:43 AM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87

Option 1 doesn't work, either:

llvm::APFloat DoubleFloatValue(llvm::APFloat::IEEEdouble());                                                                                                                             
if (&FloatValue.getSemantics() == &llvm::APFloat::IEEEdouble())                                                                                                                          
  DoubleFloatValue = FloatValue;                                                                                                                                                         
else if (&FloatValue.getSemantics() == &llvm::APFloat::IEEEsingle()) {                                                                                                                   
  const float Value = FloatValue.convertToFloat();                                                                                                                                       
  DoubleFloatValue = llvm::APFloat(static_cast<double>(Value));                                                                                                                          
}                                                                                                                                                                                        
                                                                                                                                                                                         
 return std::binary_search(IgnoredFloatingPointValues.begin(),                                                                                                                            
                          IgnoredFloatingPointValues.end(),                                                                                                                              
                          DoubleFloatValue, compareAPFloats);

has problems with floating point values that are not integers.

Unless somebody knows how to convert APFloats with any semantics into APFloats with double semantics, I'm stuck ;(

Add a flag to ignore all powers of two integral values.

aaron.ballman added inline comments.Jul 29 2018, 11:17 AM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87

There is no
APFloat convertToOtherSemantics(const fltSemantics &) const;
method.

opStatus convert(const fltSemantics &ToSemantics, roundingMode RM,
                 bool *losesInfo);

This function does exactly that?

77–87

TargetInfo::getDoubleFormat() is not accessible directly from ClangTidyContext or from MatchFinder

MatchFinder has an ASTContext object on which you can call ASTContext::getTargetInfo(), I believe.

Top 40 magic numbers in https://github.com/qt/qtbase

4859 2
2901 3
1855 4
 985 5
 968 8
 605 6
 600 7
 439 16
 432 10
 363 
 356 32
 241 1.0f
 217 12
 209 255
 207 100
 205 9
 205 20
 204 50
 177 0.5
 174 15
 162 0x2
 144 24
 140 0x80
 135 11
 127 256
 113 14
 110 0xff
 101 1.0
  99 64
  99 200
  96 13
  86 30
  84 1000
  68 18
  66 150
  62 127
  62 0xFF
  58 19
  58 0.05f
  57 128

Top 40 floating point magic numbers in https://github.com/qt/qtbase

241 1.0f
177 0.5
101 1.0
 58 0.05f
 44 2.0
 42 0.5f
 31 10.0
 28 30.0
 24 20.0
 22 60.0
 20 100.0
 19 0.8
 19 0.25
 17 0.2
 16 1000.0
 14 1.224744871
 14 100.
 13 25.0
 13 0.1
 12 90.0
 12 40.0
 12 0.707106781
 12 0.30
 12 0.20
 11 80.0
 11 6.0
 11 50.0
 11 2.0f
 11 0.75
 11 0.66f
 11 0.1f
 10 6.28
 10 5.0
 10 4.0
 10 1.414213562
  9 360.0
  9 25.4
  9 2.54
  8 70.0
  8 55.0

Top 40 magic numbers in https://github.com/facebook/rocksdb

2131 2
 896 3
 859 4
 858 10
 685 100
 678 1024
 600 8
 445 5
 323 1000
 244 20
 231 301
 227 200
 223 6
 209 16
 189 7
 154 10000
 131 1000000
 119 100000
 111 30
 105 256
 104 32
 103 5U
 103 50
  94 128
  91 64
  89 60
  88 3U
  85 2U
  84 500
  72 4U
  67 9
  65 300
  63 13
  59 0xff
  57 6U
  52 4096
  52 24
  52 12
  51 600
  50 10U

Top 40 floating point numbers in rocksdb:

37 100.0
30 1.0
27 0.5
24 0.001
12 1048576.0
12 0.25
11 1.1
 8 50.0
 8 1.5
 8 10000.0
 5 .3
 5 .1
 5 0.8
 4 99.99
 4 99.9
 4 20000.0
 4 1.048576
 4 100.0f
 4 0.9
 4 0.75
 4 0.69
 4 0.02
 4 0.00001
 3 1000000.0
 3 0.4
 3 0.1
 2 0.7
 2 0.6
 2 0.45
 1 8.0
 1 5.6
 1 40.00002
 1 40.00001
 1 3.25
 1 2.0
 1 2.
 1 116.00002
 1 116.00001
 1 110.5e-4
 1 1024.0
0x8000-0000 added inline comments.Jul 29 2018, 11:29 AM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87

Yes, it does! Thank you!

77–87

Looking at the code, it declares a structure for MatchResult which indeed contains an ASTContext. But MatchFinder does not hold a reference to it, it is passed into it from a MatchVisitor.

What we would need is to parse the configuration before we start matching (either in the constructor or in ::registerMatchers - and there is no visitor available then.

0x8000-0000 added inline comments.Jul 29 2018, 11:46 AM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87

So functionally it works, but it suffers from the same problem as doing the conversion by hand using the `static_cast<double>(FloatValue.convertToFloat())`; the resulting value for 3.14f does not match the existing value for 3.14 present in the IgnoredFloatingPointValues array.

Top 40 magic numbers in https://github.com/qt/qtbase

4859 2
2901 3
1855 4
 985 5
 968 8
 605 6
 600 7
 439 16
 432 10
 363 
 356 32
 241 1.0f
 217 12
 209 255
 207 100
 205 9
 205 20
 204 50
 177 0.5
 174 15
 162 0x2
 144 24
 140 0x80
 135 11
 127 256
 113 14
 110 0xff
 101 1.0
  99 64
  99 200
  96 13
  86 30
  84 1000
  68 18
  66 150
  62 127
  62 0xFF
  58 19
  58 0.05f
  57 128

Top 40 floating point magic numbers in https://github.com/qt/qtbase

241 1.0f
177 0.5
101 1.0
 58 0.05f
 44 2.0
 42 0.5f
 31 10.0
 28 30.0
 24 20.0
 22 60.0
 20 100.0
 19 0.8
 19 0.25
 17 0.2
 16 1000.0
 14 1.224744871
 14 100.
 13 25.0
 13 0.1
 12 90.0
 12 40.0
 12 0.707106781
 12 0.30
 12 0.20
 11 80.0
 11 6.0
 11 50.0
 11 2.0f
 11 0.75
 11 0.66f
 11 0.1f
 10 6.28
 10 5.0
 10 4.0
 10 1.414213562
  9 360.0
  9 25.4
  9 2.54
  8 70.0
  8 55.0

Top 40 magic numbers in https://github.com/facebook/rocksdb

2131 2
 896 3
 859 4
 858 10
 685 100
 678 1024
 600 8
 445 5
 323 1000
 244 20
 231 301
 227 200
 223 6
 209 16
 189 7
 154 10000
 131 1000000
 119 100000
 111 30
 105 256
 104 32
 103 5U
 103 50
  94 128
  91 64
  89 60
  88 3U
  85 2U
  84 500
  72 4U
  67 9
  65 300
  63 13
  59 0xff
  57 6U
  52 4096
  52 24
  52 12
  51 600
  50 10U

Top 40 floating point numbers in rocksdb:

37 100.0
30 1.0
27 0.5
24 0.001
12 1048576.0
12 0.25
11 1.1
 8 50.0
 8 1.5
 8 10000.0
 5 .3
 5 .1
 5 0.8
 4 99.99
 4 99.9
 4 20000.0
 4 1.048576
 4 100.0f
 4 0.9
 4 0.75
 4 0.69
 4 0.02
 4 0.00001
 3 1000000.0
 3 0.4
 3 0.1
 2 0.7
 2 0.6
 2 0.45
 1 8.0
 1 5.6
 1 40.00002
 1 40.00001
 1 3.25
 1 2.0
 1 2.
 1 116.00002
 1 116.00001
 1 110.5e-4
 1 1024.0

Awesome, thank you for this! Based on this, I think the integer list should also include 2, 3, and 4 as defaults -- those show up a lot more than I'd have expected. As for floating-point values, 1.0 certainly jumps out at me, but none of the rest seem particularly common. What do you think?

clang-tidy/readability/MagicNumbersCheck.cpp
77–87

Hmm, that's certainly annoying.

@alexfh -- I think we should have access to the ASTContext object from within the ClangTidyContext passed in when registering the checks. We already have ClangTidyContext::setASTContext() that gets called before createChecks() in ClangTidyASTConsumerFactory::CreateASTConsumer(); can you think of a reason we should not persist that value in the ClangTidyContext? If there is a good reason to not persist it, perhaps it would be a reasonable argument to be passed to each check's constructor (by threading it through createChecks()?

Based on this, I think the integer list should also include 2, 3, and 4 as defaults -- those show up a lot more than I'd have expected. As for floating-point values, 1.0 certainly jumps out at me, but none of the rest seem particularly common. What do you think?

I'm good with 0, 1, 2, 3, 4 for integers and 0.0, 1.0 and also 100.0 (used to compute percentages) for floating point values.

0x8000-0000 added inline comments.Jul 29 2018, 12:58 PM
clang-tidy/readability/MagicNumbersCheck.cpp
77–87
bool MagicNumbersCheck::isIgnoredValue(const FloatingLiteral *Literal) const {
  llvm::APFloat FloatValue = Literal->getValue();
  if (FloatValue.isZero())
    return true;
  else {
    bool LosesInfo = true;
    const llvm::APFloat::opStatus ConvertStatus = FloatValue.convert(
        llvm::APFloat::IEEEdouble(), DefaultRoundingMode, &LosesInfo);

    if ((ConvertStatus == llvm::APFloat::opOK) ||
        (ConvertStatus == llvm::APFloat::opInexact))
      return std::binary_search(IgnoredFloatingPointValues.begin(),
                                IgnoredFloatingPointValues.end(), FloatValue,
                                compareAPFloats);
    else
      return false;
  }
}

"3.14f" FloatValue converts cleanly... to something that is not present in the IgnoredFloatingPointValues set.

Update the list of magic values ignored by default.

mgrang removed a subscriber: mgrang.Jul 30 2018, 10:43 AM

The state of this patch:

  • user interface is as agreed-upon, giving end-users the capability to filter out floats and ints as it makes sense for their code base
  • code is clean
  • documentation is up to date
  • default ignore lists are sensible

There is one outstanding improvement request - to combine the two float/double lists into a single APFloat but that requires more refactoring outside this check and the benefit lies mainly in a future extensibility and the saving of a couple hundred bytes.

@aaron.ballman - can we get this merged as-is and improve it later? I'd like to see this check available in Clang7.

The state of this patch:

  • user interface is as agreed-upon, giving end-users the capability to filter out floats and ints as it makes sense for their code base
  • code is clean
  • documentation is up to date
  • default ignore lists are sensible

There is one outstanding improvement request - to combine the two float/double lists into a single APFloat but that requires more refactoring outside this check and the benefit lies mainly in a future extensibility and the saving of a couple hundred bytes.

@aaron.ballman - can we get this merged as-is and improve it later? I'd like to see this check available in Clang7.

In my opinion this check is in good shape.
One thing that came to my mind: hexadecimal floating point literals. They should work out of the box, but could you add a test that they are recognized correctly?

Add tests to show that we can detect, report and ignore floating point numbers represented using hexadecimal notation

0x8000-0000 edited the summary of this revision. (Show Details)Jul 31 2018, 7:00 PM

I think this is close. If @alexfh doesn't chime in on the open question in the next few days, I would say the check is ready to go in and we can address the open question in follow-up commits.

clang-tidy/readability/MagicNumbersCheck.cpp
35–38

This can be reduced to return AsDecl->isImplicit();

77–87

@alexfh: pinging on this question.

134

Drop the custom function and instead use IntValue.isPowerOf2() instead.

Also, no else after a return.

146–154

No else after return.

Address several style comments. Rebase to current trunk (8.0).

0x8000-0000 marked 3 inline comments as done.Aug 5 2018, 9:09 AM

Rebased on curent master.

aaron.ballman accepted this revision.Aug 12 2018, 7:01 AM

Alex has had plenty of time to respond, so I'm fine handling any concerns he has post-commit. Do you need me to commit this on your behalf?

This revision is now accepted and ready to land.Aug 12 2018, 7:01 AM

Alex has had plenty of time to respond, so I'm fine handling any concerns he has post-commit. Do you need me to commit this on your behalf?

Yes, please! Thank you!

Author: "Florin Iucha <florin@signbit.net>"

aaron.ballman closed this revision.Aug 12 2018, 7:36 AM

Thank you for the patch and great discussion! I've commit in r339516.