Clang-tidy - Enum misuse check

The checker detects various cases when an enum is probably misused (as a bitmask).

For the heuristics suggestions and improvement ideas are welcome.

alexfh added inline comments.Aug 17 2016, 2:06 AM
59 ↗(On Diff #65966)

nit: The parameters are not iterators, so I'd change It1 and It2 to EnumConst1 and EnumConst2, E1 and E2, First and Second, Left and Right or something else not related to iterators. Same above in ValueRange and below in countNonPowOfTwoNum.

25 ↗(On Diff #65310)

I'd move the private section to the bottom of the class definition.

Changes based on comments.

75 ↗(On Diff #65966)

Thanks for the recommendations! As you can see my grammar and vocabulary is a "bit strange" so I really appreciated the correction!

209 ↗(On Diff #65966)

Here you can see the results on LLVM. (weak options, less false positive)

Here I have to mention that the last 4 results could be combined into one, because it`s actually the usage of the same enum in different files. If you wish I could easily change it.

alexfh added inline comments.
215 ↗(On Diff #68968)

Diagnostic messages are not full sentences, so they shouldn't start with a capital letter and end with a period.

26 ↗(On Diff #68968)
  1. Please move the constructor body to the .cpp file so that the code reading and storing options are together.
  2. Let's use a global flag StrictMode (used by one more check currently:, clang-tidy/misc/ArgumentCommentCheck.cpp). It can still be configured separately for each check, but overall it improves consistency. Also, let's make it non-strict by default.


should change to

StrictMode(Options.getLocalOrGlobal("StrictMode", 0) != 0)
21 ↗(On Diff #68968)

Please enclose inline code snippets in backquotes (+=, |=, etc.). Many places in this file and in doxygen comments.

29 ↗(On Diff #68968)

Should be .. code-block:: c++.

31 ↗(On Diff #68968)

Code block should be indented. Please compile the doc and make sure the result seems reasonable.

31 ↗(On Diff #68968)

Could you add some descriptions about what 1 stands for here? strict or non-strict? please leave a blank between "//" and comment words, the same below. Make sure the code here align with code style.

Still not addressed.

38 ↗(On Diff #68968)

nit: space after //
here and below.

This is still not addressed.

2 ↗(On Diff #68968)

The format still seems off.

66 ↗(On Diff #68968)

Is the commented line needed?

2 ↗(On Diff #68968)

Doesn't seem to be done: the format is still off.

Changes based on comments, fix a cast to dyn_cast bug, description updated (hopefully it became more clear).

szepet added inline comments.Sep 1 2016, 8:48 AM
3 ↗(On Diff #70017)

Could you specify which part of the file seems off? I have run the clang format with the same options on testfiles as on the others.

aaron.ballman added inline comments.Sep 2 2016, 12:20 PM
3 ↗(On Diff #70017)
enum A { A = 1,
         B = 2,
         C = 4,
         D = 8,
         E = 16,
         F = 32,
         G = 63

should be:

enum A {
  A = 1,
  B = 2,
  C = 4,
  D = 8,
  E = 16,
  F = 32,
  G = 63

is what I was thinking was incorrect, but perhaps clang-format allows such constructs?

66 ↗(On Diff #70017)

Do you intend to have the return 0; here?

cast to dyn-cast change in order to fix a bug, changes based on comments

Close, but still a bunch of comments in the docs and a suggestion to fix a class of false positives.

210 ↗(On Diff #70324)
  1. llvm/lib/MC/ELFObjectWriter.cpp:903 - the warning looks reasonable.
  2. llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoderCommon.h:66 - the warning looks reasonable (ATTR_max doesn't seem to be useful for the bitmask enum).
  3. llvm/tools/clang/lib/Basic/IdentifierTable.cpp:95 - two issues here:
    1. the "possibly contains misspelled number(s) " message could be more useful, if it specified which member corresponds to the possibly misspelled number
    2. I suppose, the check considers KEYALL = (0x1fffff & ~KEYNOMS18 & ~KEYNOOPENCL) to be incorrect. I think, it should exclude enum members initialized with a bit arithmetic expressions, since it's rather common to define aliases for a certain combination of flags.
  4. llvm/tools/clang/lib/Frontend/Rewrite/RewriteModernObjC.cpp:5083 and friends - the warning looks reasonable, since it's hard to understand the motivation for the BLOCK_FIELD_IS_OBJECT = 3. If it's a combination of flags, it should be written as such, and the check should ignore enum members initialized with a bit arithmetic expression.
9 ↗(On Diff #70324)

Too much indentation here.

38 ↗(On Diff #70324)
  1. Enum should not start with a capital letter, since it's a keyword.
  2. Please indent the code block contents by 2 spaces (currently, it's indented by 1).
  3. Please clang-format all code samples.
47 ↗(On Diff #70324)

Commas should be used instead of semicolons.

54 ↗(On Diff #70324)

Missing semicolon.

In general, please make sure code snippets are valid code. Otherwise, it creates unneeded obstacles in reading the code.

In order to decrease false positive rate, the bitmask specific checker part investigate only the enumconstans which was initilized by a literal. (If this is too strong it can be modified)

Renamed the checker to be more consistent with the checkers used for similar purpose.

Documentation code examples updated.

Thank you for the updates!

Please re-run the check on LLVM to see what has changed.

53 ↗(On Diff #71925)

nit: Add an empty line above.

61 ↗(On Diff #71925)

nit: Add an empty line above.

67 ↗(On Diff #71925)

nit: Please add braces, since the body doesn't fit on a line.

155 ↗(On Diff #71925)


175 ↗(On Diff #71925)

Use early return here.

208 ↗(On Diff #71925)

One variable definition at a time, please.

216 ↗(On Diff #71925)

Use early return here.

32–35 ↗(On Diff #71925)

These can be just static constants in the .cpp file. Apart from that, const char X[] = ...; is a better way to define string constants, otherwise you would have to go with const char * const X = ...; to make the pointer const as well.

4 ↗(On Diff #71925)

This line should be aligned with the line above.

9 ↗(On Diff #71925)

Remove two spaces at the start of the line.

11 ↗(On Diff #71925)

There's no "Weak" option, it's the "Strict" option set to false / 0.

12 ↗(On Diff #71925)

Please use the :option: notation and add the option description (.. option:: ...). See, for example, modernize-use-emplace.rst.

31 ↗(On Diff #71925)

This notation is used to make the previous line a heading. Doesn't seem to be the case here. See for some examples. Please also try to build your docs to check for obvious issues.

18 ↗(On Diff #71925)

Add check lines for notes as well (I don't think you'll be able to use [[@LINE]] for most of them, but you can probably just skip the line.

Updates based on comments (the testfile note comments will be added in the next commit)

Some changes in the algorithm/design:
In non-strict mode the checker will only investigate the operations between different enum types.
In strict mode we check the suspicious bitmasks too.
(In the previous comments we always talked about only the strict mode heuristics and looked on the strict results. )

So strict mode results in this revision where we investigate only the literals:

szepet added inline comments.Sep 30 2016, 3:56 PM
155 ↗(On Diff #71925)

Because the hasDisjointValueRange function could not decide the values properly. So in case of an empty Enum it would not make sense. Fortunately we know that the empty case should not be reported so used early return on this.

That is why this is needed if we want a deterministic check.

Note message checks added to testfiles.

What is your opinion about the new results? I hope the checker can make it into 4.0.

LG with one nit. Feel free to ping earlier next time.

170–171 ↗(On Diff #73439)

Looks like this is the same as in case 3 below, so you could just move this check out of the branch and remove the duplication below.

alexfh added inline comments.Dec 13 2016, 6:53 AM
155 ↗(On Diff #71925)

BTW, this might make sense to be explained in the comment in the code itself (code review comments are bad means of documenting code).

Minor changes to improve the readability of the code according to comments.

A few more notes, all fine for a follow up.

202 ↗(On Diff #81848)

Looks like you're doing exactly same thing twice for lhs and rhs. Pull this out to a function. Fine for a follow up.

212–213 ↗(On Diff #81848)

There's not much value in these variables.

215–219 ↗(On Diff #81848)

This code doesn't need the lhs/rhs variable declared above it. Move it up.

220–227 ↗(On Diff #81848)

This code can be pulled to a function / method to avoid repeating it twice (starting from the const auto *LhsExpr = Result.Nodes.getNodeAs<Expr>("lhsExpr"); part).

31 ↗(On Diff #71925)

This is not done yet.

aaron.ballman added inline comments.Dec 19 2016, 7:39 AM
31 ↗(On Diff #81848)

Please drop the (s) from the diagnostic. The phrase "but some literal are not" is incorrect. Alternatively, you could use the %plural diagnostic modifier (see note_constexpr_baa_value_insufficient_alignment in for an example usage).

The requested changes have been made.
Some more refactor on the Case2 since it is the same as the LHS/RHS case. Moved more common statements out of the branch (Case2-3) for better readabilty. (And less code duplication.)

