This is an archive of the discontinued LLVM Phabricator instance.

Clang-tidy - Enum misuse check
ClosedPublic

Authored by szepet on Jul 19 2016, 4:25 AM.

Details

Summary

The checker detects various cases when an enum is probably misused (as a bitmask).

For the heuristics suggestions and improvement ideas are welcome.

Diff Detail

Repository
rL LLVM

Event Timeline

szepet updated this revision to Diff 64464.Jul 19 2016, 4:25 AM
szepet retitled this revision from to Clang-tidy - Enum misuse check.
szepet updated this object.
szepet added reviewers: xazax.hun, alexfh.
hokein edited edge metadata.Jul 19 2016, 7:30 AM

Please make sure the code align with LLVM code style (clang-format can help you to that).

clang-tidy/misc/EnumMisuseCheck.cpp
20 ↗(On Diff #64464)

You can use std::distance(Enum->enumerator_begin(), Enum->enumerator_end()).

32 ↗(On Diff #64464)

One variable declaration per line.

61 ↗(On Diff #64464)

LLVM code style prefers to use preincrement (++I), the same below.

217 ↗(On Diff #64464)

An unrelated comment.

clang-tidy/misc/EnumMisuseCheck.h
19 ↗(On Diff #64464)

Fix FIXME.

24 ↗(On Diff #64464)

Put it to private member.

28 ↗(On Diff #64464)

As you defined a IsStrict option in the check, you need to override storeOptions method.

void EnumMisuseCheck::storeOptions(
    ClangTidyOptions::OptionMap &Opts) {
  Options.store(Opts, "IsStrict", IsStrict);
}
docs/clang-tidy/checks/misc-enum-misuse.rst
4 ↗(On Diff #64464)

Could you add a few code examples here?

Eugene.Zelenko set the repository for this revision to rL LLVM.
Eugene.Zelenko added a subscriber: cfe-commits.

Please mention this check in docs/ReleaseNotes.rst. See pre-4.0 branch versions as example.

clang-tidy/misc/EnumMisuseCheck.cpp
116 ↗(On Diff #64464)

Please separate with empty line.

test/clang-tidy/misc-enum-misuse-weak.cpp
9 ↗(On Diff #64464)

Please put closing }; on next line. Same in other places.

83 ↗(On Diff #64464)

Please fix -Wextra-semi. Is next empty line excessive?

test/clang-tidy/misc-enum-misuse.cpp
42 ↗(On Diff #64464)

Unnecessary empty line. Same at the end of function.

77 ↗(On Diff #64464)

Please fix -Wextra-semi.

etienneb edited edge metadata.Jul 19 2016, 10:27 AM

drive-by, some comments.
Thanks for the check

clang-tidy/misc/EnumMisuseCheck.cpp
20 ↗(On Diff #64464)

We tend to keep name meaningful and avoid abbreviation.
enumLen -> enumLength

35 ↗(On Diff #64464)

nit: remove empty line

55 ↗(On Diff #64464)

nit: static

81 ↗(On Diff #64464)

nit: remove empty line

117 ↗(On Diff #64464)

nit: const int x --> int x for all parameters

119 ↗(On Diff #64464)

nit: line too long

127 ↗(On Diff #64464)

A common pattern is:

if (const auto *DiffEnumOp = Result.Nodes.getNodeAs<BinaryOperator>("diffEnumOp")) {
   [...]
}

This way the scope is smaller and there is less chance to use it by mistake.

128 ↗(On Diff #64464)

nit: no empty line

150 ↗(On Diff #64464)

ditto for th "if (var = ...)" pattern

170 ↗(On Diff #64464)

nit: remove the space after (s)

177 ↗(On Diff #64464)

To be consistent with your comments, add newline here.

szepet updated this revision to Diff 64897.Jul 21 2016, 8:07 AM
szepet edited edge metadata.
szepet removed rL LLVM as the repository for this revision.

updating patch based on review comments

szepet marked 18 inline comments as done.Jul 21 2016, 8:09 AM
szepet marked an inline comment as done.Jul 21 2016, 8:09 AM
aaron.ballman added inline comments.
clang-tidy/misc/EnumMisuseCheck.cpp
31 ↗(On Diff #64464)

I think this class can be replaced by std::minmax_element() over the EnumDec->enumerators() and then grabbing the actual values out of the resulting std::pair.

59 ↗(On Diff #64464)

This function doesn't do what is described. It appears to be checking if the last value in the enumeration has all its bits set, not if the max value has all the bits set. e.g., it won't check:

enum E {
  MaxValue = 0xFFFFFFFF,
  First = 0,
  Second
};
clang-tidy/misc/EnumMisuseCheck.h
24 ↗(On Diff #64464)

It is a private member already. I think this usage is fine.

Seems that you only uploaded the diff part, I only see the updated part of your patch now, and can't see the whole patch now (The review page says "Context not available"), could you upload the whole patch again?

docs/clang-tidy/checks/misc-enum-misuse.rst
31 ↗(On Diff #64897)

Could you add some descriptions about what 1 stands for here? strict or non-strict? please leave a blank between "//" and comment words, the same below. Make sure the code here align with code style.

62 ↗(On Diff #64897)

extra indentations.

szepet updated this revision to Diff 65310.Jul 25 2016, 1:16 AM

full diff submitted

Prazek added a project: Restricted Project.Jul 25 2016, 8:31 AM

some nits

docs/clang-tidy/checks/misc-enum-misuse.rst
37 ↗(On Diff #65310)

nit: space after //
here and below.

37 ↗(On Diff #65310)

nit: intervalls (typo)

test/clang-tidy/misc-enum-misuse-weak.cpp
48 ↗(On Diff #65310)

you can remove "[misc-enum-misuse]" from the following line.
Only the first occurrence is enough.

etienneb added a subscriber: etienneb.
Prazek added a subscriber: Prazek.Jul 25 2016, 8:45 AM
Prazek added inline comments.
clang-tidy/misc/EnumMisuseCheck.cpp
19 ↗(On Diff #65310)

s/\/\//\/\/\/

In other words, change to / (doxygen comment)

192 ↗(On Diff #65310)

If the pointer returned from dyn_cast<EnumConstantDecl>(LhsDeclExpr->getDecl()) is assumed to be always not null, thn you can change it to cast<> (It will assert if null instead of returning nullptr)

docs/clang-tidy/checks/misc-enum-misuse.rst
29 ↗(On Diff #65310)

I think all the code here needs to be indented by one or more to get into ..code:: block.

szepet updated this revision to Diff 65966.Jul 28 2016, 12:17 PM
szepet marked 12 inline comments as done.

updates based on comments, counter and search functions replaced by std functions

szepet marked 2 inline comments as done.Jul 28 2016, 12:18 PM
aaron.ballman added inline comments.Jul 29 2016, 10:33 AM
clang-tidy/misc/EnumMisuseCheck.cpp
40 ↗(On Diff #65966)

Doxygen comment here as well.

75 ↗(On Diff #65966)

The wrapping for this comment is a bit strange, also should be using doxygen-style comments. Also, the grammar is a bit off for the comment. I would recommend:

Check if there are two or more enumerators that are not a power of 2 in the enum type, and that the enumeration contains enough elements to reasonably act as a bitmask. Exclude the case where the last enumerator is the sum of the lesser values or when it could contain consecutive values.

Also, I would call this isPossiblyBitMask instead of using "bit field" because a bit-field is a syntactic construct that is unrelated. Bitmask types are covered in the C++ standard under [bitmask.types] and are slightly different, but more closely related to what this check is looking for.

209 ↗(On Diff #65966)

I think this diagnostic text is a bit confusing. The enum type shouldn't seem like a bit-field (but more like a bitmask), and I'm not certain what a misspelled number would be. I think the diagnostic is effectively saying that we guess this might be a bitmask, but some of the enumerator values don't make sense for that guess, and so this may be suspicious code -- but I really worry about the false positive rate for such a diagnostic. Have you run this check over a large body of work (like LLVM, Firefox, Chrome, etc)? If so, how do the diagnostics look?

Perhaps a different way to word the diagnostic is: "enum type used as a bitmask with an enumerator value that is not a power of 2" and place the diagnostic on the enumerator(s) that cause the problem rather than the enumeration as a whole.

212 ↗(On Diff #65966)

s/bitmask/bitfield

214 ↗(On Diff #65966)

Are these spurious parens around LhsConstant->getInitVal()?

219 ↗(On Diff #65966)

Same comments here as above; if you can remove the duplicate code, that would be great, too.

clang-tidy/misc/EnumMisuseCheck.h
24 ↗(On Diff #65310)

Can remove the private access specifier -- it's already private.

test/clang-tidy/misc-enum-misuse-weak.cpp
1 ↗(On Diff #65966)

Please run clang-format over this file.

test/clang-tidy/misc-enum-misuse.cpp
1 ↗(On Diff #65966)

Please run clang-format over this file.

alexfh requested changes to this revision.Aug 17 2016, 2:06 AM
alexfh edited edge metadata.
alexfh added inline comments.
clang-tidy/misc/EnumMisuseCheck.cpp
28 ↗(On Diff #65966)

I don't think the compound statement here is needed.

48 ↗(On Diff #65966)

Please remove the outermost parentheses, they are superfluous.

56 ↗(On Diff #65966)

nit: Please remove the empty line.

128 ↗(On Diff #65966)

Please remove the empty line.

146 ↗(On Diff #65966)

Formatting of the list seems rather uncommon for English text. Let's change to:

// Case 2:
//   a. ......
//   b. ......

(note the period after a and b, and Case N instead of 1. case).

164 ↗(On Diff #65966)

No parentheses needed after !.

164 ↗(On Diff #65966)

Braces should be used consistently in each if-else chain. Please add braces around the first if body.

172 ↗(On Diff #65966)

No capitalization and trailing period needed.

224 ↗(On Diff #65966)

Inner parentheses are not needed.

alexfh added inline comments.Aug 17 2016, 2:06 AM
clang-tidy/misc/EnumMisuseCheck.cpp
59 ↗(On Diff #65966)

nit: The parameters are not iterators, so I'd change It1 and It2 to EnumConst1 and EnumConst2, E1 and E2, First and Second, Left and Right or something else not related to iterators. Same above in ValueRange and below in countNonPowOfTwoNum.

clang-tidy/misc/EnumMisuseCheck.h
25 ↗(On Diff #65310)

I'd move the private section to the bottom of the class definition.

This revision now requires changes to proceed.Aug 17 2016, 2:06 AM
szepet updated this revision to Diff 68968.Aug 23 2016, 5:03 AM
szepet edited edge metadata.
szepet marked 18 inline comments as done.

Changes based on comments.

clang-tidy/misc/EnumMisuseCheck.cpp
75 ↗(On Diff #65966)

Thanks for the recommendations! As you can see my grammar and vocabulary is a "bit strange" so I really appreciated the correction!

209 ↗(On Diff #65966)

Here you can see the results on LLVM. (weak options, less false positive)

Here I have to mention that the last 4 results could be combined into one, because it`s actually the usage of the same enum in different files. If you wish I could easily change it.

alexfh requested changes to this revision.Aug 24 2016, 11:55 AM
alexfh edited edge metadata.
alexfh added inline comments.
clang-tidy/misc/EnumMisuseCheck.cpp
215 ↗(On Diff #68968)

Diagnostic messages are not full sentences, so they shouldn't start with a capital letter and end with a period.

clang-tidy/misc/EnumMisuseCheck.h
26 ↗(On Diff #68968)
  1. Please move the constructor body to the .cpp file so that the code reading and storing options are together.
  2. Let's use a global flag StrictMode (used by one more check currently: http://clang.llvm.org/extra/clang-tidy/checks/misc-argument-comment.html, clang-tidy/misc/ArgumentCommentCheck.cpp). It can still be configured separately for each check, but overall it improves consistency. Also, let's make it non-strict by default.

    Options.get(...)

should change to

StrictMode(Options.getLocalOrGlobal("StrictMode", 0) != 0)
docs/clang-tidy/checks/misc-enum-misuse.rst
21 ↗(On Diff #68968)

Please enclose inline code snippets in backquotes (+=, |=, etc.). Many places in this file and in doxygen comments.

29 ↗(On Diff #68968)

Should be .. code-block:: c++.

31 ↗(On Diff #68968)

Code block should be indented. Please compile the doc and make sure the result seems reasonable.

31 ↗(On Diff #68968)

Could you add some descriptions about what 1 stands for here? strict or non-strict? please leave a blank between "//" and comment words, the same below. Make sure the code here align with code style.

Still not addressed.

38 ↗(On Diff #68968)

nit: space after //
here and below.

This is still not addressed.

test/clang-tidy/misc-enum-misuse-weak.cpp
2 ↗(On Diff #68968)

The format still seems off.

66 ↗(On Diff #68968)

Is the commented line needed?

test/clang-tidy/misc-enum-misuse.cpp
2 ↗(On Diff #68968)

Doesn't seem to be done: the format is still off.

This revision now requires changes to proceed.Aug 24 2016, 11:55 AM
szepet updated this revision to Diff 70017.Sep 1 2016, 8:38 AM
szepet edited edge metadata.
szepet marked 11 inline comments as done.

Changes based on comments, fix a cast to dyn_cast bug, description updated (hopefully it became more clear).

szepet added inline comments.Sep 1 2016, 8:48 AM
test/clang-tidy/misc-enum-misuse.cpp
3 ↗(On Diff #70017)

Could you specify which part of the file seems off? I have run the clang format with the same options on testfiles as on the others.

aaron.ballman added inline comments.Sep 2 2016, 12:20 PM
test/clang-tidy/misc-enum-misuse.cpp
3 ↗(On Diff #70017)
enum A { A = 1,
         B = 2,
         C = 4,
         D = 8,
         E = 16,
         F = 32,
         G = 63
};

should be:

enum A {
  A = 1,
  B = 2,
  C = 4,
  D = 8,
  E = 16,
  F = 32,
  G = 63
};

is what I was thinking was incorrect, but perhaps clang-format allows such constructs?

66 ↗(On Diff #70017)

Do you intend to have the return 0; here?

szepet updated this revision to Diff 70324.Sep 5 2016, 6:53 AM
szepet edited edge metadata.
szepet marked 4 inline comments as done.

cast to dyn-cast change in order to fix a bug, changes based on comments

alexfh requested changes to this revision.Sep 7 2016, 5:08 AM
alexfh edited edge metadata.

Close, but still a bunch of comments in the docs and a suggestion to fix a class of false positives.

clang-tidy/misc/EnumMisuseCheck.cpp
210 ↗(On Diff #70324)
  1. llvm/lib/MC/ELFObjectWriter.cpp:903 - the warning looks reasonable.
  2. llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoderCommon.h:66 - the warning looks reasonable (ATTR_max doesn't seem to be useful for the bitmask enum).
  3. llvm/tools/clang/lib/Basic/IdentifierTable.cpp:95 - two issues here:
    1. the "possibly contains misspelled number(s) " message could be more useful, if it specified which member corresponds to the possibly misspelled number
    2. I suppose, the check considers KEYALL = (0x1fffff & ~KEYNOMS18 & ~KEYNOOPENCL) to be incorrect. I think, it should exclude enum members initialized with a bit arithmetic expressions, since it's rather common to define aliases for a certain combination of flags.
  4. llvm/tools/clang/lib/Frontend/Rewrite/RewriteModernObjC.cpp:5083 and friends - the warning looks reasonable, since it's hard to understand the motivation for the BLOCK_FIELD_IS_OBJECT = 3. If it's a combination of flags, it should be written as such, and the check should ignore enum members initialized with a bit arithmetic expression.
docs/clang-tidy/checks/misc-enum-misuse.rst
9 ↗(On Diff #70324)

Too much indentation here.

38 ↗(On Diff #70324)
  1. Enum should not start with a capital letter, since it's a keyword.
  2. Please indent the code block contents by 2 spaces (currently, it's indented by 1).
  3. Please clang-format all code samples.
47 ↗(On Diff #70324)

Commas should be used instead of semicolons.

54 ↗(On Diff #70324)

Missing semicolon.

In general, please make sure code snippets are valid code. Otherwise, it creates unneeded obstacles in reading the code.

This revision now requires changes to proceed.Sep 7 2016, 5:08 AM
szepet updated this revision to Diff 71925.Sep 20 2016, 6:50 AM
szepet updated this object.
szepet edited edge metadata.
szepet marked 7 inline comments as done.

In order to decrease false positive rate, the bitmask specific checker part investigate only the enumconstans which was initilized by a literal. (If this is too strong it can be modified)

Renamed the checker to be more consistent with the checkers used for similar purpose.

Documentation code examples updated.

alexfh requested changes to this revision.Sep 23 2016, 7:06 PM
alexfh edited edge metadata.

Thank you for the updates!

Please re-run the check on LLVM to see what has changed.

clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
53 ↗(On Diff #71925)

nit: Add an empty line above.

61 ↗(On Diff #71925)

nit: Add an empty line above.

67 ↗(On Diff #71925)

nit: Please add braces, since the body doesn't fit on a line.

155 ↗(On Diff #71925)

Why?

175 ↗(On Diff #71925)

Use early return here.

208 ↗(On Diff #71925)

One variable definition at a time, please.

216 ↗(On Diff #71925)

Use early return here.

clang-tidy/misc/SuspiciousEnumUsageCheck.h
32–35 ↗(On Diff #71925)

These can be just static constants in the .cpp file. Apart from that, const char X[] = ...; is a better way to define string constants, otherwise you would have to go with const char * const X = ...; to make the pointer const as well.

docs/clang-tidy/checks/misc-suspicious-enum-usage.rst
4 ↗(On Diff #71925)

This line should be aligned with the line above.

9 ↗(On Diff #71925)

Remove two spaces at the start of the line.

11 ↗(On Diff #71925)

There's no "Weak" option, it's the "Strict" option set to false / 0.

12 ↗(On Diff #71925)

Please use the :option: notation and add the option description (.. option:: ...). See, for example, modernize-use-emplace.rst.

31 ↗(On Diff #71925)

This notation is used to make the previous line a heading. Doesn't seem to be the case here. See http://llvm.org/docs/SphinxQuickstartTemplate.html for some examples. Please also try to build your docs to check for obvious issues.

test/clang-tidy/misc-suspicious-enum-usage-strict.cpp
18 ↗(On Diff #71925)

Add check lines for notes as well (I don't think you'll be able to use [[@LINE]] for most of them, but you can probably just skip the line.

This revision now requires changes to proceed.Sep 23 2016, 7:06 PM
szepet updated this revision to Diff 73157.EditedSep 30 2016, 3:48 PM
szepet edited edge metadata.
szepet marked 13 inline comments as done.

Updates based on comments (the testfile note comments will be added in the next commit)

Some changes in the algorithm/design:
In non-strict mode the checker will only investigate the operations between different enum types.
In strict mode we check the suspicious bitmasks too.
(In the previous comments we always talked about only the strict mode heuristics and looked on the strict results. )

So strict mode results in this revision where we investigate only the literals:

szepet added inline comments.Sep 30 2016, 3:56 PM
clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
155 ↗(On Diff #71925)

Because the hasDisjointValueRange function could not decide the values properly. So in case of an empty Enum it would not make sense. Fortunately we know that the empty case should not be reported so used early return on this.

That is why this is needed if we want a deterministic check.

szepet updated this revision to Diff 73439.Oct 4 2016, 3:15 AM
szepet edited edge metadata.
szepet marked an inline comment as done.

Note message checks added to testfiles.

What is your opinion about the new results? I hope the checker can make it into 4.0.

LG with one nit. Feel free to ping earlier next time.

clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
170–171 ↗(On Diff #73439)

Looks like this is the same as in case 3 below, so you could just move this check out of the branch and remove the duplication below.

alexfh added inline comments.Dec 13 2016, 6:53 AM
clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
155 ↗(On Diff #71925)

BTW, this might make sense to be explained in the comment in the code itself (code review comments are bad means of documenting code).

alexfh accepted this revision.Dec 13 2016, 6:57 AM
alexfh edited edge metadata.
This revision is now accepted and ready to land.Dec 13 2016, 6:57 AM
szepet updated this revision to Diff 81848.Dec 17 2016, 11:56 AM
szepet edited edge metadata.

Minor changes to improve the readability of the code according to comments.

szepet marked 2 inline comments as done.Dec 17 2016, 11:58 AM

A few more notes, all fine for a follow up.

clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
202 ↗(On Diff #81848)

Looks like you're doing exactly same thing twice for lhs and rhs. Pull this out to a function. Fine for a follow up.

212–213 ↗(On Diff #81848)

There's not much value in these variables.

215–219 ↗(On Diff #81848)

This code doesn't need the lhs/rhs variable declared above it. Move it up.

220–227 ↗(On Diff #81848)

This code can be pulled to a function / method to avoid repeating it twice (starting from the const auto *LhsExpr = Result.Nodes.getNodeAs<Expr>("lhsExpr"); part).

docs/clang-tidy/checks/misc-suspicious-enum-usage.rst
31 ↗(On Diff #71925)

This is not done yet.

aaron.ballman added inline comments.Dec 19 2016, 7:39 AM
clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
31 ↗(On Diff #81848)

Please drop the (s) from the diagnostic. The phrase "but some literal are not" is incorrect. Alternatively, you could use the %plural diagnostic modifier (see note_constexpr_baa_value_insufficient_alignment in DiagnosticASTKinds.td for an example usage).

szepet updated this revision to Diff 82488.Dec 25 2016, 4:19 PM
szepet marked 6 inline comments as done.

The requested changes have been made.
Some more refactor on the Case2 since it is the same as the LHS/RHS case. Moved more common statements out of the branch (Case2-3) for better readabilty. (And less code duplication.)

This revision was automatically updated to reflect the committed changes.