Page MenuHomePhabricator

Clang-tidy - Enum misuse check
ClosedPublic

Authored by szepet on Jul 19 2016, 4:25 AM.

Details

Summary

The checker detects various cases when an enum is probably misused (as a bitmask).

For the heuristics suggestions and improvement ideas are welcome.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
alexfh added inline comments.Aug 17 2016, 2:06 AM
clang-tidy/misc/EnumMisuseCheck.cpp
59 ↗(On Diff #65966)

nit: The parameters are not iterators, so I'd change It1 and It2 to EnumConst1 and EnumConst2, E1 and E2, First and Second, Left and Right or something else not related to iterators. Same above in ValueRange and below in countNonPowOfTwoNum.

clang-tidy/misc/EnumMisuseCheck.h
25 ↗(On Diff #65310)

I'd move the private section to the bottom of the class definition.

This revision now requires changes to proceed.Aug 17 2016, 2:06 AM
szepet updated this revision to Diff 68968.Aug 23 2016, 5:03 AM
szepet edited edge metadata.
szepet marked 18 inline comments as done.

Changes based on comments.

clang-tidy/misc/EnumMisuseCheck.cpp
75 ↗(On Diff #65966)

Thanks for the recommendations! As you can see my grammar and vocabulary is a "bit strange" so I really appreciated the correction!

209 ↗(On Diff #65966)

Here you can see the results on LLVM. (weak options, less false positive)

Here I have to mention that the last 4 results could be combined into one, because it`s actually the usage of the same enum in different files. If you wish I could easily change it.

alexfh requested changes to this revision.Aug 24 2016, 11:55 AM
alexfh edited edge metadata.
alexfh added inline comments.
clang-tidy/misc/EnumMisuseCheck.cpp
215 ↗(On Diff #68968)

Diagnostic messages are not full sentences, so they shouldn't start with a capital letter and end with a period.

clang-tidy/misc/EnumMisuseCheck.h
26 ↗(On Diff #68968)
  1. Please move the constructor body to the .cpp file so that the code reading and storing options are together.
  2. Let's use a global flag StrictMode (used by one more check currently: http://clang.llvm.org/extra/clang-tidy/checks/misc-argument-comment.html, clang-tidy/misc/ArgumentCommentCheck.cpp). It can still be configured separately for each check, but overall it improves consistency. Also, let's make it non-strict by default.

    Options.get(...)

should change to

StrictMode(Options.getLocalOrGlobal("StrictMode", 0) != 0)
docs/clang-tidy/checks/misc-enum-misuse.rst
21 ↗(On Diff #68968)

Please enclose inline code snippets in backquotes (+=, |=, etc.). Many places in this file and in doxygen comments.

29 ↗(On Diff #68968)

Should be .. code-block:: c++.

31 ↗(On Diff #68968)

Code block should be indented. Please compile the doc and make sure the result seems reasonable.

31 ↗(On Diff #68968)

Could you add some descriptions about what 1 stands for here? strict or non-strict? please leave a blank between "//" and comment words, the same below. Make sure the code here align with code style.

Still not addressed.

38 ↗(On Diff #68968)

nit: space after //
here and below.

This is still not addressed.

test/clang-tidy/misc-enum-misuse-weak.cpp
2 ↗(On Diff #68968)

The format still seems off.

66 ↗(On Diff #68968)

Is the commented line needed?

test/clang-tidy/misc-enum-misuse.cpp
2 ↗(On Diff #68968)

Doesn't seem to be done: the format is still off.

This revision now requires changes to proceed.Aug 24 2016, 11:55 AM
szepet updated this revision to Diff 70017.Sep 1 2016, 8:38 AM
szepet edited edge metadata.
szepet marked 11 inline comments as done.

Changes based on comments, fix a cast to dyn_cast bug, description updated (hopefully it became more clear).

szepet added inline comments.Sep 1 2016, 8:48 AM
test/clang-tidy/misc-enum-misuse.cpp
3 ↗(On Diff #70017)

Could you specify which part of the file seems off? I have run the clang format with the same options on testfiles as on the others.

aaron.ballman added inline comments.Sep 2 2016, 12:20 PM
test/clang-tidy/misc-enum-misuse.cpp
3 ↗(On Diff #70017)
enum A { A = 1,
         B = 2,
         C = 4,
         D = 8,
         E = 16,
         F = 32,
         G = 63
};

should be:

enum A {
  A = 1,
  B = 2,
  C = 4,
  D = 8,
  E = 16,
  F = 32,
  G = 63
};

is what I was thinking was incorrect, but perhaps clang-format allows such constructs?

66 ↗(On Diff #70017)

Do you intend to have the return 0; here?

szepet updated this revision to Diff 70324.Sep 5 2016, 6:53 AM
szepet edited edge metadata.
szepet marked 4 inline comments as done.

cast to dyn-cast change in order to fix a bug, changes based on comments

alexfh requested changes to this revision.Sep 7 2016, 5:08 AM
alexfh edited edge metadata.

Close, but still a bunch of comments in the docs and a suggestion to fix a class of false positives.

clang-tidy/misc/EnumMisuseCheck.cpp
210 ↗(On Diff #70324)
  1. llvm/lib/MC/ELFObjectWriter.cpp:903 - the warning looks reasonable.
  2. llvm/lib/Target/X86/Disassembler/X86DisassemblerDecoderCommon.h:66 - the warning looks reasonable (ATTR_max doesn't seem to be useful for the bitmask enum).
  3. llvm/tools/clang/lib/Basic/IdentifierTable.cpp:95 - two issues here:
    1. the "possibly contains misspelled number(s) " message could be more useful, if it specified which member corresponds to the possibly misspelled number
    2. I suppose, the check considers KEYALL = (0x1fffff & ~KEYNOMS18 & ~KEYNOOPENCL) to be incorrect. I think, it should exclude enum members initialized with a bit arithmetic expressions, since it's rather common to define aliases for a certain combination of flags.
  4. llvm/tools/clang/lib/Frontend/Rewrite/RewriteModernObjC.cpp:5083 and friends - the warning looks reasonable, since it's hard to understand the motivation for the BLOCK_FIELD_IS_OBJECT = 3. If it's a combination of flags, it should be written as such, and the check should ignore enum members initialized with a bit arithmetic expression.
docs/clang-tidy/checks/misc-enum-misuse.rst
9 ↗(On Diff #70324)

Too much indentation here.

38 ↗(On Diff #70324)
  1. Enum should not start with a capital letter, since it's a keyword.
  2. Please indent the code block contents by 2 spaces (currently, it's indented by 1).
  3. Please clang-format all code samples.
47 ↗(On Diff #70324)

Commas should be used instead of semicolons.

54 ↗(On Diff #70324)

Missing semicolon.

In general, please make sure code snippets are valid code. Otherwise, it creates unneeded obstacles in reading the code.

This revision now requires changes to proceed.Sep 7 2016, 5:08 AM
szepet updated this revision to Diff 71925.Sep 20 2016, 6:50 AM
szepet updated this object.
szepet edited edge metadata.
szepet marked 7 inline comments as done.

In order to decrease false positive rate, the bitmask specific checker part investigate only the enumconstans which was initilized by a literal. (If this is too strong it can be modified)

Renamed the checker to be more consistent with the checkers used for similar purpose.

Documentation code examples updated.

alexfh requested changes to this revision.Sep 23 2016, 7:06 PM
alexfh edited edge metadata.

Thank you for the updates!

Please re-run the check on LLVM to see what has changed.

clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
53 ↗(On Diff #71925)

nit: Add an empty line above.

61 ↗(On Diff #71925)

nit: Add an empty line above.

67 ↗(On Diff #71925)

nit: Please add braces, since the body doesn't fit on a line.

155 ↗(On Diff #71925)

Why?

175 ↗(On Diff #71925)

Use early return here.

208 ↗(On Diff #71925)

One variable definition at a time, please.

216 ↗(On Diff #71925)

Use early return here.

clang-tidy/misc/SuspiciousEnumUsageCheck.h
32–35 ↗(On Diff #71925)

These can be just static constants in the .cpp file. Apart from that, const char X[] = ...; is a better way to define string constants, otherwise you would have to go with const char * const X = ...; to make the pointer const as well.

docs/clang-tidy/checks/misc-suspicious-enum-usage.rst
4 ↗(On Diff #71925)

This line should be aligned with the line above.

9 ↗(On Diff #71925)

Remove two spaces at the start of the line.

11 ↗(On Diff #71925)

There's no "Weak" option, it's the "Strict" option set to false / 0.

12 ↗(On Diff #71925)

Please use the :option: notation and add the option description (.. option:: ...). See, for example, modernize-use-emplace.rst.

31 ↗(On Diff #71925)

This notation is used to make the previous line a heading. Doesn't seem to be the case here. See http://llvm.org/docs/SphinxQuickstartTemplate.html for some examples. Please also try to build your docs to check for obvious issues.

test/clang-tidy/misc-suspicious-enum-usage-strict.cpp
18 ↗(On Diff #71925)

Add check lines for notes as well (I don't think you'll be able to use [[@LINE]] for most of them, but you can probably just skip the line.

This revision now requires changes to proceed.Sep 23 2016, 7:06 PM
szepet updated this revision to Diff 73157.EditedSep 30 2016, 3:48 PM
szepet edited edge metadata.
szepet marked 13 inline comments as done.

Updates based on comments (the testfile note comments will be added in the next commit)

Some changes in the algorithm/design:
In non-strict mode the checker will only investigate the operations between different enum types.
In strict mode we check the suspicious bitmasks too.
(In the previous comments we always talked about only the strict mode heuristics and looked on the strict results. )

So strict mode results in this revision where we investigate only the literals:

szepet added inline comments.Sep 30 2016, 3:56 PM
clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
155 ↗(On Diff #71925)

Because the hasDisjointValueRange function could not decide the values properly. So in case of an empty Enum it would not make sense. Fortunately we know that the empty case should not be reported so used early return on this.

That is why this is needed if we want a deterministic check.

szepet updated this revision to Diff 73439.Oct 4 2016, 3:15 AM
szepet edited edge metadata.
szepet marked an inline comment as done.

Note message checks added to testfiles.

What is your opinion about the new results? I hope the checker can make it into 4.0.

LG with one nit. Feel free to ping earlier next time.

clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
170–171 ↗(On Diff #73439)

Looks like this is the same as in case 3 below, so you could just move this check out of the branch and remove the duplication below.

alexfh added inline comments.Dec 13 2016, 6:53 AM
clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
155 ↗(On Diff #71925)

BTW, this might make sense to be explained in the comment in the code itself (code review comments are bad means of documenting code).

alexfh accepted this revision.Dec 13 2016, 6:57 AM
alexfh edited edge metadata.
This revision is now accepted and ready to land.Dec 13 2016, 6:57 AM
szepet updated this revision to Diff 81848.Dec 17 2016, 11:56 AM
szepet edited edge metadata.

Minor changes to improve the readability of the code according to comments.

szepet marked 2 inline comments as done.Dec 17 2016, 11:58 AM

A few more notes, all fine for a follow up.

clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
202 ↗(On Diff #81848)

Looks like you're doing exactly same thing twice for lhs and rhs. Pull this out to a function. Fine for a follow up.

212–213 ↗(On Diff #81848)

There's not much value in these variables.

215–219 ↗(On Diff #81848)

This code doesn't need the lhs/rhs variable declared above it. Move it up.

220–227 ↗(On Diff #81848)

This code can be pulled to a function / method to avoid repeating it twice (starting from the const auto *LhsExpr = Result.Nodes.getNodeAs<Expr>("lhsExpr"); part).

docs/clang-tidy/checks/misc-suspicious-enum-usage.rst
31 ↗(On Diff #71925)

This is not done yet.

aaron.ballman added inline comments.Dec 19 2016, 7:39 AM
clang-tidy/misc/SuspiciousEnumUsageCheck.cpp
31 ↗(On Diff #81848)

Please drop the (s) from the diagnostic. The phrase "but some literal are not" is incorrect. Alternatively, you could use the %plural diagnostic modifier (see note_constexpr_baa_value_insufficient_alignment in DiagnosticASTKinds.td for an example usage).

szepet updated this revision to Diff 82488.Dec 25 2016, 4:19 PM
szepet marked 6 inline comments as done.

The requested changes have been made.
Some more refactor on the Case2 since it is the same as the LHS/RHS case. Moved more common statements out of the branch (Case2-3) for better readabilty. (And less code duplication.)

This revision was automatically updated to reflect the committed changes.