This is an archive of the discontinued LLVM Phabricator instance.

[Clang-tidy] add check misc-prefer-switch-for-enums
AbandonedPublic

Authored by jbcoe on Mar 13 2017, 8:53 AM.

Details

Summary

Add a check to find enums used in == or != expressions. Using a switch statement leads to more easily maintained code.

Diff Detail

Repository
rL LLVM

Event Timeline

jbcoe created this revision.Mar 13 2017, 8:53 AM

I like that check. But I think it could take another feature.
In my opinion it would be valueable to check if enums are compared against ints as well.

For example

enum Kind k = Kind::a;
if (k == 3) { /* something */ }

This usecase is not specially considered here, but i think that check would be the right place. Maybe it is already covered by your matcher, since you check for everything that compares against enums.
If yes, additional diagnostics would be nice.

This check should be added to the ReleaseNotes as well.

clang-tools-extra/clang-tidy/misc/PreferSwitchForEnumsCheck.cpp
21

remove FIXME

27

maybe a more telling name than x would be better?

clang-tools-extra/test/clang-tidy/misc-prefer-switch-for-enums.cpp
3

maybe another test for non enum class values would be nice.

enum another_kind {e, f, g};
jbcoe updated this revision to Diff 91588.Mar 13 2017, 10:50 AM

Minor edits in response to review comments. A few questions outstanding.

jbcoe marked 2 inline comments as done.Mar 13 2017, 10:51 AM

Handling

enum Kind k = Kind::a;
if (k == 3) { /* something */ }

is intentionally out of scope for now as the author is doing something that I can't trivially replace with a switch.

clang-tools-extra/test/clang-tidy/misc-prefer-switch-for-enums.cpp
3

I don't need to write code to specifically deal with enum classes. Perhaps changing the test to just test enums is simpler. I'm not sure what the LLVM/Clang approach normally is.

alright. then i have a good check for myself to write ;)

Eugene.Zelenko added inline comments.
clang-tools-extra/docs/ReleaseNotes.rst
96 ↗(On Diff #91588)

Please highlight switch with `` and indent preferred on same column as beginning of sentence.

clang-tools-extra/docs/clang-tidy/checks/misc-prefer-switch-for-enums.rst
9

Please highlight switch with ``.

11

Please highlight if with ``.

Eugene.Zelenko retitled this revision from [CLANG-TIDY] add check misc-prefer-switch-for-enums to [Clang-tidy] add check misc-prefer-switch-for-enums.Mar 13 2017, 12:04 PM
Eugene.Zelenko set the repository for this revision to rL LLVM.
aaron.ballman edited edge metadata.Mar 13 2017, 3:28 PM

I'm curious what kind of results you get when running this over a large code base. There are definitely times when using == or != for a specific value is *way* cleaner than using a switch statement, and I worry about this being so chatty that it needs to be disabled by default.

If it turns out to be very chatty, perhaps the check could be relaxed to only consider cases where you have multiple if/else if clauses (tuned via an option) for checking several values? At the very least, the check should not be triggered on an enumeration that has only one enumerator.

Also, maybe the readability module would be a better place for this check.

jbcoe added a comment.EditedMar 14 2017, 3:21 AM

@aaron.ballman

re: "perhaps the check could be relaxed to only consider cases where you have multiple if/else if clauses".

Sadly this does not satisfy my use case where most of the time a single enum value is used.

enum class animal_kind { herbivore, carnivore };
if(k == animal_kind::carnivore) { ... }

I now add omnivore and play 'hunt-the-if' knowing anything I miss will be a bug.


re: "There are definitely times when using == or != for a specific value is *way* cleaner than using a switch statement".

I totally agree. That's why I want this check. People don't remember to use switch for enums because it's harder and looks uglier.

Perhaps this check should not go in misc as it might be better to have it opt-in? I know I want it in one code-base I work on.


@xazax.hun

re: "Also, maybe the readability module would be a better place for this check."

I like the idea of it not being in misc as it should be opt-in. It really does little to enhance readability though. Should we add a maintainability module? I'm not sure where this check should live.

@aaron.ballman

re: "perhaps the check could be relaxed to only consider cases where you have multiple if/else if clauses".

Sadly this does not satisfy my use case where most of the time a single enum value is used.

enum class animal_kind { herbivore, carnivore };
if(k == animal_kind:: carnivore) { ... }

I now add omnivore and play 'hunt-the-if' knowing anything I miss will be a bug.

Hmm, okay, I can see that as a reasonable use case, but I still would like some hard numbers as to how often this check triggers on a large code base to understand how chatty this is in practice. Clang-tidy checks can be more chatty than compiler diagnostics, but only up to a point.


re: "There are definitely times when using == or != for a specific value is *way* cleaner than using a switch statement".

I totally agree. That's why I want this check. People don't remember to use switch for enums because it's harder and looks uglier.

Perhaps this check should not go in misc as it might be better to have it opt-in? I know I want it in one code-base I work on.

Misc might be a better place for it, but I'm definitely not keen on adding a check that is so noisy that it needs to be opt-in unless we've exhausted the ways to reduce chattiness with heuristics or reasonable options. If it's opt-in, the user has to remember to opt in to it every time they add a new enumerator to an enum, otherwise it provides them with no benefit anyway. That's why it's good to run it over a large code base (I'd use LLVM and Clang for this; we use a lot of enums which are frequently covered by switch statements) -- the diagnostics you see may demonstrate some code patterns used to set that heuristic or suggest good options.

jbcoe added a comment.Mar 15 2017, 7:25 AM

I've run the check on clang. It's noisy (690 warnings). The (few) cases I've manually inspected look like firm candidates for replacing with switches. Not my call though.

I'm going to look at quantlib next as that's a domain of interest to me.

jbcoe added a comment.Mar 15 2017, 8:02 AM

Quantlib produces no warnings but relies heavily on virtual-dispatch so that may be unsurprising.

I've run the check on clang. It's noisy (690 warnings). The (few) cases I've manually inspected look like firm candidates for replacing with switches. Not my call though.

The first thing I noticed is that this definitely emits false positives that would not be improved by a switch. Consider SemaExprCXX.cpp:3288:

return ConditionResult(*this, ConditionVar, MakeFullExpr(E.get(), StmtLoc),
                       CK == ConditionKind::ConstexprIf);

or SemaLambda.cpp:1023:

assert(C->InitKind == LambdaCaptureInitKind::NoInit &&
       "init capture has valid but null init?");

Out of the five random diagnostics I picked, three would be made worse by a switch (SemaExprCXX.cpp:3288, SemaLambda.cpp:1023, CGDebugInfo.cpp:4000), one would be made better (LLParser.cpp:6206), and one would be questionable (PDB.cpp:30), all in my opinion, of course. Given the number of diagnostics it produces, and the fact that my very small random sampling produced a 60% false-positive rate, that's a bit concerning. It could also be a total coincidence due to my small sample size, of course.

I think this check has utility and would be useful, but I think it needs some tuning (whether in the form of heuristics or options) to help reduce the noise out of the box. Taking a more thorough look at the results from running it over clang may identify some usage patterns that suggest what kind of heuristics or tuning would help.

jbcoe planned changes to this revision.Mar 15 2017, 10:03 AM

@Aaron I agree that some instances from your sample would not be improved by a switch. I'll look into narrowing the matched.

@jbcoe and @aaron.ballman i think an valid warning would be if there is another check in an else if statement.
so singular if(enum_value == Enum::Kind) is still ok, but another branch checking on the enum is suspiscious.

I've played around with a few heuristics but it's still far too contentious to have this check on by default and have it warn in places I want warnings. Where should it go?

I've played around with a few heuristics but it's still far too contentious to have this check on by default and have it warn in places I want warnings. Where should it go?

Perhaps it should live as a private check rather than a public one?

Alternatively, are you aware of any public coding standard rule that matches the behavior you want? We tend to be a little more relaxed about chattiness when it's part of a check for a public, relatively well-known coding standard because it guides the discussion about what constitutes a false positive. As a case in point, the CERT module has a check for MSC30-C, which prohibits using the rand() function. That would be way too contentious of a check normally except that it matches the behavior expected of a check against that CERT rule.

I'm not certain we want a module named "chatty", or something like it, for checks that are over-eager to diagnose code, but that might be another option. It would require a community discussion outside of this code review. (Personally, I'd be opposed to such a module, but don't let that dissuade you from pitching the idea. I'm just one contributor and my mind can certainly be changed in the presence of good rationale.)

I think that clang-tidy allows case-specific subsetting of C++. It is said that C++ contains a beautiful language and I've found that definition of beauty to be very use-case specific. Checks for side-effect free, Haskell-like, functional code would be of enormous interest to some subsections of the C++ community ( financial asset pricing and risk) but of very little interest to others (device drivers for embedded code). It would seem an enormous loss to me to require all clang-tidy checks to be generally useful. It's a simple matter to define a clang-tidy config file to enable different checks in different parts of a code base.

alexfh requested changes to this revision.May 11 2017, 5:24 AM

I think that clang-tidy allows case-specific subsetting of C++. It is said that C++ contains a beautiful language and I've found that definition of beauty to be very use-case
specific. Checks for side-effect free, Haskell-like, functional code would be of enormous interest to some subsections of the C++ community ( financial asset pricing and risk)
but of very little interest to others (device drivers for embedded code). It would seem an enormous loss to me to require all clang-tidy checks to be generally useful.

We don't require checks to be useful to every single C++ developer, but there is a certain bar the check should meet. It's not well defined, but I don't think we want checks that are only useful to a single person / company / code base (maybe with an exception of LLVM, since, well, clang-tidy is a part of it ;). The rationale is simple: every check added to clang-tidy adds up to all sorts of maintenance costs (source code maintenance, build and test times, binary sizes, complexity for the user, bugs filed, etc.), which should be outweighed by the usefulness of the check. When there's a single beneficiary of the check, they better care the costs of maintaining an out-of-tree check themselves.

Going back to what Aaron asked: is there any well-known coding standard, book or any other credible publication that formalizes recommendations on where using switch is preferred over if? I'm sure "enormous interest" would warrant existence of this kind of a recommendation, which might justify the presence of the check.

It's a simple matter to define a clang-tidy config file to enable different checks in different parts of a code base.

jbcoe abandoned this revision.Sep 8 2017, 12:39 PM