Page MenuHomePhabricator

[clang-tidy] Fix cppcoreguidelines-init-variables with enum judgement
Needs ReviewPublic

Authored by Sockke on Wed, Jul 21, 4:49 AM.

Details

Summary

In C++, the enumeration is never Integer, and the enumeration condition judgment is added to avoid compiling errors when it is initialized to an integer.
As the following case show, clang-tidy will give a wrong automatic fix:

enum Color {Red, Green, Blue};
void func() {
  Color color = 0; // <--- fix bug
}

Diff Detail

Event Timeline

Sockke created this revision.Wed, Jul 21, 4:49 AM
Sockke requested review of this revision.Wed, Jul 21, 4:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptWed, Jul 21, 4:49 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Is this the right decision to make, conceptually? It will leave the variable uninitialised still, and reading such an uninit variable is still an issue, even if it is an enum. Could we consider the alternative of warning the user about the uninitialized variable, just not offering an automatic (and potentially bad / incorrect) fix?

steven.zhang added inline comments.Wed, Jul 21, 4:58 AM
clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp
103

Technical speaking, we should warn about such case, ad the color is undefined now. Can we initialize it to the lowerest value of the enum type ?

MTC added a comment.EditedWed, Jul 21, 4:59 AM

Is this the right decision to make, conceptually? It will leave the variable uninitialised still, and reading such an uninit variable is still an issue, even if it is an enum.

Yeah, that's right. However, it's much more difficult to give enum an initial value than an integer.

Could we consider the alternative of warning the user about the uninitialized variable, just not offering an automatic (and potentially bad / incorrect) fix?

Make sense, we (ByteDance) are also hesitating whether we should provide automatic fix for uninitialized variables, because automatic fix may change the program semantics.

MTC added inline comments.Wed, Jul 21, 5:02 AM
clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp
103

Good point! However, which value should be chosen as automatic fix is a problem worth discussing.

Is this the right decision to make, conceptually? It will leave the variable uninitialised still, and reading such an uninit variable is still an issue, even if it is an enum.

Yeah, that's right. However, it's much more difficult to give enum an initial value than an integer.

Could we consider the alternative of warning the user about the uninitialized variable, just not offering an automatic (and potentially bad / incorrect) fix?

Make sense, we (ByteDance) are also hesitating whether we should provide automatic repair for uninitialized variables, because automatic fix may change the program semantics.

This is the same as what we did for integer in essential. We are changing an uninitialized value to the default one, which makes sense as uninitialized means it could be anything.

Sockke edited the summary of this revision. (Show Details)Wed, Jul 21, 8:34 PM
Sockke edited the summary of this revision. (Show Details)
Sockke edited the summary of this revision. (Show Details)Wed, Jul 21, 8:38 PM
Sockke edited the summary of this revision. (Show Details)

At present, Our views on preventing UB are basically the same, so a warning still needs to be reported (BTW, the original version will not report warnings for enum class types). The final question is what is the recommended value for initialization and whether to provide the automatic fix. Hope you can give us a direction, thanks!@aaron.ballman

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

To give an example

enum class BaseColour { BLACK, RED, GREEN, BLUE, WHITE };

BLACK is the 0, so we can fix-it. If the 0 is the first member (usually in enums that have a well-defined "default" or "None" case...), then, it is alright to use that as the automated fix.

However, if there isn't a "zero case" in the enum, for example:

enum class Something {
  FOO = -2,
  BAR, // This will be -1
  BAZ = 84,
  QUX  // This will be 85
};

then simply using the first member might not be the best fitting action, so in such case, I would opt for no FixIt. But, obviously, yes for giving a warning.

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

I think we shouldn't give any fix-it for enumerations. Zero doesn't always mean "the right default value" -- for example, another common idiom is for the *last* member of the enumeration to be a sentinel value.

clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp
103

FWIW, I'd expect a warning but no fix-it for these.

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

I think we shouldn't give any fix-it for enumerations. Zero doesn't always mean "the right default value" -- for example, another common idiom is for the *last* member of the enumeration to be a sentinel value.

I agree with you, but we need to consider that the checker works in a way that it gives the "zero" for integers. If we are here, was that the right decision? I mean... I wonder how much consistency we should shoot for. (nullptr for the pointers as default is at least somewhat sensible.)

But definitely, the warning must be given, that is true.

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

I think we shouldn't give any fix-it for enumerations. Zero doesn't always mean "the right default value" -- for example, another common idiom is for the *last* member of the enumeration to be a sentinel value.

I agree with you, but we need to consider that the checker works in a way that it gives the "zero" for integers. If we are here, was that the right decision?

If we're here, I think the right decision would have been to not support the C++ Core Guidelines in clang-tidy in the first place because of the lack of attention that's been paid to enforcement for the rules by the authors. ;-) But snark aside, I think = 0 is a somewhat defensible naïve default value for integers, but today I would prefer it give its fix-it on a note rather than on the warning. When there's only one fix-it to be applied to a warning, it's possible to apply the fix-it without any per-case user intervention and that default value isn't statically known to be safe. With nullptr for pointers, it's more defensible because that's a well-understood sentinel value that code *should* be checking for (and there's good tooling to catch null pointer dereferences for the other situations). With NAN for floats, it's more defensible because that (usually!) is a "sticky" value that poisons further uses of the type. There's nothing quite like that for integers.

I mean... I wonder how much consistency we should shoot for. (nullptr for the pointers as default is at least somewhat sensible.)

There are at least two ways to chase consistency here -- one is being consistent with other fix-its in the check, the other is being consistent with other fix-its across the project. My feeling is we should aim for consistency with the rest of the toolchain, which is to only issue a fix-it on a warning when the fix is known to be correct. https://clang.llvm.org/docs/InternalsManual.html#fix-it-hints

Perhaps one way forward would be to issue the integer fix-it on a note rather than a warning, and for enumerations, we could issue up to two fix-its on a note, one for the first and one for the last enumerator in an enumeration (and if the enum only contains one enumerator, there's only one fix-it to generate which suggests it could be on the warning rather than a note, but that seems like a lot of trouble for an unlikely scenario). However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

Another way forward would be to not issue a fix-it for integers or enumerations.

However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

I have one information from January that if you're viewing the diagnostic output as a sequence of [warning, note, note, ...] elements (so you "group by" warning), Clang-Tidy will apply the first fix (be it on the warning or the note) in the order of diag() calls. (There was a (never-upstreamed) check in which I had to abuse this fact.) This behaviour could've changed, though...

Another way forward would be to not issue a fix-it for integers or enumerations.

This might be the best course of action, and could be fixed in the same patch (this one)...

However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

I have one information from January that if you're viewing the diagnostic output as a sequence of [warning, note, note, ...] elements (so you "group by" warning), Clang-Tidy will apply the first fix (be it on the warning or the note) in the order of diag() calls. (There was a (never-upstreamed) check in which I had to abuse this fact.) This behaviour could've changed, though...

Another way forward would be to not issue a fix-it for integers or enumerations.

This might be the best course of action, and could be fixed in the same patch (this one)...

I think that's a reasonable way forward, though I don't insist on changing the integer behavior if others have strong opinions that it is correct. I do have strong opinions on fixing the enumeration behavior because that fix-it is wrong for C++ code.