Download Raw Diff

Details

Reviewers

aaron.ballman
MTC
steven.zhang
whisperity
rsmith

Commits

rG4a097efe7784: [clang-tidy] Fix cppcoreguidelines-init-variables by removing the enum

Summary

In C++, the enumeration is never Integer, and the enumeration condition judgment is added to avoid compiling errors when it is initialized to an integer.
Add support for initialization check of scope enum.
As the following case show, clang-tidy will give a wrong automatic fix:

enum Color {Red, Green, Blue};
enum class Gender {Male, Female};
void func() {
  Color color; // Color color = 0; <--- fix bug
  Gender gender; // <--- no warning
}

Diff Detail

Event Timeline

Sockke created this revision.Jul 21 2021, 4:49 AM

Herald added subscribers: shchenz, kbarton, xazax.hun, nemanjai. · View Herald TranscriptJul 21 2021, 4:49 AM

Sockke requested review of this revision.Jul 21 2021, 4:49 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2021, 4:49 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Is this the right decision to make, conceptually? It will leave the variable uninitialised still, and reading such an uninit variable is still an issue, even if it is an enum. Could we consider the alternative of warning the user about the uninitialized variable, just not offering an automatic (and potentially bad / incorrect) fix?

steven.zhang added inline comments.Jul 21 2021, 4:58 AM

clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp
103	Technical speaking, we should warn about such case, ad the color is undefined now. Can we initialize it to the lowerest value of the enum type ?

In D106431#2892859, @whisperity wrote:

Is this the right decision to make, conceptually? It will leave the variable uninitialised still, and reading such an uninit variable is still an issue, even if it is an enum.

Yeah, that's right. However, it's much more difficult to give enum an initial value than an integer.

Could we consider the alternative of warning the user about the uninitialized variable, just not offering an automatic (and potentially bad / incorrect) fix?

Make sense, we (ByteDance) are also hesitating whether we should provide automatic fix for uninitialized variables, because automatic fix may change the program semantics.

steven.zhang added a reviewer: whisperity.Jul 21 2021, 5:00 AM

Herald added a subscriber: rnkovacs. · View Herald TranscriptJul 21 2021, 5:00 AM

MTC added inline comments.Jul 21 2021, 5:02 AM

clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp
103	Good point! However, which value should be chosen as automatic fix is a problem worth discussing.

In D106431#2892866, @MTC wrote:

In D106431#2892859, @whisperity wrote:

Is this the right decision to make, conceptually? It will leave the variable uninitialised still, and reading such an uninit variable is still an issue, even if it is an enum.

Yeah, that's right. However, it's much more difficult to give enum an initial value than an integer.

Could we consider the alternative of warning the user about the uninitialized variable, just not offering an automatic (and potentially bad / incorrect) fix?

Make sense, we (ByteDance) are also hesitating whether we should provide automatic repair for uninitialized variables, because automatic fix may change the program semantics.

This is the same as what we did for integer in essential. We are changing an uninitialized value to the default one, which makes sense as uninitialized means it could be anything.

Harbormaster completed remote builds in B115279: Diff 360405.Jul 21 2021, 5:56 AM

Sockke edited the summary of this revision. (Show Details)Jul 21 2021, 8:34 PM

Sockke edited the summary of this revision. (Show Details)

Sockke edited the summary of this revision. (Show Details)Jul 21 2021, 8:38 PM

Sockke edited the summary of this revision. (Show Details)

At present, Our views on preventing UB are basically the same, so a warning still needs to be reported (BTW, the original version will not report warnings for enum class types). The final question is what is the recommended value for initialization and whether to provide the automatic fix. Hope you can give us a direction, thanks！@aaron.ballman

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

To give an example

enum class BaseColour { BLACK, RED, GREEN, BLUE, WHITE };

BLACK is the 0, so we can fix-it. If the 0 is the first member (usually in enums that have a well-defined "default" or "None" case...), then, it is alright to use that as the automated fix.

However, if there isn't a "zero case" in the enum, for example:

enum class Something {
  FOO = -2,
  BAR, // This will be -1
  BAZ = 84,
  QUX  // This will be 85
};

then simply using the first member might not be the best fitting action, so in such case, I would opt for no FixIt. But, obviously, yes for giving a warning.

In D106431#2896206, @whisperity wrote:

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

I think we shouldn't give any fix-it for enumerations. Zero doesn't always mean "the right default value" -- for example, another common idiom is for the *last* member of the enumeration to be a sentinel value.

clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp
103	FWIW, I'd expect a warning but no fix-it for these.

In D106431#2896334, @aaron.ballman wrote:

In D106431#2896206, @whisperity wrote:

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

I think we shouldn't give any fix-it for enumerations. Zero doesn't always mean "the right default value" -- for example, another common idiom is for the *last* member of the enumeration to be a sentinel value.

I agree with you, but we need to consider that the checker works in a way that it gives the "zero" for integers. If we are here, was that the right decision? I mean... I wonder how much consistency we should shoot for. (nullptr for the pointers as default is at least somewhat sensible.)

But definitely, the warning must be given, that is true.

In D106431#2896367, @whisperity wrote:

In D106431#2896334, @aaron.ballman wrote:

In D106431#2896206, @whisperity wrote:

The problem with enums is that translating zero (0, 0.0, nullptr, etc...) to the enum case is not always apparent. A warning should always be given. And if you can find a zero member in the enum, we can report an automated suggestion for that.

I think we shouldn't give any fix-it for enumerations. Zero doesn't always mean "the right default value" -- for example, another common idiom is for the *last* member of the enumeration to be a sentinel value.

I agree with you, but we need to consider that the checker works in a way that it gives the "zero" for integers. If we are here, was that the right decision?

If we're here, I think the right decision would have been to not support the C++ Core Guidelines in clang-tidy in the first place because of the lack of attention that's been paid to enforcement for the rules by the authors. ;-) But snark aside, I think = 0 is a somewhat defensible naïve default value for integers, but today I would prefer it give its fix-it on a note rather than on the warning. When there's only one fix-it to be applied to a warning, it's possible to apply the fix-it without any per-case user intervention and that default value isn't statically known to be safe. With nullptr for pointers, it's more defensible because that's a well-understood sentinel value that code *should* be checking for (and there's good tooling to catch null pointer dereferences for the other situations). With NAN for floats, it's more defensible because that (usually!) is a "sticky" value that poisons further uses of the type. There's nothing quite like that for integers.

I mean... I wonder how much consistency we should shoot for. (nullptr for the pointers as default is at least somewhat sensible.)

There are at least two ways to chase consistency here -- one is being consistent with other fix-its in the check, the other is being consistent with other fix-its across the project. My feeling is we should aim for consistency with the rest of the toolchain, which is to only issue a fix-it on a warning when the fix is known to be correct. https://clang.llvm.org/docs/InternalsManual.html#fix-it-hints

Perhaps one way forward would be to issue the integer fix-it on a note rather than a warning, and for enumerations, we could issue up to two fix-its on a note, one for the first and one for the last enumerator in an enumeration (and if the enum only contains one enumerator, there's only one fix-it to generate which suggests it could be on the warning rather than a note, but that seems like a lot of trouble for an unlikely scenario). However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

Another way forward would be to not issue a fix-it for integers or enumerations.

In D106431#2896441, @aaron.ballman wrote:

However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

I have one information from January that if you're viewing the diagnostic output as a sequence of [warning, note, note, ...] elements (so you "group by" warning), Clang-Tidy will apply the first fix (be it on the warning or the note) in the order of diag() calls. (There was a (never-upstreamed) check in which I had to abuse this fact.) This behaviour could've changed, though...

In D106431#2896441, @aaron.ballman wrote:

Another way forward would be to not issue a fix-it for integers or enumerations.

This might be the best course of action, and could be fixed in the same patch (this one)...

In D106431#2896472, @whisperity wrote:

In D106431#2896441, @aaron.ballman wrote:

However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

I have one information from January that if you're viewing the diagnostic output as a sequence of [warning, note, note, ...] elements (so you "group by" warning), Clang-Tidy will apply the first fix (be it on the warning or the note) in the order of diag() calls. (There was a (never-upstreamed) check in which I had to abuse this fact.) This behaviour could've changed, though...

In D106431#2896441, @aaron.ballman wrote:

Another way forward would be to not issue a fix-it for integers or enumerations.

This might be the best course of action, and could be fixed in the same patch (this one)...

I think that's a reasonable way forward, though I don't insist on changing the integer behavior if others have strong opinions that it is correct. I do have strong opinions on fixing the enumeration behavior because that fix-it is wrong for C++ code.

In D106431#2896472, @whisperity wrote:

In D106431#2896441, @aaron.ballman wrote:

However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

I have one information from January that if you're viewing the diagnostic output as a sequence of [warning, note, note, ...] elements (so you "group by" warning), Clang-Tidy will apply the first fix (be it on the warning or the note) in the order of diag() calls. (There was a (never-upstreamed) check in which I had to abuse this fact.) This behaviour could've changed, though...

In D106431#2896441, @aaron.ballman wrote:

Another way forward would be to not issue a fix-it for integers or enumerations.

This might be the best course of action, and could be fixed in the same patch (this one)...

I may be more supportive of the first view. Based on clang -Wuninitialized, we hope that clang-tidy can do more. In addition, if there are two recommended values for enum initialization, we can put them in one note.

Sockke added a reviewer: rsmith.Jul 25 2021, 11:41 PM

Any thoughts? : )

In D106431#2907002, @Sockke wrote:

Any thoughts? : )

First, let's first fix that we should still warn for the uninitialised enum case, without a FixIt. That's the issue at hand, right now, Clang-Tidy generates, as you identified, broken output. We can discuss the later steps after this is fixed. Please implement this logic, and update the patch, so we have a snapshot of how that would look like and the thing working.

Afterwards, as Aaron suggested:

In D106431#2896441, @aaron.ballman wrote:

for enumerations, we could issue up to two fix-its on a note, one for the first and one for the last enumerator in an enumeration (and if the enum only contains one enumerator, there's only one fix-it to generate which suggests it could be on the warning rather than a note, but that seems like a lot of trouble for an unlikely scenario). However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

This might require nontrivial changes to the check's code, and investigating the potential problem with automated fix-it application when multiple conflicting fix-its are given on a note (not a warning line).

I think we all would be fine with only doing the first step (reintroduce the warning, without a fixit) in this patch, so it can be merged and hit the the mainline code quickly. The next step can be its own patch. 🙂

In addition, I wager that adding a line about this fix to the release notes at clang-tools-extra/docs/ReleaseNotes.rst is a good option, I can imagine people having turned this check off due to it being broken on enums.

In D106431#2907688, @whisperity wrote:

In D106431#2907002, @Sockke wrote:

Any thoughts? : )

First, let's first fix that we should still warn for the uninitialised enum case, without a FixIt. That's the issue at hand, right now, Clang-Tidy generates, as you identified, broken output. We can discuss the later steps after this is fixed. Please implement this logic, and update the patch, so we have a snapshot of how that would look like and the thing working.

Afterwards, as Aaron suggested:

In D106431#2896441, @aaron.ballman wrote:

for enumerations, we could issue up to two fix-its on a note, one for the first and one for the last enumerator in an enumeration (and if the enum only contains one enumerator, there's only one fix-it to generate which suggests it could be on the warning rather than a note, but that seems like a lot of trouble for an unlikely scenario). However, I don't recall how clang-tidy interacts with fix-its on notes off the top of my head, so I'm making an assumption that clang-tidy's automatic fixit applying mode handles notes the same way as clang and we should double-check that assumption.

This might require nontrivial changes to the check's code, and investigating the potential problem with automated fix-it application when multiple conflicting fix-its are given on a note (not a warning line).

I think we all would be fine with only doing the first step (reintroduce the warning, without a fixit) in this patch, so it can be merged and hit the the mainline code quickly. The next step can be its own patch. 🙂

In addition, I wager that adding a line about this fix to the release notes at clang-tools-extra/docs/ReleaseNotes.rst is a good option, I can imagine people having turned this check off due to it being broken on enums.

I think, the intention of the auto fix for such case is to make the program's behavior well-defined, not with some uninitialize value that cannot be predicted. It is ok as far as we select any enumerator to initialize it from my understanding. We should separate it into anther patch of cause.

Update!

Harbormaster completed remote builds in B116629: Diff 362291.Jul 28 2021, 1:07 AM

whisperity added inline comments.Jul 28 2021, 2:32 AM

clang-tools-extra/docs/ReleaseNotes.rst
156 ↗	(On Diff #362291)	Unterminated sequence. Also, where is this feature implemented?

Update!

Harbormaster completed remote builds in B116646: Diff 362320.Jul 28 2021, 3:42 AM

whisperity added inline comments.Jul 28 2021, 5:01 AM

clang-tools-extra/docs/ReleaseNotes.rst
156 ↗	(On Diff #362291)	@Sockke Sorry if I'm ultra dense here, but I still do not see what sort of support was added, or where. Do you mean that scoped enums were never reported before by this check, and that 5 lines of new code you added fixes that too? If that is the case, I'd rather rephrase the sentence here. I think `FixIt` is also an internal term (at least when written as such), `clang-tidy --help` only says: --fix - Apply suggested fixes. Suggestion: Removed generating fixes for enums because the code generated was broken, trying to initialise the enum from an integer. And if my understanding of what "support was added" is true, then also: The check now also warns for uninitialised scoped enums.

Thanks for your reply @whisperity. I have updated!

Harbormaster completed remote builds in B116665: Diff 362345.Jul 28 2021, 5:49 AM

Alright, thanks! So it does work that way, I didn't know. I think this is okay from my end. You can mark the inline discussion done, by the way. 🙂 Let's see if @aaron.ballman has any comments.

LGTM -- it's not the cleanest solution (we're duplicating the call to diag()), but it'll suffice. If we wanted to clean it up a bit, we could use an llvm::Optional<const char *> Diagnose = llvm::None; where a non-None value means "diagnose" and if the stored pointer value is not null, it additionally adds the fixit information.

This revision is now accepted and ready to land.Jul 28 2021, 9:47 AM

Thank you for your reply! I generally understand what you mean, but there are three cases, namely, no diagnosis, providing diagnosis, and providing diagnosis & FixIt. Only by judging whether "diagnosis" is null may fail to achieve the goal. I have revised it a little bit, do you think that's alright? @aaron.ballman

Harbormaster completed remote builds in B116882: Diff 362651.Jul 28 2021, 11:36 PM

If we are tinkering with the code post-acceptance, could you please also change const char * to StringRef? They are the same thing essentially, but accessing the length is (AFIAK) O(1) and there isn't a concerning strlen call in the code.

whisperity added inline comments.Jul 29 2021, 12:30 AM

clang-tools-extra/clang-tidy/cppcoreguidelines/InitVariablesCheck.cpp
87	Also, if we're using Optional, this could also be `= nullptr` instead, and then we have the state when the Optional `hasValue()` but the value itself is null. In all other cases, it both `hasValue()` and the value is non-null. This would also eliminate the need for the `strlen` call.

whisperity added inline comments.Jul 29 2021, 12:36 AM

clang-tools-extra/clang-tidy/cppcoreguidelines/InitVariablesCheck.cpp
87	In the StringRef case, you'd need to `emplace` a default-constructed StringRef (which has the semantics of not pointing anywhere and having length 0) into the Optional.

You made it very clear, thanks a lot！ @whisperity

Harbormaster completed remote builds in B116893: Diff 362663.Jul 29 2021, 1:00 AM

In D106431#2912589, @Sockke wrote:

Thank you for your reply! I generally understand what you mean, but there are three cases, namely, no diagnosis, providing diagnosis, and providing diagnosis & FixIt. Only by judging whether "diagnosis" is null may fail to achieve the goal. I have revised it a little bit, do you think that's alright? @aaron.ballman

By how the code looks, you made the exact changes I was hoping to see. Thank you!

whisperity accepted this revision.Jul 29 2021, 5:40 AM

In D106431#2913101, @aaron.ballman wrote:

In D106431#2912589, @Sockke wrote:

Thank you for your reply! I generally understand what you mean, but there are three cases, namely, no diagnosis, providing diagnosis, and providing diagnosis & FixIt. Only by judging whether "diagnosis" is null may fail to achieve the goal. I have revised it a little bit, do you think that's alright? @aaron.ballman

By how the code looks, you made the exact changes I was hoping to see. Thank you!

Cheers! Can it be merged? @aaron.ballman

clang-tools-extra/docs/ReleaseNotes.rst
156 ↗	(On Diff #362291)	Before that, isIntegerType() would not treat scoped enum as an integer, so scoped enum was ignored. But isEnumeralType() will not.

In D106431#2916542, @Sockke wrote:

Cheers! Can it be merged? @aaron.ballman

Yes, but I think the patch needs to be rebased first as the pre-merge checks CI system said "patch application failed". Do you have commit access? If not, could you please specify the username and e-mail (the values of Git config variable user.name and user.email) you would like to have associated with the commit?

In D106431#2916551, @whisperity wrote:

In D106431#2916542, @Sockke wrote:

Cheers! Can it be merged? @aaron.ballman

Yes, but I think the patch needs to be rebased first as the pre-merge checks CI system said "patch application failed". Do you have commit access? If not, could you please specify the username and e-mail (the values of Git config variable user.name and user.email) you would like to have associated with the commit?

Thanks a lot!

Name: liuke
Email: liuke.gehry@bytedance.com

This revision was landed with ongoing or failed builds.Jul 30 2021, 9:25 AM

Closed by commit rG4a097efe7784: [clang-tidy] Fix cppcoreguidelines-init-variables by removing the enum (authored by Sockke, committed by whisperity). · Explain Why

This revision was automatically updated to reflect the committed changes.

whisperity added a commit: rG4a097efe7784: [clang-tidy] Fix cppcoreguidelines-init-variables by removing the enum.

whisperity mentioned this in rG39bbbc2c2a24: [clang-tidy][docs] Remove the entry about 'cppcoreguidelines-init-variables'….Aug 12 2021, 1:45 AM

@Sockke Sorry to circle back here, but it seems to me that Clang now has a flag -Wuninitialized. Could you check how it behaves, compared to this check? If there are overlaps, what should we do, @aaron.ballman? Should parts of the check be deprecated?

This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Fix cppcoreguidelines-init-variables by removing the enum FixIt, and add support for initialization check of scoped enum.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 360405

clang-tools-extra/clang-tidy/cppcoreguidelines/InitVariablesCheck.cpp

clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Fix cppcoreguidelines-init-variables by removing the enum FixIt, and add support for initialization check of scoped enum.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 360405

clang-tools-extra/clang-tidy/cppcoreguidelines/InitVariablesCheck.cpp

clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines-init-variables.cpp

[clang-tidy] Fix cppcoreguidelines-init-variables by removing the enum FixIt, and add support for initialization check of scoped enum.
ClosedPublic