This is an archive of the discontinued LLVM Phabricator instance.

Add warning when assigning enums to bitfields without an explicit unsigned underlying type
ClosedPublic

Authored by sashab on Sep 6 2016, 6:26 PM.

Details

Summary

Add a warning when assigning enums to bitfields without an explicit unsigned underlying type. This is to prevent problems with MSVC compatibility, since the Microsoft ABI defaults to storing enums with a signed type, causing inconsistencies with saving to/reading from bitfields.

Also disabled the warning in the dr0xx.cpp test which throws the error, and added a test for the warning.

The warning can be disabled with -Wno-signed-enum-bitfield.

Diff Detail

Repository
rL LLVM

Event Timeline

sashab updated this revision to Diff 70500.Sep 6 2016, 6:26 PM
sashab retitled this revision from to Add warning when assigning enums to bitfields without an explicit unsigned underlying type.
sashab updated this object.
sashab added a reviewer: rnk.
sashab added subscribers: dcheng, thakis.
rnk added inline comments.Sep 7 2016, 3:56 PM
include/clang/Basic/DiagnosticSemaKinds.td
2962 ↗(On Diff #70500)

I think this should be DefaultIgnore, since it is only relevant for projects that care about cross-platform portability.

lib/Sema/SemaChecking.cpp
7841 ↗(On Diff #70500)

Are we sure we want it to behave this way? The test case you already have looks problematic:

enum E : signed { E1, E2 };
struct { E e1 : 1; E e2; F f1 : 1; F f2; } s;
s.e2 = E2; // comes out as -1 instead of 1

Did you want to warn on the assignment instead of the bitfield?

sashab updated this revision to Diff 70652.Sep 7 2016, 11:58 PM

Changed warning to be DefaultIgnore

sashab marked 2 inline comments as done.Sep 8 2016, 12:05 AM
sashab added inline comments.
lib/Sema/SemaChecking.cpp
7841 ↗(On Diff #70500)

Yes, but this is fine because in this case the behavior is consistent with clang (which also prints -1)

If you run the code below on msvc and clang (which you can do here - http://rextester.com/l/cpp_online_compiler_visual)

#include <iostream>
using namespace std;

int main()
{
    enum E : signed { E1, E2 };
    struct { E e1 : 1; E e2; } s;
    s.e1 = E2;
    cout << s.e1 << endl;
}

It prints -1 with both MSVC and Clang, but if you removed the ': signed' from E it prints -1 with MSVC and 1 with Clang, which is the dangerous inconsistency. We're assuming here that if the developer specified a signed underlying type then they know what they are doing. :)

And yes, the warning is on the assignment, which is correct.

rnk added a subscriber: cfe-commits.Sep 8 2016, 8:55 AM
rnk accepted this revision.Sep 8 2016, 9:17 AM
rnk edited edge metadata.

lgtm

lib/Sema/SemaChecking.cpp
7841 ↗(On Diff #70652)

Got it.

This revision is now accepted and ready to land.Sep 8 2016, 9:17 AM
thakis added inline comments.Sep 8 2016, 1:10 PM
include/clang/Basic/DiagnosticSemaKinds.td
2962 ↗(On Diff #70652)

Hm, isn't the more safe default to have it on by default? A "what you're doing isn't portable" warning seems useful, and people who don't care can easily turn it off. And if it's not off by default, it's hard for people who do care to turn it on (they'll have to know about the warning).

Maybe it should be off-by-default but part of -Wall?

aaron.ballman added inline comments.
include/clang/Basic/DiagnosticSemaKinds.td
2962 ↗(On Diff #70652)

I would also vote for this being on by default rather than off by default. We usually try to avoid adding new, off-by-default warnings because they are basically undiscoverable warnings that no one enables (unless we're adding them for GCC compatibility and GCC default ignores). I would be fine if it was only enabled with -Wall.

rnk added a comment.Sep 8 2016, 2:28 PM

I was thinking of suggesting to put it under -Wextra, but -Wall is fine too.

I still don't think it should be on by default. I think there are a lot of Unix-only projects out there that aren't concerned with MSVC compatibility, and a warning about how things with in the Microsoft ABI would be surprising and noisy. Maybe we could tack on "... to make this code portable" or something to the diagnostic text to make it more clear why the user should be concerned about this?

sashab updated this revision to Diff 70760.Sep 8 2016, 4:40 PM
sashab edited edge metadata.
sashab marked an inline comment as done.

Thanks for your feedback everyone; left the flag as DefaultIgnore but added it to -Wall.

Keep in mind I am planning on adding two additional warnings after this, namely
"%0 is too small to hold all values of %1"
and
"%0 is too large to hold all values of %1"
for all enum bitfields.

Also, just going back to rnk's original comment, I don't think I properly replied -- you are correct that the following code behaves strangely:

enum E : signed { E1, E2 };
struct { E e1 : 1; E e2; F f1 : 1; F f2; } s;
s.e2 = E2; // comes out as -1 instead of 1

If you are storing signed enums in a bitfield, you need 1 additional bit to store the sign bit otherwise the enums will not serialize back correctly. This is equivalent to storing unsigned enums in bitfields that are too small; the whole value is not stored.

This is handled by the warning I'm adding next "%0 is too small to hold all values of %1", which can fire for signed enum bitfields when you don't store max(numberOfBitsNeededForUnsignedValues, numberOfBitsNeededForSignedValues) + 1.

Pinging for another LGTM since I have added it to -Wall and added a portability comment to the error message :)

Works for me if rnk likes it :-)

(We could bikeshed on if this should be one of the -Wmicrosoft warnings, but the warning can't be both in -Wall and -Wmicrosoft, so let's don't).

I do have a question about the test (the first commit below):

test/SemaCXX/warn-msvc-enum-bitfield.cpp
12 ↗(On Diff #70760)

Shouldn't this be the version that warns? The assignment with E1 assigns 0 which is portable, but this assigns 1 which overflows, right?

39 ↗(On Diff #70760)

We have -Wconversion warnings for similar cases; maybe we should have this here too (but not in this change):

Nicos-MBP:llvm-build thakis$ cat test2.cc
enum E {e1, e2};

struct S {

E e1 : 1;

};
void f() {

short s1 = 65536;
short s2 = 32769;

S s;
s.e1 = e2;

}
Nicos-MBP:llvm-build thakis$ clang -Wconversion -c test2.cc
test2.cc:8:14: warning: implicit conversion changes signedness: 'int' to 'short' [-Wsign-conversion]

short s2 = 32769;
      ~~   ^~~~~

test2.cc:7:14: warning: implicit conversion from 'int' to 'short' changes value from 65536 to 0 [-Wconstant-conversion]

short s1 = 65536;
      ~~   ^~~~~

2 warnings generated.

rnk added a comment.Sep 15 2016, 11:33 AM

Works for me if rnk likes it :-)

Yep, looks good.

(We could bikeshed on if this should be one of the -Wmicrosoft warnings, but the warning can't be both in -Wall and -Wmicrosoft, so let's don't).

I also don't think it's a good fit for -Wmicrosoft, which is usually trying to warn you that "hey, this is an MS extension!", and not "hey, your Unix code isn't portable to Windows!".

aaron.ballman accepted this revision.Sep 15 2016, 11:35 AM
aaron.ballman added a reviewer: aaron.ballman.

LGTM as well, thank you!

Thanks all! :)

test/SemaCXX/warn-msvc-enum-bitfield.cpp
12 ↗(On Diff #70760)

e2 is not a bitfield :) So this code is fine.

And we should warn on all assignments, since any assigned value could potentially be incorrect. Also, most assignments are not static so we don't always have that information :)

sashab marked an inline comment as done.Sep 15 2016, 10:32 PM
sashab closed this revision.Sep 15 2016, 10:36 PM

Is this how I commit this? Hopefully this lands... :-)

thakis added inline comments.Sep 16 2016, 7:37 AM
test/SemaCXX/warn-msvc-enum-bitfield.cpp
12 ↗(On Diff #70760)

Do you have any data on false positive rate when we just always warn unconditionally? i.e. did you build some large-ish open-source program (clang, chromium, firefox, ...) and is the true positive / false positive rate ok? (we have some problems with "always warn" with other enum warning which we currently always have to disable, e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=621097)

To land, you need to svn commit or ask someone with commit access to do that for you -- llvm doesn't have a commit queue.

mehdi_amini reopened this revision.Oct 26 2016, 4:32 PM
mehdi_amini added a subscriber: mehdi_amini.

Reopen: this hasn't been committed yet apparently.

@sashab : are you still interested in moving this forward?

This revision is now accepted and ready to land.Oct 26 2016, 4:32 PM

Sorry, had this discussion elsewhere (https://bugs.chromium.org/p/chromium/issues/detail?id=648462).

I'm uncertain at this point. There is currently a bug in GCC that means enums with an explicit underlying type (or enum classes, although the latter is to be fixed) are given the size of their underlying type. For example:

enum Foo { A, B };

can be stored in a bitfield of size 1, but:

enum Foo : unsigned char { A, B };

will error in a bitfield of any size smaller than 8.

So when this modification tells the developer to add 'unsigned' to their enum, they are subsequently causing a warning to occur in GCC.

I have commented on the bug on GCC for this (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51242#c28), but it looks unlikely to be fixed.

Should we go ahead and add this warning when following its instructions will cause a warning in the GCC compiler? Even though GCC is at fault here, I'm not sure what the right thing is to do.

So when this modification tells the developer to add 'unsigned' to their enum, they are subsequently causing a warning to occur in GCC.

I have commented on the bug on GCC for this (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51242#c28), but it looks unlikely to be fixed.

Should we go ahead and add this warning when following its instructions will cause a warning in the GCC compiler? Even though GCC is at fault here, I'm not sure what the right thing is to do.

GCC seems to agree that they should fix it: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61414 ; so I wouldn't consider it a blocker.

But I'm not using GCC either, and I don't know what is our usual policy, so it'd be nice to have another opinion here.

So when this modification tells the developer to add 'unsigned' to their enum, they are subsequently causing a warning to occur in GCC.

I have commented on the bug on GCC for this (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51242#c28), but it looks unlikely to be fixed.

Should we go ahead and add this warning when following its instructions will cause a warning in the GCC compiler? Even though GCC is at fault here, I'm not sure what the right thing is to do.

GCC seems to agree that they should fix it: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61414 ; so I wouldn't consider it a blocker.

But I'm not using GCC either, and I don't know what is our usual policy, so it'd be nice to have another opinion here.

We don't need to be bug-for-bug compatible with GCC unless there's a compelling reason to do so (which I don't think this is), so I think this is fine to commit, modulo the comments from @thakis regarding the false-positive rate.

This revision was automatically updated to reflect the committed changes.
rsmith added a subscriber: rsmith.Nov 16 2016, 6:07 PM

This is causing warnings to fire for headers shared between C and C++, where the "give the enum an unsigned underlying type" advice doesn't work, and where the code in question will never be built for the MS ABI. It seems really hard to justify this being on by default.

I'm going to turn it off by default for now, but we should probably consider turning it back on by default when targeting the MS ABI (as a "your code is wrong" warning rather than a "your code is not portable" warning).

What is the correct solution for MSVC in C? (If any)

rnk added a comment.Nov 17 2016, 8:56 AM

This is causing warnings to fire for headers shared between C and C++, where the "give the enum an unsigned underlying type" advice doesn't work, and where the code in question will never be built for the MS ABI. It seems really hard to justify this being on by default.

I'm going to turn it off by default for now, but we should probably consider turning it back on by default when targeting the MS ABI (as a "your code is wrong" warning rather than a "your code is not portable" warning).

Seems reasonable. I asked to make it off by default, but got argued down to putting it under -Wall.

Yeah, suggesting adding an underlying type to the enum to solve this problem seems like a bad idea, since that fundamentally changes the nature of the enum -- typically allowing it to store a lot more values, and making putting it in a bitfield a bad idea.

Any time you use a bitfield it stores fewer values than the original integer type. I don't see how enums are special here. Even if I can store values in the enum not representable as a bitwise combination of enumerators, people usually don't, and adding an underlying type doesn't change the conventions of normal C++ code.

Do you have any better suggestions for people that want this code to do the right thing when built with MSVC?

enum E /* : unsigned */ { E0, E1, E2, E3 };
struct A { E b : 2; };
int main() {
  A a;
  a.b = E3;
  return a.b; // comes out as -1 without the underlying type
}

Widening the bitfield wastes a bit. Making the bitfield a plain integer and cast in and out of the enum type, but that obscures the true type of the bitfield. So, I still support this suggestion.