Details
- Reviewers
rsmith aaron.ballman - Group Reviewers
Restricted Project - Commits
- rG82afc9b169a6: Fix -Wbitfield-constant-conversion on 1-bit signed bitfield
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thank you for working on this! I think this is going in the right direction, but I think there's some more work needed here because the rules depend on the type of the bit-field (at least in C, I've not checked C++ yet). See C2x 6.7.2.1p12 (and it's footnote): "A bit-field is interpreted as having a signed or unsigned integer type consisting of the specified number of bits. ..." (and the footnote goes on to say "As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int, then it is implementation-defined whether the bit-field is signed or unsigned. ...").
Assuming that we actually do treat plain int bit-fields as being signed (which I'm pretty sure we do), then I agree that these changes are fine. But you should add some CodeGen test coverage (or find existing coverage) that proves how we behave in that case and an int test in Sema showing we match the codegen behavior.
Also, can you add a release note for the fix?
clang/test/Sema/constant-conversion.c | ||
---|---|---|
145 |
Thanks for the quick reply and the reference on the C standard!
On the C++ side, Section C.1.8 specified that int bit-fields are signed:
Change: Bit-fields of type plain int are signed.
Rationale: Leaving the choice of signedness to implementations could lead to inconsistent definitions of template specializations. For consistency, the implementation freedom was eliminated for non-dependent types, too.
Effect on original feature: The choise is implementation-defined in C, but not so in C++.
Implementation-wise, I'll see what I can find in CodeGen on whether int bit-fields are signed for C
Agreed; I think that's falling out from https://eel.is/c++draft/class.bit#4.sentence-1 and https://eel.is/c++draft/basic.fundamental#1.sentence-1.
Implementation-wise, I'll see what I can find in CodeGen on whether int bit-fields are signed for C
Thanks!
- Added CodeGen tests.
- Fixed failed tests that assigned 1 to int:1.
- Modified the Sema test to use int: instead of signed char:1.
- Added release note.
I did a pass through, this seems like it is the right thing. I looked over the codegen tests, and I'm convinced it does what we want from that perspective. I'll leave it to @aaron.ballman to take another look for +1'ing this though.
LGTM! Do you need someone to commit on your behalf? If so, what name and email address would you like used for patch attribution?
I'm new to Phabricator. I don't need someone else to commit on my behalf, but I guess I don't have permission to merge a commit on my own.
If the patch submitted misses author information for some reason, feel free to use Shawn Zhong <github@shawnzhong.com>.
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 | This was to suppress false positives. All instances we've seen of this are in fact false positives. Was there any analysis on true / false positives for this change? At least for our code, this seems to make the warning strictly worse. |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 | I've not performed any such analysis, but it would be good to know. FWIW, this is the kind of situation I was thinking this diagnostic would help make far more clear: https://godbolt.org/z/336n9xex3 (not that I expect people to write that static assert, but I very much expect people who assign the value 1 into a bit-field and then check for the value 1 to come back out are going to be surprised). That said, another approach to that specific scenario, which might have a better true positive rate would be: struct S { int bit : 1; }; int main() { constexpr S s{1}; // No warning if (s.bit == 1) { // Warn about explicit comparison of a 1-bit bit-field ... } else if (s.bit == 0) { // Warn about explicit comparison of a 1-bit bit-field? ... } else if (s.bit) { // No warning ... } else if (!s.bit) { // No warning ... } } |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 | Do you have something in mind that counts as false positives? | |
13069 | BTW, I realized that no warning is reported when a bool is assigned to a single-bit signed bit-field: https://godbolt.org/z/M5ex48T8s int main() { struct S { int b : 1; } s; s.b = true; if (s.b == true) { puts("T"); } else { puts("F"); } } |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 | For reference, I found this issue on chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1352183 |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 |
In my example above, I consider the use of s.bit and !s.bit to count as false positives. The value in the bit-field being 1 or -1 has no bearing on the semantics of the program, the only thing that matters is zero vs nonzero because of the boolean conversion.
Good catch, the conversion from bool to integer does give the value 1 explicitly. At the same time, I would consider the assignment to the bit-field from a boolean to be a non-issue, the problem only stems from attempting to use inspect the value. |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 |
Actually, that's a more interesting case than I had originally realized. See C2x 6.7.2.1p12. "If the value 0 or 1 is stored into a nonzero-width bit-field of type bool, the value of the bit-field shall compare equal to the value stored; a bool bit-field has the semantics of a bool." So if you store a 1 into a bool bit-field, you have to get a 1 back out of it. The bit-field is implicitly unsigned, effectively. |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 |
I agree with you that if s.bit is only used to be converted to bool, then the value stored being 1 or -1 does not matter that much. But hypothetically, one could use s.bit for arithmetic (e.g., index into an array), assignment (say into another int or enum), or, in your example, comparison against another value. If a warning is not generated during conversion, handling them in other places would be harder.
That makes sense to me. I'm not particularly concerned with assigning 1 to bool:1 or any uint*_t:1 in general. People may want to use them to compactly store bools in a struct.
The issue I referred to is the following: int main() { struct S { int b : 1; } s; s.b = 1; // warning generated after this patch s.b = true; // still no warning, but there should be } |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
13069 |
Harder, but with less false positives. We try to keep the false positive rate as low as we can for diagnostics in clang so that users are less tempted to entirely disable the diagnostic as being low-value.
On the one hand, yes -- pedantically, that is true. On the other hand, does use of a boolean indicate programmer intent that this field be treated as a boolean, and thus the actual value stored doesn't matter? At this point, I think we need some evidence that this is finding true positives instead of only false positives. We've already heard about Chromium, but I think you should try against some other larger C and C++ code bases to see what the true positive rate is. If there's evidence that this it catching actual bugs in practice, then it strengthens the case for the current design. If we don't find many true positives but find it increases the false positive rate, we should either rework this patch or back it out entirely. (I don't have a particular corpus in mind, but I'm thinking things like OpenSSL or sqlite3 for C code, and LLVM or Qt for C++ code in terms of size.) |
FWIW, just some things noticed when I examined some of the new warning that popped up after this patch:
https://github.com/llvm/llvm-project/issues/53253 mentioned that for example gcc complained about this. Although, as shown here https://godbolt.org/z/bq34Kexac there are some other differences that clang now complains already with -Wall, but that is not the case with gcc (I think one need to enable -Wpedantic in gcc to get a warning about signed overflow).
A similar warning (when assigning -1 to an unsigned bitfield) can be given by both gcc and clang by using -Wsign-conversion but that is not part of -Wall either. But maybe that is a totally different thing.
After some more thought and some offline discussions, I think I have a reasonable way forward: let's add -Wsingle-bit-bitfield-constant-conversion as a new warning group under -Wbitfield-constant-conversion that controls the diagnostic for one-bit bitfields. The diagnostic behavior here is pedantically correct and will help catch some bugs for folks, but there's enough existing code using 1 and not misbehaving (probably because they're not inspecting the value except in a boolean context) that having separate control seems useful.
What do others think?
It's a different thing. When the bit-field is unsigned, the conversion from signed is well-defined and so we only issue a -Wconversion warning, but it doesn't technically truncate the bit-field. That's also why we don't diagnose: https://godbolt.org/z/WW9477jfG
This was to suppress false positives. All instances we've seen of this are in fact false positives.
Was there any analysis on true / false positives for this change?
At least for our code, this seems to make the warning strictly worse.