Cover a new use case when using a 'signed char' as an integer
might lead to issue with non-ASCII characters. Comparing
a 'signed char' with an 'unsigned char' using equality / unequality
operator produces an unexpected result for non-ASCII characters.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I run the check on LLVM code and found only one catch of suspicious comparison:
clang-tidy -p=/home/zolnai/clang/build /home/zolnai/clang/llvm-project/clang/lib/Lex/Lexer.cpp /home/zolnai/clang/llvm-project/clang/lib/Lex/Lexer.cpp:1293:39: warning: comparison between signed and unsigned char [bugprone-signed-char-misuse] if ((C == '\n' || C == '\r') && C != PrevC)
In the LibreOffice codebase, I found 8 catches:
https://gist.github.com/tzolnai/2b22492c0cf79f5dba577f6a878657e3
All catches seem valid to me.
clang-tools-extra/clang-tidy/bugprone/SignedCharMisuseCheck.cpp | ||
---|---|---|
74 | All node matchers are implicitly allOf matchers so you can safely drop the allOf matcher. | |
97 | binaryOperator(hasAnyOperatorName("==", "!=") ... | |
103 | Any reason for this trailing ;? | |
122–145 | This should probably be a condition variable and bring the init from the start of check inside. | |
138–140 | } else if (const auto *IntegerType = Result.Nodes.getNodeAs<QualType>("integerType")) { ... } else llvm_unreachable("Unexpected match"); | |
clang-tools-extra/test/clang-tidy/checkers/bugprone-signed-char-misuse.cpp | ||
210 | New line |
Fix pre-merge check error
Plus remove unneeded "no warning" comments
and add a missing quote mark in docs.
Changed code based on reviewer comments.
- Removed unneeded allOf()
- Remove unneeded ';'
- Added new line at the end of the file
- Moved variable declarations inside if()
clang-tools-extra/clang-tidy/bugprone/SignedCharMisuseCheck.cpp | ||
---|---|---|
97 | I think these are different cases when the programmer checks the equality of two characters or checking some kind of order of characters. In case of equality, there seems to be a clear assumption, that if the two character variables represent the same character (semantically the same), then the equality should return true. This assumption fails with mixed signed and unsigned char variables. |
clang-tools-extra/clang-tidy/bugprone/SignedCharMisuseCheck.cpp | ||
---|---|---|
97 | Fair enough, in that case can you specify in the docs that you only check for == and != comparison. Currently they only say it looks for comparison rather than maybe equality comparison. |
Update code based on reviewer comments.
- Update docs: comparison -> equality/inequality comparison.
- Use hasAnyOperatorName
- Plus fix a typo
LGTM
However, how does this handle cases when the type is written as char. They can be signed/unsigned based on what platform is being targeted. But on a platform where char is signed, comparison to an unsigned char is dangerous and the complimented case is the same. Surely any char comparison to signed char or unsigned char should be warned. But I'm not sure if that's in the scope of this check.
It uses the default signedness what clang specifies for char (which depends on the platform). For example, on my Linux x64 system, char is equivalent to signed char and so this clang-tidy check catches also the char <-> unsigned char comparisons. This default behavior can be overridden with -funsigned-char and -fsigned-char compilation options. So for a cross-platform C++ code, it's an option to run the clang-tidy twice, using these two options and so both char <-> unsigned char and char <-> signed char comparisons can be caught.
All node matchers are implicitly allOf matchers so you can safely drop the allOf matcher.