This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Warn on ternary comparison
Needs ReviewPublic

Authored by mgrabovsky on Oct 11 2015, 11:16 PM.

Details

Summary

This change adds a Sema diagnostic for comparisons like a < b < c,
which usually do not behave as intended by the author.

Fixes bug 20082.

Diff Detail

Event Timeline

mgrabovsky retitled this revision from to [Sema] Warn on ternary comparison.
mgrabovsky updated this object.
mgrabovsky added a subscriber: cfe-commits.
mgrabovsky updated this object.Oct 12 2015, 1:09 AM
aaron.ballman edited edge metadata.Oct 13 2015, 7:04 AM

Should there be an exception to this diagnostic for code involving Boolean values? e.g.,

void f(bool a, bool b, bool c) {

if (a == b == c)
  ;

}

At the very least, it seems like this should also follow GCC's behavior and suggest parenthesis directly.

include/clang/Basic/DiagnosticSemaKinds.td
5875

This diagnostic somewhat implies that ?: does not work as expected. I prefer GCC's wording for this:

"comparisons like 'X<=Y<=Z' do not have their mathematical meaning"

However, I would love it if we could do one step better and use the same operators the user wrote, if reasonable. ;-)

Should there be an exception to this diagnostic for code involving Boolean values? e.g.,

void f(bool a, bool b, bool c) {

if (a == b == c)
  ;

}

At the very least, it seems like this should also follow GCC's behavior and suggest parenthesis directly.

Interesting. This might be more complex than I had predicted. I'll try to dig into GCC's code and see what they do exactly. I can't guarantee returning sane, though.

Some thoughts: Even if a, b, c are Booleans, what did the user mean by a == b == c? Do they realise that the compiler interprets it as (a == b) == c? What about 'type mismatches', i.e. if a, b are ints, while b is bool – clearly the user must be aware of the interpretation above here. What if a, b, c are all ints – C90 doesn't have Booleans, so it might be intended as either.

include/clang/Basic/DiagnosticSemaKinds.td
5875

This diagnostic somewhat implies that ?: does not work as expected.

?: is a ternary operator but it's not a comparison.

I prefer GCC's wording for this

I wasn't sure if we wanted to be copycats, but OK.

would love it if we could do one step better and use the same operators the user wrote

Yes, that would be nice.

Should there be an exception to this diagnostic for code involving Boolean values? e.g.,

void f(bool a, bool b, bool c) {

if (a == b == c)
  ;

}

At the very least, it seems like this should also follow GCC's behavior and suggest parenthesis directly.

Interesting. This might be more complex than I had predicted. I'll try to dig into GCC's code and see what they do exactly. I can't guarantee returning sane, though.

Some thoughts: Even if a, b, c are Booleans, what did the user mean by a == b == c? Do they realise that the compiler interprets it as (a == b) == c? What about 'type mismatches', i.e. if a, b are ints, while b is bool – clearly the user must be aware of the interpretation above here. What if a, b, c are all ints – C90 doesn't have Booleans, so it might be intended as either.

I would spend some time digging into how GCC handles those cases, and use that as a baseline that we can then improve upon. I like the fact that GCC basically says "use parens to clarify your intent", as that solves all of the cases regarding equality and inequality.

I would imagine type == type == bool should behave the same as bool == bool == bool; it can be valid code that just needs parens to clarify intent. As for C90 behavior, I'm not too worried about what we do there.

include/clang/Basic/DiagnosticSemaKinds.td
5875

I wasn't sure if we wanted to be copycats, but OK.

If we can improve on their wording, I am all for it. I just find their wording more clear than the proposed wording.

I would spend some time digging into how GCC handles those cases, and use that as a baseline that we can then improve upon. I like the fact that GCC basically says "use parens to clarify your intent", as that solves all of the cases regarding equality and inequality.

I would imagine type == type == bool should behave the same as bool == bool == bool; it can be valid code that just needs parens to clarify intent. As for C90 behavior, I'm not too worried about what we do there.

I'm back. The warning is emitted in gcc/c-family/c-common.c L11488–11503; GCC only does this for what LLVM calls 'relational' operators (i.e., comparisons except == and !=) and for integral types.

mgrabovsky edited edge metadata.

Change message wording

Interesting checker.

I'll test it on some debian projects.

If you're interested.. it does not currently warn about "1.0 < 2.0 < 3.0" as far as I see.

I tested this on 2199 debian projects.. and got 16 warnings.

ftp://ftp.se.debian.org/debian/pool/main/t/tf5/tf5_5.0beta8.orig.tar.gz
expr.c:738:22: warning: comparisons such as 'a < b != c' do not have their mathematical meaning [-Wparentheses]
        if (neg0 == neg1 && sum<0 != neg0) {
                            ^~~~~~~~~~~~~

I guess it could be intentional. I haven't looked in the real code.

ftp://ftp.se.debian.org/debian/pool/main/m/medusa/medusa_2.1.1.orig.tar.gz
cvs.c:111:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
ftp.c:140:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
imap.c:143:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 2) )
         ^~~~~~~~~~~~~~
mysql.c:135:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
nntp.c:110:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
pcanywhere.c:133:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
pop3.c:133:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
rexec.c:95:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
rlogin.c:94:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
rsh.c:96:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
smtp.c:135:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
smtp-vrfy.c:114:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
snmp.c:154:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
telnet.c:143:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~
vmauthd.c:109:10: warning: comparisons such as 'a <= b <= c' do not have their mathematical meaning [-Wparentheses]
  if ( !(0 <= argc <= 3) )
         ^~~~~~~~~~~~~~

these are surely true positives.

mgrabovsky updated this revision to Diff 37985.Oct 21 2015, 4:30 AM

Fix comparisons with implicit casts and mismatched types

If you're interested.. it does not currently warn about "1.0 < 2.0 < 3.0" as far as I see.

I have now fixed this for C, but the issue remains for C++ and I have absolutely no idea what's going on. Can someone knowledgeable chime in?

mgrabovsky updated this revision to Diff 37987.Oct 21 2015, 4:49 AM

Correction for C++ tests

I tested this on 2199 debian projects.. and got 16 warnings.

ftp://ftp.se.debian.org/debian/pool/main/t/tf5/tf5_5.0beta8.orig.tar.gz
expr.c:738:22: warning: comparisons such as 'a < b != c' do not have their mathematical meaning [-Wparentheses]
        if (neg0 == neg1 && sum<0 != neg0) {
                            ^~~~~~~~~~~~~

I guess it could be intentional. I haven't looked in the real code.
....

Do you have any comments on that one.

I tested this on 2199 debian projects.. and got 16 warnings.

ftp://ftp.se.debian.org/debian/pool/main/t/tf5/tf5_5.0beta8.orig.tar.gz
expr.c:738:22: warning: comparisons such as 'a < b != c' do not have their mathematical meaning [-Wparentheses]
        if (neg0 == neg1 && sum<0 != neg0) {
                            ^~~~~~~~~~~~~

I guess it could be intentional. I haven't looked in the real code.
....

Do you have any comments on that one.

In the code you posted, neg0 should be Boolean (technically, it's an int, but it should be either 0 or 1, which clang can detect using Expr::isKnownToHaveBooleanValue). Although, aaron.ballman suggested above that such code is unclear and should use parentheses. Thus this warning is de facto intentional.

Any other remarks?

Mostly looks good; the only thing left is a test to ensure the fixits are placing the parens in the expected location.

lib/Sema/SemaChecking.cpp
6725

Missing a period to end the sentence.

rsmith edited edge metadata.Mar 18 2016, 11:33 AM

Once this lands, we should also consider the corresponding case from a fold-expression:

template<typename ...T> bool f(T ...t) { return (t < ...); } // warn if sizeof...(T) != 2 and we use built-in operator <.
bool k = f(1.0, 2.0, 3.0);