Page MenuHomePhabricator

Missing tautological compare warnings due to unary operators
ClosedPublic

Authored by Codesbyusman on Jul 25 2022, 12:36 PM.

Details

Summary

The patch mainly focuses on the no warnings for -Wtautological-compare. I t work fine for the positive numbers but doesn't for the negative numbers as mention in the post that this is because the warning explicitly checks for an IntegerLiteral AST node, but -1 is represented by a UnaryOperator with an IntegerLiteral sub-Expr.
So, for the solution lib/Analysis/CFG.cpp was updated by using the different hellper functions to identify that it is a integer lliteral and then will take value from the expression. The functions checkIncorrectEqualityOperator and also functions for the relational operators will be updated.

For the below code we have warnings:

if (0 == (5 | x)) {}

but not for

if (0 == (-5 | x)) {}

Thus updating to tackle this issue.

GitHub Issue

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Thank you for this, I think this is good incremental progress and is almost ready to go. Just a few small nits, but also, can you also add a release note for the fix (be sure to mention which issue is being closed too).

Note, the precommit CI failures are unrelated to the changes in this patch.

clang/test/Sema/warn-bitwise-compare.c
42
85
clang/test/SemaCXX/warn-unreachable.cpp
399–410

We should move these up above the TODO comment on line 399, since these are now getting the expected warnings.

Thank you for this, I think this is good incremental progress and is almost ready to go. Just a few small nits, but also, can you also add a release note for the fix (be sure to mention which issue is being closed too).

Note, the precommit CI failures are unrelated to the changes in this patch.

where should I write this release note. Like in comment or somewhere else?

Thank you for this, I think this is good incremental progress and is almost ready to go. Just a few small nits, but also, can you also add a release note for the fix (be sure to mention which issue is being closed too).

Note, the precommit CI failures are unrelated to the changes in this patch.

where should I write this release note. Like in comment or somewhere else?

All of Clang's release notes live in: https://github.com/llvm/llvm-project/blob/main/clang/docs/ReleaseNotes.rst and you'd want to put your note under Improvements to Clang's diagnostics (since you're making the diagnostic behavior better) or Bug Fixes (since there's a GitHub issue associated with it). I'd probably list it under Bug Fixes.

updated with the small fixes and addition of release note

aaron.ballman accepted this revision.Jul 28 2022, 4:41 AM

LGTM! I'll make some editorial corrections for grammar to the release note when I land.

This revision is now accepted and ready to land.Jul 28 2022, 4:41 AM
This revision was landed with ongoing or failed builds.Jul 28 2022, 4:49 AM
This revision was automatically updated to reflect the committed changes.

LGTM! I'll make some editorial corrections for grammar to the release note when I land.

Thank you.

rtrieu reopened this revision.Aug 1 2022, 11:04 PM
rtrieu added a subscriber: rtrieu.

This warning is now flagging some code which I believe is a false positive. In particular, when template arguments are involved, their values can be calculated during template instantiation, but they should be treated as variables instead of constants. For example:

template <int I, class T>
void foo(int x) {
    bool b1 = (x & sizeof(T)) == 8;
    bool b2 = (x & I) == 8;
    bool b3 = (x & 4) == 8;
}

void run(int x) {
    foo<4, int>(8);
}

In this instantiation, x is and'ed with the value 4 in different ways. sizeof(T) is type-dependent, I is value-dependent, and 4 is an integer literal. With your code, each of them would produce a warning. However, the first two values of 4 will change in different template instantiations, so the warning is not useful when it is okay for some instantiations to have constant values. Because of this, warnings should treat dependent expressions as non-constant even when they can be evaluated, so only b3 should get a warning. This is causing the warning to be emitted on code heavy in template meta-programming, such as array libraries. Please revert or fix.

I believe that evaluating the expression would make this warning too broad and would need more testing that what was included here. Only handling UnaryOperator with IntegerLiteral sub-expression makes more sense for the warning, and adding in any new cases if we find them.

This revision is now accepted and ready to land.Aug 1 2022, 11:04 PM

This warning is now flagging some code which I believe is a false positive. In particular, when template arguments are involved, their values can be calculated during template instantiation, but they should be treated as variables instead of constants. For example:

template <int I, class T>
void foo(int x) {
    bool b1 = (x & sizeof(T)) == 8;
    bool b2 = (x & I) == 8;
    bool b3 = (x & 4) == 8;
}

void run(int x) {
    foo<4, int>(8);
}

In this instantiation, x is and'ed with the value 4 in different ways. sizeof(T) is type-dependent, I is value-dependent, and 4 is an integer literal. With your code, each of them would produce a warning. However, the first two values of 4 will change in different template instantiations, so the warning is not useful when it is okay for some instantiations to have constant values.

Yeah, this is definitely a false positive we need to avoid, thank you for bringing it up!

Because of this, warnings should treat dependent expressions as non-constant even when they can be evaluated, so only b3 should get a warning. This is causing the warning to be emitted on code heavy in template meta-programming, such as array libraries. Please revert or fix.

Yeah, I agree. Unfortunately, the CFG makes this exceptionally difficult because it walks over the instantiated code, so we've lost that the original expression was dependent by the time we get to checking the binary expressions. The original code worked by virtue of overfitting to *just* integer literals.

I believe that evaluating the expression would make this warning too broad and would need more testing that what was included here. Only handling UnaryOperator with IntegerLiteral sub-expression makes more sense for the warning, and adding in any new cases if we find them.

I agree that the warning is too broad right now and that's unintentional. However, manually handling every single case in the CFG as something special is fragile and what got us this bug in the first place. We have a constant expression evaluator (two, actually) and we shouldn't have to reimplement it a third time in the CFG. However, asking a GSoC mentee to address that is well beyond the scope of what they should be working on. So for now I'm going to revert this change, reopen the issue, and we'll discuss the next steps off-list with Usman.

aaron.ballman requested changes to this revision.Aug 2 2022, 6:43 AM

I've reverted in c783ca0de1e1e00f364cf4745b8444a020ddd29b. Marking as requesting changes to make it clear more work needs to be done.

This revision now requires changes to proceed.Aug 2 2022, 6:43 AM

Because of this, warnings should treat dependent expressions as non-constant even when they can be evaluated, so only b3 should get a warning. This is causing the warning to be emitted on code heavy in template meta-programming, such as array libraries. Please revert or fix.

Yeah, I agree. Unfortunately, the CFG makes this exceptionally difficult because it walks over the instantiated code, so we've lost that the original expression was dependent by the time we get to checking the binary expressions. The original code worked by virtue of overfitting to *just* integer literals.

Not being able to detect when expressions are dependent inside template instantiations has been a pain for warnings since forever.

I believe that evaluating the expression would make this warning too broad and would need more testing that what was included here. Only handling UnaryOperator with IntegerLiteral sub-expression makes more sense for the warning, and adding in any new cases if we find them.

I agree that the warning is too broad right now and that's unintentional. However, manually handling every single case in the CFG as something special is fragile and what got us this bug in the first place. We have a constant expression evaluator (two, actually) and we shouldn't have to reimplement it a third time in the CFG. However, asking a GSoC mentee to address that is well beyond the scope of what they should be working on. So for now I'm going to revert this change, reopen the issue, and we'll discuss the next steps off-list with Usman.

I agree, the CFG should be as broadly applicable as possible, so using an evaluator there is fine. Manually handling every single case may be needed to keep the warning under control. It's possible to put that handling to the Sema side of things, right before the warning is emitted. There's already a filter for macros, so maybe adding the filtering logic there would be a good fix.

Adopted another approach working for the error caught. Kindly review this.

rtrieu added a comment.Aug 3 2022, 8:12 PM

Can you add my earlier test case or something like it to SemaCXX/warn-bitwise-compare.cpp ?

template <int I, class T>
void foo(int x) {
    bool b1 = (x & sizeof(T)) == 8;
    bool b2 = (x & I) == 8;
    bool b3 = (x & 4) == 8;  // only warn here
}

void run(int x) {
    foo<4, int>(8);
}
clang/lib/Analysis/CFG.cpp
58

For debugging?

Have you tried llvm::errs() << "message"; ? A few streams are provided by LLVM support which most places have already, so no extra header is needed to make it work.

1020

Is this any better than just having the callers use EvaluateAsInt themselves?

Can you add my earlier test case or something like it to SemaCXX/warn-bitwise-compare.cpp ?

template <int I, class T>
void foo(int x) {
    bool b1 = (x & sizeof(T)) == 8;
    bool b2 = (x & I) == 8;
    bool b3 = (x & 4) == 8;  // only warn here
}

void run(int x) {
    foo<4, int>(8);
}

Yes sure, will update you

clang/lib/Analysis/CFG.cpp
58

For debugging?

Yes. I just forget to remove them. Will updaye them.

Have you tried llvm::errs() << "message"; ? A few streams are provided by LLVM support which most places have already, so no extra header is needed to make it work.

okay, Great. Will use in future. Thank you.

rtrieu added a comment.Aug 4 2022, 3:03 PM
void foo(long x) {
  if ((x & 1) == 1L) return;  // bad always false warning here
  static_assert(sizeof(int) < sizeof(long), "long is bigger than int");
  static_assert((long(15) & 1) == 1L, "proof that condition can be true");
}

I found this false positive case when testing your new patch. The condition is fine, but it gives an always false warning. When fixed, this would be another good test case to include.

updating for the missing tautological compare warnings.

updated also the test cases.

This is looking much closer to what I think we had in mind, so I mostly have some cleanup suggestions.

clang/lib/Analysis/CFG.cpp
983

You can drop the IgnoreParens() here -- BoolExpr is either LHSExpr or RHSExpr, and both of those were initialized from a call to IgnoreParens().

1017–1019
1021–1023

Removing an unnecessary variable.

1024–1025

Coding style nit -- if the type is spelled out explicitly in the initialization, we tend to prefer using auto for the type so we don't have to repeat the type information twice.

Also removes a declaration that's not needed.

1031

This ensures everything past here doesn't have to care about parens.

1032–1035
1036
1037–1038
1045–1049
1051

If there's a unary expression operating on an integer literal... it has to be one of the listed operations above, otherwise I don't think you can get here. So might as well assert and bail out rather than return a value that we didn't properly perform the unary operation on.

1054–1062
Codesbyusman marked 10 inline comments as done.

updated.

Nice! Just a few more small nits to fix that I can see.

clang/lib/Analysis/CFG.cpp
1020–1025

Oops, I realized we could simplify this further and keep UnOp scoped more tightly to where it's used.

1030–1036

Another simplification to scope things more tightly.

1049–1051
Codesbyusman marked 3 inline comments as not done.

updated.

aaron.ballman accepted this revision.Aug 16 2022, 12:38 PM

This looks correct to me! Any further concerns @erichkeane or @rtrieu?

This revision is now accepted and ready to land.Aug 16 2022, 12:38 PM

This patch has been moving back and forth between IsIntegerLiteralConstantExpr and getIntegerLiteralSubexpressionValue. The first function is preexisting and the second one is a new function. The final patch seems to settle on using just getIntegerLiteralSubexpressionValue. Can you explain why the existing function does not meet your needs? It wasn't clear from the update messages why you went that way.

Besides that, there is added support for multiple unary operators, but only minus is tested. Each one should have at least a positive and a negative test to show it is supported.

This patch has been moving back and forth between IsIntegerLiteralConstantExpr and getIntegerLiteralSubexpressionValue. The first function is preexisting and the second one is a new function. The final patch seems to settle on using just getIntegerLiteralSubexpressionValue. Can you explain why the existing function does not meet your needs? It wasn't clear from the update messages why you went that way.

The existing function returns whether the expression is an ICE (sort of), the replacement function calculates the actual value and returns an optional APInt so you can perform operations on it directly. That said, a future refactoring could probably remove IsIntegerLiteralConstantExpr() and replace it with getIntegerLiteralSubexpressionValue(), but the only caller of IsIntegerLiteralConstantExpr() doesn't actually need the value at that point.

Besides that, there is added support for multiple unary operators, but only minus is tested. Each one should have at least a positive and a negative test to show it is supported.

Good catch! I agree those tests should be added.

working on the test. Will update the patch soon.

updated test cases.

This patch has been moving back and forth between IsIntegerLiteralConstantExpr and getIntegerLiteralSubexpressionValue. The first function is preexisting and the second one is a new function. The final patch seems to settle on using just getIntegerLiteralSubexpressionValue. Can you explain why the existing function does not meet your needs? It wasn't clear from the update messages why you went that way.

The existing function returns whether the expression is an ICE (sort of), the replacement function calculates the actual value and returns an optional APInt so you can perform operations on it directly. That said, a future refactoring could probably remove IsIntegerLiteralConstantExpr() and replace it with getIntegerLiteralSubexpressionValue(), but the only caller of IsIntegerLiteralConstantExpr() doesn't actually need the value at that point.

In that case, I suggest adding a comment to pointing to the other function. I expected that some day, someone will notice that the calculation for literals is slightly different between the two warnings and we should be helpful to them. I have no other concerns.

This patch has been moving back and forth between IsIntegerLiteralConstantExpr and getIntegerLiteralSubexpressionValue. The first function is preexisting and the second one is a new function. The final patch seems to settle on using just getIntegerLiteralSubexpressionValue. Can you explain why the existing function does not meet your needs? It wasn't clear from the update messages why you went that way.

The existing function returns whether the expression is an ICE (sort of), the replacement function calculates the actual value and returns an optional APInt so you can perform operations on it directly. That said, a future refactoring could probably remove IsIntegerLiteralConstantExpr() and replace it with getIntegerLiteralSubexpressionValue(), but the only caller of IsIntegerLiteralConstantExpr() doesn't actually need the value at that point.

In that case, I suggest adding a comment to pointing to the other function. I expected that some day, someone will notice that the calculation for literals is slightly different between the two warnings and we should be helpful to them. I have no other concerns.

That's a good suggestion, I've added two suggested edits for the comment. Thanks for the extra set of eyes on this review @rtrieu!

clang/lib/Analysis/CFG.cpp
74–79
1019–1020

Hi! A bit of late feedback on this patch. We found a failure in our downstream testing likely originating from here.

The failing case is:

void f(int a)
{
    (0 != (a | !0LL));
}

built with clang -cc1 -emit-llvm-bc -O2 -v -o foo.bc -x c foo.c

clang/lib/Analysis/CFG.cpp
1044

This isn't returning an APInt of the right width. It will construct an APInt with a width of the input value, but that isn't the same as the required width of what a logical not produces; that should have a width of 'int'.

Hi! A bit of late feedback on this patch. We found a failure in our downstream testing likely originating from here.

Thank you for the feedback! I've addressed the issue in 96a79cb308d1b8c00a83b180d9fecc5d54bacb9c.

clang/lib/Analysis/CFG.cpp
1044

Good catch, that is correct -- logical not always returns an int.

Hi! A bit of late feedback on this patch. We found a failure in our downstream testing likely originating from here.

Thank you for the feedback! I've addressed the issue in 96a79cb308d1b8c00a83b180d9fecc5d54bacb9c.

Thanks!