This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][solver] Improve reasoning for not equal to operator
ClosedPublic

Authored by manas on Dec 14 2022, 6:57 PM.

Details

Summary

This patch fixes certain cases where solver was not able to infer
disequality due to overlapping of values in rangeset. This case was
casting from lower signed type to bigger unsigned type.

Diff Detail

Event Timeline

manas created this revision.Dec 14 2022, 6:57 PM
Herald added a project: Restricted Project. · View Herald Transcript
manas requested review of this revision.Dec 14 2022, 6:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 14 2022, 6:57 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Thanks for going the extra mile to address this last thing. I really appreciate it.
I've got only a few minor comments and suggestions.

I'd recommend spell-checking the comments and the summary of this revision.
See my technical comments inline.

The test coverage looks good to me.

Good job.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1639

I think in this context != should achieve the same, and we usually prefer this operator for this.

1639–1640

I was thinking that maybe if (LHS.isUnsigned() && RHS.isSigned()) {} ... else if (LHS.isSigned() && RHS.isUnsigned()) results in cleaner code, as it would require one level fewer indentations.
The control-flow looks complicated already enough.

1640

Why do we need this additional condition?
If I remove these, I get no test failures, which suggests to me that we have some undertested code paths here.

1642–1647
1664–1667

I was thinking of using init-ifs, but on second thought I'm not sure if that's more readable.

if (RHS.getAPSIntType().convert(LHS.getMaxValue()) < RHS.getMinValue())
  return getTrueRange(T);

Shouldn't be too bad.

manas marked 2 inline comments as done.Dec 17 2022, 3:48 PM

Thanks for going the extra mile to address this last thing. I really appreciate it.
I've got only a few minor comments and suggestions.

I'd recommend spell-checking the comments and the summary of this revision.
See my technical comments inline.

Thank you once again for reviewing @steakhal :)

I did the spell-checking.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1639–1640

I was thinking that maybe if (LHS.isUnsigned() && RHS.isSigned()) {} ... else if (LHS.isSigned() && RHS.isUnsigned()) results in cleaner code

I did this, and I also combined both blocks into one to remove redundant code.

1640

Why do we need this additional condition?

Bitwidth was important because we should ideally cast smaller bitwidth type to bigger bitwidth type.

Consider if we have LHS(u8), RHS(i32), then without checking for bitwidth, we would be casting RHS's maxValue to LHS's type, which will result in lose of information and will not serve our purpose.

1664–1667

I think its readable. But to combine two blocks into one, I had to use different variable names, which makes this expression bigger. Should we still go with init-ifs?

manas updated this revision to Diff 483786.Dec 17 2022, 3:51 PM

Remove redundant branches

About spellings. In the summary you used 'lesser', I think as a synonym for 'smaller' or something like that. Anyway, not important.
Great stuff.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1640

If you think we need that bitwidth check, why did you remove it?
I'd like to see test cases demonstrating what we are talking about and see if we want that behavior or not.

1640

Interesting. I like it. I'd however recommend to move this and the other variable to the beginning of this guarded block. That way it would be easier to see that the guard condition relates to this ternary condition.

manas added inline comments.Dec 18 2022, 3:35 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1640

This test fails.

void testfoo(unsigned char u, signed int s) {
  if (u >= 253 && u <= 255 && s < INT_MAX - 2) {
    // u: [253, 254], s: [INT_MIN, INT_MAX - 2]
    clang_analyzer_eval(u != s); // expected-warning{{UNKNOWN}}
                                 // but returns TRUE
  }
}
steakhal requested changes to this revision.Dec 21 2022, 1:10 AM

Bitwidth was important because we should ideally cast smaller bitwidth type to bigger bitwidth type.
Consider if we have LHS(u8), RHS(i32), then without checking for bitwidth, we would be casting RHS's maxValue to LHS's type, which will result in lose of information and will not serve our purpose.

If you think we need that bitwidth check, why did you remove it?
I'd like to see test cases demonstrating what we are talking about and see if we want that behavior or not.

This test fails.

void testfoo(unsigned char u, signed int s) {
  if (u >= 253 && u <= 255 && s < INT_MAX - 2) {
    // u: [253, 254], s: [INT_MIN, INT_MAX - 2]
    clang_analyzer_eval(u != s); // expected-warning{{UNKNOWN}}
                                 // but returns TRUE
  }
}

I feel like we have something to talk about.
When I do the review pro bono, I'd like to focus on higher-level issues and let the submitter deal with the smaller concerns.
That's why I'm expecting the submitter to:

  • Explain in the summary what the patch aims to solve (aka. why did your work on it)
  • What & how it implemented it
  • What obstacles you had when you tried to implement it? Because the reviewer will most likely think the same way, it's better to highlight what you tried and why you failed that way.
  • Most importantly, attach the test cases you uncovered during development about the edge-cases of the previous point.

I'm also expecting that the change compiles, works, and is well-tested. This generally means that tests are covering all the branches in the modified parts and the change runs and are capable of analyzing non-trivial projects without crashing or producing unacceptable reports.
In particular, the Core and the range-based solver are the foundation of the engine, hence even more rigorous testing is required, so correctness is a must in these contexts.

Coming back to this review, I don't want to validate the correctness of the math. I trust you to do this, which you prove by tests or (Z3 solution in addition to that).
Getting more concrete, returning Unknown is fine, but returning the wrong answer like True for cases where we should not be able to deduce it, it's a serious issue.

I hope it helps to align us.

This revision now requires changes to proceed.Dec 21 2022, 1:10 AM
manas updated this revision to Diff 486425.Jan 4 2023, 4:24 PM

Re-introduce bitwidth comparison

manas added a comment.Jan 4 2023, 4:24 PM

This test fails.

void testfoo(unsigned char u, signed int s) {
  if (u >= 253 && u <= 255 && s < INT_MAX - 2) {
    // u: [253, 254], s: [INT_MIN, INT_MAX - 2]
    clang_analyzer_eval(u != s); // expected-warning{{UNKNOWN}}
                                 // but returns TRUE
  }
}

I feel like we have something to talk about.
When I do the review pro bono, I'd like to focus on higher-level issues and let the submitter deal with the smaller concerns.

I think there has been some miscommunication. When I mentioned the failing example, I didn't mean to leave/delegate the pending work.

Nonetheless, I fixed it by re-introducing bitwidth comparison. Here is a Z3 proof https://gist.github.com/weirdsmiley/a9917815e71e4ec09e076522df039841

manas marked 5 inline comments as done.Jan 4 2023, 4:26 PM
manas added inline comments.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1641–1642

I combined the two conditionals as they both were returning TrueRange. Is it readable?

Sorry, I don't have the time this week.

manas added a comment.Jan 24 2023, 3:23 AM

Gentle ping.

steakhal accepted this revision.Jan 24 2023, 4:42 AM

Looks good. Thanks.

This revision is now accepted and ready to land.Jan 24 2023, 4:42 AM