Page MenuHomePhabricator

[analyzer][solver] Introduce reasoning for not equal to operator
Needs ReviewPublic

Authored by manas on Oct 27 2021, 6:36 AM.

Details

Summary

With this patch, the solver can infer results for not equal to operator
over Ranges as well. This also fixes the issue of comparison between
different types, by first converting the RangeSets to resulting type,
which then can be used for comparisons.

Diff Detail

Event Timeline

manas created this revision.Oct 27 2021, 6:36 AM
manas requested review of this revision.Oct 27 2021, 6:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 27 2021, 6:36 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
manas added a comment.Oct 27 2021, 6:40 AM

@steakhal @martong This patch solves the comparison between different types by bypassing the RangeSets first to VisitBinaryOperator which coarses/converts, and then only it checks for any True/False ranges via comparison. This is similar to what happens with BO_Rem, BO_And etc. as well.

Please also demonstrate that the patch can deal with sign mismatching ranges.
Aside from that, it looks clean and sweet.
Although, I still miss the point of how it would bypass anything.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1622

I would probably host this logic into a Range::intersects(Range Other) const member function.

1626–1631

Technically, it's correct, but I would rather follow a similar logic to what the comment would suggest.

steakhal requested changes to this revision.Oct 27 2021, 7:31 AM

It seems like the test does not exercise the modified code segment.
Please investigate, and make sure that the code you submit is actually exercised by the test you provide.

This revision now requires changes to proceed.Oct 27 2021, 7:31 AM
manas added a comment.Oct 27 2021, 8:40 AM

It seems like the test does not exercise the modified code segment.
Please investigate, and make sure that the code you submit is actually exercised by the test you provide.

I am really sorry. I will fix it soon.

manas updated this revision to Diff 390526.Nov 29 2021, 5:11 PM

Fix comparison between different types and add a test to check it

manas added a comment.Nov 29 2021, 5:17 PM

I have made few changes:

  1. The failed assertions due to comparison between different types have been fixed by converting all the Ranges to a given type. This will allow comparisons without throwing errors.
  1. There was also a build error due to explicit specialization in non-namespace scope. This was fixed by @martong previously, but that patch led to the above mentioned bug. So I used @martong 's patch here to make GCC happy.
  1. I have added a small check for comparison between different types.

https://reviews.llvm.org/D106102 differential contains the background story.

steakhal requested changes to this revision.Nov 30 2021, 6:37 AM

I have made few changes:

  1. The failed assertions due to comparison between different types have been fixed by converting all the Ranges to a given type. This will allow comparisons without throwing errors.
  2. There was also a build error due to explicit specialization in non-namespace scope. This was fixed by @martong previously, but that patch led to the above mentioned bug. So I used @martong 's patch here to make GCC happy.
  3. I have added a small check for comparison between different types.

https://reviews.llvm.org/D106102 differential contains the background story.

I'm not quite happy.
Next time please make sure that the tests you provide actually exercise the changed code segments.
I deleted all the old lines of the constant-folding.c test file and placed a llvm_unreachable() into SymbolicRangeInferrer::VisitBinaryOperator<BO_NE>() and the test did not cause a crash, which is a clear indication that the test does not test the modified code segments.
Unless I'm overly anxious I would draw false conclusions about the results of the tests and if I approve we could get crashes and complaints, and reverts.

I would kindly ask you to double-check your tests prior to submitting them to review.


About the technical content of this patch.

I would recommend following the same pattern as in the rest of the functions. So, fillGaps() then convert(). That way you would reduce the time-complexity of your code to O(1) from O(N).
I would like to ask for a coverage report that shows that all of the branches of the modified code have been exercised by the constant-folding.c test file.

That being said, I would like to thank you for your effort in fixing this, I'm really looking forward!

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1661–1662

This and the subsequent similar block could be implemented by a member function of a RangeSet.
It's highly likely that many of these VisitBinaryOperator<Op>() functions could benefit from reusing them in the future if we decide to handle more of them.

1668–1670

ConvertedLHS.getConcreteValue() && ConvertedLHS.getConcreteValue() == ConvertedRHS.getConcreteValue()

clang/test/Analysis/constant-folding.c
326

I would assume from [x, y] that x <= y.

This revision now requires changes to proceed.Nov 30 2021, 6:37 AM
manas updated this revision to Diff 420962.Apr 6 2022, 11:45 AM

Fix test cases to make them reachable via VisiBinaryOperator<BO_NE> and using getConcreteValue()

Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2022, 11:45 AM
manas added a comment.Apr 6 2022, 11:49 AM

@steakhal apologies for holding onto this for so long. I managed to fix previously untestable test cases. The issue was that I was building expressions as (u1 != u2) != 0 and the solver was canonicalizing this to an equivalent BO_EQ expression. That's why, it wasn't reaching VisitBinaryOperator<BO_NE>(). So I changed all tests as following: (u1 != u2) (removing the latter != 0 part. Also, I utilized the getConcreteValue() for checking of single APSInt in the RangeSets.

Apart from this, I am building for the coverage reports (which I will upload later).

manas marked 2 inline comments as done.Apr 6 2022, 11:50 AM
manas added inline comments.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1661–1662

I agree. I will try to find similarities which can be extracted from remaining binops and put them I a member function.

manas updated this revision to Diff 421122.Apr 7 2022, 1:23 AM

Fix comments and rebase

manas updated this revision to Diff 421134.Apr 7 2022, 2:09 AM

Format constant-folding.c

Here are my remarks.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1628–1629

Avoid filling the gaps, because it's completely possible to compare all the ranges.
E.g. LHS [1,2]U[8,9] and RHS [5,6] are not equal.
And you don't need to fiil the gap in LHS, because it will lead you to a wrong answer, like [1,9] != [5,6] => UNKNOWN instead of TRUE.

1633–1634

The ResultType of BO_NE is bool. You don't need to convert your integer ranges to boolean. Otherwise, you'll lose the information to compare.

1643–1646

You can simply check an intersection (RangeSet::Factory::intersect) between the ranges.
If the result range is empty then return TRUE.

1650–1654

This is OK but check ConvertedCoarseRHS->getConcreteValue() for nullptr before getting the value.

clang/test/Analysis/constant-folding.c
339

I'd like to see the cases with concrete integers as well.

manas added a comment.Jul 11 2022, 5:53 PM

I was busy with other stuff. I will take a look at it now.

manas updated this revision to Diff 444911.Jul 15 2022, 1:34 AM

Add concrete tests for same and different types

manas added a comment.Jul 15 2022, 1:34 AM

The coverage showing unreachability of VisitBinaryOperator<BO_NE> for concrete integer cases.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1628–1629

If we don't fill gaps, then we will be making this method O(n) instead of O(1) as of now. As I see it, filling RHS (and not filling LHS), it can iterate over all ranges in LHS, and check each range's intersection with CoarseRHS.

Before, I do this, I just wanted to point out the unreachability of concrete cases to this method. I have tried to explain this below.

clang/test/Analysis/constant-folding.c
339

I have checked the coverage reports which show that concrete values are not reaching VisitBinaryOperator<BO_NE>. This is due to them being trivially inferred in RangedConstraintManager.cpp. I attached the coverage for RangeConstraintManager.cpp.

I think that I should remove the handling of concrete APSInt all together from SymbolicRangeInferrer for above reason. What do you guys think?

ASDenysPetrov added a comment.EditedJul 15 2022, 10:03 AM

@manas

I'm sorry but it seems like I brought you a new work :-) I've just loaded these two patches (D129678 and D129498) and now you have to rebase your changes. But there is a good news as well. You will be able to use clang_analyzer_value function for your tests if needed.

I appreciate your efforts but I should tell that I'm currently working on the thing that should resolve the issue you are trying to solve D103096.

The coverage showing unreachability of VisitBinaryOperator<BO_NE> for concrete integer cases.

Maybe it's better to remove that unreachable part but leave the tests for concrete ints just to verify that all the cases are covered.

Also I expect to see test case for short-ushort, char-uchar pairs, because if it would turn out that the int-uint is only pair that we handle, the patch would be disappointing, unfortunately.

manas updated this revision to Diff 445549.Jul 18 2022, 9:43 AM

Remove filling gaps and convert, use castTo, and add tests for short-ushort,
char-uchar pairs

manas marked 5 inline comments as done.EditedJul 18 2022, 9:57 AM

Considering @ASDenysPetrov 's example of LHS = [1, 2] U [8, 9] and RHS = [5, 6], I constructed a test case as following:

(((u1 >= 1 && u1 <= 2) || (u1 >= 8 && u1 <= 9)) && u2 >= 5 && u2 <= 6)

but I can see that the analyzer is bifurcating paths at the OR operator. Essentially, there are two diverged paths:

  1. 1 <= u1 && u1 <= 2 && 5 <= u2 && u2 <= 6
  1. 8 <= u1 && u1 <= 9 && 5 <= u2 && u2 <= 6

Separately, they hit VisitBinaryOperator<BO_NE> and in both cases, we get TRUE for (u1 != u2).

Is there any other way to formulate the expression so that it constructs LHS = [1, 2] U [8, 9] and doesn't bifurcate?

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1633–1634

I have utilized castTo method for conversion of different types so that they can be compared/intersected.

1650–1654

I have removed the concrete case handling as it was unreachable.

clang/test/Analysis/constant-folding.c
289

@ASDenysPetrov I have used your example, but I realized that the path bifurcates and sizeof LHS RangeSet is always 1 inside VisitBinaryOperator<BO_NE>.

manas marked 2 inline comments as done.Jul 18 2022, 10:00 AM
manas updated this revision to Diff 445671.Jul 18 2022, 6:47 PM

Rebase and fix comments

manas marked an inline comment as done.Jul 18 2022, 6:48 PM
manas marked an inline comment as done.Jul 18 2022, 6:52 PM
manas added inline comments.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1661–1662

This comment is about finding intersecting Ranges in ConvertedLHS/ConvertedRHS which was previously done by comparing getMinValue/getMaxValue. Now, we use intersect family of methods for it.

Considering @ASDenysPetrov 's example of LHS = [1, 2] U [8, 9] and RHS = [5, 6], I constructed a test case as following:

(((u1 >= 1 && u1 <= 2) || (u1 >= 8 && u1 <= 9)) && u2 >= 5 && u2 <= 6)

but I can see that the analyzer is bifurcating paths at the OR operator. Essentially, there are two diverged paths:

  1. 1 <= u1 && u1 <= 2 && 5 <= u2 && u2 <= 6
  1. 8 <= u1 && u1 <= 9 && 5 <= u2 && u2 <= 6

Separately, they hit VisitBinaryOperator<BO_NE> and in both cases, we get TRUE for (u1 != u2).

Is there any other way to formulate the expression so that it constructs LHS = [1, 2] U [8, 9] and doesn't bifurcate?

@manas, constrain into [8, 9] first and then pop out each intermediate element. This should work:

void clang_analyzer_eval(bool);

template <typename T>
void clang_analyzer_value(T x);

extern void abort() __attribute__((__noreturn__));
#define assert(expr) ((expr) ? (void)(0) : abort())

void testDisequalityRules(unsigned int u1, unsigned int u2, unsigned int u3,
                          int s1, int s2, int s3, unsigned char uch,
                          signed char sch, short ssh, unsigned short ush) {
    assert(1 <= u1 && u1 <= 9);
    assert(u1 != 3);
    assert(u1 != 4);
    assert(u1 != 5);
    assert(u1 != 6);
    assert(u1 != 7);
    clang_analyzer_value(u1);      // expected-warning{{32u:{ [1, 2], [8, 9] }}}
    assert(5 <= u2 && u2 <= 6);
    clang_analyzer_value(u2);      // expected-warning{{32u:{ [5, 6] }}}

    clang_analyzer_eval(u1 != u2); // expected-warning{{TRUE}}
}
ASDenysPetrov added a comment.EditedJul 19 2022, 11:09 AM

Is there any other way to formulate the expression so that it constructs LHS = [1, 2] U [8, 9] and doesn't bifurcate?

Try this u1 > 0 && u1 < 10 && u1 != 3 && u1 != 4 && u1 != 5 && u1 != 6 && u1 != 7 && u2 > 4 && u2 < 7. This is a bit verbose but it will avoid bifurcation.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1624–1626

Syntax: We usually don't use braces for single-line statements.

1628–1629

Here I'd rather get a correct result and wait a bit then contra-versa.

1630–1631

And again. This is incorrect to cast your un/signeds to bool.
First, castTo currently does not support this operation correctly AFAIR (My fault. I'll add the support later). And thus, I've no idea why your tests pass.
Second, you will get from e.g. u32[1,9] - bool[1,1] and from i32[42, 45] - bool[1,1]. Then bool[1,1] would be equal to bool[1,1], but it shouldn't in terms of u/i32.

Here you should emulate the behavior of C++ abstract machine, hence cast both to the biggest type or unsigned one.

1633–1636

castTo won't produce you empty RangeSets unless the originals were already empty, but we checked for this previously. So, you don't need this check.

clang/test/Analysis/constant-folding.c
289

Just do as I advised in the main comment.