This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Extend ConstraintAssignor to handle remainder op
ClosedPublic

Authored by martong on Sep 23 2021, 11:25 AM.

Details

Summary

a % b != 0 implies that a != 0 for any a and b. This patch
extends the ConstraintAssignor to do just that. In fact, we could do
something similar with division and in case of multiplications we could
have some other inferences, but I'd like to keep these for future
patches.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51940

Diff Detail

Event Timeline

martong created this revision.Sep 23 2021, 11:25 AM
martong requested review of this revision.Sep 23 2021, 11:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 23 2021, 11:25 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I had to move the definition of ConstraintAssignor after the definition of RangeConstraintManager b/c I am using assumeSymNE in the new logic. Unfortunately, the diff does not show clearly the changes inside the moved hunk, so I try to indicate the important changes with a New code comment.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1740

New code

1750–1764

New code

1767–1770

New code

martong updated this revision to Diff 374741.Sep 24 2021, 12:36 AM
  • Break out the movement of RangeConstraintManager into a parent patch, this way the diff here is clearly visible and makes the review easier.

Great work!

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1601–1606

Hm, why don't we acquire RCM, Builder, F in the constructor? I'm expecting all of them to remain the same for all the assign() calls.

1613–1614

Why is this not a const member function?

1615

I think you should also implement a % b == 0 implies that a == 0.

1639–1641

IMO we should pass a reference here, just like the rest of the parameters.

clang/test/Analysis/constraint-assignor.c
19

It's still mindboggling that we need to do this.

steakhal added inline comments.Sep 27 2021, 9:29 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1601–1606

My bad. It's a static method. Ignore this thread.

ASDenysPetrov added inline comments.Sep 28 2021, 5:47 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614

IMO it's better to rename the function handleRemainderOp.

Add a function description in comments above.
E.g. Handle expressions like: a % b == 0. ... Returns true when bla-bla, otherwise returns false.

1619–1622

Maybe make some more complex assumptions to cover complex LHS's?

1719
clang/test/Analysis/constraint-assignor.c
31

Add some nested cases like x % y % z == 0.

ASDenysPetrov added inline comments.Sep 28 2021, 5:49 AM
clang/test/Analysis/constraint-assignor.c
19

This is needed because of RemoveDeadBindings.

steakhal added inline comments.Sep 28 2021, 6:43 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619–1622

Oh nice.

clang/test/Analysis/constraint-assignor.c
19

I know, I just wanted to highlight that in contrast to this test code, on real code we would not draw the same conclusion, since we reclaimed the constraints too early. And this is what worries me.
Not in this patch in particular, but I still think that we need something better than this 'keeping symbols alive manually'.

martong updated this revision to Diff 379017.Oct 12 2021, 6:44 AM
  • Rebase on top of new Parent (Use &RCM)
martong updated this revision to Diff 379039.Oct 12 2021, 7:56 AM
martong marked 11 inline comments as done.
  • Handle a % b == 0 implies that a == 0.
  • Add more test cases
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614

Why is this not a const member function?

Because RCM.assumeSymNE is not const.

IMO it's better to rename the function handleRemainderOp.

I agree, changed it.

1615

Good point, I've just added that.

1619–1622

State->assume goes through many higher level abstractions and finally calls assumeSymNE, so I think calling that would be a pessimization in this case.

clang/test/Analysis/constraint-assignor.c
31

Good point! I've added one such test case.

ASDenysPetrov added inline comments.Oct 12 2021, 8:29 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619–1622

I agree, but then you lose an internal simplification of LHS symbol.

steakhal added inline comments.Oct 12 2021, 10:22 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614

Why is this not a const member function?

Because RCM.assumeSymNE is not const.

I don't think it's an excuse: https://godbolt.org/z/nEcsozheq


Aside from that, I would clarify that it will try to apply the following *two* assumptions:
a % b == 0 implies that a == 0.
a % b != 0 implies that a != 0.
Now the comment only highlights the latter.

1619
1623

In case the Constraint might be zero, why do you constrain LHS so that it must be zero?
I could not come up with a concrete example, but I hope you got my concern.
We either have a logic flaw or we lack an assertion here stating this hidden assumption about Constraints.

1626–1628

BWT the following lines are uncovered by tests: L1627, L1651, L1758
Please adjust your tests accordingly.

ASDenysPetrov added inline comments.Oct 12 2021, 12:51 PM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619

Since SymbolRef actually is already const. (using SymbolRef = const SymExpr *;)

martong marked 5 inline comments as done.Oct 12 2021, 2:26 PM
martong added inline comments.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614

Why is this not a const member function?

Because RCM.assumeSymNE is not const.

I don't think it's an excuse: https://godbolt.org/z/nEcsozheq

Yeah you're right. It has nothing to do with the non-constness of RCM.assumeSymNE, my bad. The problem is that we try to assign a new state to the member non-const State and assignment is non-const:

No viable overloaded '='
/home/egbomrt/WORK/llvm4/git/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:188:23:
note: candidate function not viable: 'this' argument has type 'const clang::ento::ProgramStateRef' (aka 'const IntrusiveRefCntPtr<const clang::ento::ProgramState>'), but method is not marked const

Aside from that, I would clarify that it will try to apply the following *two* assumptions:
a % b == 0 implies that a == 0.
a % b != 0 implies that a != 0.
Now the comment only highlights the latter.

Ok, I've added those comments.

1619

Fair enough.

1619–1622

Well, I cannot come up any disadvantages of using`State->assume` other than some performance constraints. On the other hand, that could simplify the implementation by eliminating the need for RCM. I'll give it a try and perhaps will update the patch accordingly.

1623

Ahh, very good catch, sharp eyes! This mistake slipped in when I extended the original logic with a % b == 0 implies that a == 0. Fixed it.

martong updated this revision to Diff 379189.Oct 12 2021, 2:26 PM
martong marked 3 inline comments as done.
  • Fix logic error
  • Add more test cases
  • Use SymbolRef
steakhal accepted this revision.Oct 13 2021, 12:22 AM

Excellent! All lines are covered.
Great job.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619

They mean different things. What I wanted is to declare both the pointer and the pointee const.

This revision is now accepted and ready to land.Oct 13 2021, 12:22 AM
martong updated this revision to Diff 379830.Oct 14 2021, 1:46 PM
  • Fix signedness mismatch assertaion and add a test case for that

Ok. Let's see what the benefits it brings.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619

Oh, I see. const T * const

ASDenysPetrov accepted this revision.Oct 15 2021, 1:36 AM
steakhal accepted this revision.Oct 15 2021, 5:36 AM

Ok. Let's see what the benefits it brings.

According to our measurements, it has some effects but is probably difficult to draw clear conclusions.
By using the github/martong/constraint_assignor_rem, it seems like the runtime is within measurable errors, and the memory consumption and the total result count remained approximately the same.



It seems like we have more results since we more often have more concrete values instead of unsimplified symbols, which makes the analyzer 'smarter' and reports more problems. At least, that's my theory.
Sorry for not sharing directly the csa-testbench summary HTML, but that could contain sensitive information.
If you require, I could repeat the test and publish the results to the publicly reachable demo server to let you all inspect and verify my theory.

Additionally to my previous observation, a surprising amount of the new findings are of deadcode detections, and most of them there are loops.

Other than that, I've seen a true-positive report as well:


At line 'A', on the path where valid_modulus is assumed to be true, we now correctly constrain spec->modulus to zero. The only problem is that we never mention this in the bugreport xD
That being said, I would recommend introducing a bug-report visitor to complement this surprising behavior, OR somehow add a NoteTag to this transition where this operation gets evaluated.

I suspect we will need to tune the trackExpressionValue stuff to keep up, whenever we extend ConstraintAssignor.

After taking a look at the new findings we discovered, there is a logic error with this patch, actually a % b == 0 implies that a == 0 does not hold, one counter example is 10 % 2 == 0. Argh, probably we should be using Z3 next time to prove or disprove such things. Or perhaps other reviewers with strong math background could have had a look (knock knock @NoQ :).

martong updated this revision to Diff 379988.Oct 15 2021, 6:54 AM
  • Remove the wrong inferrence of a % b == 0 implies that a == 0 and related test cases.
steakhal accepted this revision.Oct 18 2021, 1:33 AM

I see. Now it looks correct.

Next time we shall have a z3 proof about the theory.
A => B <=> not(A) or B. which is SAT only if A and not(B) UNSAT.

a = z3.BitVec('a', 32)
b = z3.BitVec('b', 32)
zero = z3.BitVecVal(0, 32)
s = z3.Solver()
s.add((a % b) != zero)
s.add(a == zero)
s.check() # reports UNSAT

Note: it requires the z3-solver pip package.

ASDenysPetrov added inline comments.Oct 18 2021, 2:10 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627

How about using the family of ProgramState::isNonNull or ProgramState::isNull or RangeConstraintManager::checkNull functoins for this stuff?

martong marked an inline comment as done.Oct 22 2021, 1:17 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627

I've been checking this and turend out that ProgramState::isNull does not modify the State (this is aligned with being a const member function). So, these functions do not "assume" anything, they can be used only to query some property of an SVal (or Symbol) from the State.

However, this comment and your other previous comment made me to do further investigations towards exploiting the "assume" machinery better. The result is a new child patch, where we can handle "adjustments" as well.

This revision was landed with ongoing or failed builds.Oct 22 2021, 1:54 AM
This revision was automatically updated to reflect the committed changes.
martong marked an inline comment as done.
ASDenysPetrov added inline comments.Oct 25 2021, 4:32 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627

But I don't see you use the modified State in any way. Why it's important for you to change the State?

1618–1627

Let me suggest possible changes.

1619–1620

Howerver, put this line inside if-body below, since Zero isn't needed wherever else.

martong marked 6 inline comments as done.Oct 25 2021, 7:25 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627

But I don't see you use the modified State in any way.

Actually, we do use it. The State we overwrite here is the member of the class ConstraintAssignor, it is not a local variable.

Why it's important for you to change the State?

It is important, because we'd like to assign new information to the existing things we know (i.e. to the State). So, once we see a modulo then we can defer some extra constraints and that is done via the ConstraintAssignor.

1619–1620

Thanks, that is correct. I am going to address this in the child patch.

ASDenysPetrov added inline comments.Oct 25 2021, 12:02 PM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627

OK. I see. Thanks :)