This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/StaticAnalyzer/Core/PathSensitive/
-
clang/
-
StaticAnalyzer/
-
Core/
-
PathSensitive/
-
RangedConstraintManager.h
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
34/35
RangeConstraintManager.cpp
-
test/Analysis/
-
Analysis/
5/5
constraint-assignor.c

Differential D110357

[Analyzer] Extend ConstraintAssignor to handle remainder op
ClosedPublic

Authored by martong on Sep 23 2021, 11:25 AM.

Download Raw Diff

Details

Reviewers

NoQ
vsavchenko
steakhal
Szelethus
ASDenysPetrov

Commits

rG5f8dca023504: [Analyzer] Extend ConstraintAssignor to handle remainder op

Summary

a % b != 0 implies that a != 0 for any a and b. This patch
extends the ConstraintAssignor to do just that. In fact, we could do
something similar with division and in case of multiplications we could
have some other inferences, but I'd like to keep these for future
patches.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51940

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

martong created this revision.Sep 23 2021, 11:25 AM

Herald added subscribers: manas, gamesh411, dkrupp and 8 others. · View Herald TranscriptSep 23 2021, 11:25 AM

martong requested review of this revision.Sep 23 2021, 11:25 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 23 2021, 11:25 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

I had to move the definition of ConstraintAssignor after the definition of RangeConstraintManager b/c I am using assumeSymNE in the new logic. Unfortunately, the diff does not show clearly the changes inside the moved hunk, so I try to indicate the important changes with a New code comment.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1740	New code
1750–1764	New code
1767–1770	New code

Harbormaster completed remote builds in B125411: Diff 374632.Sep 23 2021, 12:19 PM

Break out the movement of RangeConstraintManager into a parent patch, this way the diff here is clearly visible and makes the review easier.

martong added a parent revision: D110387: [Analyzer][NFC] Move RangeConstraintManager's def before ConstraintAssignor's def.Sep 24 2021, 12:37 AM

Harbormaster completed remote builds in B125488: Diff 374741.Sep 24 2021, 12:42 AM

Great work!

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1601–1606	Hm, why don't we acquire `RCM`, `Builder`, `F` in the constructor? I'm expecting all of them to remain the same for all the `assign()` calls.
1613–1614	Why is this not a `const` member function?
1615	I think you should also implement `a % b == 0 implies that a == 0`.
1639–1641	IMO we should pass a reference here, just like the rest of the parameters.
clang/test/Analysis/constraint-assignor.c
19	It's still mindboggling that we need to do this.

steakhal added inline comments.Sep 27 2021, 9:29 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1601–1606	My bad. It's a static method. Ignore this thread.

ASDenysPetrov added inline comments.Sep 28 2021, 5:47 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614	IMO it's better to rename the function `handleRemainderOp`. Add a function description in comments above. E.g. Handle expressions like: `a % b == 0`. ... Returns `true` when bla-bla, otherwise returns `false`.
1619–1622	Maybe make some more complex assumptions to cover complex LHS's?
1719
clang/test/Analysis/constraint-assignor.c
31	Add some nested cases like `x % y % z == 0`.

ASDenysPetrov added inline comments.Sep 28 2021, 5:49 AM

clang/test/Analysis/constraint-assignor.c
19	This is needed because of `RemoveDeadBindings`.

steakhal added inline comments.Sep 28 2021, 6:43 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619–1622	Oh nice.
clang/test/Analysis/constraint-assignor.c
19	I know, I just wanted to highlight that in contrast to this test code, on real code we would not draw the same conclusion, since we reclaimed the constraints too early. And this is what worries me. Not in this patch in particular, but I still think that we need something better than this 'keeping symbols alive manually'.

Rebase on top of new Parent (Use &RCM)

Harbormaster completed remote builds in B128360: Diff 379017.Oct 12 2021, 6:45 AM

martong edited parent revisions, added: D111640: [Analyzer][NFC] Add RangedConstraintManager to ConstraintAssignor; removed: D110387: [Analyzer][NFC] Move RangeConstraintManager's def before ConstraintAssignor's def.Oct 12 2021, 6:48 AM

Handle a % b == 0 implies that a == 0.
Add more test cases

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614	Why is this not a const member function? Because `RCM.assumeSymNE` is not `const`. IMO it's better to rename the function handleRemainderOp. I agree, changed it.
1615	Good point, I've just added that.
1619–1622	`State->assume` goes through many higher level abstractions and finally calls `assumeSymNE`, so I think calling that would be a pessimization in this case.
clang/test/Analysis/constraint-assignor.c
31	Good point! I've added one such test case.

Harbormaster completed remote builds in B128375: Diff 379039.Oct 12 2021, 7:57 AM

ASDenysPetrov added inline comments.Oct 12 2021, 8:29 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619–1622	I agree, but then you lose an internal simplification of LHS symbol.

steakhal added inline comments.Oct 12 2021, 10:22 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614	Why is this not a const member function? Because RCM.assumeSymNE is not const. I don't think it's an excuse: https://godbolt.org/z/nEcsozheq Aside from that, I would clarify that it will try to apply the following two assumptions: `a % b == 0 implies that a == 0.` `a % b != 0 implies that a != 0.` Now the comment only highlights the latter.
1619
1623	In case the `Constraint` might be zero, why do you constrain `LHS` so that it must be zero? I could not come up with a concrete example, but I hope you got my concern. We either have a logic flaw or we lack an assertion here stating this hidden assumption about `Constraints`.
1626–1628

BWT the following lines are uncovered by tests: L1627, L1651, L1758
Please adjust your tests accordingly.

ASDenysPetrov added inline comments.Oct 12 2021, 12:51 PM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619	Since `SymbolRef` actually is already `const`. (`using SymbolRef = const SymExpr *;`)

martong marked 5 inline comments as done.Oct 12 2021, 2:26 PM

martong added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1613–1614	Why is this not a const member function? Because RCM.assumeSymNE is not const. I don't think it's an excuse: https://godbolt.org/z/nEcsozheq Yeah you're right. It has nothing to do with the non-constness of `RCM.assumeSymNE`, my bad. The problem is that we try to assign a new state to the member non-const `State` and assignment is non-const: No viable overloaded '=' /home/egbomrt/WORK/llvm4/git/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:188:23: note: candidate function not viable: 'this' argument has type 'const clang::ento::ProgramStateRef' (aka 'const IntrusiveRefCntPtr<const clang::ento::ProgramState>'), but method is not marked const Aside from that, I would clarify that it will try to apply the following two assumptions: `a % b == 0 implies that a == 0.` `a % b != 0 implies that a != 0.` Now the comment only highlights the latter. Ok, I've added those comments.
1619	Fair enough.
1619–1622	Well, I cannot come up any disadvantages of using`State->assume` other than some performance constraints. On the other hand, that could simplify the implementation by eliminating the need for `RCM`. I'll give it a try and perhaps will update the patch accordingly.
1623	Ahh, very good catch, sharp eyes! This mistake slipped in when I extended the original logic with `a % b == 0 implies that a == 0`. Fixed it.

Fix logic error
Add more test cases
Use SymbolRef

Harbormaster completed remote builds in B128478: Diff 379189.Oct 12 2021, 2:27 PM

Excellent! All lines are covered.
Great job.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619	They mean different things. What I wanted is to declare both the pointer and the pointee `const`.

This revision is now accepted and ready to land.Oct 13 2021, 12:22 AM

Fix signedness mismatch assertaion and add a test case for that

Harbormaster completed remote builds in B128946: Diff 379830.Oct 14 2021, 1:47 PM

Ok. Let's see what the benefits it brings.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1619	Oh, I see. `const T * const`

ASDenysPetrov accepted this revision.Oct 15 2021, 1:36 AM

In D110357#3066207, @ASDenysPetrov wrote:

Ok. Let's see what the benefits it brings.

According to our measurements, it has some effects but is probably difficult to draw clear conclusions.
By using the github/martong/constraint_assignor_rem, it seems like the runtime is within measurable errors, and the memory consumption and the total result count remained approximately the same.

It seems like we have more results since we more often have more concrete values instead of unsimplified symbols, which makes the analyzer 'smarter' and reports more problems. At least, that's my theory.
Sorry for not sharing directly the csa-testbench summary HTML, but that could contain sensitive information.
If you require, I could repeat the test and publish the results to the publicly reachable demo server to let you all inspect and verify my theory.

Additionally to my previous observation, a surprising amount of the new findings are of deadcode detections, and most of them there are loops.

Other than that, I've seen a true-positive report as well:

At line 'A', on the path where valid_modulus is assumed to be true, we now correctly constrain spec->modulus to zero. The only problem is that we never mention this in the bugreport xD
That being said, I would recommend introducing a bug-report visitor to complement this surprising behavior, OR somehow add a NoteTag to this transition where this operation gets evaluated.

I suspect we will need to tune the trackExpressionValue stuff to keep up, whenever we extend ConstraintAssignor.

After taking a look at the new findings we discovered, there is a logic error with this patch, actually a % b == 0 implies that a == 0 does not hold, one counter example is 10 % 2 == 0. Argh, probably we should be using Z3 next time to prove or disprove such things. Or perhaps other reviewers with strong math background could have had a look (knock knock @NoQ :).

Remove the wrong inferrence of a % b == 0 implies that a == 0 and related test cases.

Harbormaster completed remote builds in B129051: Diff 379988.Oct 15 2021, 6:55 AM

I see. Now it looks correct.

Next time we shall have a z3 proof about the theory.
A => B <=> not(A) or B. which is SAT only if A and not(B) UNSAT.

a = z3.BitVec('a', 32)
b = z3.BitVec('b', 32)
zero = z3.BitVecVal(0, 32)
s = z3.Solver()
s.add((a % b) != zero)
s.add(a == zero)
s.check() # reports UNSAT

Note: it requires the z3-solver pip package.

ASDenysPetrov added inline comments.Oct 18 2021, 2:10 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627	How about using the family of `ProgramState::isNonNull` or `ProgramState::isNull` or `RangeConstraintManager::checkNull` functoins for this stuff?

martong marked an inline comment as done.Oct 22 2021, 1:17 AM

martong added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627	I've been checking this and turend out that `ProgramState::isNull` does not modify the State (this is aligned with being a `const` member function). So, these functions do not "assume" anything, they can be used only to query some property of an SVal (or Symbol) from the State. However, this comment and your other previous comment made me to do further investigations towards exploiting the "assume" machinery better. The result is a new child patch, where we can handle "adjustments" as well.

martong added a child revision: D112296: [Analyzer][solver] Handle adjustments in constraint assignor remainder.Oct 22 2021, 1:18 AM

This revision was landed with ongoing or failed builds.Oct 22 2021, 1:54 AM

Closed by commit rG5f8dca023504: [Analyzer] Extend ConstraintAssignor to handle remainder op (authored by martong). · Explain Why

This revision was automatically updated to reflect the committed changes.

martong marked an inline comment as done.

martong added a commit: rG5f8dca023504: [Analyzer] Extend ConstraintAssignor to handle remainder op.

ASDenysPetrov added inline comments.Oct 25 2021, 4:32 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627	But I don't see you use the modified `State` in any way. Why it's important for you to change the `State`?
1618–1627	Let me suggest possible changes.
1619–1620	Howerver, put this line inside if-body below, since `Zero` isn't needed wherever else.

martong marked 6 inline comments as done.Oct 25 2021, 7:25 AM

martong added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627	But I don't see you use the modified `State` in any way. Actually, we do use it. The `State` we overwrite here is the member of the class `ConstraintAssignor`, it is not a local variable. Why it's important for you to change the `State`? It is important, because we'd like to assign new information to the existing things we know (i.e. to the State). So, once we see a modulo then we can defer some extra constraints and that is done via the `ConstraintAssignor`.
1619–1620	Thanks, that is correct. I am going to address this in the child patch.

ASDenysPetrov added inline comments.Oct 25 2021, 12:02 PM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1618–1627	OK. I see. Thanks :)

Revision Contents

Path

Size

clang/

include/

clang/

StaticAnalyzer/

Core/

PathSensitive/

RangedConstraintManager.h

5 lines

lib/

StaticAnalyzer/

Core/

RangeConstraintManager.cpp

28 lines

test/

Analysis/

constraint-assignor.c

69 lines

Diff 381483

clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h

Show First 20 Lines • Show All 276 Lines • ▼ Show 20 Lines	public:
const llvm::APSInt &getMaxValue() const;		const llvm::APSInt &getMaxValue() const;

/// Test whether the given point is contained by any of the ranges.		/// Test whether the given point is contained by any of the ranges.
///		///
/// Complexity: O(logN)		/// Complexity: O(logN)
/// where N = size(this)		/// where N = size(this)
bool contains(llvm::APSInt Point) const { return containsImpl(Point); }		bool contains(llvm::APSInt Point) const { return containsImpl(Point); }

		bool containsZero() const {
		APSIntType T{getMinValue()};
		return contains(T.getZeroValue());
		}

void dump(raw_ostream &OS) const;		void dump(raw_ostream &OS) const;
void dump() const;		void dump() const;

bool operator==(const RangeSet &Other) const { return Impl == Other.Impl; }		bool operator==(const RangeSet &Other) const { return Impl == Other.Impl; }
bool operator!=(const RangeSet &Other) const { return !(*this == Other); }		bool operator!=(const RangeSet &Other) const { return !(*this == Other); }

private:		private:
/* implicit / RangeSet(ContainerType RawContainer) : Impl(RawContainer) {}		/* implicit / RangeSet(ContainerType RawContainer) : Impl(RawContainer) {}
▲ Show 20 Lines • Show All 121 Lines • Show Last 20 Lines

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

Show First 20 Lines • Show All 1,592 Lines • ▼ Show 20 Lines

/// them. /// them.

/// ///

/// It has a nice symmetry with SymbolicRangeInferrer. When the latter /// It has a nice symmetry with SymbolicRangeInferrer. When the latter

/// can provide more precise ranges by looking into the operands of the /// can provide more precise ranges by looking into the operands of the

/// expression in question, ConstraintAssignor looks into the operands /// expression in question, ConstraintAssignor looks into the operands

/// to see if we can imply more from the new constraint. /// to see if we can imply more from the new constraint.

class ConstraintAssignor : public ConstraintAssignorBase<ConstraintAssignor> { class ConstraintAssignor : public ConstraintAssignorBase<ConstraintAssignor> {

public: public:

template <class ClassOrSymbol> template <class ClassOrSymbol>

LLVM_NODISCARD static ProgramStateRef LLVM_NODISCARD static ProgramStateRef

assign(ProgramStateRef State, RangeConstraintManager &RCM, assign(ProgramStateRef State, RangeConstraintManager &RCM,

SValBuilder &Builder, RangeSet::Factory &F, ClassOrSymbol CoS, SValBuilder &Builder, RangeSet::Factory &F, ClassOrSymbol CoS,

RangeSet NewConstraint) { RangeSet NewConstraint) {

if (!State || NewConstraint.isEmpty()) if (!State || NewConstraint.isEmpty())

steakhalUnsubmitted

Done

Hm, why don't we acquire RCM, Builder, F in the constructor? I'm expecting all of them to remain the same for all the assign() calls.

steakhal: Hm, why don't we acquire `RCM`, `Builder`, `F` in the constructor? I'm expecting all of them to…

steakhalUnsubmitted

Done

My bad. It's a static method. Ignore this thread.

steakhal: My bad. It's a static method. Ignore this thread.

return nullptr; return nullptr;

ConstraintAssignor Assignor{State, RCM, Builder, F}; ConstraintAssignor Assignor{State, RCM, Builder, F};

return Assignor.assign(CoS, NewConstraint); return Assignor.assign(CoS, NewConstraint);

} }

/// Handle expressions like: a % b != 0.

template <typename SymT>

steakhalUnsubmitted

Done

Why is this not a const member function?

steakhal: Why is this not a `const` member function?

martongAuthorUnsubmitted

Done

Why is this not a const member function?

Because RCM.assumeSymNE is not const.

IMO it's better to rename the function handleRemainderOp.

I agree, changed it.

martong: > Why is this not a const member function? Because `RCM.assumeSymNE` is not `const`. > IMO…

steakhalUnsubmitted

Done

Why is this not a const member function?

Because RCM.assumeSymNE is not const.

I don't think it's an excuse: https://godbolt.org/z/nEcsozheq

Aside from that, I would clarify that it will try to apply the following *two* assumptions:
a % b == 0 implies that a == 0.
a % b != 0 implies that a != 0.
Now the comment only highlights the latter.

steakhal: >> Why is this not a const member function? > Because RCM.assumeSymNE is not const. I don't…

martongAuthorUnsubmitted

Done

Why is this not a const member function?

Because RCM.assumeSymNE is not const.

I don't think it's an excuse: https://godbolt.org/z/nEcsozheq

Yeah you're right. It has nothing to do with the non-constness of RCM.assumeSymNE, my bad. The problem is that we try to assign a new state to the member non-const State and assignment is non-const:

No viable overloaded '='
/home/egbomrt/WORK/llvm4/git/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:188:23:
note: candidate function not viable: 'this' argument has type 'const clang::ento::ProgramStateRef' (aka 'const IntrusiveRefCntPtr<const clang::ento::ProgramState>'), but method is not marked const

Aside from that, I would clarify that it will try to apply the following *two* assumptions:
a % b == 0 implies that a == 0.
a % b != 0 implies that a != 0.
Now the comment only highlights the latter.

Ok, I've added those comments.

martong: > >> Why is this not a const member function? > > Because RCM.assumeSymNE is not const. > I…

ASDenysPetrovUnsubmitted

Done

IMO it's better to rename the function handleRemainderOp.

Add a function description in comments above.
E.g. Handle expressions like: a % b == 0. ... Returns true when bla-bla, otherwise returns false.

ASDenysPetrov: IMO it's better to rename the function `handleRemainderOp`. Add a function description in…

bool handleRemainderOp(const SymT *Sym, RangeSet Constraint) {

steakhalUnsubmitted

Done

I think you should also implement a % b == 0 implies that a == 0.

steakhal: I think you should also implement `a % b == 0 implies that a == 0`.

martongAuthorUnsubmitted

Done

Good point, I've just added that.

martong: Good point, I've just added that.

if (Sym->getOpcode() != BO_Rem)

return true;

const SymbolRef LHS = Sym->getLHS();

const llvm::APSInt &Zero =

steakhalUnsubmitted

Done

return true;

- const SymExpr *LHS = Sym->getLHS();

+ const SymbolRef LHS = Sym->getLHS();

const llvm::APSInt &Zero =

steakhal:

martongAuthorUnsubmitted

Done

Fair enough.

martong: Fair enough.

ASDenysPetrovUnsubmitted

Done

return true;

- const SymExpr *LHS = Sym->getLHS();

+ SymbolRef LHS = Sym->getLHS();

const llvm::APSInt &Zero =

Since SymbolRef actually is already const. (using SymbolRef = const SymExpr *;)

ASDenysPetrov: Since `SymbolRef` actually is already `const`. (`using SymbolRef = const SymExpr *;`)

steakhalUnsubmitted

Done

They mean different things. What I wanted is to declare both the pointer and the pointee const.

steakhal: They mean different things. What I wanted is to declare both the pointer and the pointee…

ASDenysPetrovUnsubmitted

Done

Oh, I see. const T * const

ASDenysPetrov: Oh, I see. `const T * const`

Builder.getBasicValueFactory().getValue(0, Sym->getType());

ASDenysPetrovUnsubmitted

Not Done

Howerver, put this line inside if-body below, since Zero isn't needed wherever else.

ASDenysPetrov: Howerver, put this line inside //if-body// below, since `Zero` isn't needed wherever else.

martongAuthorUnsubmitted

Done

Thanks, that is correct. I am going to address this in the child patch.

martong: Thanks, that is correct. I am going to address this in the child patch.

// a % b != 0 implies that a != 0.

if (!Constraint.containsZero()) {

ASDenysPetrovUnsubmitted

Done

if (!Constraint.containsZero()) {

- const SymExpr *LHS = Sym->getLHS();

- const llvm::APSInt &Zero =

- Builder.getBasicValueFactory().getValue(0, LHS->getType());

- State = RCM->assumeSymNE(State, LHS, Zero, Zero);

+ State = State->assume(Sym->getLHS(), true);

if (!State)

Maybe make some more complex assumptions to cover complex LHS's?

ASDenysPetrov: Maybe make some more complex assumptions to cover complex **LHS's**?

steakhalUnsubmitted

Done

Oh nice.

steakhal: Oh nice.

martongAuthorUnsubmitted

Done

State->assume goes through many higher level abstractions and finally calls assumeSymNE, so I think calling that would be a pessimization in this case.

martong: `State->assume` goes through many higher level abstractions and finally calls `assumeSymNE`, so…

ASDenysPetrovUnsubmitted

Done

I agree, but then you lose an internal simplification of LHS symbol.

ASDenysPetrov: I agree, but then you lose an internal simplification of LHS symbol.

martongAuthorUnsubmitted

Done

Well, I cannot come up any disadvantages of using`State->assume` other than some performance constraints. On the other hand, that could simplify the implementation by eliminating the need for RCM. I'll give it a try and perhaps will update the patch accordingly.

martong: Well, I cannot come up any disadvantages of using`State->assume` other than some performance…

State = RCM.assumeSymNE(State, LHS, Zero, Zero);

steakhalUnsubmitted

Done

In case the Constraint might be zero, why do you constrain LHS so that it must be zero?
I could not come up with a concrete example, but I hope you got my concern.
We either have a logic flaw or we lack an assertion here stating this hidden assumption about Constraints.

steakhal: In case the `Constraint` //might be// zero, why do you constrain `LHS` so that it **must be**…

martongAuthorUnsubmitted

Done

Ahh, very good catch, sharp eyes! This mistake slipped in when I extended the original logic with a % b == 0 implies that a == 0. Fixed it.

martong: Ahh, very good catch, sharp eyes! This mistake slipped in when I extended the original logic…

if (!State)

return false;

}

return true;

ASDenysPetrovUnsubmitted

Done

How about using the family of ProgramState::isNonNull or ProgramState::isNull or RangeConstraintManager::checkNull functoins for this stuff?

ASDenysPetrov: How about using the family of `ProgramState::isNonNull` or `ProgramState::isNull` or…

martongAuthorUnsubmitted

Done

I've been checking this and turend out that ProgramState::isNull does not modify the State (this is aligned with being a const member function). So, these functions do not "assume" anything, they can be used only to query some property of an SVal (or Symbol) from the State.

However, this comment and your other previous comment made me to do further investigations towards exploiting the "assume" machinery better. The result is a new child patch, where we can handle "adjustments" as well.

martong: I've been checking this and turend out that `ProgramState::isNull` does not modify the State…

ASDenysPetrovUnsubmitted

Not Done

But I don't see you use the modified State in any way. Why it's important for you to change the State?

ASDenysPetrov: But I don't see you use the modified `State` in any way. Why it's important for you to change…

martongAuthorUnsubmitted

Done

But I don't see you use the modified State in any way.

Actually, we do use it. The State we overwrite here is the member of the class ConstraintAssignor, it is not a local variable.

Why it's important for you to change the State?

It is important, because we'd like to assign new information to the existing things we know (i.e. to the State). So, once we see a modulo then we can defer some extra constraints and that is done via the ConstraintAssignor.

martong: > But I don't see you use the modified `State` in any way. Actually, we do use it. The…

ASDenysPetrovUnsubmitted

Not Done

OK. I see. Thanks :)

ASDenysPetrov: OK. I see. Thanks :)

ASDenysPetrovUnsubmitted

Done

return true;

const SymbolRef LHS = Sym->getLHS();

- const llvm::APSInt &Zero =

- Builder.getBasicValueFactory().getValue(0, Sym->getType());

// a % b != 0 implies that a != 0.

if (!Constraint.containsZero()) {

- State = RCM.assumeSymNE(State, LHS, Zero, Zero);

- if (!State)

+ if (checkNull(State, LHS).isConstrainedTrue())

return false;

}

return true;

}

inline bool assignSymExprToConst(const SymExpr *Sym, Const Constraint);

Let me suggest possible changes.

ASDenysPetrov: Let me suggest possible changes.

}

steakhalUnsubmitted

Done

State = RCM.assumeSymNE(State, LHS, Zero, Zero);

- if (!State)

- return false;

- return true;

+ return static_cast<bool>(State);

}

inline bool assignSymExprToConst(const SymExpr *Sym, Const Constraint);

steakhal:

inline bool assignSymExprToConst(const SymExpr *Sym, Const Constraint); inline bool assignSymExprToConst(const SymExpr *Sym, Const Constraint);

inline bool assignSymIntExprToRangeSet(const SymIntExpr *Sym,

RangeSet Constraint) {

return handleRemainderOp(Sym, Constraint);

}

inline bool assignSymSymExprToRangeSet(const SymSymExpr *Sym, inline bool assignSymSymExprToRangeSet(const SymSymExpr *Sym,

RangeSet Constraint); RangeSet Constraint);

private: private:

ConstraintAssignor(ProgramStateRef State, RangeConstraintManager &RCM, ConstraintAssignor(ProgramStateRef State, RangeConstraintManager &RCM,

SValBuilder &Builder, RangeSet::Factory &F) SValBuilder &Builder, RangeSet::Factory &F)

: State(State), RCM(RCM), Builder(Builder), RangeFactory(F) {} : State(State), RCM(RCM), Builder(Builder), RangeFactory(F) {}

steakhalUnsubmitted

Done

IMO we should pass a reference here, just like the rest of the parameters.

steakhal: IMO we should pass a reference here, just like the rest of the parameters.

using Base = ConstraintAssignorBase<ConstraintAssignor>; using Base = ConstraintAssignorBase<ConstraintAssignor>;

/// Base method for handling new constraints for symbols. /// Base method for handling new constraints for symbols.

LLVM_NODISCARD ProgramStateRef assign(SymbolRef Sym, RangeSet NewConstraint) { LLVM_NODISCARD ProgramStateRef assign(SymbolRef Sym, RangeSet NewConstraint) {

// All constraints are actually associated with equivalence classes, and // All constraints are actually associated with equivalence classes, and

// that's what we are going to do first. // that's what we are going to do first.

State = assign(EquivalenceClass::find(State, Sym), NewConstraint); State = assign(EquivalenceClass::find(State, Sym), NewConstraint);

if (!State) if (!State)

▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines private:

} }

LLVM_NODISCARD Optional<bool> interpreteAsBool(RangeSet Constraint) { LLVM_NODISCARD Optional<bool> interpreteAsBool(RangeSet Constraint) {

assert(!Constraint.isEmpty() && "Empty ranges shouldn't get here"); assert(!Constraint.isEmpty() && "Empty ranges shouldn't get here");

if (Constraint.getConcreteValue()) if (Constraint.getConcreteValue())

return !Constraint.getConcreteValue()->isZero(); return !Constraint.getConcreteValue()->isZero();

APSIntType T{Constraint.getMinValue()}; if (!Constraint.containsZero())

Const Zero = T.getZeroValue();

if (!Constraint.contains(Zero))

return true; return true;

return llvm::None; return llvm::None;

} }

ProgramStateRef State; ProgramStateRef State;

RangeConstraintManager &RCM; RangeConstraintManager &RCM;

ASDenysPetrovUnsubmitted

Done

ProgramStateRef State;

- RangeConstraintManager *RCM;

+ RangeConstraintManager &RCM;

SValBuilder &Builder;

ASDenysPetrov:

SValBuilder &Builder; SValBuilder &Builder;

RangeSet::Factory &RangeFactory; RangeSet::Factory &RangeFactory;

}; };

bool ConstraintAssignor::assignSymExprToConst(const SymExpr *Sym, bool ConstraintAssignor::assignSymExprToConst(const SymExpr *Sym,

const llvm::APSInt &Constraint) { const llvm::APSInt &Constraint) {

llvm::SmallSet<EquivalenceClass, 4> SimplifiedClasses; llvm::SmallSet<EquivalenceClass, 4> SimplifiedClasses;

// Iterate over all equivalence classes and try to simplify them. // Iterate over all equivalence classes and try to simplify them.

ClassMembersTy Members = State->get<ClassMembers>(); ClassMembersTy Members = State->get<ClassMembers>();

for (std::pair<EquivalenceClass, SymbolSet> ClassToSymbolSet : Members) { for (std::pair<EquivalenceClass, SymbolSet> ClassToSymbolSet : Members) {

EquivalenceClass Class = ClassToSymbolSet.first; EquivalenceClass Class = ClassToSymbolSet.first;

State = EquivalenceClass::simplify(Builder, RangeFactory, State, Class); State = EquivalenceClass::simplify(Builder, RangeFactory, State, Class);

if (!State) if (!State)

return false; return false;

SimplifiedClasses.insert(Class); SimplifiedClasses.insert(Class);

} }

// Trivial equivalence classes (those that have only one symbol member) are // Trivial equivalence classes (those that have only one symbol member) are

// not stored in the State. Thus, we must skim through the constraints as // not stored in the State. Thus, we must skim through the constraints as

// well. And we try to simplify symbols in the constraints. // well. And we try to simplify symbols in the constraints.

martongAuthorUnsubmitted

Done

New code

martong: New code

ConstraintRangeTy Constraints = State->get<ConstraintRange>(); ConstraintRangeTy Constraints = State->get<ConstraintRange>();

for (std::pair<EquivalenceClass, RangeSet> ClassConstraint : Constraints) { for (std::pair<EquivalenceClass, RangeSet> ClassConstraint : Constraints) {

EquivalenceClass Class = ClassConstraint.first; EquivalenceClass Class = ClassConstraint.first;

if (SimplifiedClasses.count(Class)) // Already simplified. if (SimplifiedClasses.count(Class)) // Already simplified.

continue; continue;

State = EquivalenceClass::simplify(Builder, RangeFactory, State, Class); State = EquivalenceClass::simplify(Builder, RangeFactory, State, Class);

if (!State) if (!State)

return false; return false;

} }

return true; return true;

} }

bool ConstraintAssignor::assignSymSymExprToRangeSet(const SymSymExpr *Sym, bool ConstraintAssignor::assignSymSymExprToRangeSet(const SymSymExpr *Sym,

RangeSet Constraint) { RangeSet Constraint) {

if (!handleRemainderOp(Sym, Constraint))

return false;

Optional<bool> ConstraintAsBool = interpreteAsBool(Constraint); Optional<bool> ConstraintAsBool = interpreteAsBool(Constraint);

if (!ConstraintAsBool) if (!ConstraintAsBool)

return true; return true;

if (Optional<bool> Equality = meansEquality(Sym)) { if (Optional<bool> Equality = meansEquality(Sym)) {

martongAuthorUnsubmitted

Done

New code

martong: New code

// Here we cover two cases: // Here we cover two cases:

// * if Sym is equality and the new constraint is true -> Sym's operands // * if Sym is equality and the new constraint is true -> Sym's operands

// should be marked as equal // should be marked as equal

// * if Sym is disequality and the new constraint is false -> Sym's // * if Sym is disequality and the new constraint is false -> Sym's

// operands should be also marked as equal // operands should be also marked as equal

if (*Equality == *ConstraintAsBool) { if (*Equality == *ConstraintAsBool) {

martongAuthorUnsubmitted

Done

New code

martong: New code

State = trackEquality(State, Sym->getLHS(), Sym->getRHS()); State = trackEquality(State, Sym->getLHS(), Sym->getRHS());

} else { } else {

// Other combinations leave as with disequal operands. // Other combinations leave as with disequal operands.

State = trackDisequality(State, Sym->getLHS(), Sym->getRHS()); State = trackDisequality(State, Sym->getLHS(), Sym->getRHS());

} }

if (!State) if (!State)

return false; return false;

▲ Show 20 Lines • Show All 1,085 Lines • Show Last 20 Lines

clang/test/Analysis/constraint-assignor.c

This file was added.

				// RUN: %clang_analyze_cc1 %s \
				// RUN: -analyzer-checker=core \
				// RUN: -analyzer-checker=debug.ExprInspection \
				// RUN: -verify

				// expected-no-diagnostics

				void clang_analyzer_warnIfReached();

				void rem_constant_rhs_ne_zero(int x, int y) {
				if (x % 3 == 0) // x % 3 != 0 -> x != 0
				return;
				if (x * y != 0) // x * y == 0
				return;
				if (y != 1) // y == 1 -> x == 0
				return;
				clang_analyzer_warnIfReached(); // no-warning
				(void)x; // keep the constraints alive.
				}
				steakhalUnsubmitted Done Reply Inline Actions It's still mindboggling that we need to do this. steakhal: It's still mindboggling that we need to do this.
				ASDenysPetrovUnsubmitted Done Reply Inline Actions This is needed because of `RemoveDeadBindings`. ASDenysPetrov: This is needed because of `RemoveDeadBindings`.
				steakhalUnsubmitted Done Reply Inline Actions I know, I just wanted to highlight that in contrast to this test code, on real code we would not draw the same conclusion, since we reclaimed the constraints too early. And this is what worries me. Not in this patch in particular, but I still think that we need something better than this 'keeping symbols alive manually'. steakhal: I know, I just wanted to highlight that in contrast to this test code, on real code we would…

				void rem_symbolic_rhs_ne_zero(int x, int y, int z) {
				if (x % z == 0) // x % z != 0 -> x != 0
				return;
				if (x * y != 0) // x * y == 0
				return;
				if (y != 1) // y == 1 -> x == 0
				return;
				clang_analyzer_warnIfReached(); // no-warning
				(void)x; // keep the constraints alive.
				}

				ASDenysPetrovUnsubmitted Done Reply Inline Actions Add some nested cases like `x % y % z == 0`. ASDenysPetrov: Add some nested cases like `x % y % z == 0`.
				martongAuthorUnsubmitted Done Reply Inline Actions Good point! I've added one such test case. martong: Good point! I've added one such test case.
				void rem_symbolic_rhs_ne_zero_nested(int w, int x, int y, int z) {
				if (w % x % z == 0) // w % x % z != 0 -> w % x != 0
				return;
				if (w % x * y != 0) // w % x * y == 0
				return;
				if (y != 1) // y == 1 -> w % x == 0
				return;
				clang_analyzer_warnIfReached(); // no-warning
				(void)(w * x); // keep the constraints alive.
				}

				void rem_constant_rhs_ne_zero_early_contradiction(int x, int y) {
				if ((x + y) != 0) // (x + y) == 0
				return;
				if ((x + y) % 3 == 0) // (x + y) % 3 != 0 -> (x + y) != 0 -> contradiction
				return;
				clang_analyzer_warnIfReached(); // no-warning
				(void)x; // keep the constraints alive.
				}

				void rem_symbolic_rhs_ne_zero_early_contradiction(int x, int y, int z) {
				if ((x + y) != 0) // (x + y) == 0
				return;
				if ((x + y) % z == 0) // (x + y) % z != 0 -> (x + y) != 0 -> contradiction
				return;
				clang_analyzer_warnIfReached(); // no-warning
				(void)x; // keep the constraints alive.
				}

				void internal_unsigned_signed_mismatch(unsigned a) {
				int d = a;
				// Implicit casts are not handled, thus the analyzer models `d % 2` as
				// `(reg_$0<unsigned int a>) % 2`
				// However, this should not result in internal signedness mismatch error when
				// we assign new constraints below.
				if (d % 2 != 0)
				return;
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Extend ConstraintAssignor to handle remainder opClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 381483

clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

clang/test/Analysis/constraint-assignor.c

[Analyzer] Extend ConstraintAssignor to handle remainder op
ClosedPublic