This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/StaticAnalyzer/Core/PathSensitive/
-
clang/
-
StaticAnalyzer/
-
Core/
-
PathSensitive/
-
RangedConstraintManager.h
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
4/21
RangeConstraintManager.cpp
-
test/Analysis/
-
Analysis/
1
constraint_manager_negate_difference.c
1
ptr-arith.c

Differential D35110

[Analyzer] Constraint Manager Negates Difference
ClosedPublic

Authored by baloghadamsoftware on Jul 7 2017, 1:08 AM.

Download Raw Diff

Details

Reviewers

NoQ
dcoughlin
george.karpenkov

Commits

rG77660ee89a01: [Analyzer] Constraint Manager Negates Difference
rL335814: [Analyzer] Constraint Manager Negates Difference
rC335814: [Analyzer] Constraint Manager Negates Difference

Summary

If range [m .. n] is stored for symbolic expression A - B, then we can deduce the range for B - A which is [-n .. -m]. This is only true for signed types, unless the range is [0 .. 0].

Diff Detail

Repository: rC Clang

Event Timeline

Wrong patch files was uploaded first.

baloghadamsoftware added a parent revision: D35109: [Analyzer] SValBuilder Comparison Rearrangement.Jul 7 2017, 1:10 AM

baloghadamsoftware added a child revision: D32642: [Analyzer] Iterator Checker - Part 2: Increment, decrement operators and ahead-of-begin checks.

xazax.hun added inline comments.Jul 11 2017, 2:36 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
476	Could you give an example why do you need this (probably as a test), or constrain the transformation when all the types are the same?
483	I think it would be better to describe why don't we want to do that rather than describing what the code does.

NoQ added inline comments.Jul 12 2017, 12:59 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
472	With this, it sounds as if we're half-way into finally supporting the unary minus operator (: Could you add a FIXME here: "Once SValBuilder supports unary minus, we should use SValBuilder to obtain the negated symbolic expression instead of constructing the symbol manually. This will allow us to support finding ranges of not only negated SymSymExpr-type expressions, but also of other, simpler expressions which we currently do not know how to negate."
475–480	I'm quite sure that types of `A - B` and `B - A` are always equal when it comes to integer promotion rules. Also, due to our broken `SymbolCast` (which is often missing where it ideally should be), the type of the `A - B` symbol may not necessarily be the same as the type that you obtain by applying integer promotion rules to types of `A` and `B`. So i think you should always take the type of `A - B` and expect to find `B - A` of the same type in the range map, otherwise give up.
487	Pointer types are currently treated as unsigned, so i'm not sure you want them here.
test/Analysis/ptr-arith.c
267–268	The tests that are now supported should be moved above this comment.

baloghadamsoftware added inline comments.Jul 13 2017, 7:36 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
487	For me it seems that pointer differences are still pointer types and they are signed. (The range becomes negative upon negative assumption. From test `ptr-arith.c`: void use_symbols(int lhs, int rhs) { clang_analyzer_eval(lhs < rhs); // expected-warning{{UNKNOWN}} if (lhs < rhs) return; clang_analyzer_eval(lhs < rhs); // expected-warning{{FALSE}} clang_analyzer_eval(lhs - rhs); // expected-warning{{UNKNOWN}} if ((lhs - rhs) != 5) return; clang_analyzer_eval((lhs - rhs) == 5); // expected-warning{{TRUE}} } If I put `clang_analyzer_printState()` into the empty line in the middle, I get the following range for the difference: `[-9223372036854775808, 0]`. If I replace `int` with `unsigned`, this range becomes `[0, 0]`, so `int` is handled as a signed type here.

Type selection simplified, FIXME added.

NoQ added inline comments.Jul 17 2017, 8:03 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
487	Umm, yeah, i was wrong. looks closer `T` is the type of the difference, right? I don't think i'd expect pointer type as the type of the difference. Could you add test cases for pointers if you intend to support them (and maybe for unsigned types)?

baloghadamsoftware added inline comments.Jul 18 2017, 2:31 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
487	I do not know exactly the type, but if I remove the `T->isPointerType()` condition the test in `ptr_arith.c` will fail with `UNKNOWN`. So the type of the difference is a type that returns `true` from `T->isPointerType()`. Pointer tests are already there in `ptr_arith.c`. Should I duplicate them?

NoQ added inline comments.Jul 18 2017, 4:25 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
487	I don't see any failing tests when i remove `T->isPointerType()`. Also this shouldn't be system-specific, because the target triple is hardcoded in `ptr-arith.c` runline. Could you point out which test is failing and dump the type in question (`-ast-dump`, or `Type->dump()`, or `llvm::errs() << QualType::getAsString()`, or whatever)?

I think I checked the type of the left side of the difference, not the difference itself. Thus the difference is not a pointer type, it is a signed integer type, the tests pass when I remove that line.

zaks.anna added a reviewer: dcoughlin.Jul 20 2017, 9:20 AM

Is this blocked on the same reasons as what was raised in https://reviews.llvm.org/D35109?

In D35110#854334, @zaks.anna wrote:

Is this blocked on the same reasons as what was raised in https://reviews.llvm.org/D35109?

No, it is blocked because D35109 is a prerequisite.

NoQ mentioned this in D35109: [Analyzer] SValBuilder Comparison Rearrangement.Oct 31 2017, 12:07 PM

baloghadamsoftware removed a parent revision: D35109: [Analyzer] SValBuilder Comparison Rearrangement.Dec 4 2017, 6:15 AM

This one is not blocked anymore since I removed the dependency.

Herald added subscribers: a.sidorin, rnkovacs, szepet. · View Herald TranscriptJan 5 2018, 4:43 AM

In D35110#968284, @baloghadamsoftware wrote:

This one is not blocked anymore since I removed the dependency.

But I have to modify the test cases...

Strange, but modifying the tests from m <relation> n to m - n <relation> 0 does not help. The statement if (m - n <relation> 0) does not store a range for m - n in the constraint manager. With the other patch which automatically changes m <relation> n to m - n <relation> 0 the range is stored automatically.

baloghadamsoftware added a parent revision: D35109: [Analyzer] SValBuilder Comparison Rearrangement.Jan 10 2018, 4:37 AM

In D35110#969782, @baloghadamsoftware wrote:

Strange, but modifying the tests from m <relation> n to m - n <relation> 0 does not help. The statement if (m - n <relation> 0) does not store a range for m - n in the constraint manager. With the other patch which automatically changes m <relation> n to m - n <relation> 0 the range is stored automatically.

I guess we can easily assume how a SymIntExpr relates to a number by storing a range on the opaque left-hand-side symbol, no matter how complicated it is, but we cannot assume how a symbol relates to a symbol (there's no obvious range to store). That's just how assumeSym currently works.

In D35110#972430, @NoQ wrote:

In D35110#969782, @baloghadamsoftware wrote:

Strange, but modifying the tests from m <relation> n to m - n <relation> 0 does not help. The statement if (m - n <relation> 0) does not store a range for m - n in the constraint manager. With the other patch which automatically changes m <relation> n to m - n <relation> 0 the range is stored automatically.

I guess we can easily assume how a SymIntExpr relates to a number by storing a range on the opaque left-hand-side symbol, no matter how complicated it is, but we cannot assume how a symbol relates to a symbol (there's no obvious range to store). That's just how assumeSym currently works.

Actually it happens because m - n evaluates to Unknown. The code part responsible for this is the beginning of SValBuilder::makeSymExprValNN(), which prevents m - n-like symbolic expression unless one of m or n is Tainted. Anna added this part 5-6 years ago because some kind of bug, but it seems that it still exists. If I try to remove it then one test executes for days (with loop max count 63 or 64) and two tests fail with an assert.

baloghadamsoftware edited parent revisions, added: D41938: [Analyzer] SValBuilder Comparison Rearrangement (with Restrictions and Analyzer Option); removed: D35109: [Analyzer] SValBuilder Comparison Rearrangement.Jan 11 2018, 2:11 AM

Updated to be based upon D41938.

baloghadamsoftware mentioned this in D32642: [Analyzer] Iterator Checker - Part 2: Increment, decrement operators and ahead-of-begin checks.Jan 11 2018, 6:48 AM

baloghadamsoftware removed a reviewer: zaks.anna.

Rebased to the newly committed SValbuilder rearranger patch

Herald added a reviewer: george.karpenkov. · View Herald TranscriptApr 11 2018, 12:43 AM

No more pending dependency, so we can continue the review.

LGTM! Minor nitpicking in comments.

Currently there's no such problem, but as getRange becomes more complicated, we'll really miss the possibility of saying something like "if there's a range for negated symbol, return getRange(the negated symbol)", so that all other special cases applied. We could have allowed that by canonicalizing symbols (i.e. announce that $A always goes before $B and therefore we will store a range for $A - $B even if we're told to store the range for $B - $A) and then the "if" will become "if the symbol is not canonical, return getRange(the canonicalized symbol)".

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
472	Can we move the whole `if` into a function? Eg., if (RangeSet *R = getRangeForMinusSymbol(Sym)) return R->Negate(BV, F) ?
483	`ConstraintRangeTy::data_type` ~> `RangeSet` should be easier to read.

NoQ added inline comments.Apr 27 2018, 8:29 PM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
167–178	Hmm, wait a minute, is this actually correct? For the range [-2³¹, -2³¹ + 1] over a 32-bit integer, the negated range will be [-2³¹, -2³¹] U [2³¹ - 1, 2³¹ - 1]. So there must be a place in the code where we take one range and add two ranges.

baloghadamsoftware added inline comments.May 2 2018, 6:56 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
167–178	The two ends of the range of the type usually stands for +/- infinity. If we add the minimum of the type when negating a negative range, then we lose the whole point of this transformation. Example: If `A - B < 0`, then the range of `A - B` is `[-2³¹, -1]`, If we negate this, and keep the `-2³¹` range end, then we get `[-2³¹, -2³¹]U[1, 2³¹-1]`. However, this does not mean `B - A > 0`. If we make assumption about this, we get two states instead of one, in the true state `A - B`'s range is `[1, 2³¹-1]` and in the false state it is `[-2³¹, -2³¹]`. This is surely not what we want.

NoQ added inline comments.May 2 2018, 2:39 PM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
167–178	Well, we can't turn math into something we want, it is what it is. Iterator-related symbols are all planned to be within range [-2²⁹, -2²⁹], right? So if we subtract one such symbol from another, it's going to be in range [-2³⁰, 2³⁰]. Can we currently infer that? Or maybe we should make the iterator checker to enforce that separately? Because this range doesn't include -2³¹, so we won't have any problems with doing negation correctly. So as usual i propose to get this code mathematically correct and then see if we can ensure correct behavior by enforcing reasonable constraints on our symbols.

baloghadamsoftware added inline comments.May 2 2018, 11:46 PM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
167–178	I agree that the code should do mathematically correct things, but what I argue here is what math here means. Computer science is based on math, but it is somewhat different because of finite ranges and overflows. So I initially regarded the minimal and maximal values as infinity. Maybe that is not correct. However, I am sure that negating `-2³¹` should never be `-2³¹`. This is mathematically incorrect, and renders the whole calculation useless, since the union of a positive range and a negative range is unsuitable for any reasoning. I see two options here: Remove the extension when negating a range which ends at the maximal value of the type. So the negated range begins at the minimal value plus one. However, cut the range which begins at the minimal value of the type by one. So the negated range ends at the maximal value, as in the current version in the patch. Remove the extension as in 1. and disable the whole negation if we have the range begins at the minimal value. Iterator checkers are of course not affected because of the max/4 constraints.

Fixed according to the comments.

I chose option 1 for now.

Can we continue the discussion here, please? We should involve Devin and/or George as well if we cannot agree ourselves.

NoQ added inline comments.May 25 2018, 2:53 PM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
167–178	However, I am sure that negating `-2³¹` should never be `-2³¹`. This is mathematically incorrect, and renders the whole calculation useless, since the union of a positive range and a negative range is unsuitable for any reasoning. Well, that's how computers already work. And that's how all sorts of abstract algebra work as well, so this is totally mathematically correct. We promise to support the two's complement semantics in the analyzer when it comes to signed integer overflows. Even though it's technically UB, most implementations follow this semantics and a lot of real-world applications explicitly rely on that. Also we cannot simply drop values from our constraint ranges in the general case because the values we drop might be the only valid values, and the assumption that at least one non-dropped value can definitely be taken is generally incorrect. Finding cornercases like this one is one of the strong sides of any static analysis: it is in fact our job to make the user aware of it if he doesn't understand overflow rules. If it cannot be said that the variable on a certain path is non-negative because it might as well be -2³¹, we should totally explore this possibility. If for a certain checker it brings no benefit because such value would be unlikely in certain circumstances, that checker is free to cut off the respective paths, but the core should perform operations precisely. I don't think we have much room for personal preferences here.

I still disagree, but I want the review to continue so I did the requested modifications.

NoQ added inline comments.May 29 2018, 11:16 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
191	We'll also need to merge the two adjacent segments if the original range had both a [MinSingedValue, MinSignedValue] and a [X, MaxSignedValue]: Because our immutable sets are sorted, you can conduct the check for the first and the last segment separately. I think this code needs comments because even though it's short it's pretty hard to get right.
192	Return value of `add` seems to be accidentally discarded here. I guess i'll look into adding an `__attribute__((warn_unused_result))` to these functions, because it's a super common bug.

NoQ added inline comments.May 29 2018, 11:17 AM

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
192	Also tests would have saved us.

george.karpenkov resigned from this revision.May 30 2018, 4:46 PM

Sorry, Artem, but it does not work this way. Even if the symbolic expressions are constrained to [-MAX/4..MAX/4], after rearrangement the difference still uses the whole range, thus m>n becomes m-n>0, where in the false branch the range for m-n is [MIN..0]. Then if we check n>=m the range set is reversed to [MIN..MIN]U[0..MAX] which results UNKNOWN for n-m. It does not solve any of our problems and there is no remedy on the checker's side.

Maybe if we could apply somehow a [-MAX/2..MAX/2] constraint to both sides of the rearranged equality in SimpleSValBuilder.

In D35110#1117401, @baloghadamsoftware wrote:

Sorry, Artem, but it does not work this way. Even if the symbolic expressions are constrained to [-MAX/4..MAX/4], after rearrangement the difference still uses the whole range, thus m>n becomes m-n>0, where in the false branch the range for m-n is [MIN..0]. Then if we check n>=m the range set is reversed to [MIN..MIN]U[0..MAX] which results UNKNOWN for n-m. It does not solve any of our problems and there is no remedy on the checker's side.

Which expressions are constrained? Why does the difference use the whole range? Is it something that could have been fixed by the "enforce that separately" part in my old comment:

iterator-related symbols are all planned to be within range [-2²⁹, -2²⁹], right? So if we subtract one such symbol from another, it's going to be in range [-2³⁰, 2³⁰]. Can we currently infer that? Or maybe we should make the iterator checker to enforce that separately?

In D35110#1119496, @NoQ wrote:

Which expressions are constrained? Why does the difference use the whole range? Is it something that could have been fixed by the "enforce that separately" part in my old comment:

iterator-related symbols are all planned to be within range [-2²⁹, -2²⁹], right? So if we subtract one such symbol from another, it's going to be in range [-2³⁰, 2³⁰]. Can we currently infer that? Or maybe we should make the iterator checker to enforce that separately?

?

RangedConstraintManager currently does not support Sym+Sym-type of expressions, only Sym+Int-type ones. That is why it cannot calculate that the result is within [-2³⁰, 2³⁰]. In the iterator checkers we do not know anything about the rearranged expressions, it has no access to the sum/difference, the whole purpose of your proposal was to put in into the infrastructure. The checker enforces everything it can but it does not help.

Any idea how to proceed?

Herald added a subscriber: mikhail.ramalho. · View Herald TranscriptJun 13 2018, 9:48 AM

In the iterator checkers we do not know anything about the rearranged expressions, it has no access to the sum/difference, the whole purpose of your proposal was to put in into the infrastructure.

It wasn't. The purpose was merely to move (de-duplicate) the code that computes the sum/difference away from the checker. The checker can still operate on the result of such calculation if it knows something about that result that the core doesn't.

I still don't think i fully understand your concern. Could you provide an example and point out what exactly goes wrong?

-(-2³¹) == -2³¹

I added extra assertion into the test for the difference. Interestingly, it also works if I assert n-m is in the range instead of m-n. However, I wonder how can I apply such constraint to the difference of iterator positions without decomposing them to symbols and constants.

baloghadamsoftware added a comment.Jun 19 2018, 12:03 PM

This comment was removed by baloghadamsoftware.

Ok, code makes sense to me now!

I think we still need a few new tests to cover the corner cases.

In D35110#1135306, @baloghadamsoftware wrote:

I added extra assertion into the test for the difference. Interestingly, it also works if I assert n-m is in the range instead of m-n. However, I wonder how can I apply such constraint to the difference of iterator positions without decomposing them to symbols and constants.

I don't see how iterator checker is different from the tests. The code of the program in your tests doesn't decompose m - n into symbols and constants, it simply subtracts an opaque value n (whatever it is) from an opaque value m (whatever it is) and makes assumptions on the opaque result of the subtraction (whatever it is). The checker should do the same, i guess?

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
177–178	I guess the comment needs to be updated.
189	I suggest a few sanity checks here (untested): `assert(newRanges.begin()->To().isMinSignedValue());` because we shouldn't ever get an overlap. `assert(!from.isMinSignedValue())` for the same reason; it's good to know because it tells us what `newTo` is equal to on this path.

Comment fixed, assertions inserted, new tests added.

baloghadamsoftware marked 2 inline comments as done.Jun 25 2018, 12:56 PM

Thank you!! Please commit.

test/Analysis/constraint_manager_negate_difference.c
96	Whitespace (:

This revision is now accepted and ready to land.Jun 27 2018, 10:44 AM

Closed by commit rC335814: [Analyzer] Constraint Manager Negates Difference (authored by baloghadamsoftware). · Explain WhyJun 28 2018, 12:40 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

clang/

StaticAnalyzer/

Core/

PathSensitive/

RangedConstraintManager.h

2 lines

lib/

StaticAnalyzer/

Core/

RangeConstraintManager.cpp

69 lines

test/

Analysis/

constraint_manager_negate_difference.c

98 lines

ptr-arith.c

23 lines

Diff 153263

include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	private:
const llvm::APSInt &getMinValue() const;		const llvm::APSInt &getMinValue() const;

bool pin(llvm::APSInt &Lower, llvm::APSInt &Upper) const;		bool pin(llvm::APSInt &Lower, llvm::APSInt &Upper) const;

public:		public:
RangeSet Intersect(BasicValueFactory &BV, Factory &F, llvm::APSInt Lower,		RangeSet Intersect(BasicValueFactory &BV, Factory &F, llvm::APSInt Lower,
llvm::APSInt Upper) const;		llvm::APSInt Upper) const;

		RangeSet Negate(BasicValueFactory &BV, Factory &F) const;

void print(raw_ostream &os) const;		void print(raw_ostream &os) const;

bool operator==(const RangeSet &other) const {		bool operator==(const RangeSet &other) const {
return ranges == other.ranges;		return ranges == other.ranges;
}		}
};		};


▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

Show First 20 Lines • Show All 158 Lines • ▼ Show 20 Lines	if (!pin(Lower, Upper))
return F.getEmptySet();		return F.getEmptySet();

PrimRangeSet newRanges = F.getEmptySet();		PrimRangeSet newRanges = F.getEmptySet();

PrimRangeSet::iterator i = begin(), e = end();		PrimRangeSet::iterator i = begin(), e = end();
if (Lower <= Upper)		if (Lower <= Upper)
IntersectInRange(BV, F, Lower, Upper, newRanges, i, e);		IntersectInRange(BV, F, Lower, Upper, newRanges, i, e);
else {		else {
// The order of the next two statements is important!		// The order of the next two statements is important!
// IntersectInRange() does not reset the iteration state for i and e.		// IntersectInRange() does not reset the iteration state for i and e.
// Therefore, the lower range most be handled first.		// Therefore, the lower range most be handled first.
IntersectInRange(BV, F, BV.getMinValue(Upper), Upper, newRanges, i, e);		IntersectInRange(BV, F, BV.getMinValue(Upper), Upper, newRanges, i, e);
IntersectInRange(BV, F, Lower, BV.getMaxValue(Lower), newRanges, i, e);		IntersectInRange(BV, F, Lower, BV.getMaxValue(Lower), newRanges, i, e);
}		}

return newRanges;		return newRanges;
}		}

		// Turn all [A, B] ranges to [-B, -A]. Ranges [MIN, B] are turned to range set
		// [MIN, MIN] U [-B, MAX], when MIN and MAX are the minimal and the maximal
		NoQUnsubmitted Not Done Reply Inline Actions Hmm, wait a minute, is this actually correct? For the range [-2³¹, -2³¹ + 1] over a 32-bit integer, the negated range will be [-2³¹, -2³¹] U [2³¹ - 1, 2³¹ - 1]. So there must be a place in the code where we take one range and add two ranges. NoQ: Hmm, wait a minute, is this actually correct? For the range [-2³¹, -2³¹ + 1] over a 32-bit…
		baloghadamsoftwareAuthorUnsubmitted Not Done Reply Inline Actions The two ends of the range of the type usually stands for +/- infinity. If we add the minimum of the type when negating a negative range, then we lose the whole point of this transformation. Example: If `A - B < 0`, then the range of `A - B` is `[-2³¹, -1]`, If we negate this, and keep the `-2³¹` range end, then we get `[-2³¹, -2³¹]U[1, 2³¹-1]`. However, this does not mean `B - A > 0`. If we make assumption about this, we get two states instead of one, in the true state `A - B`'s range is `[1, 2³¹-1]` and in the false state it is `[-2³¹, -2³¹]`. This is surely not what we want. baloghadamsoftware: The two ends of the range of the type usually stands for +/- infinity. If we add the minimum of…
		NoQUnsubmitted Not Done Reply Inline Actions Well, we can't turn math into something we want, it is what it is. Iterator-related symbols are all planned to be within range [-2²⁹, -2²⁹], right? So if we subtract one such symbol from another, it's going to be in range [-2³⁰, 2³⁰]. Can we currently infer that? Or maybe we should make the iterator checker to enforce that separately? Because this range doesn't include -2³¹, so we won't have any problems with doing negation correctly. So as usual i propose to get this code mathematically correct and then see if we can ensure correct behavior by enforcing reasonable constraints on our symbols. NoQ: Well, we can't turn math into something we want, it is what it is. Iterator-related symbols…
		baloghadamsoftwareAuthorUnsubmitted Not Done Reply Inline Actions I agree that the code should do mathematically correct things, but what I argue here is what math here means. Computer science is based on math, but it is somewhat different because of finite ranges and overflows. So I initially regarded the minimal and maximal values as infinity. Maybe that is not correct. However, I am sure that negating `-2³¹` should never be `-2³¹`. This is mathematically incorrect, and renders the whole calculation useless, since the union of a positive range and a negative range is unsuitable for any reasoning. I see two options here: Remove the extension when negating a range which ends at the maximal value of the type. So the negated range begins at the minimal value plus one. However, cut the range which begins at the minimal value of the type by one. So the negated range ends at the maximal value, as in the current version in the patch. Remove the extension as in 1. and disable the whole negation if we have the range begins at the minimal value. Iterator checkers are of course not affected because of the max/4 constraints. baloghadamsoftware: I agree that the code should do mathematically correct things, but what I argue here is what…
		NoQUnsubmitted Not Done Reply Inline Actions However, I am sure that negating `-2³¹` should never be `-2³¹`. This is mathematically incorrect, and renders the whole calculation useless, since the union of a positive range and a negative range is unsuitable for any reasoning. Well, that's how computers already work. And that's how all sorts of abstract algebra work as well, so this is totally mathematically correct. We promise to support the two's complement semantics in the analyzer when it comes to signed integer overflows. Even though it's technically UB, most implementations follow this semantics and a lot of real-world applications explicitly rely on that. Also we cannot simply drop values from our constraint ranges in the general case because the values we drop might be the only valid values, and the assumption that at least one non-dropped value can definitely be taken is generally incorrect. Finding cornercases like this one is one of the strong sides of any static analysis: it is in fact our job to make the user aware of it if he doesn't understand overflow rules. If it cannot be said that the variable on a certain path is non-negative because it might as well be -2³¹, we should totally explore this possibility. If for a certain checker it brings no benefit because such value would be unlikely in certain circumstances, that checker is free to cut off the respective paths, but the core should perform operations precisely. I don't think we have much room for personal preferences here. NoQ: > However, I am sure that negating `-2³¹` should never be `-2³¹`. This is mathematically…
		NoQUnsubmitted Done Reply Inline Actions I guess the comment needs to be updated. NoQ: I guess the comment needs to be updated.
		// signed values of the type.
		RangeSet RangeSet::Negate(BasicValueFactory &BV, Factory &F) const {
		PrimRangeSet newRanges = F.getEmptySet();

		for (iterator i = begin(), e = end(); i != e; ++i) {
		const llvm::APSInt &from = i->From(), &to = i->To();
		const llvm::APSInt &newTo = (from.isMinSignedValue() ?
		BV.getMaxValue(from) :
		BV.getValue(- from));
		if (to.isMaxSignedValue() && !newRanges.isEmpty() &&
		newRanges.begin()->From().isMinSignedValue()) {
		NoQUnsubmitted Done Reply Inline Actions I suggest a few sanity checks here (untested): `assert(newRanges.begin()->To().isMinSignedValue());` because we shouldn't ever get an overlap. `assert(!from.isMinSignedValue())` for the same reason; it's good to know because it tells us what `newTo` is equal to on this path. NoQ: I suggest a few sanity checks here (untested): `assert(newRanges.begin()->To().isMinSignedValue…
		assert(newRanges.begin()->To().isMinSignedValue() &&
		"Ranges should not overlap");
		NoQUnsubmitted Not Done Reply Inline Actions We'll also need to merge the two adjacent segments if the original range had both a [MinSingedValue, MinSignedValue] and a [X, MaxSignedValue]: Because our immutable sets are sorted, you can conduct the check for the first and the last segment separately. I think this code needs comments because even though it's short it's pretty hard to get right. NoQ: We'll also need to merge the two adjacent segments if the original range had both a…
		assert(!from.isMinSignedValue() && "Ranges should not overlap");
		NoQUnsubmitted Not Done Reply Inline Actions Return value of `add` seems to be accidentally discarded here. I guess i'll look into adding an `__attribute__((warn_unused_result))` to these functions, because it's a super common bug. NoQ: Return value of `add` seems to be accidentally discarded here. I guess i'll look into adding…
		NoQUnsubmitted Not Done Reply Inline Actions Also tests would have saved us. NoQ: Also tests would have saved us.
		const llvm::APSInt &newFrom = newRanges.begin()->From();
		newRanges =
		F.add(F.remove(newRanges, *newRanges.begin()), Range(newFrom, newTo));
		} else if (!to.isMinSignedValue()) {
		const llvm::APSInt &newFrom = BV.getValue(- to);
		newRanges = F.add(newRanges, Range(newFrom, newTo));
		}
		if (from.isMinSignedValue()) {
		newRanges = F.add(newRanges, Range(BV.getMinValue(from),
		BV.getMinValue(from)));
		}
		}

		return newRanges;
		}

void RangeSet::print(raw_ostream &os) const {		void RangeSet::print(raw_ostream &os) const {
bool isFirst = true;		bool isFirst = true;
os << "{ ";		os << "{ ";
for (iterator i = begin(), e = end(); i != e; ++i) {		for (iterator i = begin(), e = end(); i != e; ++i) {
if (isFirst)		if (isFirst)
isFirst = false;		isFirst = false;
else		else
os << ", ";		os << ", ";
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	public:
ProgramStateRef assumeSymOutsideInclusiveRange(		ProgramStateRef assumeSymOutsideInclusiveRange(
ProgramStateRef State, SymbolRef Sym, const llvm::APSInt &From,		ProgramStateRef State, SymbolRef Sym, const llvm::APSInt &From,
const llvm::APSInt &To, const llvm::APSInt &Adjustment) override;		const llvm::APSInt &To, const llvm::APSInt &Adjustment) override;

private:		private:
RangeSet::Factory F;		RangeSet::Factory F;

RangeSet getRange(ProgramStateRef State, SymbolRef Sym);		RangeSet getRange(ProgramStateRef State, SymbolRef Sym);
		const RangeSet* getRangeForMinusSymbol(ProgramStateRef State,
		SymbolRef Sym);

RangeSet getSymLTRange(ProgramStateRef St, SymbolRef Sym,		RangeSet getSymLTRange(ProgramStateRef St, SymbolRef Sym,
const llvm::APSInt &Int,		const llvm::APSInt &Int,
const llvm::APSInt &Adjustment);		const llvm::APSInt &Adjustment);
RangeSet getSymGTRange(ProgramStateRef St, SymbolRef Sym,		RangeSet getSymGTRange(ProgramStateRef St, SymbolRef Sym,
const llvm::APSInt &Int,		const llvm::APSInt &Int,
const llvm::APSInt &Adjustment);		const llvm::APSInt &Adjustment);
RangeSet getSymLERange(ProgramStateRef St, SymbolRef Sym,		RangeSet getSymLERange(ProgramStateRef St, SymbolRef Sym,
const llvm::APSInt &Int,		const llvm::APSInt &Int,
const llvm::APSInt &Adjustment);		const llvm::APSInt &Adjustment);
RangeSet getSymLERange(llvm::function_ref<RangeSet()> RS,		RangeSet getSymLERange(llvm::function_ref<RangeSet()> RS,
const llvm::APSInt &Int,		const llvm::APSInt &Int,
const llvm::APSInt &Adjustment);		const llvm::APSInt &Adjustment);
RangeSet getSymGERange(ProgramStateRef St, SymbolRef Sym,		RangeSet getSymGERange(ProgramStateRef St, SymbolRef Sym,
const llvm::APSInt &Int,		const llvm::APSInt &Int,
const llvm::APSInt &Adjustment);		const llvm::APSInt &Adjustment);

};		};

} // end anonymous namespace		} // end anonymous namespace

std::unique_ptr<ConstraintManager>		std::unique_ptr<ConstraintManager>
ento::CreateRangeConstraintManager(ProgramStateManager &StMgr, SubEngine *Eng) {		ento::CreateRangeConstraintManager(ProgramStateManager &StMgr, SubEngine *Eng) {
return llvm::make_unique<RangeConstraintManager>(Eng, StMgr.getSValBuilder());		return llvm::make_unique<RangeConstraintManager>(Eng, StMgr.getSValBuilder());
}		}
▲ Show 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	static RangeSet applyBitwiseConstraints(
return Input;		return Input;
}		}

RangeSet RangeConstraintManager::getRange(ProgramStateRef State,		RangeSet RangeConstraintManager::getRange(ProgramStateRef State,
SymbolRef Sym) {		SymbolRef Sym) {
if (ConstraintRangeTy::data_type *V = State->get<ConstraintRange>(Sym))		if (ConstraintRangeTy::data_type *V = State->get<ConstraintRange>(Sym))
return *V;		return *V;

		BasicValueFactory &BV = getBasicVals();

		// If Sym is a difference of symbols A - B, then maybe we have range set
		// stored for B - A.
		if (const RangeSet *R = getRangeForMinusSymbol(State, Sym))
		return R->Negate(BV, F);

// Lazily generate a new RangeSet representing all possible values for the		// Lazily generate a new RangeSet representing all possible values for the
// given symbol type.		// given symbol type.
BasicValueFactory &BV = getBasicVals();
QualType T = Sym->getType();		QualType T = Sym->getType();

RangeSet Result(F, BV.getMinValue(T), BV.getMaxValue(T));		RangeSet Result(F, BV.getMinValue(T), BV.getMaxValue(T));
		NoQUnsubmitted Not Done Reply Inline Actions With this, it sounds as if we're half-way into finally supporting the unary minus operator (: Could you add a FIXME here: "Once SValBuilder supports unary minus, we should use SValBuilder to obtain the negated symbolic expression instead of constructing the symbol manually. This will allow us to support finding ranges of not only negated SymSymExpr-type expressions, but also of other, simpler expressions which we currently do not know how to negate." NoQ: With this, it sounds as if we're half-way into finally supporting the unary minus operator…
		NoQUnsubmitted Done Reply Inline Actions Can we move the whole `if` into a function? Eg., if (RangeSet R = getRangeForMinusSymbol(Sym)) return R->Negate(BV, F) ? NoQ:* Can we move the whole `if` into a function? Eg., ``` if (RangeSet *R = getRangeForMinusSymbol…

// References are known to be non-zero.		// References are known to be non-zero.
if (T->isReferenceType())		if (T->isReferenceType())
return assumeNonZero(BV, F, Sym, Result);		return assumeNonZero(BV, F, Sym, Result);
		xazax.hunUnsubmitted Not Done Reply Inline Actions Could you give an example why do you need this (probably as a test), or constrain the transformation when all the types are the same? xazax.hun: Could you give an example why do you need this (probably as a test), or constrain the…

// Known constraints on ranges of bitwise expressions.		// Known constraints on ranges of bitwise expressions.
if (const SymIntExpr* SIE = dyn_cast<SymIntExpr>(Sym))		if (const SymIntExpr* SIE = dyn_cast<SymIntExpr>(Sym))
return applyBitwiseConstraints(BV, F, Result, SIE);		return applyBitwiseConstraints(BV, F, Result, SIE);
		NoQUnsubmitted Not Done Reply Inline Actions I'm quite sure that types of `A - B` and `B - A` are always equal when it comes to integer promotion rules. Also, due to our broken `SymbolCast` (which is often missing where it ideally should be), the type of the `A - B` symbol may not necessarily be the same as the type that you obtain by applying integer promotion rules to types of `A` and `B`. So i think you should always take the type of `A - B` and expect to find `B - A` of the same type in the range map, otherwise give up. NoQ: I'm quite sure that types of `A - B` and `B - A` are always equal when it comes to integer…

return Result;		return Result;
}		}
		xazax.hunUnsubmitted Not Done Reply Inline Actions I think it would be better to describe why don't we want to do that rather than describing what the code does. xazax.hun: I think it would be better to describe why don't we want to do that rather than describing what…
		NoQUnsubmitted Done Reply Inline Actions `ConstraintRangeTy::data_type` ~> `RangeSet` should be easier to read. NoQ: `ConstraintRangeTy::data_type` ~> `RangeSet` should be easier to read.

		// FIXME: Once SValBuilder supports unary minus, we should use SValBuilder to
		// obtain the negated symbolic expression instead of constructing the
		// symbol manually. This will allow us to support finding ranges of not
		NoQUnsubmitted Not Done Reply Inline Actions Pointer types are currently treated as unsigned, so i'm not sure you want them here. NoQ: Pointer types are currently treated as unsigned, so i'm not sure you want them here.
		baloghadamsoftwareAuthorUnsubmitted Not Done Reply Inline Actions For me it seems that pointer differences are still pointer types and they are signed. (The range becomes negative upon negative assumption. From test `ptr-arith.c`: void use_symbols(int lhs, int rhs) { clang_analyzer_eval(lhs < rhs); // expected-warning{{UNKNOWN}} if (lhs < rhs) return; clang_analyzer_eval(lhs < rhs); // expected-warning{{FALSE}} clang_analyzer_eval(lhs - rhs); // expected-warning{{UNKNOWN}} if ((lhs - rhs) != 5) return; clang_analyzer_eval((lhs - rhs) == 5); // expected-warning{{TRUE}} } If I put `clang_analyzer_printState()` into the empty line in the middle, I get the following range for the difference: `[-9223372036854775808, 0]`. If I replace `int` with `unsigned`, this range becomes `[0, 0]`, so `int` is handled as a signed type here. baloghadamsoftware: For me it seems that pointer differences are still pointer types and they are signed. (The…
		NoQUnsubmitted Not Done Reply Inline Actions Umm, yeah, i was wrong. looks closer `T` is the type of the difference, right? I don't think i'd expect pointer type as the type of the difference. Could you add test cases for pointers if you intend to support them (and maybe for unsigned types)? NoQ: Umm, yeah, i was wrong. looks closer `T` is the type of the difference, right? I don't…
		baloghadamsoftwareAuthorUnsubmitted Not Done Reply Inline Actions I do not know exactly the type, but if I remove the `T->isPointerType()` condition the test in `ptr_arith.c` will fail with `UNKNOWN`. So the type of the difference is a type that returns `true` from `T->isPointerType()`. Pointer tests are already there in `ptr_arith.c`. Should I duplicate them? baloghadamsoftware: I do not know exactly the type, but if I remove the `T->isPointerType()` condition the test in…
		NoQUnsubmitted Not Done Reply Inline Actions I don't see any failing tests when i remove `T->isPointerType()`. Also this shouldn't be system-specific, because the target triple is hardcoded in `ptr-arith.c` runline. Could you point out which test is failing and dump the type in question (`-ast-dump`, or `Type->dump()`, or `llvm::errs() << QualType::getAsString()`, or whatever)? NoQ: I don't see any failing tests when i remove `T->isPointerType()`. Also this shouldn't be…
		// only negated SymSymExpr-type expressions, but also of other, simpler
		// expressions which we currently do not know how to negate.
		const RangeSet*
		RangeConstraintManager::getRangeForMinusSymbol(ProgramStateRef State,
		SymbolRef Sym) {
		if (const SymSymExpr *SSE = dyn_cast<SymSymExpr>(Sym)) {
		if (SSE->getOpcode() == BO_Sub) {
		QualType T = Sym->getType();
		SymbolManager &SymMgr = State->getSymbolManager();
		SymbolRef negSym = SymMgr.getSymSymExpr(SSE->getRHS(), BO_Sub,
		SSE->getLHS(), T);
		if (const RangeSet *negV = State->get<ConstraintRange>(negSym)) {
		// Unsigned range set cannot be negated, unless it is [0, 0].
		if ((negV->getConcreteValue() &&
		(*negV->getConcreteValue() == 0)) \|\|
		T->isSignedIntegerOrEnumerationType())
		return negV;
		}
		}
		}
		return nullptr;
		}

//===------------------------------------------------------------------------===		//===------------------------------------------------------------------------===
// assumeSymX methods: protected interface for RangeConstraintManager.		// assumeSymX methods: protected interface for RangeConstraintManager.
//===------------------------------------------------------------------------===/		//===------------------------------------------------------------------------===/

// The syntax for ranges below is mathematical, using [x, y] for closed ranges		// The syntax for ranges below is mathematical, using [x, y] for closed ranges
// and (x, y) for open ranges. These ranges are modular, corresponding with		// and (x, y) for open ranges. These ranges are modular, corresponding with
// a common treatment of C integer overflow. This means that these methods		// a common treatment of C integer overflow. This means that these methods
// do not have to worry about overflow; RangeSet::Intersect can handle such a		// do not have to worry about overflow; RangeSet::Intersect can handle such a
▲ Show 20 Lines • Show All 231 Lines • Show Last 20 Lines

test/Analysis/constraint_manager_negate_difference.c

				// RUN: %clang_analyze_cc1 -analyzer-checker=debug.ExprInspection,core.builtin -analyzer-config aggressive-relational-comparison-simplification=true -verify %s

				void clang_analyzer_eval(int);

				void exit(int);

				#define UINT_MAX (~0U)
				#define INT_MAX (UINT_MAX & (UINT_MAX >> 1))
				#define INT_MIN (UINT_MAX & ~(UINT_MAX >> 1))

				extern void __assert_fail (__const char __assertion, __const char __file,
				unsigned int __line, __const char *__function)
				__attribute__ ((__noreturn__));
				#define assert(expr) \
				((expr) ? (void)(0) : __assert_fail (#expr, __FILE__, __LINE__, __func__))

				void assert_in_range(int x) {
				assert(x <= ((int)INT_MAX / 4));
				assert(x >= -(((int)INT_MAX) / 4));
				}

				void assert_in_wide_range(int x) {
				assert(x <= ((int)INT_MAX / 2));
				assert(x >= -(((int)INT_MAX) / 2));
				}

				void assert_in_range_2(int m, int n) {
				assert_in_range(m);
				assert_in_range(n);
				}

				void equal(int m, int n) {
				assert_in_range_2(m, n);
				if (m != n)
				return;
				assert_in_wide_range(m - n);
				clang_analyzer_eval(n == m); // expected-warning{{TRUE}}
				}

				void non_equal(int m, int n) {
				assert_in_range_2(m, n);
				if (m == n)
				return;
				assert_in_wide_range(m - n);
				clang_analyzer_eval(n != m); // expected-warning{{TRUE}}
				}

				void less_or_equal(int m, int n) {
				assert_in_range_2(m, n);
				if (m < n)
				return;
				assert_in_wide_range(m - n);
				clang_analyzer_eval(n <= m); // expected-warning{{TRUE}}
				}

				void less(int m, int n) {
				assert_in_range_2(m, n);
				if (m <= n)
				return;
				assert_in_wide_range(m - n);
				clang_analyzer_eval(n < m); // expected-warning{{TRUE}}
				}

				void greater_or_equal(int m, int n) {
				assert_in_range_2(m, n);
				if (m > n)
				return;
				assert_in_wide_range(m - n);
				clang_analyzer_eval(n >= m); // expected-warning{{TRUE}}
				}

				void greater(int m, int n) {
				assert_in_range_2(m, n);
				if (m >= n)
				return;
				assert_in_wide_range(m - n);
				clang_analyzer_eval(n > m); // expected-warning{{TRUE}}
				}

				void negate_positive_range(int m, int n) {
				if (m - n <= 0)
				return;
				clang_analyzer_eval(n - m < 0); // expected-warning{{TRUE}}
				clang_analyzer_eval(n - m > INT_MIN); // expected-warning{{TRUE}}
				clang_analyzer_eval(n - m == INT_MIN); // expected-warning{{FALSE}}
				}

				void negate_int_min(int m, int n) {
				if (m - n != INT_MIN)
				return;
				clang_analyzer_eval(n - m == INT_MIN); // expected-warning{{TRUE}}
				}

				void negate_mixed(int m, int n) {
				if (m -n > INT_MIN && m - n <= 0)
				return;
				NoQUnsubmitted Not Done Reply Inline Actions Whitespace (: NoQ: Whitespace (:
				clang_analyzer_eval(n - m <= 0); // expected-warning{{TRUE}}
				}

test/Analysis/ptr-arith.c

Show First 20 Lines • Show All 258 Lines • ▼ Show 20 Lines	#endif
clang_analyzer_eval(lhs == rhs); // expected-warning{{FALSE}}		clang_analyzer_eval(lhs == rhs); // expected-warning{{FALSE}}
#ifdef ANALYZER_CM_Z3		#ifdef ANALYZER_CM_Z3
clang_analyzer_eval(lhs < rhs); // expected-warning{{UNKNOWN}}		clang_analyzer_eval(lhs < rhs); // expected-warning{{UNKNOWN}}
#else		#else
clang_analyzer_eval(lhs < rhs); // expected-warning{{TRUE}}		clang_analyzer_eval(lhs < rhs); // expected-warning{{TRUE}}
#endif		#endif
clang_analyzer_eval((rhs - lhs) > 0); // expected-warning{{TRUE}}		clang_analyzer_eval((rhs - lhs) > 0); // expected-warning{{TRUE}}
}		}

//-------------------------------
// False positives
//-------------------------------

void zero_implies_reversed_equal(int lhs, int rhs) {		void zero_implies_reversed_equal(int lhs, int rhs) {
		NoQUnsubmitted Not Done Reply Inline Actions The tests that are now supported should be moved above this comment. NoQ: The tests that are now supported should be moved above this comment.
clang_analyzer_eval((rhs - lhs) == 0); // expected-warning{{UNKNOWN}}		clang_analyzer_eval((rhs - lhs) == 0); // expected-warning{{UNKNOWN}}
if ((rhs - lhs) == 0) {		if ((rhs - lhs) == 0) {
#ifdef ANALYZER_CM_Z3
clang_analyzer_eval(rhs != lhs); // expected-warning{{FALSE}}		clang_analyzer_eval(rhs != lhs); // expected-warning{{FALSE}}
clang_analyzer_eval(rhs == lhs); // expected-warning{{TRUE}}		clang_analyzer_eval(rhs == lhs); // expected-warning{{TRUE}}
#else
clang_analyzer_eval(rhs != lhs); // expected-warning{{UNKNOWN}}
clang_analyzer_eval(rhs == lhs); // expected-warning{{UNKNOWN}}
#endif
return;		return;
}		}
clang_analyzer_eval((rhs - lhs) == 0); // expected-warning{{FALSE}}		clang_analyzer_eval((rhs - lhs) == 0); // expected-warning{{FALSE}}
#ifdef ANALYZER_CM_Z3
clang_analyzer_eval(rhs == lhs); // expected-warning{{FALSE}}		clang_analyzer_eval(rhs == lhs); // expected-warning{{FALSE}}
clang_analyzer_eval(rhs != lhs); // expected-warning{{TRUE}}		clang_analyzer_eval(rhs != lhs); // expected-warning{{TRUE}}
#else
clang_analyzer_eval(rhs == lhs); // expected-warning{{UNKNOWN}}
clang_analyzer_eval(rhs != lhs); // expected-warning{{UNKNOWN}}
#endif
}		}

void canonical_equal(int lhs, int rhs) {		void canonical_equal(int lhs, int rhs) {
clang_analyzer_eval(lhs == rhs); // expected-warning{{UNKNOWN}}		clang_analyzer_eval(lhs == rhs); // expected-warning{{UNKNOWN}}
if (lhs == rhs) {		if (lhs == rhs) {
#ifdef ANALYZER_CM_Z3
clang_analyzer_eval(rhs == lhs); // expected-warning{{TRUE}}		clang_analyzer_eval(rhs == lhs); // expected-warning{{TRUE}}
#else
clang_analyzer_eval(rhs == lhs); // expected-warning{{UNKNOWN}}
#endif
return;		return;
}		}
clang_analyzer_eval(lhs == rhs); // expected-warning{{FALSE}}		clang_analyzer_eval(lhs == rhs); // expected-warning{{FALSE}}

#ifdef ANALYZER_CM_Z3
clang_analyzer_eval(rhs == lhs); // expected-warning{{FALSE}}		clang_analyzer_eval(rhs == lhs); // expected-warning{{FALSE}}
#else
clang_analyzer_eval(rhs == lhs); // expected-warning{{UNKNOWN}}
#endif
}		}

void compare_element_region_and_base(int *p) {		void compare_element_region_and_base(int *p) {
int *q = p - 1;		int *q = p - 1;
clang_analyzer_eval(p == q); // expected-warning{{FALSE}}		clang_analyzer_eval(p == q); // expected-warning{{FALSE}}
}		}

struct Point {		struct Point {
Show All 37 Lines