This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
cfe/trunk/
-
trunk/
-
include/clang/StaticAnalyzer/Core/PathSensitive/
-
clang/
-
StaticAnalyzer/
-
Core/
-
PathSensitive/
-
RangedConstraintManager.h
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
-
RangeConstraintManager.cpp
-
test/Analysis/
-
Analysis/
-
constraint_manager_negate_difference.c

Differential D55007

[Analyzer] Constraint Manager - Calculate Effective Range for Differences
ClosedPublic

Authored by baloghadamsoftware on Nov 28 2018, 7:48 AM.

Download Raw Diff

Details

Reviewers

NoQ
george.karpenkov

Commits

rGa19c985f8ab0: [Analyzer] Constraint Manager - Calculate Effective Range for Differences
rC357167: [Analyzer] Constraint Manager - Calculate Effective Range for Differences
rL357167: [Analyzer] Constraint Manager - Calculate Effective Range for Differences

Summary

Since rL335814, if the constraint manager cannot find a range set for A - B (where A and B are symbols) it looks for a range for B - A and returns it negated if it exists. However, if a range set for both A - B and B - A is stored then it only returns the first one. If we both use A - B and B - A, these expressions behave as two totally unrelated symbols. This way we miss some useful deductions which may lead to false negatives or false positives.

This tiny patch changes this behavior: if the symbolic expression the constraint manager is looking for is a difference A - B, it tries to retrieve the range for both A - B and B - A and if both exists it returns the intersection of range A - B and the negated range of B - A. This way every time a checker applies new constraints to the symbolic difference or to its negated it always affects both the original difference and its negated.

Diff Detail

Repository: rL LLVM

Event Timeline

baloghadamsoftware created this revision.Nov 28 2018, 7:48 AM

Herald added a reviewer: george.karpenkov. · View Herald TranscriptNov 28 2018, 7:48 AM

Herald added subscribers: donat.nagy, mikhail.ramalho, a.sidorin, szepet. · View Herald Transcript

Is it an option to canonicalize the expression so that B - A was never stored in the first place? I.e., do this range intersection at the moment of writing the range, not at the moment of reading the range.

This could be implemented by, say, comparing symbol IDs for A and B and making sure that in every stored SymSymExpr the first symbol's ID is greater than the second symbol's ID.

My original idea was that once we ony store either A - B or B - A. Thus if we already have A - B stored then do not store range for B - A but negate both the difference and the range. I can think on two ways to implement this:

Create a separate function e.g. setRange() to store the range. This function checks whether the symbol is a difference and whether we already have a range for its negated. If so, then negate the difference and the range as well. ( We do not need to intersect them because the caller already did it.) However, in this case we negate twice: once in getRange() then once in setRange().

Move the negation out of getRange() and call check for a stored negated difference before calling it. If it exist then call the appropriate assume function for the negated difference (for == and != it is the same function, but reverse the operator for the rest).

Your idea (store either A - B or B - A based on their symbol ID is also feasible but then we also face the same question. So 1) or 2)?

I just generally wish for a function that would return the most specific known range for the given arbitrary symbol. This is something we can put, for example, into visitor path notes.

Eg., imagine "Assuming x is greater than 5" -> "Tainted array index constrained to [5, 6] U {8} U [10, 12]" -> "Potential buffer overflow: accessing array of 11 elements with a tainted index".

getRange() is a great name for such function. For now i think it kinda works for atomic symbols with a few extra goodies on top of that. If we generally move towards that goal, i think it would be great.

MTC added a subscriber: MTC.Dec 6 2018, 6:19 PM

In D55007#1322335, @NoQ wrote:

I just generally wish for a function that would return the most specific known range for the given arbitrary symbol. This is something we can put, for example, into visitor path notes.

Eg., imagine "Assuming x is greater than 5" -> "Tainted array index constrained to [5, 6] U {8} U [10, 12]" -> "Potential buffer overflow: accessing array of 11 elements with a tainted index".

getRange() is a great name for such function. For now i think it kinda works for atomic symbols with a few extra goodies on top of that. If we generally move towards that goal, i think it would be great.

I think my patch is exactly about this for symbol differences: it returns the most specific known range for the difference by intersecting the range stored for the differences with the range stored for the negated difference.

It may be better not to store the difference for the negated range at all, but it is a bigger change in the code. (See my options 1 and 2.)

Ping!

Herald added a subscriber: Charusso. · View Herald TranscriptMar 18 2019, 12:43 AM

What i was trying to say with my last comment is that i guess i'd rather go for option (1) because with that getRange() remains the single source of truth, which is comfy.

I agree this shouldn't really be blocking the patch - sorry for stalling! - i'm hopefully slowly getting better at not stalling.

Generally i would have went for saving some memory and expensive ImmutableMap lookups by canonicalizing the key as much as possible.

Do we want to add the opposite test

void effective_range_2(int m, int n) {
  assert(m - n <= 0);
  assert(n - m <= 0);
  clang_analyzer_eval(m - n == 0); // expected-warning{{TRUE}} expected-warning{{FALSE}}
  clang_analyzer_eval(n - m == 0); // expected-warning{{TRUE}} expected-warning{{FALSE}}
}

...where the FALSE case corresponds to m - n == INT_MIN?

lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
483 ↗	(On Diff #175693)	`sotred` -> `stored` :)

This revision is now accepted and ready to land.Mar 27 2019, 3:34 PM

Closed by commit rL357167: [Analyzer] Constraint Manager - Calculate Effective Range for Differences (authored by baloghadamsoftware). · Explain WhyMar 28 2019, 6:04 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2019, 6:04 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Revision Contents

Path

Size

cfe/

trunk/

include/

clang/

StaticAnalyzer/

Core/

PathSensitive/

RangedConstraintManager.h

3 lines

lib/

StaticAnalyzer/

Core/

RangeConstraintManager.cpp

33 lines

test/

Analysis/

constraint_manager_negate_difference.c

14 lines

Diff 192621

cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	private:

const llvm::APSInt &getMinValue() const;		const llvm::APSInt &getMinValue() const;

bool pin(llvm::APSInt &Lower, llvm::APSInt &Upper) const;		bool pin(llvm::APSInt &Lower, llvm::APSInt &Upper) const;

public:		public:
RangeSet Intersect(BasicValueFactory &BV, Factory &F, llvm::APSInt Lower,		RangeSet Intersect(BasicValueFactory &BV, Factory &F, llvm::APSInt Lower,
llvm::APSInt Upper) const;		llvm::APSInt Upper) const;
		RangeSet Intersect(BasicValueFactory &BV, Factory &F,
		const RangeSet &Other) const;
RangeSet Negate(BasicValueFactory &BV, Factory &F) const;		RangeSet Negate(BasicValueFactory &BV, Factory &F) const;

void print(raw_ostream &os) const;		void print(raw_ostream &os) const;

bool operator==(const RangeSet &other) const {		bool operator==(const RangeSet &other) const {
return ranges == other.ranges;		return ranges == other.ranges;
}		}
};		};
▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

cfe/trunk/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	else {
// Therefore, the lower range most be handled first.		// Therefore, the lower range most be handled first.
IntersectInRange(BV, F, BV.getMinValue(Upper), Upper, newRanges, i, e);		IntersectInRange(BV, F, BV.getMinValue(Upper), Upper, newRanges, i, e);
IntersectInRange(BV, F, Lower, BV.getMaxValue(Lower), newRanges, i, e);		IntersectInRange(BV, F, Lower, BV.getMaxValue(Lower), newRanges, i, e);
}		}

return newRanges;		return newRanges;
}		}

		// Returns a set containing the values in the receiving set, intersected with
		// the range set passed as parameter.
		RangeSet RangeSet::Intersect(BasicValueFactory &BV, Factory &F,
		const RangeSet &Other) const {
		PrimRangeSet newRanges = F.getEmptySet();

		for (iterator i = Other.begin(), e = Other.end(); i != e; ++i) {
		RangeSet newPiece = Intersect(BV, F, i->From(), i->To());
		for (iterator j = newPiece.begin(), ee = newPiece.end(); j != ee; ++j) {
		newRanges = F.add(newRanges, *j);
		}
		}

		return newRanges;
		}

// Turn all [A, B] ranges to [-B, -A]. Ranges [MIN, B] are turned to range set		// Turn all [A, B] ranges to [-B, -A]. Ranges [MIN, B] are turned to range set
// [MIN, MIN] U [-B, MAX], when MIN and MAX are the minimal and the maximal		// [MIN, MIN] U [-B, MAX], when MIN and MAX are the minimal and the maximal
// signed values of the type.		// signed values of the type.
RangeSet RangeSet::Negate(BasicValueFactory &BV, Factory &F) const {		RangeSet RangeSet::Negate(BasicValueFactory &BV, Factory &F) const {
PrimRangeSet newRanges = F.getEmptySet();		PrimRangeSet newRanges = F.getEmptySet();

for (iterator i = begin(), e = end(); i != e; ++i) {		for (iterator i = begin(), e = end(); i != e; ++i) {
const llvm::APSInt &from = i->From(), &to = i->To();		const llvm::APSInt &from = i->From(), &to = i->To();
▲ Show 20 Lines • Show All 272 Lines • ▼ Show 20 Lines	static RangeSet applyBitwiseConstraints(
if (Operator == BO_And && (IsUnsigned \|\| RHS >= Zero))		if (Operator == BO_And && (IsUnsigned \|\| RHS >= Zero))
return Input.Intersect(BV, F, BV.getMinValue(T), RHS);		return Input.Intersect(BV, F, BV.getMinValue(T), RHS);

return Input;		return Input;
}		}

RangeSet RangeConstraintManager::getRange(ProgramStateRef State,		RangeSet RangeConstraintManager::getRange(ProgramStateRef State,
SymbolRef Sym) {		SymbolRef Sym) {
if (ConstraintRangeTy::data_type *V = State->get<ConstraintRange>(Sym))		ConstraintRangeTy::data_type *V = State->get<ConstraintRange>(Sym);
return *V;

BasicValueFactory &BV = getBasicVals();

// If Sym is a difference of symbols A - B, then maybe we have range set		// If Sym is a difference of symbols A - B, then maybe we have range set
// stored for B - A.		// stored for B - A.
if (const RangeSet *R = getRangeForMinusSymbol(State, Sym))		BasicValueFactory &BV = getBasicVals();
		const RangeSet *R = getRangeForMinusSymbol(State, Sym);

		// If we have range set stored for both A - B and B - A then calculate the
		// effective range set by intersecting the range set for A - B and the
		// negated range set of B - A.
		if (V && R)
		return V->Intersect(BV, F, R->Negate(BV, F));
		if (V)
		return *V;
		if (R)
return R->Negate(BV, F);		return R->Negate(BV, F);

// Lazily generate a new RangeSet representing all possible values for the		// Lazily generate a new RangeSet representing all possible values for the
// given symbol type.		// given symbol type.
QualType T = Sym->getType();		QualType T = Sym->getType();

RangeSet Result(F, BV.getMinValue(T), BV.getMaxValue(T));		RangeSet Result(F, BV.getMinValue(T), BV.getMaxValue(T));

▲ Show 20 Lines • Show All 276 Lines • Show Last 20 Lines

cfe/trunk/test/Analysis/constraint_manager_negate_difference.c

Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	void negate_int_min(int m, int n) {
clang_analyzer_eval(n - m == INT_MIN); // expected-warning{{TRUE}}		clang_analyzer_eval(n - m == INT_MIN); // expected-warning{{TRUE}}
}		}

void negate_mixed(int m, int n) {		void negate_mixed(int m, int n) {
if (m -n > INT_MIN && m - n <= 0)		if (m -n > INT_MIN && m - n <= 0)
return;		return;
clang_analyzer_eval(n - m <= 0); // expected-warning{{TRUE}}		clang_analyzer_eval(n - m <= 0); // expected-warning{{TRUE}}
}		}

		void effective_range(int m, int n) {
		assert(m - n >= 0);
		assert(n - m >= 0);
		clang_analyzer_eval(m - n == 0); // expected-warning{{TRUE}}
		clang_analyzer_eval(n - m == 0); // expected-warning{{TRUE}}
		}

		void effective_range_2(int m, int n) {
		assert(m - n <= 0);
		assert(n - m <= 0);
		clang_analyzer_eval(m - n == 0); // expected-warning{{TRUE}} expected-warning{{FALSE}}
		clang_analyzer_eval(n - m == 0); // expected-warning{{TRUE}} expected-warning{{FALSE}}
		}