This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
-
ExprEngineC.cpp
-
SimpleConstraintManager.cpp
-
test/Analysis/
-
Analysis/
-
unwanted-programstate-data-propagation.c

Differential D22862

[analyzer] Fix for PR15623: eliminate unwanted ProgramState checker data propagation.
ClosedPublic

Authored by ayartsev on Jul 27 2016, 6:43 AM.

Download Raw Diff

Details

Reviewers

dcoughlin
zaks.anna

Summary

The attached patch eliminates unneeded checker data propagation from one of the operands of a logical operation to the operation result. The result of a logical operation is calculated from the logical values of its operands and is independent from operands nature.

One of the test changed its result (misc-ps-region-store.m, rdar_7275774). I did not manage to understand the test, something is definitely wrong with it - at least the comment inside the test do not correspond to reality and an old test result seem to be wrong.

The patch fixes https://llvm.org/bugs/show_bug.cgi?id=15623.

Please review!

Diff Detail

Event Timeline

ayartsev updated this revision to Diff 65716.Jul 27 2016, 6:43 AM

ayartsev retitled this revision from to [analyzer] Fix for PR15623: eliminate unwanted ProgramState checker data propagation..

ayartsev updated this object.

ayartsev added reviewers: zaks.anna, krememek.

ayartsev added a subscriber: cfe-commits.

NoQ added a subscriber: NoQ.Jul 27 2016, 11:05 AM

I am not sure it's the right way of fixing this problem and it introduces a regression. The bug is probably specific to "&&".

+ Devin as we might have seen something similar.

test/Analysis/misc-ps-region-store.m
332 ↗	(On Diff #65716)	Try substituting 'p' with null and you will se that n must be zero in that case because, otherwise, we would take the early return branch. Since p is not null, we should not warn here. This is a regression.

zaks.anna edited reviewers, added: dcoughlin; removed: krememek.Jul 27 2016, 10:13 PM

ayartsev added inline comments.Jul 28 2016, 12:11 AM

test/Analysis/misc-ps-region-store.m
332 ↗	(On Diff #65716)	If we reached the line "unsigned short p = (unsigned short) data;" then ''data" is definitely null and "n" is definitely >0, otherwise we would take the early return branch. Then we have "p" is definitely null and "q" is either equal (if n == 1) or greater then "p". In case of n > 1 we definitely have a null dereference. Please tell what I'm missing.

NoQ added inline comments.Jul 28 2016, 1:49 AM

test/Analysis/misc-ps-region-store.m
332 ↗	(On Diff #65716)	"data" is definitely null and "n" is definitely >0 "data" is definitely non-null or "n" is definitely =0. We return on 'not-or', which means we continue on plain 'or'. I also agree that the easiest way to understand that is to substitute `data` with null and see what happens.

xazax.hun added a subscriber: xazax.hun.Jul 29 2016, 3:23 AM

zaks.anna requested changes to this revision.Jul 29 2016, 9:52 AM

zaks.anna edited edge metadata.

This revision now requires changes to proceed.Jul 29 2016, 9:52 AM

As PR15623 shows, returning the existing cast is not correct. But rather than replace it with an unknown, here is a proposal for how to address this without regressing in precision.

Instead of using assumeDual() in ExprEngine::VisitLogicalExpr() on the RHSVal and returning 0, 1, or Unknown() depending on whether the true and false states are empty, we can return a symbolic expression representing "RHSVal != 0":

nonloc::ConcreteInt Zero(getBasicVals().getValue(0, B->getType()));
X = evalBinOp(N->getState(), BO_NE, getSValBuilder().evalCast(RHSVal, B->getType(), RHS->getType()), Zero, B->getType());

Now, evalBinOp() is ultimately doing its own assumeDual() under the hood, so at a high level this should do the same thing as we had before for the 0 and 1 cases -- and it will return a new symbol rather than the casted RHSVal (which is what causing the false positive leaks) when the assumption is unknown.

Unfortunately, this can easily create symbolic expressions of the form "((reg_$1<n>) == 0U) != 0", which the analyzer currently doesn't handle very well. Right now when assumed, that symbolic expression gets translated into a range constraint saying the symbol (reg_$1<n>) == 0U) is in [1, UINTMAX] rather than a range constraint saying that reg_$1<n> is in [0,0] as we might expect. This means that we lose precision in cases like:

if ((n == 0) != 0) {
  clang_analyzer_eval(n == 0); // This should be TRUE but we really currently get UNKNOWN
}

So my proposal is to teach SimpleConstraintManager::assumeSymRel() to simplify an assume of a constraint of the form: "(exp comparison_op expr) != 0" to true into an assume of "exp comparison_op expr" to true. (And similarly, an assume of the form "(exp comparison_op expr) == 0" to true as an assume of exp comparison_op expr to false.) This should be quite a small change.

This approach would improve precision overall, simplify ExprEngine::VisitLogicalExpr(), and avoid the leak false positive from PR15623 without causing the regression in misc-ps-region-store.m that Anna is worried about.

Does this seem reasonable?

Hmm. The test in unwanted-programstate-data-propagation.c passes on my machine even without the patch, and the return value on the respective path is correctly represented as (conj_$6{void *}) != 0U, which comes from the evalCast() call in VisitLogicalExpr() and is the default behavior of evalCast() for Loc to pointer casts. There seems to be something amiss.

@zaks.anna, sorry for the noise about the "misc-ps-region-store.m" test, my mistake.

In D22862#508674, @NoQ wrote:

Hmm. The test in unwanted-programstate-data-propagation.c passes on my machine even without the patch, and the return value on the respective path is correctly represented as (conj_$6{void *}) != 0U, which comes from the evalCast() call in VisitLogicalExpr() and is the default behavior of evalCast() for Loc to pointer casts. There seems to be something amiss.

Hm, updated to trunk, now the test passes without the patch. Changing "_Bool" to "int" in the test reproduces the issue.

In D22862#501315, @dcoughlin wrote:

Does this seem reasonable?

Thanks for the idea, working on the solution.

@dcoughlin, @NoQ, could you, please, tell, how you get dumps of symbolic expressions and constraints like "(conj_$6{void *}) != 0U"? Tried different debug.* checkers and clang_analyzer_explain() but failed.

@dcoughlin, @NoQ, could you, please, tell, how you get dumps of symbolic expressions and constraints like "(conj_$6{void *}) != 0U"? Tried different debug.* checkers and clang_analyzer_explain() but failed.

That's SVal.dump(), SymbolRef->dump(), MemRegion.dump() (you can also push those directly into llvm::errs()), ProgramStateRef->dump(), and ultimately ExprEngine.ViewExplodedGraph() (the last one can be activated in run-time with debug.ViewExplodedGraph checker or with -analyzer-viz-egraph-graphviz which was the one i used, see also http://clang-analyzer.llvm.org/checker_dev_manual.html#commands In fact clang-analyzer-explain() was an attempt to make these dumps a bit more understandable.

Hm, updated to trunk, now the test passes without the patch. Changing "_Bool" to "int" in the test reproduces the issue.

Aha, i see! The cast to int gets represented as a nonloc::LocAsInteger value:

&element{SymRegion{conj_$6{void *}},0 S64b,char} [as 32 bit integer]

which is, of course, incorrect, and Devin's fix makes perfect sense here in my opinion :)

The updated patch implements Devin's solution. Please review.

Overall, this looks good to me. Thanks for tackling this!

One request, though. Could you move the tests into already existing test files? We're trying to avoid test files that only test a single issue. This makes it easier for people who are new to the project to determine where a new test should go. I think test1 could go in perhaps malloc.c and test2 could go in misc-ps.c. It would also be good to have both test functions mention the PR so a future maintainer can tell they are related even though they are in different files.

ddcc mentioned this in D26061: [analyzer] Refactor and simplify SimpleConstraintManager.Feb 18 2017, 8:00 PM

Thi has been committed in r290505.

This revision is now accepted and ready to land.Mar 8 2017, 1:40 PM

zaks.anna closed this revision.Mar 8 2017, 1:40 PM

Revision Contents

Path

Size

lib/

StaticAnalyzer/

Core/

ExprEngineC.cpp

24 lines

SimpleConstraintManager.cpp

16 lines

test/

Analysis/

unwanted-programstate-data-propagation.c

26 lines

Diff 78810

lib/StaticAnalyzer/Core/ExprEngineC.cpp

Show First 20 Lines • Show All 612 Lines • ▼ Show 20 Lines	else {
assert(!SrcBlock->empty());		assert(!SrcBlock->empty());
CFGStmt Elem = SrcBlock->rbegin()->castAs<CFGStmt>();		CFGStmt Elem = SrcBlock->rbegin()->castAs<CFGStmt>();
const Expr *RHS = cast<Expr>(Elem.getStmt());		const Expr *RHS = cast<Expr>(Elem.getStmt());
SVal RHSVal = N->getState()->getSVal(RHS, Pred->getLocationContext());		SVal RHSVal = N->getState()->getSVal(RHS, Pred->getLocationContext());

if (RHSVal.isUndef()) {		if (RHSVal.isUndef()) {
X = RHSVal;		X = RHSVal;
} else {		} else {
DefinedOrUnknownSVal DefinedRHS = RHSVal.castAs<DefinedOrUnknownSVal>();		// We evaluate "RHSVal != 0" expression which result in 0 if the value is
ProgramStateRef StTrue, StFalse;		// known to be false, 1 if the value is known to be true and a new symbol
std::tie(StTrue, StFalse) = N->getState()->assume(DefinedRHS);		// when the assumption is unknown.
if (StTrue) {		nonloc::ConcreteInt Zero(getBasicVals().getValue(0, B->getType()));
if (StFalse) {		X = evalBinOp(N->getState(), BO_NE,
// We can't constrain the value to 0 or 1.		svalBuilder.evalCast(RHSVal, B->getType(), RHS->getType()),
// The best we can do is a cast.		Zero, B->getType());
X = getSValBuilder().evalCast(RHSVal, B->getType(), RHS->getType());
} else {
// The value is known to be true.
X = getSValBuilder().makeIntVal(1, B->getType());
}
} else {
// The value is known to be false.
assert(StFalse && "Infeasible path!");
X = getSValBuilder().makeIntVal(0, B->getType());
}
}		}
}		}
Bldr.generateNode(B, Pred, state->BindExpr(B, Pred->getLocationContext(), X));		Bldr.generateNode(B, Pred, state->BindExpr(B, Pred->getLocationContext(), X));
}		}

void ExprEngine::VisitInitListExpr(const InitListExpr *IE,		void ExprEngine::VisitInitListExpr(const InitListExpr *IE,
ExplodedNode *Pred,		ExplodedNode *Pred,
ExplodedNodeSet &Dst) {		ExplodedNodeSet &Dst) {
▲ Show 20 Lines • Show All 389 Lines • Show Last 20 Lines

lib/StaticAnalyzer/Core/SimpleConstraintManager.cpp

	Show First 20 Lines • Show All 244 Lines • ▼ Show 20 Lines

	ProgramStateRef SimpleConstraintManager::assumeSymRel(ProgramStateRef state,			ProgramStateRef SimpleConstraintManager::assumeSymRel(ProgramStateRef state,
	const SymExpr *LHS,			const SymExpr *LHS,
	BinaryOperator::Opcode op,			BinaryOperator::Opcode op,
	const llvm::APSInt& Int) {			const llvm::APSInt& Int) {
	assert(BinaryOperator::isComparisonOp(op) &&			assert(BinaryOperator::isComparisonOp(op) &&
	"Non-comparison ops should be rewritten as comparisons to zero.");			"Non-comparison ops should be rewritten as comparisons to zero.");

				SymbolRef Sym = LHS;

				// Simplification: translate an assume of a constraint of the form
				// "(exp comparison_op expr) != 0" to true into an assume of
				// "exp comparison_op expr" to true. (And similarly, an assume of the form
				// "(exp comparison_op expr) == 0" to true into an assume of
				// "exp comparison_op expr" to false.)
				if (Int == 0 && (op == BO_EQ \|\| op == BO_NE)) {
				if (const BinarySymExpr *SE = dyn_cast<BinarySymExpr>(Sym)) {
				BinaryOperator::Opcode Op = SE->getOpcode();
				if (BinaryOperator::isComparisonOp(Op))
				return assume(state, nonloc::SymbolVal(Sym), (op == BO_NE ? true : false));
				}
				}

	// Get the type used for calculating wraparound.			// Get the type used for calculating wraparound.
	BasicValueFactory &BVF = getBasicVals();			BasicValueFactory &BVF = getBasicVals();
	APSIntType WraparoundType = BVF.getAPSIntType(LHS->getType());			APSIntType WraparoundType = BVF.getAPSIntType(LHS->getType());

	// We only handle simple comparisons of the form "$sym == constant"			// We only handle simple comparisons of the form "$sym == constant"
	// or "($sym+constant1) == constant2".			// or "($sym+constant1) == constant2".
	// The adjustment is "constant1" in the above expression. It's used to			// The adjustment is "constant1" in the above expression. It's used to
	// "slide" the solution range around for modular arithmetic. For example,			// "slide" the solution range around for modular arithmetic. For example,
	// x < 4 has the solution [0, 3]. x+2 < 4 has the solution [0-2, 3-2], which			// x < 4 has the solution [0, 3]. x+2 < 4 has the solution [0-2, 3-2], which
	// in modular arithmetic is [0, 1] U [UINT_MAX-1, UINT_MAX]. It's up to			// in modular arithmetic is [0, 1] U [UINT_MAX-1, UINT_MAX]. It's up to
	// the subclasses of SimpleConstraintManager to handle the adjustment.			// the subclasses of SimpleConstraintManager to handle the adjustment.
	SymbolRef Sym = LHS;
	llvm::APSInt Adjustment = WraparoundType.getZeroValue();			llvm::APSInt Adjustment = WraparoundType.getZeroValue();
	computeAdjustment(Sym, Adjustment);			computeAdjustment(Sym, Adjustment);

	// Convert the right-hand side integer as necessary.			// Convert the right-hand side integer as necessary.
	APSIntType ComparisonType = std::max(WraparoundType, APSIntType(Int));			APSIntType ComparisonType = std::max(WraparoundType, APSIntType(Int));
	llvm::APSInt ConvertedInt = ComparisonType.convert(Int);			llvm::APSInt ConvertedInt = ComparisonType.convert(Int);

	// Prefer unsigned comparisons.			// Prefer unsigned comparisons.
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

test/Analysis/unwanted-programstate-data-propagation.c

				// RUN: %clang_cc1 -analyze -analyzer-checker=core,debug.ExprInspection,unix.Malloc -verify %s

				// test for PR15623
				#include "Inputs/system-header-simulator.h"

				void clang_analyzer_eval(int);

				typedef __typeof(sizeof(int)) size_t;
				void *malloc(size_t);
				void free(void *);

				int test1(void) {
				char *param = malloc(10);
				char *value = malloc(10);
				int ok = (param && value);
				free(param);
				free(value);
				// Previously we ended up with 'Use of memory after it is freed' on return.
				return ok; // no warning
				}

				void test2(int n) {
				if ((n == 0) != 0) {
				clang_analyzer_eval(n == 0); // expected-warning{{TRUE}}
				}
				}