This is an archive of the discontinued LLVM Phabricator instance.

(a) Environment allows taking SVals of ReturnStmt, which is not an expression, by transparently converting it into its sub-expression. In fact, it only stores expressions.

Having just noticed (a), i also understand that (a) is not of direct importance to the question, however:

(b) With a similar trick, Environment allows taking SVal of literal expressions, but never stores literal expressions. Instead, it reconstructs the constant value by looking at the literal and returns the freshly constructed value when asked.

This leads to "return 0;" and "return 1;" branches having the same program state. The difference should have been within Environment (return values are different), however return values are literals, and they aren't stored in the Environment, and hence the Environment is equal in both states. If only the function returned, say, 0 and i, the problem wouldn't have been there.

This leaves us with two options:

Tell the Environment to store everything. This would make things heavy. However, i do not completely understand the consequences of this bug at the moment - perhaps there would be more problems due to this state merging unless we start storing everything.

Rely on the ProgramPoint to remember where we are. After all, why do we merge when program points should clearly be different? The program point that fails us is CallExitBegin - we could construct it with ReturnStmt and its block count to discriminate between different returns. It should fix this issue, and is similar to the approach taken in this patch (just puts things to their own place), however, as i mentioned, there might be more problems with misbehavior (b) of the Environment, need to think.

Finally, i'm not quite sure why CallExitBegin is at all necessary. I wonder if we could just remove it and jump straight to Bind Return Value, also need to think.

adding reviewers.

In D25326#564082, @NoQ wrote:

Funny facts about the Environment:

(a) Environment allows taking SVals of ReturnStmt, which is not an expression, by transparently converting it into its sub-expression. In fact, it only stores expressions.

Having just noticed (a), i also understand that (a) is not of direct importance to the question, however:

(b) With a similar trick, Environment allows taking SVal of literal expressions, but never stores literal expressions. Instead, it reconstructs the constant value by looking at the literal and returns the freshly constructed value when asked.

This leads to "return 0;" and "return 1;" branches having the same program state. The difference should have been within Environment (return values are different), however return values are literals, and they aren't stored in the Environment, and hence the Environment is equal in both states. If only the function returned, say, 0 and i, the problem wouldn't have been there.

I did not know "a" and "b", thanks for info.

This leaves us with two options:

Tell the Environment to store everything. This would make things heavy. However, i do not completely understand the consequences of this bug at the moment - perhaps there would be more problems due to this state merging unless we start storing everything.

Rely on the ProgramPoint to remember where we are. After all, why do we merge when program points should clearly be different? The program point that fails us is CallExitBegin - we could construct it with ReturnStmt and its block count to discriminate between different returns. It should fix this issue, and is similar to the approach taken in this patch (just puts things to their own place), however, as i mentioned, there might be more problems with misbehavior (b) of the Environment, need to think.

yes sounds heavy.
I assume you are right.. however as I understand it the ProgramPoint when CallExitBegin is called is the same (in the exit block). do you suggest that we take the ProgramPoint for the exit block's predecessor?

Finally, i'm not quite sure why CallExitBegin is at all necessary. I wonder if we could just remove it and jump straight to Bind Return Value, also need to think.

me too. however there is much I wonder about when it comes to ExplodedGraph. :-)

In D25326#564185, @danielmarjamaki wrote:

as I understand it the ProgramPoint when CallExitBegin is called is the same (in the exit block). do you suggest that we take the ProgramPoint for the exit block's predecessor?

CallExitBegin is the program point. I propose to make different exits from the same function be different program points. This could be achieved by adding more members to the CallExitBegin class - return statement and block count would probably be sufficient.

In fact, not sure if we need block count. If we're, say, returning from a loop with the same statement, then we're either returning different values, or returning the same literal expression, so Environment would do the job for us well in both cases.

ok. As far as I see it's not trivial to know which ReturnStmt there was when CallExitBegin is created. Do you suggest that I move it or that I try to lookup the ReturnStmt? I guess it can be looked up by looking in the predecessors in the ExplodedGraph?

Finally, i'm not quite sure why CallExitBegin is at all necessary. I wonder if we could just remove it and jump straight to Bind Return Value, also need to think.

.. unless that is better apprach

In D25326#564239, @danielmarjamaki wrote:

ok. As far as I see it's not trivial to know which ReturnStmt there was when CallExitBegin is created.

We're in HandleBlockEdge, just pass down the statement from CFG here?

In D25326#564239, @danielmarjamaki wrote:

.. unless that is better apprach

It seems to have something to do with separation of duties between CoreEngine and ExprEngine. Kind of, CoreEngine explores the CFG, ExprEngine models effects of statements, and noticing end of function is CoreEngine's duty, while binding the return value is ExprEngine's duty, and CallExitBegin acts like a message from CoreEngine to ExprEngine so that they could work together. That's how it seems to me, but i'm not sure of the original intention here.

In D25326#564283, @NoQ wrote:

In D25326#564239, @danielmarjamaki wrote:

ok. As far as I see it's not trivial to know which ReturnStmt there was when CallExitBegin is created.

We're in HandleBlockEdge, just pass down the statement from CFG here?

I don't directly see how you mean. Code is:

void CoreEngine::HandleBlockEdge(const BlockEdge &L, ExplodedNode *Pred) {

  const CFGBlock *Blk = L.getDst();

The Blk->dump() says:

[B0 (EXIT)]
   Preds (2): B1 B2

In D25326#564291, @danielmarjamaki wrote:
The Blk->dump() says:
[B0 (EXIT)]
   Preds (2): B1 B2

What about the source block?

In D25326#564291, @danielmarjamaki wrote:
In D25326#564283, @NoQ wrote:

In D25326#564239, @danielmarjamaki wrote:

ok. As far as I see it's not trivial to know which ReturnStmt there was when CallExitBegin is created.

We're in HandleBlockEdge, just pass down the statement from CFG here?

I don't directly see how you mean. Code is:
void CoreEngine::HandleBlockEdge(const BlockEdge &L, ExplodedNode *Pred) {

  const CFGBlock *Blk = L.getDst();
The Blk->dump() says:
[B0 (EXIT)]
   Preds (2): B1 B2

Sorry... I think I see. L.getSrc() will give me the cfg block I am interested in.

Refactoring.

Yay, this is great. Thanks for investigating all those loss of coverage issues!

lib/StaticAnalyzer/Core/CoreEngine.cpp
610 ↗	(On Diff #73926)	Space after ",".
639 ↗	(On Diff #73926)	Space after ",".
lib/StaticAnalyzer/Core/ExprEngine.cpp
1792 ↗	(On Diff #73926)	Space after ",".

This revision is now accepted and ready to land.Oct 7 2016, 7:17 AM

Please, fix the style issues before committing.

include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
266 ↗	(On Diff #73926)	Add spaces around '=.'

Can you also add a test that tests this more directly (i.e., with clang_analyzer_warnIfReached). I don't think it is good to have the only test for this core coverage issue to be in tests for an alpha checker. Adding the direct test would also make it easier to track down any regression if it happens. The 'func.c' test file might be a good place for such a test.

a.sidorin added inline comments.Oct 7 2016, 10:13 AM

test/Analysis/unreachable-code-path.c
201	I have a small question. Is it possible to simplify this sample with removing of table[] array? Like putting something like `i != 0` into condition. As I understand, the problem is not array-related.

NoQ added inline comments.Oct 7 2016, 10:15 AM

test/Analysis/unreachable-code-path.c
201	Any `UnknownVal` in the condition would trigger this issue.

seurer added a subscriber: seurer.Oct 7 2016, 10:36 AM

Closed by commit rL283554: [analyzer] Don't merge different return nodes in ExplodedGraph (authored by danielmarjamaki). · Explain WhyOct 7 2016, 10:45 AM

This revision was automatically updated to reflect the committed changes.

danielmarjamaki added inline comments.Oct 10 2016, 12:44 AM

test/Analysis/unreachable-code-path.c
201	Is it possible to simplify this sample with removing of table[] array? I tried to simplify as much as possible. But as NoQ says an UnknownVal is required here for this test.

In D25326#564584, @zaks.anna wrote:

Please, fix the style issues before committing.

Sorry I missed that.

Ideally it would be possible to run clang-format on the files before committing. but currently I get lots of unrelated changes then.

Would it be ok to run clang-format on some files to clean up the formatting? At least some minor changes like removing trailing spaces.

In D25326#565919, @danielmarjamaki wrote:

Ideally it would be possible to run clang-format on the files before committing. but currently I get lots of unrelated changes then.

Would it be ok to run clang-format on some files to clean up the formatting? At least some minor changes like removing trailing spaces.

clang-format should be able to format small sections of the code, keeping in mind the rest of the file but not touching it. In vim, i use a hotkey for formatting the statement under cursor. I've seen a plugin for Xcode that clang-format's selected text. Also, there's this git clang-format thingy, which should be able to clang-format git changes and only them, i didn't try, but it might be universally useful.

dcoughlin added inline comments.Oct 10 2016, 8:23 AM

test/Analysis/unreachable-code-path.c
201	This is worth a comment in the test then, so that if we ever add symbolic reasoning we'll know how to adjust the test. You could also try to add a canary with clang analyzer eval after the if statement to force the test to fail if we do add this symbolic reasoning.

In D25326#565919, @danielmarjamaki wrote:

In D25326#564584, @zaks.anna wrote:

Please, fix the style issues before committing.

Would it be ok to run clang-format on some files to clean up the formatting? At least some minor changes like removing trailing spaces.

We generally try to avoid large-scale fixes of formatting and spacing issues if they are not on lines already modified by a patch. This makes it a bit easier to track down the original commit for a line and also reduces merge conflicts.

I agree with the comments from you dcoughlin but I am not sure how to do it.

Can you also add a test that tests this more directly (i.e., with clang_analyzer_warnIfReached). I don't think it is good to have the only test for this core coverage issue to be in tests for an alpha checker. Adding the direct test would also make it easier to track down any regression if it happens. The 'func.c' test file might be a good place for such a test.

I totally agree.

In func.c there such comments:
// expected-warning{{FALSE|TRUE|UNKNOWN}}

what does those FALSE|TRUE|UNKNOWN do?

I don't see what this will do:

clang_analyzer_eval(!f);

I want that both returns are reached. and I want to ensure that result from function is both 1 and 0.

You could also try to add a canary with clang analyzer eval after the if statement to force the test to fail if we do add this symbolic reasoning.

sounds good. sorry but I don't see how to do it.

In D25326#569061, @danielmarjamaki wrote:

You could also try to add a canary with clang analyzer eval after the if statement to force the test to fail if we do add this symbolic reasoning.

sounds good. sorry but I don't see how to do it.

The trick is to not first store the UnknownVal into a local (the analyzer will automatically create a symbol for that when read).

Instead you can do something like:

if (table[i] != 0) { }
clang_analyzer_eval((table[i] != 0)) // expected-warning {{UNKNOWN}}

This way if the analyzer ever starts generating symbols for a read from table[i] then the case split will change the eval to emit TRUE on one path and FALSE on the other.

dcoughlin mentioned this in D25728: Test ExprEngine handling of unknown values.Mar 6 2017, 10:21 AM

NoQ mentioned this in D58392: [analyzer] MIGChecker: Fix false negatives for releases in automatic destructors..Feb 19 2019, 10:47 AM

Revision Contents

Path

Size

lib/

StaticAnalyzer/

Core/

ExprEngineCallAndReturn.cpp

5 lines

test/

Analysis/

inlining/

InlineObjCClassMethod.m

4 lines

unreachable-code-path.c

11 lines

Diff 73796

lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp

Show All 17 Lines
#include "clang/AST/ParentMap.h"		#include "clang/AST/ParentMap.h"
#include "clang/Analysis/Analyses/LiveVariables.h"		#include "clang/Analysis/Analyses/LiveVariables.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"		#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Support/SaveAndRestore.h"		#include "llvm/Support/SaveAndRestore.h"

		REGISTER_TRAIT_WITH_PROGRAMSTATE(LastStmt, const void *)

using namespace clang;		using namespace clang;
using namespace ento;		using namespace ento;

#define DEBUG_TYPE "ExprEngine"		#define DEBUG_TYPE "ExprEngine"

STATISTIC(NumOfDynamicDispatchPathSplits,		STATISTIC(NumOfDynamicDispatchPathSplits,
"The # of times we split the path due to imprecise dynamic dispatch info");		"The # of times we split the path due to imprecise dynamic dispatch info");

▲ Show 20 Lines • Show All 947 Lines • ▼ Show 20 Lines	void ExprEngine::VisitReturnStmt(const ReturnStmt RS, ExplodedNode Pred,
ExplodedNodeSet dstPreVisit;		ExplodedNodeSet dstPreVisit;
getCheckerManager().runCheckersForPreStmt(dstPreVisit, Pred, RS, *this);		getCheckerManager().runCheckersForPreStmt(dstPreVisit, Pred, RS, *this);

StmtNodeBuilder B(dstPreVisit, Dst, *currBldrCtx);		StmtNodeBuilder B(dstPreVisit, Dst, *currBldrCtx);

if (RS->getRetValue()) {		if (RS->getRetValue()) {
for (ExplodedNodeSet::iterator it = dstPreVisit.begin(),		for (ExplodedNodeSet::iterator it = dstPreVisit.begin(),
ei = dstPreVisit.end(); it != ei; ++it) {		ei = dstPreVisit.end(); it != ei; ++it) {
B.generateNode(RS, it, (it)->getState());		ProgramStateRef State = (*it)->getState();
		B.generateNode(RS, it, State->set<LastStmt>((const void)RS));
}		}
}		}
}		}

test/Analysis/inlining/InlineObjCClassMethod.m

Show First 20 Lines • Show All 168 Lines • ▼ Show 20 Lines	else
return 1;		return 1;
}		}
@end		@end
@interface MyClassSelf : MyParentSelf		@interface MyClassSelf : MyParentSelf
@end		@end
@implementation MyClassSelf		@implementation MyClassSelf
+ (int)testClassMethodByKnownVarDecl {		+ (int)testClassMethodByKnownVarDecl {
int y = [MyParentSelf testSelf];		int y = [MyParentSelf testSelf];
return 5/y; // Should warn here.		return 5/y; // expected-warning{{Division by zero}}
}		}
@end		@end
int foo2() {		int foo2() {
int y = [MyParentSelf testSelf];		int y = [MyParentSelf testSelf];
return 5/y; // Should warn here.		return 5/y; // expected-warning{{Division by zero}}
}		}

// TODO: We do not inline 'getNum' in the following case, where the value of		// TODO: We do not inline 'getNum' in the following case, where the value of
// 'self' in call '[self getNum]' is available and evaualtes to		// 'self' in call '[self getNum]' is available and evaualtes to
// 'SelfUsedInParentChild' if it's called from fooA.		// 'SelfUsedInParentChild' if it's called from fooA.
// Self region should get created before we call foo and yje call to super		// Self region should get created before we call foo and yje call to super
// should keep it live.		// should keep it live.
@interface SelfUsedInParent : NSObject		@interface SelfUsedInParent : NSObject
▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

test/Analysis/unreachable-code-path.c

Show First 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	void test12(int x) {
switch (x) {		switch (x) {
case 1:		case 1:
break; // not unreachable		break; // not unreachable
case 2:		case 2:
do { } while (0);		do { } while (0);
break;		break;
}		}
}		}

		extern int table[];
		static int inlineFunction(const int i) {
		if (table[i] != 0) // <- SVal for table[0] is unknown
		return 1;
		a.sidorinUnsubmitted Not Done Reply Inline Actions I have a small question. Is it possible to simplify this sample with removing of table[] array? Like putting something like `i != 0` into condition. As I understand, the problem is not array-related. a.sidorin: I have a small question. Is it possible to simplify this sample with removing of table[] array?
		NoQUnsubmitted Not Done Reply Inline Actions Any `UnknownVal` in the condition would trigger this issue. NoQ: Any `UnknownVal` in the condition would trigger this issue.
		danielmarjamakiAuthorUnsubmitted Not Done Reply Inline Actions Is it possible to simplify this sample with removing of table[] array? I tried to simplify as much as possible. But as NoQ says an UnknownVal is required here for this test. danielmarjamaki: > Is it possible to simplify this sample with removing of table[] array? I tried to simplify…
		dcoughlinUnsubmitted Not Done Reply Inline Actions This is worth a comment in the test then, so that if we ever add symbolic reasoning we'll know how to adjust the test. You could also try to add a canary with clang analyzer eval after the if statement to force the test to fail if we do add this symbolic reasoning. dcoughlin: This is worth a comment in the test then, so that if we ever add symbolic reasoning we'll know…
		return 0;
		}
		void test13(int i) {
		int x = inlineFunction(i);
		x && x < 10;
		}

This is an archive of the discontinued LLVM Phabricator instance.

[StaticAnalyser] Don't merge different returns in ExplodedGraphClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 73796

lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp

test/Analysis/inlining/InlineObjCClassMethod.m

test/Analysis/unreachable-code-path.c

[StaticAnalyser] Don't merge different returns in ExplodedGraph
ClosedPublic