This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/StaticAnalyzer/Checkers/
-
lib/
-
StaticAnalyzer/
-
Checkers/
1/3
StreamChecker.cpp

Differential D69662

[Checkers] Avoid using evalCall in StreamChecker.
AbandonedPublic

Authored by balazske on Oct 31 2019, 8:08 AM.

Download Raw Diff

Details

Reviewers

NoQ
Charusso
Szelethus

Summary

Use of evalCall is replaced by preCall and postCall.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 40341
Build 40447: arc lint + arc unit

Event Timeline

balazske created this revision.Oct 31 2019, 8:08 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2019, 8:08 AM

Herald added subscribers: cfe-commits, gamesh411, Szelethus, dkrupp. · View Herald Transcript

Harbormaster completed remote builds in B40341: Diff 227280.Oct 31 2019, 8:09 AM

balazske added reviewers: NoQ, Charusso, Szelethus.Oct 31 2019, 8:09 AM

But why?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
161	You're not allowed to do this in `checkPostCall` because other post-call checkers may have already read the return value.

I wanted to remove eval::Call because only one checker can do this otherwise it is undefined behavior (according to the not very new "Analyzer Guide"). If it is essentially needed in this checker it will remain.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
161	Is it possible to do in `check::PreCall`? The value of the call expression is not used, only a state split is done.

On multiple evaluation of the same call the Analyzer will warn and prevent you to do so.

In D69662#1731974, @balazske wrote:

I wanted to remove eval::Call because only one checker can do this otherwise it is undefined behavior (according to the not very new "Analyzer Guide"). If it is essentially needed in this checker it will remain.

Lets get the direct quote from the book:

Usage of this callback is discouraged because only one checker may evaluate any call event; if two or more checkers, probably developed by different people, accidentally evaluate the same function, behavior of the analyzer is undefined. So, if possible, check::PreCall and check::PostCall should be considered, and most of the time they are flexible enough to model effects of the call on the program state.

I think this statement just doesn't reflect our current stance on the issue, mostly because

In D69662#1731987, @Charusso wrote:

On multiple evaluation of the same call the Analyzer will warn and prevent you to do so.

I think modeling core library functions is more fitting to be handled by eval::Calls. Do I understand correctly that a function modeled by eval::Call won't be inlined?

From the same book:

There are multiple preconditions required for inlining to happen, including:
— Source code of the callee function body needs to be available;
— No checker should evaluate the function call via eval::Call;

Abandon this change?

Yes, indeed, evalCall can only be performed by one checker. But if any, it is this checker that's fully responsible for stream functions. So i recommend doing evalCall in this particular checker and falling back to PreCall/PostCall in other checkers that model other aspects of streams.

Yes, indeed, evalCall prevents inlining from happening. This is fine, however, as long as you manually model all the effects of the call on the Environment (namely, set the return value) and on the Store (invalidate regions that may be touched by the function - in most cases it's none).

Generally, inlining shouldn't be thought of as an ideal solution because it has a downside of being extremely expensive. Instead, it should be thought of as a poor man's solution when it comes to any sort of API functions that have well-documented behavior. In particular, evalCall is more precise when it comes to state splits (which is why StdCLibraryFunctionsChecker uses evalCall, see D20811 for details) (also "inlined defensive checks").

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
161	Nope, because `ExprEngine` will overwrite the value when modeling the function call, regardless of whether it will be inlined or not. The only valid reason to use `BindExpr` in a checker is to bind the return value in `evalCall`. Generally, values in the Environment are not meant to be overwritten. Any active expression can only have one value. The only way to obtain an environment with a different value for the same expression is to wait until the expression dies and reach the same expression later in the same analysis. In this case the value is first removed from the Environment during dead symbols collection, and then added back later. For quite some time i wanted to prevent these mistakes by adding an assertion "Environment values are never mutated". Unfortunately, there are multiple existing violations of this rule, which are bugs in `ExprEngine`, and i didn't have time to fix them all. I highly recommend this little project as an exercise :)

balazske added a comment.Nov 5 2019, 1:28 AM

This comment was removed by balazske.

Given that i roughly remember what the previous comment was about and i wanted to comment on this: you should totally move evalCall for any functions you need from StdCLibraryFunctionsChecker to this checker, given that your checker is more specialized.

That said, you shouldn't do that until your checker is out of alpha, because that'd disable modeling for users who don't mess with alpha checkers. Working around that would be a moderately interesting problem. Of course you can always disable StdCLibraryFunctionsChecker when testing your checker. We could also come up with an inter-checker API function that tells StdCLibraryFunctionsChecker to disable certain parts of itself and call this function upon StreamChecker's registration. We could also go for a more principled solution by introducing a global CallDescriptionMap<const Checker *> that coordinates who evalCall what (instead of polling all checkers in a loop on every call and crashing whenever more than one checker responds).

I removed the previous comment because I realized that StdCLibraryFunctionsChecker does not use evalCall for fread (returns false because "non-pure" evaluation).

Anyway the checks that do not use BindExpr (all except the open functions) could be moved into a PreCall or PostCall callback?

In D69662#1736601, @balazske wrote:

Anyway the checks that do not use BindExpr (all except the open functions) could be moved into a PreCall or PostCall callback?

Moving from evalCall to PreCall/PostCall has the additional effect of not giving you control over invalidation of the heap (unless you do evalCall in a checker, it ends up being the normal behavior of conservativeEvalCall() most of the time). For that reason ideally every library function should be evalCall'ed by a checker.

Also if you're making updates to the program state that other checkers should see immediately (say, writing out-parameter values into the Store or updating a state trait that other checkers will read in the same callback), you should either use evalCall for that, or make sure your dependencies are set up correctly (@Szelethus, our callback invocation order is now affected by checker dependencies, right?).

checkArgNullStream() should definitely be at PreCall.

evalFseek() doesn't have a BindExpr but it should have it; looks like a bug. If you're evalCall-ing a non-void function you must bind a return value (we should add an assertion for this; there's never a reason to bind an UnknownVal in evalCall because there generally never is a good reason to bind UnknownVal to anything because it shouldn't have been present in our SVal hierarchy in the first place because conjuring a value is always strictly better).

In D69662#1744479, @NoQ wrote:

In D69662#1736601, @balazske wrote:

Anyway the checks that do not use BindExpr (all except the open functions) could be moved into a PreCall or PostCall callback?

Moving from evalCall to PreCall/PostCall has the additional effect of not giving you control over invalidation of the heap (unless you do evalCall in a checker, it ends up being the normal behavior of conservativeEvalCall() most of the time). For that reason ideally every library function should be evalCall'ed by a checker.

Also if you're making updates to the program state that other checkers should see immediately (say, writing out-parameter values into the Store or updating a state trait that other checkers will read in the same callback), you should either use evalCall for that, or make sure your dependencies are set up correctly (@Szelethus, our callback invocation order is now affected by checker dependencies, right?).

Sorry for the slack :)

One should never count on the invocation order of callback funcions in between checkers. In fact, I'm not too sure that my patches affect this, but I suspect that it does, as the container of choice for checker objects is std::vector.

checkArgNullStream() should definitely be at PreCall.

evalFseek() doesn't have a BindExpr but it should have it; looks like a bug. If you're evalCall-ing a non-void function you must bind a return value (we should add an assertion for this; there's never a reason to bind an UnknownVal in evalCall because there generally never is a good reason to bind UnknownVal to anything because it shouldn't have been present in our SVal hierarchy in the first place because conjuring a value is always strictly better).

Herald added a subscriber: martong. · View Herald TranscriptFeb 24 2020, 8:59 AM

In D69662#1889545, @Szelethus wrote:

Sorry for the slack :)

One should never count on the invocation order of callback funcions in between checkers. In fact, I'm not too sure that my patches affect this, but I suspect that it does, as the container of choice for checker objects is std::vector.

With checker dependencies introduced, i think it's not an unreasonable guarantee to make. Like, if you rely on your dependency to model things for you, you probably want to have your callbacks called after everything is set up by the dependency.

That said, it's not always obvious what does "after" mean. I wouldn't be shocked if it turns out that the correct order is different in pre-stmt and post-stmt (i.e., dependent - dependency - actual event - dependency - dependent).

In D69662#1890007, @NoQ wrote:

In D69662#1889545, @Szelethus wrote:

Sorry for the slack :)

One should never count on the invocation order of callback funcions in between checkers. In fact, I'm not too sure that my patches affect this, but I suspect that it does, as the container of choice for checker objects is std::vector.

With checker dependencies introduced, i think it's not an unreasonable guarantee to make. Like, if you rely on your dependency to model things for you, you probably want to have your callbacks called after everything is set up by the dependency.

That said, it's not always obvious what does "after" mean. I wouldn't be shocked if it turns out that the correct order is different in pre-stmt and post-stmt (i.e., dependent - dependency - actual event - dependency - dependent).

Well, you raise a valid point. While I do think that implementing complex checkers that have strong interaction should be left to the bit-more-experienced, maybe it'd be better to make the interface a bit more intuitive. I like to point at IteratorChecker, which is spread out across multiple files, despite it packing a lot of knowledge.

I'm afraid that I too have more questions then possible solution to this answer. My patches related to MallocChecker was (is) a research of some sort to come up with one.

StreamChecker will be updated in other changes, see:
https://reviews.llvm.org/D75158

Szelethus mentioned this in D77012: [analyzer] Fix StdLibraryFunctionsChecker NotNull Constraint Check.Mar 30 2020, 7:19 AM

Szelethus mentioned this in D82288: [analyzer][StdLibraryFunctionsChecker] Add POSIX file handling functions.Jun 24 2020, 4:48 AM

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Checkers/

StreamChecker.cpp

41 lines

Diff 227280

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	struct StreamState {
static StreamState getOpenFailed() { return StreamState(OpenFailed); }		static StreamState getOpenFailed() { return StreamState(OpenFailed); }
static StreamState getEscaped() { return StreamState(Escaped); }		static StreamState getEscaped() { return StreamState(Escaped); }

void Profile(llvm::FoldingSetNodeID &ID) const {		void Profile(llvm::FoldingSetNodeID &ID) const {
ID.AddInteger(K);		ID.AddInteger(K);
}		}
};		};

class StreamChecker : public Checker<eval::Call,		class StreamChecker
check::DeadSymbols > {		: public Checker<check::PreCall, check::PostCall, check::DeadSymbols> {
mutable std::unique_ptr<BuiltinBug> BT_nullfp, BT_illegalwhence,		mutable std::unique_ptr<BuiltinBug> BT_nullfp, BT_illegalwhence,
BT_doubleclose, BT_ResourceLeak;		BT_doubleclose, BT_ResourceLeak;

public:		public:
bool evalCall(const CallEvent &Call, CheckerContext &C) const;		void checkPreCall(const CallEvent &Call, CheckerContext &C) const;
		void checkPostCall(const CallEvent &Call, CheckerContext &C) const;
void checkDeadSymbols(SymbolReaper &SymReaper, CheckerContext &C) const;		void checkDeadSymbols(SymbolReaper &SymReaper, CheckerContext &C) const;

private:		private:
using FnCheck = std::function<void(const StreamChecker *, const CallEvent &,		using FnCheck = std::function<void(const StreamChecker *, const CallEvent &,
CheckerContext &)>;		CheckerContext &)>;

CallDescriptionMap<FnCheck> Callbacks = {		CallDescriptionMap<FnCheck> PreCallbacks = {
{{"fopen"}, &StreamChecker::evalFopen},
{{"tmpfile"}, &StreamChecker::evalFopen},
{{"fclose", 1}, &StreamChecker::evalFclose},		{{"fclose", 1}, &StreamChecker::evalFclose},
{{"fread", 4},		{{"fread", 4},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 3)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 3)},
{{"fwrite", 4},		{{"fwrite", 4},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 3)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 3)},
{{"fseek", 3}, &StreamChecker::evalFseek},		{{"fseek", 3}, &StreamChecker::evalFseek},
{{"ftell", 1},		{{"ftell", 1},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
{{"rewind", 1},		{{"rewind", 1},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
{{"fgetpos", 2},		{{"fgetpos", 2},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
{{"fsetpos", 2},		{{"fsetpos", 2},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
{{"clearerr", 1},		{{"clearerr", 1},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
{{"feof", 1},		{{"feof", 1},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
{{"ferror", 1},		{{"ferror", 1},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
{{"fileno", 1},		{{"fileno", 1},
std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},		std::bind(&StreamChecker::checkArgNullStream, _1, _2, _3, 0)},
};		};

		CallDescriptionMap<FnCheck> PostCallbacks = {
		{{"fopen"}, &StreamChecker::evalFopen},
		{{"tmpfile"}, &StreamChecker::evalFopen},
		};

void evalFopen(const CallEvent &Call, CheckerContext &C) const;		void evalFopen(const CallEvent &Call, CheckerContext &C) const;
void evalFclose(const CallEvent &Call, CheckerContext &C) const;		void evalFclose(const CallEvent &Call, CheckerContext &C) const;
void evalFseek(const CallEvent &Call, CheckerContext &C) const;		void evalFseek(const CallEvent &Call, CheckerContext &C) const;

		void checkCall(const CallEvent &Call, CheckerContext &C,
		const CallDescriptionMap<FnCheck> &Callbacks) const;
void checkArgNullStream(const CallEvent &Call, CheckerContext &C,		void checkArgNullStream(const CallEvent &Call, CheckerContext &C,
unsigned ArgI) const;		unsigned ArgI) const;
bool checkNullStream(SVal SV, CheckerContext &C,		bool checkNullStream(SVal SV, CheckerContext &C,
ProgramStateRef &State) const;		ProgramStateRef &State) const;
void checkFseekWhence(SVal SV, CheckerContext &C,		void checkFseekWhence(SVal SV, CheckerContext &C,
ProgramStateRef &State) const;		ProgramStateRef &State) const;
bool checkDoubleClose(const CallEvent &Call, CheckerContext &C,		bool checkDoubleClose(const CallEvent &Call, CheckerContext &C,
ProgramStateRef &State) const;		ProgramStateRef &State) const;
};		};

} // end anonymous namespace		} // end anonymous namespace

REGISTER_MAP_WITH_PROGRAMSTATE(StreamMap, SymbolRef, StreamState)		REGISTER_MAP_WITH_PROGRAMSTATE(StreamMap, SymbolRef, StreamState)

		void StreamChecker::checkPreCall(const CallEvent &Call,
		CheckerContext &C) const {
		checkCall(Call, C, PreCallbacks);
		}

		void StreamChecker::checkPostCall(const CallEvent &Call,
		CheckerContext &C) const {
		checkCall(Call, C, PostCallbacks);
		}

bool StreamChecker::evalCall(const CallEvent &Call, CheckerContext &C) const {		void StreamChecker::checkCall(
		const CallEvent &Call, CheckerContext &C,
		const CallDescriptionMap<FnCheck> &Callbacks) const {
const auto *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());		const auto *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());
if (!FD \|\| FD->getKind() != Decl::Function)		if (!FD \|\| FD->getKind() != Decl::Function)
return false;		return;

// Recognize "global C functions" with only integral or pointer arguments		// Recognize "global C functions" with only integral or pointer arguments
// (and matching name) as stream functions.		// (and matching name) as stream functions.
if (!Call.isGlobalCFunction())		if (!Call.isGlobalCFunction())
return false;		return;
for (auto P : Call.parameters()) {		for (auto P : Call.parameters()) {
QualType T = P->getType();		QualType T = P->getType();
if (!T->isIntegralOrEnumerationType() && !T->isPointerType())		if (!T->isIntegralOrEnumerationType() && !T->isPointerType())
return false;		return;
}		}

const FnCheck *Callback = Callbacks.lookup(Call);		const FnCheck *Callback = Callbacks.lookup(Call);
if (!Callback)		if (!Callback)
return false;		return;

(*Callback)(this, Call, C);		(*Callback)(this, Call, C);

return C.isDifferent();
}		}

void StreamChecker::evalFopen(const CallEvent &Call, CheckerContext &C) const {		void StreamChecker::evalFopen(const CallEvent &Call, CheckerContext &C) const {
ProgramStateRef state = C.getState();		ProgramStateRef state = C.getState();
SValBuilder &svalBuilder = C.getSValBuilder();		SValBuilder &svalBuilder = C.getSValBuilder();
const LocationContext *LCtx = C.getPredecessor()->getLocationContext();		const LocationContext *LCtx = C.getPredecessor()->getLocationContext();
auto *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());		auto *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
if (!CE)		if (!CE)
return;		return;

DefinedSVal RetVal =		DefinedSVal RetVal =
svalBuilder.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())		svalBuilder.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())
.castAs<DefinedSVal>();		.castAs<DefinedSVal>();
state = state->BindExpr(CE, C.getLocationContext(), RetVal);		state = state->BindExpr(CE, C.getLocationContext(), RetVal);
		NoQUnsubmitted Not Done Reply Inline Actions You're not allowed to do this in `checkPostCall` because other post-call checkers may have already read the return value. NoQ: You're not allowed to do this in `checkPostCall` because other post-call checkers may have…
		balazskeAuthorUnsubmitted Done Reply Inline Actions Is it possible to do in `check::PreCall`? The value of the call expression is not used, only a state split is done. balazske: Is it possible to do in `check::PreCall`? The value of the call expression is not used, only a…
		NoQUnsubmitted Not Done Reply Inline Actions Nope, because `ExprEngine` will overwrite the value when modeling the function call, regardless of whether it will be inlined or not. The only valid reason to use `BindExpr` in a checker is to bind the return value in `evalCall`. Generally, values in the Environment are not meant to be overwritten. Any active expression can only have one value. The only way to obtain an environment with a different value for the same expression is to wait until the expression dies and reach the same expression later in the same analysis. In this case the value is first removed from the Environment during dead symbols collection, and then added back later. For quite some time i wanted to prevent these mistakes by adding an assertion "Environment values are never mutated". Unfortunately, there are multiple existing violations of this rule, which are bugs in `ExprEngine`, and i didn't have time to fix them all. I highly recommend this little project as an exercise :) NoQ: Nope, because `ExprEngine` will overwrite the value when modeling the function call, regardless…

ConstraintManager &CM = C.getConstraintManager();		ConstraintManager &CM = C.getConstraintManager();
// Bifurcate the state into two: one with a valid FILE* pointer, the other		// Bifurcate the state into two: one with a valid FILE* pointer, the other
// with a NULL.		// with a NULL.
ProgramStateRef stateNotNull, stateNull;		ProgramStateRef stateNotNull, stateNull;
std::tie(stateNotNull, stateNull) = CM.assumeDual(state, RetVal);		std::tie(stateNotNull, stateNull) = CM.assumeDual(state, RetVal);

SymbolRef Sym = RetVal.getAsSymbol();		SymbolRef Sym = RetVal.getAsSymbol();
▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines