This is an archive of the discontinued LLVM Phabricator instance.

[Checkers] Avoid using evalCall in StreamChecker.
AbandonedPublic

Authored by balazske on Oct 31 2019, 8:08 AM.

Details

Summary

Use of evalCall is replaced by preCall and postCall.

Event Timeline

balazske created this revision.Oct 31 2019, 8:08 AM
NoQ added a comment.Nov 1 2019, 1:03 PM

But why?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
161

You're not allowed to do this in checkPostCall because other post-call checkers may have already read the return value.

balazske marked an inline comment as done.Nov 4 2019, 1:07 AM

I wanted to remove eval::Call because only one checker can do this otherwise it is undefined behavior (according to the not very new "Analyzer Guide"). If it is essentially needed in this checker it will remain.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
161

Is it possible to do in check::PreCall? The value of the call expression is not used, only a state split is done.

On multiple evaluation of the same call the Analyzer will warn and prevent you to do so.

I wanted to remove eval::Call because only one checker can do this otherwise it is undefined behavior (according to the not very new "Analyzer Guide"). If it is essentially needed in this checker it will remain.

Lets get the direct quote from the book:

Usage of this callback is discouraged because only one checker may evaluate any call event; if two or more checkers, probably developed by different people, accidentally evaluate the same function, behavior of the analyzer is undefined. So, if possible, check::PreCall and check::PostCall should be considered, and most of the time they are flexible enough to model effects of the call on the program state.

I think this statement just doesn't reflect our current stance on the issue, mostly because

On multiple evaluation of the same call the Analyzer will warn and prevent you to do so.

I think modeling core library functions is more fitting to be handled by eval::Calls. Do I understand correctly that a function modeled by eval::Call won't be inlined?

From the same book:

There are multiple preconditions required for inlining to happen, including:
— Source code of the callee function body needs to be available;
— No checker should evaluate the function call via eval::Call;

Abandon this change?

NoQ added a comment.Nov 4 2019, 10:46 AM

Yes, indeed, evalCall can only be performed by one checker. But if any, it is this checker that's fully responsible for stream functions. So i recommend doing evalCall in this particular checker and falling back to PreCall/PostCall in other checkers that model other aspects of streams.

Yes, indeed, evalCall prevents inlining from happening. This is fine, however, as long as you manually model all the effects of the call on the Environment (namely, set the return value) and on the Store (invalidate regions that may be touched by the function - in most cases it's none).

Generally, inlining shouldn't be thought of as an ideal solution because it has a downside of being extremely expensive. Instead, it should be thought of as a poor man's solution when it comes to any sort of API functions that have well-documented behavior. In particular, evalCall is more precise when it comes to state splits (which is why StdCLibraryFunctionsChecker uses evalCall, see D20811 for details) (also "inlined defensive checks").

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
161

Nope, because ExprEngine will overwrite the value when modeling the function call, regardless of whether it will be inlined or not. The only valid reason to use BindExpr in a checker is to bind the return value in evalCall.

Generally, values in the Environment are not meant to be overwritten. Any active expression can only have one value.

The only way to obtain an environment with a different value for the same expression is to wait until the expression dies and reach the same expression later in the same analysis. In this case the value is first removed from the Environment during dead symbols collection, and then added back later.

For quite some time i wanted to prevent these mistakes by adding an assertion "Environment values are never mutated". Unfortunately, there are multiple existing violations of this rule, which are bugs in ExprEngine, and i didn't have time to fix them all. I highly recommend this little project as an exercise :)

This comment was removed by balazske.
NoQ added a comment.Nov 5 2019, 4:54 PM

Given that i roughly remember what the previous comment was about and i wanted to comment on this: you should totally move evalCall for any functions you need from StdCLibraryFunctionsChecker to this checker, given that your checker is more specialized.

That said, you shouldn't do that until your checker is out of alpha, because that'd disable modeling for users who don't mess with alpha checkers. Working around that would be a moderately interesting problem. Of course you can always disable StdCLibraryFunctionsChecker when testing your checker. We could also come up with an inter-checker API function that tells StdCLibraryFunctionsChecker to disable certain parts of itself and call this function upon StreamChecker's registration. We could also go for a more principled solution by introducing a global CallDescriptionMap<const Checker *> that coordinates who evalCall what (instead of polling all checkers in a loop on every call and crashing whenever more than one checker responds).

I removed the previous comment because I realized that StdCLibraryFunctionsChecker does not use evalCall for fread (returns false because "non-pure" evaluation).

Anyway the checks that do not use BindExpr (all except the open functions) could be moved into a PreCall or PostCall callback?

NoQ added a comment.Nov 13 2019, 11:43 AM

Anyway the checks that do not use BindExpr (all except the open functions) could be moved into a PreCall or PostCall callback?

Moving from evalCall to PreCall/PostCall has the additional effect of not giving you control over invalidation of the heap (unless you do evalCall in a checker, it ends up being the normal behavior of conservativeEvalCall() most of the time). For that reason ideally every library function should be evalCall'ed by a checker.

Also if you're making updates to the program state that other checkers should see immediately (say, writing out-parameter values into the Store or updating a state trait that other checkers will read in the same callback), you should either use evalCall for that, or make sure your dependencies are set up correctly (@Szelethus, our callback invocation order is now affected by checker dependencies, right?).

checkArgNullStream() should definitely be at PreCall.

evalFseek() doesn't have a BindExpr but it should have it; looks like a bug. If you're evalCall-ing a non-void function you must bind a return value (we should add an assertion for this; there's never a reason to bind an UnknownVal in evalCall because there generally never is a good reason to bind UnknownVal to anything because it shouldn't have been present in our SVal hierarchy in the first place because conjuring a value is always strictly better).

In D69662#1744479, @NoQ wrote:

Anyway the checks that do not use BindExpr (all except the open functions) could be moved into a PreCall or PostCall callback?

Moving from evalCall to PreCall/PostCall has the additional effect of not giving you control over invalidation of the heap (unless you do evalCall in a checker, it ends up being the normal behavior of conservativeEvalCall() most of the time). For that reason ideally every library function should be evalCall'ed by a checker.

Also if you're making updates to the program state that other checkers should see immediately (say, writing out-parameter values into the Store or updating a state trait that other checkers will read in the same callback), you should either use evalCall for that, or make sure your dependencies are set up correctly (@Szelethus, our callback invocation order is now affected by checker dependencies, right?).

Sorry for the slack :)

One should never count on the invocation order of callback funcions in between checkers. In fact, I'm not too sure that my patches affect this, but I suspect that it does, as the container of choice for checker objects is std::vector.

checkArgNullStream() should definitely be at PreCall.

evalFseek() doesn't have a BindExpr but it should have it; looks like a bug. If you're evalCall-ing a non-void function you must bind a return value (we should add an assertion for this; there's never a reason to bind an UnknownVal in evalCall because there generally never is a good reason to bind UnknownVal to anything because it shouldn't have been present in our SVal hierarchy in the first place because conjuring a value is always strictly better).

NoQ added a comment.Feb 24 2020, 12:57 PM

Sorry for the slack :)

One should never count on the invocation order of callback funcions in between checkers. In fact, I'm not too sure that my patches affect this, but I suspect that it does, as the container of choice for checker objects is std::vector.

With checker dependencies introduced, i think it's not an unreasonable guarantee to make. Like, if you rely on your dependency to model things for you, you probably want to have your callbacks called after everything is set up by the dependency.

That said, it's not always obvious what does "after" mean. I wouldn't be shocked if it turns out that the correct order is different in pre-stmt and post-stmt (i.e., dependent - dependency - actual event - dependency - dependent).

In D69662#1890007, @NoQ wrote:

Sorry for the slack :)

One should never count on the invocation order of callback funcions in between checkers. In fact, I'm not too sure that my patches affect this, but I suspect that it does, as the container of choice for checker objects is std::vector.

With checker dependencies introduced, i think it's not an unreasonable guarantee to make. Like, if you rely on your dependency to model things for you, you probably want to have your callbacks called after everything is set up by the dependency.

That said, it's not always obvious what does "after" mean. I wouldn't be shocked if it turns out that the correct order is different in pre-stmt and post-stmt (i.e., dependent - dependency - actual event - dependency - dependent).

Well, you raise a valid point. While I do think that implementing complex checkers that have strong interaction should be left to the bit-more-experienced, maybe it'd be better to make the interface a bit more intuitive. I like to point at IteratorChecker, which is spread out across multiple files, despite it packing a lot of knowledge.

I'm afraid that I too have more questions then possible solution to this answer. My patches related to MallocChecker was (is) a research of some sort to come up with one.

balazske abandoned this revision.Feb 26 2020, 7:49 AM

StreamChecker will be updated in other changes, see:
https://reviews.llvm.org/D75158