Recognization of function names is done now with the CallDescription
class instead of using IdentifierInfo. This means function name and
argument count is compared too.
A new check for filtering not global-C-functions was added.
Test was updated.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 38660 Build 38659: arc lint + arc unit
Event Timeline
Thanks! Woohoo, tests!
You can go one step further to have a CallDescriptionMap, like in D62557. This would replace the whole chain of ifs with a map lookup (which is currently still a chain of ifs under the hood, but a lot less code anyway and we'll probably optimize it under the hood later).
It seems like this patch is diffed against your latest commit, not the master branch.
Yeah, seems so. The code looks great tho, thanks!
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
64 | Feel free to omit the constructor entirely. We almost never have constructors in our checkers. | |
74–87 | Are you sure these functions are actually sometimes implemented as builtins? I think they're required to be regular functions. | |
90–106 | For most purposes it's more convenient to pass CallEvent around. The only moment when you actually need the CallExpr is when you're doing the binding for the return value, which happens once, so it's fine to call .getOriginExpr() directly at this point. | |
125–127 | (not your fault but before i forget) This check is actually redundant. | |
273 | (not your fault but before i forget) This condition is actually always true here. |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
90–106 | The CallExpr is used at many places to get arguments and other data. There is a CallEvent::getArgSVal that can be used instead but at some places CallExpr is used in other ways. I do not see much benefit of change the passed CallExpr to CallEvent. |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
90–106 |
Can we add more convenient getters to CallEvent to help with those? 'Cause CallEvent has strictly more information than CallExpr (combining syntactic information with path-sensitive information), it's all about convenience. I also see that CallExpr is stored in StreamState (which is something you can't do with a CallEvent because it's short-lived) but i suspect that it's actually never read from there and i doubt that it was the right solution anyway. I.e., no other checker needs that and i don't have a reason to believe that StreamChecker is special. |
- Various code cleanups. Eval functions use CallEvent, CallExpr is removed from state.
A few nits inline, otherwise the patch is awesome, thank you!!
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
113–114 | Prefer using. When I wrote D68165, I spent about 10 minutes figuring out how to do it... Ah, the joys of the C++ syntax. using FnCheck = void (StreamChecker::*)(const CallEvent &, CheckerContext &) const; | |
122–123 | Isn't this redundant with my other inline about parameter types? | |
144–148 | I'm not sure why we need this, is it true that *all* stream related functions return a pointer or a numerical value? Are we actually checking whether this really is a library function? If so, this looks pretty arbitrary. | |
176 | Why the need for (void)? CheckNullSteam doesn't seem to have an LLVM_NODISCARD attribute. |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
122–123 | Probably change to isInSystemHeader or use both? | |
144–148 | This comes from code of CStringChecker: // Pro-actively check that argument types are safe to do arithmetic upon. // We do not want to crash if someone accidentally passes a structure // into, say, a C++ overload of any of these functions. We could not check // that for std::copy because they may have arguments of other types. Still I am not sure that the checker works correct with code that contains similar named but "arbitrary" functions. | |
176 | I wanted to be sure to get no buildbot compile errors (where -Werror is used). |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
144–148 | Oops, meant to write that ", is it true that *all* stream related functions have only pointer or a numerical parameters?". |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
122–123 | Should be the "stream functions" always in system header? If no isInSystemHeader call is used any global function called "fopen" with 2 arguments is recognized as stream opening function. If it returns a pointer and no "fclose" is called on that we will get a resource leak warning for that code. (But for this case the user will probably not enable the checker?). | |
144–148 | This can not be an assert because it is possible to have global functions with same name but different signature. The check can be kept to filter out such cases. The wanted stream functions have only integral or enum or pointer arguments. |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
113–114 | It's actually very easy to remember, it's just an alien kiss smiley ::*) | |
144–148 | Clang shouldn't crash even when the library definition is incorrect. | |
176 | I'd rather not discard the return value, but allow callbacks indicate an error when something goes wrong, so that to allow them to abort evalCall(). But more importantly, note how CheckNullStream actually returns a ProgramStateRef that was meant to be transitioned into. Which means that the primary execution path actually gets cut off. So even if buildbots didn't warn on this, i wish they did. |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
268–270 | Aha, so we're no longer checking whether Sym is null, because why would we, we literally made created it as a symbol a couple lines up. Is it worth assert()ing though? | |
285–286 | Why is there a need to use an Optional? Why not just return a nullptr? As I saw it, each time we check both whether the optional has a value and whether the state within it is null. | |
310–311 | And here too? |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
268–270 | Probably better to have the assert. | |
285–286 | The idea was that there are 3 kind of return values: Change to a new state, do not change to new state, or error was found. In the first and last case we should return true from evalCall but not if there is no state to change into and the analysis can proceed in default way. |
Could you please mark resolved issues as resolved? I would like to see what we miss, and what has been done.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
114 | I prefer std::function, because it is modern. using StreamCheck = std::function<void(const StreamChecker *, const CallEvent &, CheckerContext &)>; I think it is fine with pointers, but at some point we need to modernize this. | |
117 | Because evalFopen() is basically the OpenFileAux(Call, C);, I think we could simplify the API by removing unnecessary one-liners so here you could write {{"fopen"}, &StreamChecker::OpenFileAux, that is why the CallEvent parameter is so generic and useful. | |
132–133 | Why do you inject this? | |
140 | I would move this tiny identifyCall() into evalCall() as the purpose of evalCall() is the validation of the Callback in this case. | |
152 | This Optional is not necessary. When the state is changed, you can rely on CheckerContext::isDifferent() to see whether the modeling was succeeded. Therefore you could revert the bool return values as well. |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
199 | This is getting pretty messy, given that you've already made a transition in this function. If you do addTransition and then generateNonFatalErrorNode, then you'll get two parallel successor nodes, but you only need one. You should either delay the transition that happens immediately after CheckNullStream, or chain the two transitions together sequentially (as opposed to in parallel). |
Its becoming a bit difficult to navigate inlines, could you please mark them as done as you address them?
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
114 | But its also a lot more expensive. https://blog.demofox.org/2015/02/25/avoiding-the-performance-hazzards-of-stdfunction/ std::function is able to wrap lambdas with different captures and all sorts of things like that, which comes at a cost. |
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
90–106 | In the new code CallEvent is passed. | |
114 | Now std::function and std::bind is used. Probably more expensive but it is called once in a evalCall, that should be no problem? | |
140 | identifyCall is removed now. | |
152 | Optional is not used now. The functions only return if the passed state was modified (if applicable). To detect if error was generated the isDifferent is used. | |
199 | evalFseek has now a transition to new (non-null stream) state or transition to (non-fatal) error. |
Let's land this patch, given that it grew way too big for a refactoring pass, and let @balazske further clean things up later in his follow-up patches. This is already a large improvement and i'm very grateful!
For the Optional vs. bool debate: let's see. For now we have roughly the following:
bool checkNullStream(..., ProgramStateRef *State) { if (!NonBuggyState && BuggyState) { C.emitReport(..., C.generateErrorNode(BuggyState)); return false; } if (NonBuggyState) { *State = NonBuggyState; return true; } return false; } void checkFseekWhence(..., ProgramStateRef *State) const { if (!isBuggy(State)) return; C.emitReport(..., C.generateNonFatalErrorNode(State)); } void evalFseek(...) { ProgramStateRef State = C.getState(); bool StateChanged = checkNullStream(..., &State); if (C.isDifferent()) return; checkFseekWhence(..., &State); if (!C.isDifferent() && StateChanged) C.addTransition(State); }
In this code functions checkNullStream and checkFseekWhence are very similar. Both of them do some checking and potentially emit a report. In high-level terms, all you're trying to do is conduct two checks at the same program point.
How did these two functions end up being so completely different? What will you do if you have not 2 but, say, 10 different checks that you need to conduct? That's not a random question; checkers do tend to grow this way, so i'm pretty sure you'll eventually need to answer this question.
In ExprEngine, the class that's responsible for modeling all statements that have effect on the program state, we have a very common idiom for chaining pieces of work together. It looks roughly like this:
void modelSomethingPart1(ExplodedNode *Pred, set<ExplodedNode> *Succs); void modelSomethingPart2(ExplodedNode *Pred, set<ExplodedNode> *Succs); void modelSomethingPart3(ExplodedNode *Pred, set<ExplodedNode> *Succs); ... void modelSomethingPartK(ExplodedNode *Pred, set<ExplodedNode> *Succs); void modelSomething(ExplodedNode *Pred) { set<ExplodedNode> Set0 = { Pred }; set<ExplodedNode> Set1; for (P : Set0) modelSomethingPart1(P, &Set1); set<ExplodedNode> Set2; for (P : Set1) modelSomethingPart2(P, &Set2); set<ExplodedNode> Set3; for (P : Set2) modelSomethingPart3(P, &Set3); ... set<ExplodedNode> SetK; for (P : SetKMinus1) modelSomethingPartK(P, &SetK); for (N : SetK) addTransition(N); }
As you see, it scales nicely for any number of sub-tasks and you don't need to juggle boolean flags or optionals when you add another sub-task. Moreover, you can further split each of your sub-tasks into even smaller sub-sub-tasks in a similar manner.
In ExprEngine this idiom is completely essential due to how arbitrary most of the effects-of-statements tend to be. In checkers, however, this level of complexity has never been necessary so far because most sub-methods either produce exactly one new state, or sink completely. For checkers like this one, which do multiple checks, i recommend passing a single node around:
ExplodedNode *checkNullStream(..., ExplodedNode *Pred) { if (!NonBuggyState && BuggyState) C.emitReport(..., C.generateErrorNode(BuggyState, Pred)); return C.addTransition(NonBuggyState, Pred); } ExplodedNode *checkFseekWhence(..., ExplodedNode *Pred) const { State = Pred->getState(); if (!isBuggy(State)) return Pred; ExplodedNode *N = C.generateNonFatalErrorNode(State); if (N) C.emitReport(..., N); return N; } void evalFseek(...) { ExplodedNode *N = C.getPredecessor(); N = checkNullStream(..., N); if (!N) return; N = checkFseekWhence(..., N); if (!N) return; N = checkSomethingElse1(..., N); if (!N) return; N = checkSomethingElse2(..., N); if (!N) return; ... checkSomethingElseK(..., N); }
This wastes exploded nodes a little bit but that's fine. It's much less scary than accidentally adding unnecessary state splits.
My concern was the too heavy Optional and bool usage. Cool patch!
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | ||
---|---|---|
114 | There is no real overhead, yes. |
Let's make it 3! Thank you so much for sticking by, I guess one of the reasons why a patch so obviously great and desired took so long is that we still didn't figure out how we imagine the CallDescriptionMap to be integrated into larger checkers :) In any case, this is a major step in the right direction, MallocChecker and some of the others should take notes.
Feel free to omit the constructor entirely. We almost never have constructors in our checkers.