This patch introduced additional PointerEscape callbacks after conservative calls for output parameters.
This should not really affect the current checkers but the upcoming FuchsiaHandleChecker relies on this heavily.
Details
Diff Detail
Event Timeline
clang/include/clang/StaticAnalyzer/Core/CheckerManager.h | ||
---|---|---|
409 | Actually, it is possible I went too far with plumbing RegionAndSymbolInvalidationTraits. |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
740 | After some offline conversation it is very likely that we want to move the runCheckersForPointerEscape here. The main question is, how should we get all the data? We should know about:
Is there anything else? |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
740 |
That should be obvious from the AST. Like, parameters of non-const pointer/reference types.
Output parameter regions (as TopLevelInvalidated) and whatever's reachable from them.
Currently the only trait that affects escaping (as opposed to invalidation) is TK_SuppressEscape and it is never applied to out-parameters. |
Also, it looks like for example CStringChecker calls ProgramState::InvalidateRegions directly when modeling function calls. So it looks like if checkers are using evalCall, pointerEscape will not be triggered automatically. I am not sure if that would be a bug or feature :D
I'd say it's a feature, because evalCall is the official way for checkers to opt out of babysitting.
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
621 | How much do we care about the escape kind? For each symbol we need to check if it was directly passed to the callee. It is not too bad I guess, but I was wondering. | |
740 |
Nope, we do not want to move it here. We want the pointerEscape to happen AFTER postCall callbacks. | |
1152 | Hmm. Should I introduce a very short lived trait or do we have a better way to deal with stuff like this? A short lived map? |
I think you don't need to smuggle WasConservative all the way up. It's implied that if the evaluation was not conservative, then the respective ExplodedNodeSet is going to be empty, as all nodes will be put directly into the worklist instead. Eg., checkPostCall isn't going to be invoked immediately after inlineCall, but only after enqueueEndOfFunction.
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
621 | Dunno, just introduce a new PSK_ item and use it here. It isn't supposed to be per-symbol, it's just to notify checkers that we're in this new post-call invocation for out-parameters, so that they could opt out of the whole callback. |
- Added a test. More rigorous tests will come in the FuchsiaHandleChecker.
- Added a new PSK_ entry.
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
603 | WDYT about re-using ExprEngine::escapeValue() by changing it to accept ArrayRef<SVal> instead of a single SVal? |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
603 | It is not entirely the same. Here we only collect symbols from non-stack regions. There (as far as I understand) we collect all symbols. I could add a flag but at that point I wonder if it is worth the change. |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
603 | Umm, wait, right, so what are you doing in this callback to begin with? The code says "gather all symbols that are encountered as immediate values stored in traversed regions". Why not simply "gather all symbols that are traversed from parameter regions"? | |
648 | I guess technically, for our own sanity, it's worth it to-rescan the symbols for every node in dstPostCall. But i'll be very surprised if they are actually going to yield different results for every predecessor node. |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
603 | My understanding is the following but correct me if I am wrong: int *getConjuredSymbol(); call(getConjuredSymbol()); So we have can talk about two symbols here. One symbol is what was returned by getConjuredSymbol and the other is the pointee, the symbol that we get when we dereference the result of getConjuredSymbol. And my understanding is that we want to invoke escape for the latter and not the former. ExprEngine::escapeValue invalidates the former but not the latter. The visitor here invalidates the latter but not the former. This behavior solves most of my test cases in FuchsiaHandleChecker. |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
648 | I think it is possible. We use the state to get the pointee of some pointers, so in case the PostCall splits the state on the value of the outputs it would be reasonable to get different results. |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
603 | How about escapeValue(getSVal(getArgSVal()))? (the type for getSVal() could be obtained from the AST). |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
603 | Hmm, I believe that could work, thanks! |
clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h | ||
---|---|---|
632 ↗ | (On Diff #232982) | Dunno, should we rename to escapeValues()? |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
614–616 | I encourage Call.parameters(). This way you won't need to obtain a FuncDecl. In fact you won't even need it to be a FunctionDecl. | |
624 | Ok, so this patch interacts with D71152 in a non-trivial manner. We should re-use the logic that decides whether an escape on bind occurs. |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
624 | Yeah, I just started to look into it and some skeletons fell out. Let's consider the following example: void leak() { zx_handle_t sa, sb; if (zx_channel_create(0, &sa, &sb)) return; zx_handle_close(sb); } After using the same logic we can no longer detect the leak. The reason is, after zx_channel_create both sa and sb regions are escaped (and they become escaped during the conservative call) and then after the PostCall callback, when we attached the traits to our symbols, we will trigger pointer escape immediately, and lose the traits we just added. I am not immediately sure what to do here. I feel like the main source of the problem is the fact that zx_channel_create should not escape anything and the checker has now way to communicate that to the analyzer core. Any ideas? |
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
624 | Suddenly, I see four ways forward, none of which is optimal. Hopefully, you can come up with something better.
|
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp | ||
---|---|---|
624 | Or:
|
In any case, every checker is allowed to make their own decisions about escaping. Escape on its own is not material, it's all about how the checker reacts to escapes. Say, it's up to MallocChecker to decide whether the function may or may not release memory that escapes on call.
I think a valid approach would be to simply look up the function in your CallDescriptionMap and then abort the checkPointerEscape callback when it's found.
Yet, it annoys me a bit that we didn't make everything magically work in an "out of the box" manner. Can we eliminate the first pointer escape (that happens before PostCall) but only keep the secondary escape?
I don't think this is a good enough model currently. The problem is that, it does not play well with annotations. E.g. the checker can see a symbol escaping, but it does not have a whole lot of information how. For example, currently, there is no way to check if the output parameter through which the escape happened was annotated somehow.
Hmm. If the function is annotated, it is hopefully "fully" annotated, or at least the programmer doesn't mind adding more annotations to it. Given that you have your CallEvent structure in checkPointerEscape, i hope you can easily see if there are any annotations at all on the function, and if so, suppress the current escape entirely. Or at least scan the annotated parameters and suppress the escape for them.
I guess it's still a problem if the *same* handle is also passed through a parameter that *cannot* be annotated (eg., as part of a structure passed into the call) and then actually getting released inside the call, but is it a real problem for you?
Yeah, this was one of my idea as well. I think one of my main concerns is that I would except the majority of the escapes are simply being output parameters and only a minority are legitimate. So I was wondering if we got the default right. Maybe a checker should do more work to get the escaping rather than more work preventing it?
But that's exactly how it works right now(?) If you don't define checkPointerEscape you get no escaping, if you do extra work of defining it you get the exact amount of escaping that you want.
So basically what I am wonder/worrying about is the following:
The analyzer core will decide that the stack region is escaped and the checkers has no word about this. And from that time on the checkers have to do extra work each time there is a store or conservative call to find out if this escape corresponds to a region that was escaped earlier unwillingly (from the checker's point of view).
I think I found the main problem with the current model, at least for the FuchsiaHandleCheck.
Consider the following two snippets:
zx_handle_t *get_handle_address(); void escape_store_to_escaped_region01() { zx_handle_t sb; if (zx_channel_create(0, get_handle_address(), &sb)) return; zx_handle_close(sb); }
void leak() { zx_handle_t sa, sb; if (zx_channel_create(0, &sa, &sb)) return; zx_handle_close(sb); }
In the first one I want the first handle to be escaped in the second one I do not want it to be escaped.
With my current proposed changes the checker will receive a pointer escape callback for both but it does not have enough info to differentiate between the two cases.
If I do not act upon this kind of escape I end up reporting a false positive in the first case. If I act on this escape I end up missing a true positive in the second case.
Yup, you got me. Pre-escaped locals are indeed material and beyond the checker's control. I don't seem to have any immediate solutions. I think we could postpone the work on pre-escaped locals (until we figure out how to do them correctly) if they're not immediately necessary to you (after all i was the one who suggested it). Or ignore the problem (depending on how we do our FP vs. FN trade-off).
Mm, these snippets don't have pre-escaped locals. Like, they accidentally do, but above i proposed to work around this by removing the first escape invocation (that happens during the call) and only doing it after the call. This way these locals don't have time to become pre-escaped. I think these are not a problem.
Thanks!! Nothing controversial remains here, right? :)
clang/include/clang/StaticAnalyzer/Core/CheckerManager.h | ||
---|---|---|
91 | All three escape-on-calls are on conservatively evaluated calls only, so this name isn't very descriptive. Maybe PSK_EscapeOutParameters? | |
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp | ||
2736 ↗ | (On Diff #233235) | Dunno, make_pair? |
All three escape-on-calls are on conservatively evaluated calls only, so this name isn't very descriptive.
Maybe PSK_EscapeOutParameters?