This check is based on https://reviews.llvm.org/D36022 but it takes a bit different approach. It does less state splitting and tries to avoid the evalCall callback. The state machine is also a bit different, now the escaped and untracked states are merged.
Details
Diff Detail
Event Timeline
clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp | ||
---|---|---|
68 | Whoops, the spelling of the attribute is wrong here, will fix in a followup patch. |
Yay, i see the "Schrödinger handle" pattern: until the return value of release is checked, the handle is believed to be both dead and alive at the same time.
We usually use this pattern because a lot of code that's already written doesn't check their return values. But do you really need this for a brand-new API? Maybe aggressively assume that the release may always fail and fix all the places in your code where the return value is not checked before use?
clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp | ||
---|---|---|
100–101 | So you plan to unhardcode these as well eventually? | |
124 | For now it doesn't seem to be used at all. | |
142–147 | This macro has literally saved exactly as many lines as it has caused :) | |
157–169 | We've recently dumbed down this idiom to class FuchsiaHandleChecker : ... { LeakBugType{this, "Fuchsia handle leak", "Fuchsia Handle Error", /*SuppressOnSink=*/true}; ... }; :) | |
208 | This is never necessary, just call all the methods directly on QT - it has an overloaded operator ->() for this. | |
209 | Ok, so what's the deal with arrays here? If the function receives an array of handles, do you ultimately plan to return multiple symbols - one for each element of the array? Also, in the generic non-Fuchsia case, what if the handle itself is a pointer? | |
213 | Here QT->getAs<TypedefType>() respectively. | |
303–304 | Why not generate a fatal error instead? | |
337–339 | Ok, so you're basically saying that whenever the code under analysis does anything to check the return value, this callback will fire immediately? Thus guaranteeing that whenever the Schrödinger state is accessed, it's either already collapsed or indeed not checked yet? This sounds right but i wish i had your confidence :D My usual approach to Schrödinger states will be to do the same thing but in checkDeadSymbols(). I.e., when the return value symbol dies, see if it's constrained and collapse the state accordingly. If the access occurs earlier than symbol death, update the state according to the current constraints and forget about the symbol. If current constraints are insufficient, conservatively mark it as a successful release (as that's how non-paranoid code under analysis would behave). Regardless of the approach, you still need to update the state upon access as if the release has succeeded (when the current state is a Schrödinger state). But with your approach you can avoid checking the constraints upon access, and instead blindly assume that the symbol is underconstrained. I like this! Can we prove that it works by adding an assertion that the return value symbol is underconstrained upon every access to the potentially Schrödinger state? | |
344 | If you want this check to be universally applicable, you will probably also need an annotation that explains which statuses indicate an error. | |
349 | I request comments about this path. When does this happen? | |
367 | I strongly recommend testing this (i.e., by annotating an overloaded member operator with the use-annotation). This code looks correct to me but it's just too easy to make a mistake here :/ | |
clang/test/Analysis/fuchsia_handle.c | ||
1 ↗ | (On Diff #230160) | core,! |
76 ↗ | (On Diff #230160) | This test doesn't really test the pointer escape mechanism because both handles are closed correctly, so there's no leak in this test regardless of escapes. |
Thanks for the review!
I am not very well versed in Fuchsia's syscalls yet but my understanding is that not all of the syscalls can fail so we do not need all the users to check for errors. But this is something I will verify with the rest of the team, so please treat my answer with a grain of salt for now. An alternative would be to introduce an annotation to tell which APIs will never fail, but I am afraid that is also sometimes subjective. For example it is pretty much possible to have no available port in a system but is very unlikely to happen so some non mission critical applications might not check for errors there. But this is also something that I will follow up with the team.
clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp | ||
---|---|---|
100–101 | I have no specific plans at this point. I do not see these types changing frequently, so I have no urge to unhardcode them. Are there any downsides I am not being aware of? | |
124 | Wow, indeed! This is from an eariler iteration :) | |
142–147 | Good point! Initially I had a separate escaped state, but now I guess I should just expand them :) | |
157–169 | +1 :) | |
208 | getPointeeOrArrayElementType returns a Type *, so I cannot really continue to use QualType and I am not interested in the qualifiers at all. | |
209 | I do not have a real plan with arrays just yet. Creating eagerly a symbol for all the elements might look a bit wasteful but also users are probably not expected to have large arrays of handles? Probably I should remove that code for now. I would expect non-Fuchsia checkers to introduce their own getHandleSymbol logic. Does that make sense? Maybe I should rename this to getFuchsiaHandleSymbol. | |
303–304 | The motivation was to not to reduce coverage, but I have no strong feelings about making this a sink. But in case of leaks I certainly prefer non-fatal errors. | |
337–339 | Yeah, you got my intention right! I was also thinking about doing this in checkDeadSymbols, but I was wondering if that is ultimately the right thing to do. It might be possible to write code where the symbol for the error code becomes dead too late, so we keep the Schrödinger state longer than necessary. Redoing this when we access the handle makes sense though! I think that could probably perform better compared to my approach with evalAssume. My only problem at this point do not really know when is a handle in a Schrödinger state. Do you recommend specifying a distinct state for that? | |
344 | I think the main reason why we use annotations for release/acquire/use because of the potentially large number of calls that deals with handles. The status for success, however, is not a big amount of information, so I think this is easier to be hard coded. It is also less prone to changes. | |
349 | Oh, this is to model that maybe a handle allocation failed. I will definitely add a comment and also make the code more narrow to that specific case! | |
367 | I do agree, I also hate this part. | |
clang/test/Analysis/fuchsia_handle.c | ||
1 ↗ | (On Diff #230160) | Whoops! :) |
76 ↗ | (On Diff #230160) | Indeed! I will make one of the handles "leak". |
- Explicitly model "maybe" states.
- Fix some escaping issues.
- Address most review comments.
clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp | ||
---|---|---|
20–21 | Btw, this state machine is fairly common. Both MallocChecker and SimpleStreamChecker already follow this same model. Do you have any thoughts on re-using it in an abstract manner? Something like this maybe?: template <typename TraitID> class SimpleStreamStateMachine { public: SimpleStreamStateMachine(void(*pointerEscapePolicy)(CheckerContext C, Other Customizations)); ProgramStateRef makeOpened(ProgramStateRef State, SymbolRef Key) { return State->set<TraitID>(Key, TraitID::ValueTy::makeOpened()); } // makeReleased and so on pre-defined for all users, // allowing customization when necessary. }; class SimpleStreamStateMachineBookkeeping : Checker { public: checkDeadSymbols() { // Perform the state cleanup for all concrete machines // ever instantiated, in the only possible way, probably // invoke callbacks for leaks. } checkPointerEscape() { // Invoke the passed-down policy for each concrete // state machine. } // Other callbacks are implemented in the dependent checker. }; And then: REGISTER_MAP_WITH_PROGRAMSTATE(MyTrait, SymbolRef, MyState); class MyChecker : Checker { SimpleStreamStateMachine<MyStateTy> StM { getStateMachine<SimpleStreamStateMachine<MyStateTy>>( &MyChecker::checkPointerEscapeImpl, /*Other Customizations*/)}; void checkPointerEscapeImpl(...); public: checkPostCall(...) { C.addTransition(StM.makeOpened(C.getState(), Call.getRetVal())); } }; 'Cause i have this pipe dream that we make a lot of such abstract state machines and then we'll never need to write more of them and that'll make it cheaper to introduce non-trivial operations over the program state such as replacing values or advanced widening because we'll only have to implement them in the few state machines rather than in many checkers. |
clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp | ||
---|---|---|
20–21 | Hmm, indeed, lots of those checks following a very similar idea. I think a good litmus test would be to check if the customization points of such abstractions are strong enough to handle all sorts of error handling mechanisms that would potentially be handled by the checkers including: return values (like your example), output arguments, errno/other global error state. Also, checkers sometimes have slightly different state machines see the AllocatedOfSizeZero in MallocChecker, and sometimes they also have more companion data in the state like allocation family (this one should be easy to solve though). That being said I would prefer such experiments to happen in a separate patch :) |
- Fix member operator modeling.
- Added new lines to the end of files.
- Added documentation.
- Minor typo fixes in tests.
clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp | ||
---|---|---|
337–339 | Hmm, evalAssume does not get a CheckerContext, so it cannot really use SValBuilder to do more complicated queries, like check if a symbol is greater or equal to zero rather than just equal to. I wonder, is this intentional? Should I just use the checkDeadSymbol method instead or should I add CheckerContext to evalAssume? |
Ok, now I have some real world experience with the results of the check.
The false positive ratio for double free and use after free seems to be quite good but the handle leak part is almost unusable at this point. The main problem is somewhat anticipated, we are not doing a great job notifying the checkers about escaped non-pointer symbols. The question is, what should we do about this? Should the PointerEscape callback be renamed? Currently, the list of symbols will also contain non-pointer symbols, so in some cases that callback does solve the escaping issue for integers. But it is not consistent.
So I see two ways forward:
- Have a separate SymbolEscape callback.
- Rename PointerEscape, and fix it to be triggered for all kinds of symbols.
I feel like the 2. is a better solution. Of course, that change might have a performance impact as well.
Any thoughts?
I feel like the 2. is a better solution. Of course, that change might have a performance impact as well.
Yes, i'm all for '2.'. There's no need to make this callback more complicated than it already is.
As for performance, it's messy and suffers from a deeper problem: the number of escaped symbols is potentially infinite. The following false positive illustrates that well:
void invalidate(int **x); void foo(int **x) { int *y = *x; if (*y == 0) { // **x should be invalidated here! invalidate(x); } // Should not warn about division by zero! 1 / *y; }
Therefore one does not simply compose a list of escaped symbols. We need something similar to SymbolReaper but for invalidation/escapes. And then we'll talk about performance.
- Fix some problems discovered during some real world stress test
- Add more tests
- The ASCII art is not updated yet, but will do so at some point.
LGTM! Yeah, please update the ASCII-art, it's great and every checker should have it.
clang/docs/analyzer/checkers.rst | ||
---|---|---|
1341 | Maybe explain what Fuchsia is so that people don't wonder if they should turn it on? (: | |
clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp | ||
345 | Let's add a high-level comment about what's going on here. |
Maybe explain what Fuchsia is so that people don't wonder if they should turn it on? (: