This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Add MagentaHandleChecker for the Magenta kernel
AbandonedPublic

Authored by haowei on Jun 27 2017, 4:25 PM.

Details

Summary

This patch adds a new Static Analyzer checker for the correct use of handle in Magenta kernel. The concept of handle is very similar to file descriptor in Unix. This checker checks leaks, use after release and double release issues in Magenta source code. We have tested this checker internally and it has detects several issues in our code. We are still improving and adding new features to this checker so any comments or suggestions are appreciated.

Diff Detail

Event Timeline

haowei created this revision.Jun 27 2017, 4:25 PM
NoQ edited edge metadata.Jul 8 2017, 3:29 AM

Thanks for publishing this. It seems that your code is already huge, and you have already added a lot of workarounds for various problems (known and new) in our APIs, while it'd sound great to actually solve these problems and simplify the API to make writing checkers easier (as in http://lists.llvm.org/pipermail/cfe-dev/2017-June/054457.html ). This technical debt seems to slow down any progress on the checkers dramatically - and i'm totally sorry about that.

lib/StaticAnalyzer/Checkers/MagentaHandleChecker.cpp
23–50

I wish all checkers had these :3

220

We're trying to use this CallDescription thing for this purpose recently.

298–304

Uhm, yet another unobvious boilerplate that shows us that checker API needs to be made a lot easier. Not many checkers need this, but i suspect that PthreadLockChecker may suffer from the same problem.

327–340

This code can be simplified to return !State->assume(ExprSVal, true);, which also works for symbolic SVals (eg. as in CheckSymbolConstraintToNotZero).

374–377

This may work as well, yeah.

457–458

This should be fixed by allowing checkPointerEscape report non-pointer escapes, or making a separate callback for non-pointer escapes. This huge boilerplate shouldn't be repeated in every checker that needs it.

Annotations are still great though.

607–608

While this is the easiest thing to do, it halves the analyzer performance every time it happens, exploding the complexity exponentially - because the remaining path would be split up into two paths, which are analyzed independently.

So it is really really rarely a good idea to split the state in the checker.

The alternative approach would be to delay splitting the state now, and only do that when it starts to matter. This is annoying to implement, but this may work. An example of this would be how we handled failed pthread_mutex_destroy in http://lists.llvm.org/pipermail/cfe-dev/2017-April/053567.html / https://reviews.llvm.org/D32449 .

haowei updated this revision to Diff 107201.EditedJul 18 2017, 5:22 PM

Thanks for reviewing this patch. I have modified the checker according NoQ's suggestions and refactored some long functions.

We're trying to use this CallDescription thing for this purpose recently.

I took a look at CallDescption, but it can only be used when a CallEvent object is present, which is not available in evalCall callbacks. So we cannot use it in our checker. BTW, we have been working on a function matching based on annotation attributes to avoid string comparison as much as possible. But the patch is relatively huge so we would like to land this patch first before publishing the annotation attribute stuff.

This should be fixed by allowing checkPointerEscape report non-pointer escapes, or making a separate callback for non-pointer escapes. This huge boilerplate shouldn't be repeated in every checker that needs it.

I agree. For example following code is not covered by the workaround I used in the checker:
struct A {
mx_handle_t data,
mx_status_t status,
};
escapeFunction(A arg1);

And it would be hard to implement a workaround in the checker to cover this case. I would like to help but I quite new to the Clang and Clang Static Analyzer. Currently I don't know where to start to fix this issue in the static analyzer. If you have any suggestions on how to fix it in the Clang, let me know.

While this is the easiest thing to do, it halves the analyzer performance every time it happens, exploding the complexity exponentially - because the remaining path would be split up into two paths, which are analyzed independently. So it is really really rarely a good idea to split the state in the checker.

When I first worked on this checker, I actually use ProgramState to track return values of syscalls, which is very similar to the approach used by D32449. But after I read the implementation of StreamChecker, I noticed that it uses program state bifurcation which is more straightforward. So I changed my checker to use this feature as well.

After receiving your suggestion, I conducted a benchmark to compare the performance of my previous return value approach with current state bifurcation approach. It took 11m46.383s to analyze magenta kernel code base by tracking the return value of syscalls compare to 11m39.034s by using state bifurcation. So in my case, the state bifurcation is actually faster, though I am not quite sure about the reason.

Another reason that we would like to keep the state bifurcation approach is that for magenta handle acquire and release syscalls. Both handle acquire syscall and handle release syscall may fail, so the return value of both types of syscalls should be tracked and should be tracked separately. The PthreadLockChecker only tracked one. It would be a lot of harder to get things right if we use return value tracking approach when adding more syscalls support.

Let me know if you have further suggestions or concerns.

Thanks,
Haowei

haowei abandoned this revision.Jul 27 2017, 5:12 PM

Superseded by D35968