Page MenuHomePhabricator

[analyzer] Add MutexChecker for the Magenta kernel
Needs ReviewPublic

Authored by khazem on Nov 6 2016, 8:02 PM.

Details

Summary

This patch adds a new Static Analyzer checker for the correct use of mutexes in the Magenta kernel.The checker analyzes uses of mutex_init, mutex_acquire, mutex_release, and mutex_destroy. It performs the core checks on usage of these APIs on mutex to make sure they are correct. For instance, we want to make sure mutexes are always initialized in the constructor and destroyed in the destructor. Also, we do not want to have paths where a mutex is released while not having been acquired first.

The checker was authored by Farid Molazem Tabrizi.

Diff Detail

Event Timeline

khazem updated this revision to Diff 77004.Nov 6 2016, 8:02 PM
khazem retitled this revision from to [analyzer] Add MutexChecker for the Magenta kernel.
khazem updated this object.
khazem added subscribers: phosek, smklein.
NoQ added a subscriber: NoQ.Nov 7 2016, 9:17 AM
NoQ added a comment.Nov 7 2016, 4:59 PM

It's great to see more domain-specific checks coming in! We're glad to be useful. The code is also well-commented, and a lot of tests are provided, which is great.

This checker seems to be relatively similar to the existing alpha.unix.PthreadLock checker (which also handles XNU kernel locks already). In fact, the part that checks init-lock-unlock-destroy paths might actually fit in there by simply adding more function identifiers; it also shares the same problems, like not supporting pointer escapes (if a pointer to a mutex escapes to an external function, this function may lock or unlock the mutex without us knowing), or never cleaning up the program state maps (even when regions are dead and could no longer be referenced - useful for performance, and also may find more defects, eg. mutex region becoming dead in a locked state is probably a problem). So i think there's an opportunity for some code re-use, if you're interested (it wouldn't immediately benefit any of the checkers, but any later changes would naturally land into both checkers). And the escape and cleanup problems should be easy to address.

I see you also have a feature that's not covered by the existing checker, which is checking the constructor-destructor lock-unlock match. I suspect that this sub-check needs a bit more testing. My first concern is that checkEndFunction is a callback that fires once per path through end of function, not once per function. Here's a test i could come up with, which currently fails - i'm deliberately tricking the analyzer into analyzing branches in different order by alternating between if-branch of a condition and else-branch of an inverted condition; i think there may be more false positives of this kind if there's more of the complicated control flow in constructors/destructors:

bool glob;

class Foo {
  mutex_t lock;

public:
  Foo() {
    if (glob) {
    } else {
      mutex_init(&lock);
    }
  }
  ~Foo() {
    if (!glob) {
      mutex_destroy(&lock);
    } else {
    }
  }
};

void foo() {
  Foo F; // Warns "Mutex was not destroyed in the destructor", perhaps shouldn't.
}

I also have another kind of concern for this sub-check: it relies on the core analyzing all paths through the constructor/destructor. However, the analyzer never promises to cover all paths during analysis, and in fact reserves the right to drop paths arbitrarily - this is a kind of fundamental limitation for the symbolic execution technique, otherwise the analysis would never terminate on most code. So additional hacks might be needed to avoid false positives that may arise if, for example, the path on which the mutex_destroy call in the destructor was completely skipped during analysis. It should be possible to hang up to checkEndAnalysis to see if there is extra stuff in the worklist at the end, which would indicate that some paths were dropped.

It also might be a good idea to make this check path-insensitive instead, which should be more reliable and transparent for this case - simply scan the AST of constructor and destructor bodies for references to mutex variables within init/destroy call expressions, compose the sets of variable declarations referenced that way, and compare those.

To summarize, depending on how much time you'd like to invest into this checker, and if any of the thoughts above sounded as a useful feedback, in no particular order and with no urgency and without really blocking this patch from landing you may want to:

  1. Try to merge with PthreadLock checker,
  2. Handle mutex pointer escapes,
  3. Handle dead symbols,
  4. Move constructor-destructor init-destroy sets to the program state to make them path-aware,
  5. Or otherwise move the checks to checkEndOfTranslationUnit to make sure all paths were covered;
  6. Check if any paths were skipped during analysis (not sure if it'd be necessary after 4.),
  7. Maybe make the constructor-destructor init-destroy check AST-based (instead of 4.-6.).

Generally, i think this checker is good in its current shape, and issues above can anyway be addressed later, in an incremental manner. Unless you prefer to make major changes to this patch at this point, i should be able to follow up with minor comments.

lib/StaticAnalyzer/Checkers/MutexChecker.cpp
228

Before i forget: this API is a bit tricky, and there's a common but completely unobvious mistake that you're making here. reportMutexError(..., false) is producing a transition to a non-fatal error node. The whole point of the non-fatal error node is that analysis continues past this node. After adding the transition on this line, you effectively split the analysis into two paths, one following the non-fatal error node, another containing the MutexMap change.

This doubles up the time it takes to analyze the rest of the code (growing exponentially if encountered multiple times; probably rises to the node limit very soon, destroying coverage), and the additional path through the error node is not expected here (but it might be for some checkers, so the API is not quite at fault) - the behavior of the checker on this path might also be unexpected (it essentially skips the initialization of the mutex).

The right way to do this is to remember the exploded node generated by generateNonFatalErrorNode(), and put the new transition on top of that node (if the report has actually been thrown) by using the addTransition(St, N) override (or make the transition normally if it isn't).

I really wish there was a good and universal solution to make the transition API more obvious, feel sorry about that every time i see one of those problems.

Thanks for the detailed review Artem!

I think that it's sensible to try merging this with PthreadLockChecker. In fact, I could also try merging the SpinLockChecker [1] that I've posted for review with PthreadLockChecker also, since they all implement similar functionality. This would be in line with what we discussed at the developer meeting, where there was a consensus that there could be too many domain-specific checkers that were only different in what function names they look for.

Here is my proposal: I could try refactoring the PthreadLockChecker into a LockChecker class, which would be instantiated as various subclasses (for Pthread, XNU, Magenta, and others in the future). The common functionality would be implemented in LockChecker, but the user would ask for a specific checker using command-line flags. Using subclasses would hopefully prevent the class from becoming a mess of if/switch statements, while allowing future LockCheckers to be implemented for other lock APIs. Does this seem reasonable?

I could then have a go at the other improvements that you mentioned, and they would hopefully apply to all LockCheckers and not just our Magenta one.

[1] https://reviews.llvm.org/D26340

NoQ added a comment.Nov 9 2016, 3:49 PM

I think that it's sensible to try merging this with PthreadLockChecker. In fact, I could also try merging the SpinLockChecker [1] that I've posted for review with PthreadLockChecker also, since they all implement similar functionality.

That'd be great!

Here is my proposal: I could try refactoring the PthreadLockChecker into a LockChecker class, which would be instantiated as various subclasses (for Pthread, XNU, Magenta, and others in the future). The common functionality would be implemented in LockChecker, but the user would ask for a specific checker using command-line flags. Using subclasses would hopefully prevent the class from becoming a mess of if/switch statements, while allowing future LockCheckers to be implemented for other lock APIs. Does this seem reasonable?

We usually go with the if/switch approach (in this case we instantly see what's the difference between the two APIs), and i think we don't do checker inheritance anywhere, but i don't have good reasons to instantly prefer one over the other, so it's be nice to know if inheritance is actually better here.

Note that if all you want is to enable/disable platform-specific checks separately with -analyzer-checker options, or have separate lines for the checks in Checkers.td (and auto-enable magenta checkers if the magenta target platform is detected), then you don't necessarily need to produce separate Checker<> classes/instances in your C++ code for that. We can have different checkers point to the same C++ checker object. See unix.Malloc and cplusplus.NewDelete as an example (they're essentially the same checker) - the tricks are in registerChecker procedure.