I've seen a few false positives that appear because we construct C++11 std::initializer_list objects with brace initializers, and such construction is not properly modeled. For instance, if a new object is constructed on the heap only to be put into a brace-initialized STL container, the object is reported to be leaked.
Approach (0): This can be trivially fixed by this patch, which causes pointers passed into initializer list expressions to immediately escape.
This fix is overly conservative though. So i did a bit of investigation as to how model std::initializer_list better.
According to the standard, std::initializer_list<T> is an object that has methods begin(), end(), and size(), where begin() returns a pointer to continous array of size() objects of type T, and end() is equal to begin() plus size(). The standard does hint that it should be possible to implement std::initializer_list<T> as a pair of pointers, or as a pointer and a size integer, however specific fields that the object would contain are an implementation detail.
Ideally, we should be able to model the initializer list's methods precisely. Or, at least, it should be possible to explain to the analyzer that the list somehow "takes hold" of the values put into it. Initializer lists can also be copied, which is a separate story that i'm not trying to address here.
The obvious approach to modeling std::initializer_list in a checker would be to construct a SymbolMetadata for the memory region of the initializer list object, which would be of type T* and represent begin(), so we'd trivially model begin() as a function that returns this symbol. The array pointed to by that symbol would be bindLoc()ed to contain the list's contents (probably as a CompoundVal to produce less bindings in the store). Extent of this array would represent size() and would be equal to the length of the list as written.
So this sounds good, however apparently it does nothing to address our false positives: when the list escapes, our RegionStoreManager is not magically guessing that the metadata symbol attached to it, together with its contents, should also escape. In fact, it's impossible to trigger a pointer escape from within the checker.
Approach (1): If only we enabled ProgramState::bindLoc(..., notifyChanges=true) to cause pointer escapes (not only region changes) (which sounds like the right thing to do anyway) such checker would be able to solve the false positives by triggering escapes when binding list elements to the list. However, it'd be as conservative as the current patch's solution. Ideally, we do not want escapes to happen so early. Instead, we'd prefer them to be delayed until the list itself escapes.
So i believe that escaping metadata symbols whenever their base regions escape would be the right thing to do. Currently we didn't think about that because we had neither pointer-type metadatas nor non-pointer escapes.
Approach (2): We could teach the Store to scan itself for bindings to metadata-symbolic-based regions during scanReachableSymbols() whenever a region turns out to be reachable. This requires no work on checker side, but it sounds performance-heavy.
Approach (3): We could let checkers maintain the set of active metadata symbols in the program state (ideally somewhere in the Store, which sounds weird but causes the smallest amount of layering violations), so that the core knew what to escape. This puts a stress on the checkers, but with a smart data map it wouldn't be a problem.
Approach (4): We could allow checkers to trigger pointer escapes in arbitrary moments. If we allow doing this within checkPointerEscape callback itself, we would be able to express facts like "when this region escapes, that metadata symbol attached to it should also escape". This sounds like an ultimate freedom, with maximum stress on the checkers - still not too much stress when we have smart data maps.
Does anybody like/hate any of these solutions or have anything better in mind? I'm personally liking the approach (2) - it should be possible to avoid performance overhead, and clarity seems nice.
explicit ?