Page MenuHomePhabricator

[analyzer] Track multiple raw pointer symbols in DanglingInternalBufferChecker

Authored by rnkovacs on Jul 8 2018, 10:06 AM.



Previously, the checker only tracked one raw pointer symbol for each container object.
But member functions returning a pointer to the object's inner buffer may be called on the object several times.
These pointer symbols are now collected in a set inside the program state map and thus all of them is checked for use-after-free problems.

Diff Detail

Event Timeline

rnkovacs created this revision.Jul 8 2018, 10:06 AM
xazax.hun accepted this revision.Jul 8 2018, 10:21 AM

Overall looks good, some nits inline. Let's run it on some projects to exercise this change.


I think we should use using now instead of typedef.


Using both contains and get will result in two lookups. Maybe it would be better to just use get and check if the result is nullptr?


Is it possible here the set to be empty? We just added a new element to it above. Maybe turn this into an assert or just omit this if it is impossible?


Same comment as above.


We started to use lowercase letters for variable names. Maybe this is not the best, since it is not following the LLVM coding guidelines. So I think it would be better to rename this Cond to start with a lowercase letter in this patch for consistency, and update the tests to follow the LLVM coding styleguide in a separate patch later.

This revision is now accepted and ready to land.Jul 8 2018, 10:21 AM
rnkovacs updated this revision to Diff 154519.Jul 8 2018, 10:57 AM
rnkovacs marked 5 inline comments as done.
rnkovacs edited the summary of this revision. (Show Details)

Addressed comments.


I'm not sure whether add() can fail. I turned it into an assert now and will see if it ever fails.

Thanks! The changes look good, I forgot to mark one double lookup though in my previous review.


Oh, there is also a double lookup here, sorry I did not spot it in the initial review.

rnkovacs updated this revision to Diff 154520.Jul 8 2018, 11:13 AM
rnkovacs marked an inline comment as done.
rnkovacs added inline comments.

Thanks for noticing. I should have seen it in the first place :)

NoQ added a comment.Jul 8 2018, 12:25 PM

Much symbols!


Please add a comment on how this template is useful. This trick is used by some checkers, but it's a veeeery unobvious trick. We should probably add a macro for that, i.e. REGISTER_FACTORY_WITH_PROGRAMSTATE or something like that.


Maybe turn the FIXME into a NOTE or something like that. It indeed seems fine, but let's still make sure the reader realizes that there's a subtle potential problem here.


I'd much rather unwrap the symbol here, i.e. if (SymbolRef Sym = RawPtr.getAsSymbol()). A lot of other cornercases may occur if the implementation is accidentally inlined (Undefined, concrete regions). Also, speculatively, .getAsSymbol(/* search for symbolic base = */ true) for the same reason*.

If we didn't care about inlined implementations, the Unknown check would have been redundant. So it should also be safe to straightforwardly ignore inlined implementations by consulting C.wasInlined, then the presence of the symbol can be asserted. But i'd rather speculatively care about inlined implementations as long as it seems easy.

* In fact your code relies on a very wonky implementation detail of our SVal hierarchy: namely, pointer-type return values of conservatively evaluated functions are always expressed as &SymRegion{conj_$N<pointer type>} and never as &element{SymRegion{conj_$N<pointer type>}, 0 S32b, pointee type}. Currently nobody knows the rules under which zero element regions are added in different parts of the analyzer, i.e. what is the "canonical" representation of the symbolic pointer, though i made a few attempts to speculate.


My favorite idiom for such cases, looks a bit cleaner:

const PtrSet *SetPtr = State->get<RawPtrMap>(ObjRegion)
PtrSet Set = SetPtr ? *SetPtr : F.getEmptySet()

This contains check is very unlikely to yield true because if the implementation is not inlined, the symbol is newly born. I'd much rather skip the check; it doesn't sound like it'd make things any faster.

Even better, let's assert that: assert(C.wasInlined || !Set.contains(Sym)). It'll probably help us catch some "reincarnated symbol" bugs in the core.


It totally can't. We're just adding an object to a set. What could possibly go wrong?

rnkovacs updated this revision to Diff 154556.Jul 9 2018, 2:33 AM
rnkovacs marked 5 inline comments as done.

Thanks very much for your review!


Should I do that then? Maybe in CheckerContext.h?


I wasn't aware of much of this. Thanks for the detailed explanation :)

NoQ accepted this revision.Jul 10 2018, 11:06 AM

Looks great, thanks!

This revision was automatically updated to reflect the committed changes.