This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][UninitializedObjectChecker] Correct dynamic type is acquired for record pointees
ClosedPublic

Authored by Szelethus on Aug 17 2018, 5:16 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

Szelethus created this revision.Aug 17 2018, 5:16 AM
Szelethus added inline comments.Aug 17 2018, 5:17 AM
test/Analysis/cxx-uninitialized-object-inheritance.cpp
787 ↗(On Diff #161215)

The checker should be able to catch this one -- for some reason it is regarded as an unknown region. Odd, as the test case right after this one works perfectly.

NoQ added inline comments.Aug 17 2018, 12:01 PM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
187–191 ↗(On Diff #161215)

This looks like one more situation where we dereference a location to get a value and then struggle to get back to the location that we've dereferenced by looking at the value. Can we just use V?

test/Analysis/cxx-uninitialized-object-inheritance.cpp
787 ↗(On Diff #161215)

There's a variety of problems we have with empty base classes, might be one of those, and they are usually easy to fix because, well, yes it's a special case, but it's also an extremely simple case.

I encourage you to open up the Exploded Graph and study it carefully to see what and where goes wrong (not for this revision).

Szelethus added inline comments.Aug 21 2018, 12:48 PM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
187–191 ↗(On Diff #161215)

I've struggled with derefencing for months now -- I'm afraid I just don't really get what you'd like to see here.

Here's what I attempted to implement:
I'd like to obtain the pointee's region of a Loc region, even if it has to be casted to another type, like through void pointers and nonloc::LocAsInteger, and continue analysis on said region as usual.

The trickiest part I can't seem to get right is the acquisition of the pointee region. For the problem this patch attempts to solve, even though DynT correctly says that the dynamic type is DynTDerived2 *, DerefdV contains a region for DynTBase.

I uploaded a new patch, D51057, which hopefully settles derefence related issues. Please note that it does not replace this diff, as the acquired region is still of type DynTBase.

I find understanding these intricate details of the analyzer very difficult, as I found very little documentation about how this works, which often left me guessing what the proper way to do this is. Can you recommend some literature for me on this field?

NoQ added inline comments.Aug 21 2018, 1:32 PM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
187–191 ↗(On Diff #161215)

Can you recommend some literature for me on this field?

This is pretty specific to our analyzer. SVal/SymExpr/MemRegion hierarchy is tightly coupled to implementation details of the RegionStore, which is our memory model. There's a paper on it [1]. We have some in-tree documentation of the RegionStore [2] (other docs there are also interesting to read). And there's my old workbook [3]. And i guess that's it.

[1] Xu, Zhongxing & Kremenek, Ted & Zhang, Jian. (2010). A Memory Model for Static Analysis of C Programs. 535-548. 10.1007/978-3-642-16558-0_44.
[2] https://github.com/llvm-mirror/clang/blob/master/docs/analyzer/RegionStore.txt
[3] https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf

Szelethus added inline comments.Aug 23 2018, 10:17 AM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
187–191 ↗(On Diff #161215)

Thank you! I'm aware of these works, and have read them already.

The actual implementation of the analyzer is most described in your book, and I've often got the best ideas from that. However, some very well described things in there have absolutely no documentation in the actual code -- for example, it isn't obvious at all to me what CompundValue is, and I was surprised to learn what it does from your guide. Other examples are the difference between TypedValueRegions getLocationType and getValueType, SymExpr in general, and so on.

I think it would also be very valuable to have a link in docs/analyzer to your book. On another note, would you mind if I were to put some of the information from there into the actual code?

test/Analysis/cxx-uninitialized-object-inheritance.cpp
787 ↗(On Diff #161215)

I'd love to learn about other parts of the analyzer (besides chasing pointers), but for now, this seems to have fixed itself ;)

NoQ accepted this revision.Aug 24 2018, 4:09 PM
NoQ added inline comments.
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedPointee.cpp
187–191 ↗(On Diff #161215)

I guess a link on the website would be good. We should also convert a lot of our comments to doxygen comments, because some people do read doxygen (including myself).

I don't advice copy-paste-ing texts from the workbook to the code because i picked a weird licensing scheme (no idea why i picked a -BY license) and wasn't straight enough about authorship (so i don't feel i can change it anymore). But actually documenting things that are currently not documented is something i'd greatly appreciate :)

This revision is now accepted and ready to land.Aug 24 2018, 4:09 PM
Szelethus updated this revision to Diff 162658.Aug 27 2018, 4:53 AM
Szelethus marked 5 inline comments as done.

Actually, the note messages were incorrect as in this case the static and dynamic type of the object differs.

NoQ added inline comments.Aug 28 2018, 1:01 PM
test/Analysis/cxx-uninitialized-object-inheritance.cpp
802 ↗(On Diff #162658)

Mmm, what's the value of casting to derived type and then specifying that we access the field of the base type anyway? Isn't this->bptr->x exactly what the user needs to know(?)

Szelethus added inline comments.Aug 28 2018, 2:13 PM
test/Analysis/cxx-uninitialized-object-inheritance.cpp
802 ↗(On Diff #162658)

True, but it's a one tough job to write this->bptr->x here and also a correct note message for...

805 ↗(On Diff #162658)

...this one. I'll see how I could achieve it without disrupting FieldNode's interface too much.

NoQ added inline comments.Aug 29 2018, 11:42 AM
test/Analysis/cxx-uninitialized-object-inheritance.cpp
802 ↗(On Diff #162658)

I guess don't try too hard, eg. say if it requires something of non-linear complexity it's probably not worth it (not because it'd be slow but because it'd be an indication that it might be not worth the effort).

Szelethus added inline comments.Aug 29 2018, 12:21 PM
test/Analysis/cxx-uninitialized-object-inheritance.cpp
802 ↗(On Diff #162658)

I actually invested some effort into this and I'm fairly certain I could pull it off with O(n) complexity, but I'll probably just place a TODO in the code for now, as I have other things I really want to get fixed first.

NoQ accepted this revision.Aug 29 2018, 12:23 PM

Let's commit then?

Szelethus updated this revision to Diff 163479.EditedAug 31 2018, 1:39 AM

Fixed a crash, where the super region was symbolic.

In D50892#1218060, @NoQ wrote:

Let's commit then?

I'd be much more comfortable landing D51057 before modifying dereferencing any further (hence the dependency).

Szelethus marked an inline comment as done.Sep 10 2018, 11:08 AM
Szelethus added inline comments.
test/Analysis/cxx-uninitialized-object-inheritance.cpp
802 ↗(On Diff #162658)

I think I'll leave this as is, it could be done, but it wouldn't be nice, and it wouldn't make anyone's life significantly easier. At least this report tells you the dynamic type of this->bptr, so I guess that's something.

Szelethus marked 8 inline comments as done.Sep 10 2018, 11:09 AM
This revision was automatically updated to reflect the committed changes.