This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Remove the "postponed" hack, deal with derived symbols using an extra map
ClosedPublic

Authored by george.karpenkov on Aug 28 2018, 5:44 PM.

Details

Summary

The "derived" symbols indicate children fields of a larger symbol.
As parents do not have pointers to their children, the garbage collection algorithm the analyzer currently uses adds such symbols into a "postponed" category, and then keeps running through the worklist until the fixed point is reached.

The current patch rectifies that by instead using a helper map which stores pointers from parents to children, so that no fixed point calculation is necessary.

The current patch yields ~5% improvement in running time on sqlite.

Diff Detail

Repository
rL LLVM

Event Timeline

NoQ added inline comments.Aug 29 2018, 2:24 PM
clang/lib/StaticAnalyzer/Core/RegionStore.cpp
2509 ↗(On Diff #163000)

I suspect that this is unnecessary. We are guaranteed that R is live, therefore we are guaranteed that RVS is live.

2539–2541 ↗(On Diff #163000)

Loc::isLocType()?

2544 ↗(On Diff #163000)

Let's make whitespace consistent.

2548–2549 ↗(On Diff #163000)

The constructor of SymbolicRegion contains an assertion that no other symbolic regions are possible.

Let's add comment to that assertion that RegionStore now relies on that and needs to be updated if more cases are introduced.

george.karpenkov marked 4 inline comments as done.
NoQ accepted this revision.Sep 6 2018, 3:20 PM
NoQ added inline comments.
clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h
773 ↗(On Diff #163238)

Actually, let's assert even more: in the HeapSpaceRegion case, s is always SymbolConjured.

Also maybe let's point directly to populateWorklistFromSymbol() in the comment, otherwise nobody would have an idea what to look for, but if we point to a function then even if the function gets renamed the reader would be able to pick it up from the history.

clang/lib/StaticAnalyzer/Core/RegionStore.cpp
2527 ↗(On Diff #163238)

Formatting.

This revision is now accepted and ready to land.Sep 6 2018, 3:20 PM
This revision was automatically updated to reflect the committed changes.
NoQ added inline comments.Jan 30 2019, 3:32 PM
cfe/trunk/lib/StaticAnalyzer/Core/RegionStore.cpp
2539–2543

Argh, this isn't enough.

In a nutshell, this code says "Uhm, is this symbol $x (say, reg_$N<x>) now live and it is a pointer? Ok, RegionStore, re-check the symbolic region *$x (aka SymRegion{reg_$N<x>})." Which is good. By "re-check" we mean "add it to the RegionStore's worklist", which would cause re-exploration of *bindings* within it. However, not every value stored in a region is a binding within the region! For instance, all values within **$x (aka SymRegion{reg_$M<SymRegion{reg_$N<x>}} are also kept alive, which we will fail to mark live when, say, *x has no bindings at all - SymbolRegionValue is still presumed to be there.

It would have been fine if it was just sub-regions, but in fact an infinite tree of *base* regions (namely, symbolic regions of non-assigned/invalidated pointer-type sub-regions) also become live every time a region becomes live, while only a finite amount of base regions become reachable when the traversal method implemented here gets applied.