This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] A fix for symbolic element region index lifetime.
ClosedPublic

Authored by NoQ on Sep 9 2015, 5:47 AM.

Details

Summary

In Clang Static Analyzer, when the symbol is referenced by an index value of an element region, it does not prevent garbage collection (reaping) of that symbol. Hence, if the element index value is the only place where the symbol is stored, the symbol gets reaped, and range information is removed from the constraint manager.

This time, i'm not sure if it's the right place to put the fix; probably markLive() itself should handle it transparently, or something like that.

I managed to construct a regression test with the ReturnPtrRange checker. The second test shows that not only the store, but also the environment needs to be fixed.

Diff Detail

Repository
rL LLVM

Event Timeline

NoQ updated this revision to Diff 34322.Sep 9 2015, 5:47 AM
NoQ retitled this revision from to [analyzer] A fix for symbolic element region index lifetime..
NoQ updated this object.
NoQ added reviewers: zaks.anna, krememek.
NoQ added subscribers: cfe-commits, dergachev.a.
NoQ updated this object.Sep 9 2015, 6:14 AM
NoQ added a reviewer: jordan_rose.
jordan_rose edited edge metadata.Sep 14 2015, 8:33 PM

Hm, interesting. I'm not sure this is even sufficient, though: what if I have a FieldRegion that's a sub-region of an ElementRegion with a symbolic index? RegionStore does everything by base regions, so we won't ever see that intermediate region with the symbolic index.

NoQ updated this revision to Diff 35067.Sep 18 2015, 12:57 AM
NoQ edited edge metadata.

Thanks for the quick reply, sorry for the delay! Was afk for a couple of days.

Yeah, right, in fact i didn't even fix the issue for store keys at all; only for store values and environment values.

It also seems much harder to test store keys, because it's quite a problem to guess the symbolic key once the symbol is not present anywhere else, though i can imagine an artificial checker that would rely on that. A test like...

int a[1];
{
  int x = conjure_index();
  a[x] = 0;
  if (x != 0)
    return;
  clang_analyzer_eval(a[0] == 0); // expected-warning{{TRUE}}
}
clang_analyzer_eval(a[0] == 0); // expected-warning{{TRUE}}

...should have exposed such problem, but this kind of lookup doesn't seem to be supported by the store yet (that is, the first expected-warning{{TRUE}} fails as well).

Hmm, what if i expand the debug.ExprInspection checker to allow testing SymbolReaper directly? Updated the diff with a proof of concept, which fixes the issue for the store keys and adds a test. I can split the ExprInspection change into a separate commit/review if necessary. It might be useful for testing other SymbolReaper-related patches as well.

zaks.anna edited edge metadata.Oct 8 2015, 2:31 PM

Now that we have a way to test symbol reaper, please, add more coverage to the symbol-reaper.c test, including the test that Jordan mentioned. Even if it is not fixed, it's good to include it with a FIXME note.

What is the performance impact of this change?

The changes to the RegionStore and Environment seem inconsistent and repetitive.. For example, we can remove a lot of this just by changing SymbolReaper::markLive(const MemRegion *region), see below. How is this patch different from your solution? Can it be generalized to handle more cases?

diff --git a/lib/StaticAnalyzer/Core/RegionStore.cpp b/lib/StaticAnalyzer/Core/RegionStore.cpp
index 49b5ac3..55db545 100644
--- a/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ b/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -2320,8 +2320,14 @@ void removeDeadBindingsWorker::VisitCluster(const MemRegion *baseR,
   if (const SymbolicRegion *SymR = dyn_cast<SymbolicRegion>(baseR))
     SymReaper.markLive(SymR->getSymbol());
 
-  for (ClusterBindings::iterator I = C->begin(), E = C->end(); I != E; ++I)
+  for (ClusterBindings::iterator I = C->begin(), E = C->end(); I != E; ++I) {
+    // Mark the key of each binding as live.
+    const BindingKey &K = I.getKey();
+    if (auto subR = dyn_cast<SubRegion>(K.getRegion()))
+      SymReaper.markLive(subR);
+
     VisitBinding(I.getData());
+  }
 }
 
 void removeDeadBindingsWorker::VisitBinding(SVal V) {
@@ -2342,6 +2348,7 @@ void removeDeadBindingsWorker::VisitBinding(SVal V) {
   // If V is a region, then add it to the worklist.
   if (const MemRegion *R = V.getAsRegion()) {
     AddToWorkList(R);
+    SymReaper.markLive(R);
 
     // All regions captured by a block are also live.
     if (const BlockDataRegion *BR = dyn_cast<BlockDataRegion>(R)) {
diff --git a/lib/StaticAnalyzer/Core/SymbolManager.cpp b/lib/StaticAnalyzer/Core/SymbolManager.cpp
index df4d22a..793f53e 100644
--- a/lib/StaticAnalyzer/Core/SymbolManager.cpp
+++ b/lib/StaticAnalyzer/Core/SymbolManager.cpp
@@ -391,6 +391,16 @@ void SymbolReaper::markLive(SymbolRef sym) {
 
 void SymbolReaper::markLive(const MemRegion *region) {
   RegionRoots.insert(region);
+  
+  // Mark the element index as live.
+  if (const ElementRegion *ER = dyn_cast<ElementRegion>(region)) {
+    SVal Idx = ER->getIndex();
+    for (SymExpr::symbol_iterator SI = Idx.symbol_begin(),
+         SE = Idx.symbol_end();
+         SI != SE; ++SI) {
+      markLive(*SI);
+    }
+  }
 }
lib/StaticAnalyzer/Checkers/ExprInspectionChecker.cpp
177 ↗(On Diff #35067)

Thank you for adding this!!! This can be part of this commit or separate; up to you.

Please, update the debugger checkers document in the docs folder with information on how this should be used.

lib/StaticAnalyzer/Core/RegionStore.cpp
2241 ↗(On Diff #35067)

Please, remove unnecessary spacing here and in a couple of other places.

2247 ↗(On Diff #35067)

Why symbol_iterator is not used here?

2249 ↗(On Diff #35067)

Could you add test cases that explain why we need the while loop here?

NoQ updated this revision to Diff 36946.Oct 9 2015, 7:42 AM
NoQ edited edge metadata.
NoQ marked 4 inline comments as done.

Thanks for the review! Fixed all comments. I don't notice any significant performance hit with this patch (+/- 2% on the whole Android codebase runs).

I guess i won't break this into separate reviews, because less work =)

The loop through super-regions is necessary for supporting the test proposed by Jordan Rose, which is now added. Not sure what other tests are worth adding; together with specific tests on ReturnPtrRange, we're covering all three places that were patched; maybe i could move these tests to symbol-reaper.c as well, but i guess it's more worth it to test the actual analyzer output than artificially dumped internals(?) Not sure.

The change in Environment.cpp was removed because it's now handled automatically by markLive().

I don't notice any significant performance hit with this
patch (+/- 2% on the whole Android codebase runs).

What did you test it on?

we're covering all three places that were patched; maybe i could move these tests
to symbol-reaper.c as well, but i guess it's more worth it to test the actual analyzer output than
artificially dumped internals(?) Not sure.

I'd add all necessary tests to the symbol-reaper.c and duplicate one of them in return-ptr-range.cpp to see that it has an effect on bug reporting.

The change in Environment.cpp was removed because it's now handled automatically by markLive().

This patch is different from what I sent. In my patch, I was marking the region as live and your latest patch is marking the indexes live without marking the region in some cases. What are the implications in each case?

docs/analyzer/DebugChecks.rst
143 ↗(On Diff #36946)

What does subscribe mean? How do you decide when to subscribe/start tracking the symbol? Are you supposed to do that when you know the symbol is live?

NoQ updated this revision to Diff 37094.Oct 12 2015, 4:24 AM
NoQ marked an inline comment as done.

What are the implications in each case?

Uhm, sorry, right, I guess i have to admit i don't quite understand what exactly is going on, so it'd take more time for me to conduct a deeper audit of the reaping code to answer this question. In fact, right now i'm not even sure that

// If the block expr's value is a memory region, then mark that region.
if (Optional<loc::MemRegionVal> R = X.getAs<loc::MemRegionVal>())
  SymReaper.markLive(R->getRegion());

in EnvironmentManager::removeDeadBindings() does anything at all (at least, it's not covered by tests).

For now i just leave the patch here that fixes the tests with minimal intrusion (instead of patching markLive(), this diff patches only specific places), other issues you pointed out are fixed as well. Hopefully, will have time to dig into this in the nearest future :)

I don't notice any significant performance hit with this
patch (+/- 2% on the whole Android codebase runs).

What did you test it on?

On the source code of the whole Android Open Source Project (version 5); i record analyzer commands after the first scan-build, so that not to rebuild Android for every analysis (and only substitute clang executable and set of checkers), and the observed time of analysis is 1 hour 6 minutes (expensive hardware...) both before and after the patch is applied, so the difference in time seems to be less than one minute, i.e. less than 2%. And i guess it should be large and varied enough(?)

so it'd take more time for me to conduct a deeper audit of the reaping code to answer this question

That would be great, thanks!

And i guess it should be large and varied enough(?)

Yes.

NoQ updated this revision to Diff 37263.Oct 13 2015, 9:53 AM

Hmm, i think i'm ready to explain most of the stuff.

  • First of all, the piece of code in EnvironmentManager::removeDeadBindings() i mentioned above is truly useless; everything would be marked as live anyway on the next line.
  • Then, in RegionStore, in VisitBinding(), markLive() of the whole region value inside the binding is indeed correct, and i provide a pretty straightforward test;
  • Finally, in VisitCluster(), it's a bit more complicated, and i suggest that markLive() is unnecessary here, and in fact does very little if anything, so it's hard to test even with stuff we have now:
    • i guess we shouldn't think of a region as live, simply because it has bindings, in a procedure designed for removing bindings;
    • however, marking the region as live here wouldn't save the bindings; they would later be removed simply because they weren't visited;
    • liveness of the region would also extend lifetime of a few symbols that depend on it, such as its SymbolRegionValue, SymbolExtent, and SymbolMetadata, however because the binding would be dead anyway, these symbols would die on the next pass;
      • note that SymbolRegionValue should explicitly be dead when the region has other bindings, so we certainly shouldn't try to preserve the region only to save SymbolRegionValue;
    • another place that relies on region liveness information collected on these passes is Dynamic Type Propagation, however the same arguments apply; it would only delay the inevitable.

So the real question is whether (or rather how) the Store should maintain correct region liveness information after completing its internal garbage collection pass, because there are, in fact, other users of this information later in the chain, but this seems to be a larger problem without instantly noticeable effects.

The new diff removes dead code in Environment and tests and fixes the separate bug in the Store that caused reaping of constraints on SymbolRegionValue (and a few other kinds of SymbolData) too early when the only place where the related region is stored is a store value.

NoQ updated this revision to Diff 37266.Oct 13 2015, 10:05 AM

Whoops accidentally left out a dead code line in tests.

NoQ updated this revision to Diff 37267.Oct 13 2015, 10:08 AM

Awwww, another mis-submit. Sorry for the noise.

zaks.anna accepted this revision.Dec 4 2015, 4:43 PM
zaks.anna edited edge metadata.

LGTM.

(Feel free to add comments to the existing code!)

So the real question is whether (or rather how) the Store should maintain correct region liveness information
after completing its internal garbage collection pass, because there are, in fact, other users of
this information later in the chain, but this seems to be a larger problem without instantly noticeable effects.

Could you give specific (possibly potential) examples of this problem?

We assume that the symbols that are collected do not need to stick around, so no-one "later in the chain" should need them. Except for the cases where we artificially extend liveness by calling addSymbolDependency().

This revision is now accepted and ready to land.Dec 4 2015, 4:43 PM
NoQ added a comment.Dec 7 2015, 2:02 AM

So the real question is whether (or rather how) the Store should maintain correct region liveness information
after completing its internal garbage collection pass, because there are, in fact, other users of
this information later in the chain, but this seems to be a larger problem without instantly noticeable effects.

Could you give specific (possibly potential) examples of this problem?

We assume that the symbols that are collected do not need to stick around, so no-one "later in the chain" should need them. Except for the cases where we artificially extend liveness by calling addSymbolDependency().

What I was trying to say is that after StoreManager iterates over the Store, there are also checkers that may want to make decisions on liveness of their symbols based on liveness of certain regions, by checking SymbolReaper::isLiveRegion(), but currently such behavior is very rare, and thus barely tested. But eventually, with more checkers and further development, we'd need to figure out how this should work (eg. http://reviews.llvm.org/D14277 is a step in this direction - this patch can only work as long as isLiveRegion() is reliable after GC'ing the Store). Theoretically, it still makes sense to me to say that, for example, the Store shouldn't hold its keys as live - for the reasons stated above.

So this is mostly a hand-waving thing, and i think the patch is not immediately affected by this potential debate, and it should be OK to commit it.

This revision was automatically updated to reflect the committed changes.