This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Revert D51397 "Remove the "postponed" hack...".
ClosedPublic

Authored by NoQ on Jan 31 2019, 4:28 PM.

Details

Summary

Not much to see in the code since it's a revert, and i already spoiled it a bit in D51397#1377975 but i guess there are a few more things to explain here, so let's start from the beginning.

The new MallocChecker test case:

1799   struct A {
1800     int *buf;
1801   };
1802   struct B {
1803     struct A *a;
1804   };
1805   void livenessBugRealloc(struct A *a) {
1806     a->buf = realloc(a->buf, sizeof(int)); // false use-after-free
1807   }
1808   void testLivenessBug(struct B *in_b) {
1809     struct B *b = in_b;
1810     livenessBugRealloc(b->a);
1811    ((void) 0);
1812     livenessBugRealloc(b->a);
1813   }

Here we have our Store suddenly forget that in_b->a->buf is overwritten by the relocated value &HeapSymRegion{conj_$6<void *>}, i.e. drops the respective binding, and during the second call it still thinks that it contains its initial value that, by definition, is &SymRegion{reg_$2<SymRegion{reg_$1<SymRegion{reg_$0<in_b>}.a>}.buf>}. At the same time, the symbol reaper does not perceive any of reg_$2, reg_$1, reg_$0 as dead, which is kinda weird, but we'll get to that later. Because none of these are dead, MallocChecker still keeps track of reg_$2 as released (and reallocated to conj_$6) and diagnoses a use-after-free of memory at SymRegion{reg_$2} during the second function call.

What does D51397 have to do with this? Well, the main thing here is that the whole chain of bindings

(in_b, 0, direct) : &SymRegion{reg_$0<in_b>}
(SymRegion{<reg_$0>}, 0, direct) : &SymRegion{<reg_$1<SymRegion{reg_$0<in_b>}.a>}
(SymRegion{<reg_$1<SymRegion{reg_$0<in_b>}.a>}, 0, direct) : SymRegion{reg_$2<SymRegion{reg_$1<SymRegion{reg_$0<in_b>}.a>}.buf>}

is never explicitly written down in the Store. It is simply implied to be there, and if we tried to always write these down, we would actually have to write down infinite chains of regions in some cases, which isn't very productive.

But it's bad that these bindings are implicit because Store's mark-and-sweep garbage collection worklist wasn't designed to walk through these implicit bindings - instead they were handled by implicit assumptions that if region R is live, it means that reg_$k<R> is also live, which means that SymRegion{reg_$k<R> is also live, then repeat the same process for R' = SymRegion{reg_$k<R>} etc.

Previously, the UpdatePostponed mechanism kept track of every Store binding explicitly and made sure every binding is queried for being live at least once, so that all of these implicit keep-alive relationships had a chance to kick in. Once D51397 has landed, instead for every region R that contains a value of a pointer-type we put SymRegion{reg_$k<R>} into the worklist, believing that we handle all the implicit assumptions that way. No, we don't, not quite.

When D51397 puts SymRegion{<reg_$0>} into the worklist, it assumes that there is at least one explicit binding within SymRegion{reg_$k<R>} that would cause the process to continue. However, in our case there are no bindings in SymRegion{reg_$0<in_b>}, so the worklist keeps reg_$0<in_b> around and then exits without ever visiting the third binding in our list. For that reason, even though the base region for the third binding is live, it gets dropped from the Store.

I don't immediately see how to fix that without re-inventing the postponed list, so i guess we'd have to revert for now. Well, that's the traditional outcome of trying to optimize removeDeadBindings().


The other test demonstrates the problem a bit more straghtforwardly, but note that the SYMBOL DEAD part of it did not fail before. Only the part with warnIfReached was failing. So the point here is, we're not fixing liveness analysis as such here - we're only fixing the RegionStore.

Now, let's ask how come that we never bothered to ask whether the key is live before removing the binding from the Store? At the very least, we could have put an assert that the binding we're removing isn't live, right? This would indeed be a great assert to have, but we still cannot put it because other things don't work the way we expect.

Namely, SymbolReaper::isLiveRegion() in fact makes quite a bit of errors, leaning towards telling that something is live more often than necessary. At least, have a look at those return true; lines that would have caused bindings to this-region in C++ never get cleaned up, even after the method call has ended. SymbolReaper may actively contradict decisions of the Store, and the Store doesn't consult it while removing bindings; instead, it only keeps bindings that it has manually visited while running its worklist. Usually the Store ends up doing the right thing (well, in this example it didn't), but it'd be great to get these two in sync. Once we do, we might actually try to re-apply D51397 - not in its original shape but i suspect that overall idea might still work somehow.

Diff Detail

Repository
rL LLVM

Event Timeline

NoQ created this revision.Jan 31 2019, 4:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2019, 4:28 PM
george.karpenkov accepted this revision.Jan 31 2019, 4:31 PM
This revision is now accepted and ready to land.Jan 31 2019, 4:31 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2019, 3:57 PM