This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
-
RegionStore.cpp
-
test/Analysis/
-
Analysis/
-
trivial-copy-struct.cpp

Differential D132236

[analyzer] Fix liveness of LazyCompoundVals
AbandonedPublic

Authored by steakhal on Aug 19 2022, 8:20 AM.

Download Raw Diff

Details

Reviewers

NoQ
xazax.hun
martong
ASDenysPetrov
Szelethus
isuckatcs
vabridgers

Summary

The SymbolReaper should consider a region live if that appears in
the store bindings anywhere - even if the region is wrapped by
RegionSymbolVals, SymbolicRegions or LazyCompoundVals.

This mistake by the SymbolReaper materialized in false-positives like
in the following example:

void ptr1(List* n) {
  List* n2 = new List(*n); // cctor
  if (!n->next) {
    if (n2->next) {
      clang_analyzer_warnIfReached(); // FP! This should be unreachable.
    }
  }
  delete n2;
}

The store entry of the pointee of n2 looks something like this:

HeapSymRegion{conj_$1{List *, LC1, S2540, #1}}  0(Default)  lazyCompoundVal{0x5641f1ed8830,Element{SymRegion{reg_$2<List * n>},0 S64b,struct List}}

So, any constraints referring to the VarRegion n supposed to be kept alive by that store binding.
Hence, the analyzer should be able to see that reg_$3<List * Element{SymRegion{reg_$2<List * n>},0 S64b,struct List}.next> { [0, 0] } when it reaches the n2->next.

This patch fixes the issue by reusing the SymbolReaper::markLive(MemRegion) for doing the appropriate traversal.

I'm yet to do the performance testing, given that this is actually in
the hot path.

Co-authored-by: Tomasz Kamiński <tomasz.kaminski@sonarsource.com>

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

steakhal created this revision.Aug 19 2022, 8:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 8:20 AM

Herald added subscribers: manas, dkrupp, donat.nagy and 5 others. · View Herald Transcript

steakhal requested review of this revision.Aug 19 2022, 8:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2022, 8:20 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

steakhal added a parent revision: D132143: [analyzer] LazyCompoundVals should be always bound as default bindings.Aug 19 2022, 8:20 AM

Harbormaster completed remote builds in B182229: Diff 454011.Aug 19 2022, 8:59 AM

rebase

Harbormaster completed remote builds in B182863: Diff 454861.Aug 23 2022, 11:32 AM

Nice catch but I don't think this is the right solution.

Symbol liveness corresponds to a concrete runtime property of the program: can the program obtain this value by executing further runtime instructions? The program cannot obtain a pointer to variable n, or even a pointer to a region *n, just by looking at the data in a struct that was copied from region *n previously. Therefore neither n nor *n should not be marked live just because they are mentioned in a base region of a lazy compound value. In particular, memory leaks should be reported against n if the last place it's mentioned is a base region of a lazy compound value, as the program cannot possibly free that memory. You can most likely write a MallocChecker test for that, which will become a false negative after your patch.

For the same reason, for any region R, if reg<R> or derived<R, $x> is live, does not mean R is live. The program cannot guess a memory address by only looking at a value loaded from that address.

If you look at how a simpler example works:

void clang_analyzer_warnIfReached();

void foo(int **n) {
  int *n2 = *n;
  if (!**n) {
    if (*n2) {
      clang_analyzer_warnIfReached();
    }
  }
}

You will notice that it does not try to keep VarRegion n or even the symbol reg_$0<int ** n> alive. It is *sufficient* to keep reg_$1<int * Element{SymRegion{reg_$0<int ** n>},0 S64b,int *}> alive in order to keep the test passing, and it does match the actual definition of live symbol (reg_$0 can no longer be retrieved by the program but reg_$1 can).

The original code tries to do the same for lazy compound values. I suspect that you'll find the actual bug when you descend into getInterestingValues().

What looks fishy about getInterestingValues() is that it assumes that the amount of interesting values is finite. This sounds incredibly wrong to me. If a lazy compound value contains any pointer symbol $p, then all values in the following infinite series are interesting:

$p,  *$p,  **$p,  ***$p,  ...

So I think the correct solution would be to maintain a list of explicitly-live compound values. Just like we already maintain a list of explicitly live symbols - TheLiving, and explicitly live regions - RegionRoots; neither of these lists enumerates all live symbols or regions as there are infinitely many of them, but each of these lists enumerates the ones that we've learned to be definitely live, and they're sufficient to determine liveness of any other symbol through recursive descent into the identity of that symbol. So with a list of live compound value regions, we can ask a question like "is this symbol stored in any of the known-live compound values?" to determine that symbol's liveness. This will keep the stored symbol alive without making the lazy compound value's base region alive.

So to be clear, I think your solution is probably an ok stop-gap solution. False negatives aren't that bad, I'm more worried that existing leak reports may become less understandable because they'll be reported later than necessary, which may obscure the reason why we think they're finally leaking. But I really want us to agree on how this facility is supposed to work, and explore the perfect solution, before implementing an imperfect solution.

Maybe we should introduce "weak region roots" that aren't necessarily live themselves but anything derived from them (any SymbolRegionValue<R'> or SymbolDerived<R', $x> for arbitrary sub-region R' of a weak-region-root R) is live. This could be pretty straightforward.

In D132236#3747652, @NoQ wrote:
Nice catch but I don't think this is the right solution.

Symbol liveness corresponds to a concrete runtime property of the program: can the program obtain this value by executing further runtime instructions? The program cannot obtain a pointer to variable n, or even a pointer to a region *n, just by looking at the data in a struct that was copied from region *n previously. Therefore neither n nor *n should not be marked live just because they are mentioned in a base region of a lazy compound value. In particular, memory leaks should be reported against n if the last place it's mentioned is a base region of a lazy compound value, as the program cannot possibly free that memory. You can most likely write a MallocChecker test for that, which will become a false negative after your patch.

For the same reason, for any region R, if reg<R> or derived<R, $x> is live, does not mean R is live. The program cannot guess a memory address by only looking at a value loaded from that address.

If you look at how a simpler example works:
void clang_analyzer_warnIfReached();

void foo(int **n) {
  int *n2 = *n;
  if (!**n) {
    if (*n2) {
      clang_analyzer_warnIfReached();
    }
  }
}
You will notice that it does not try to keep VarRegion n or even the symbol reg_$0<int ** n> alive. It is *sufficient* to keep reg_$1<int * Element{SymRegion{reg_$0<int ** n>},0 S64b,int *}> alive in order to keep the test passing, and it does match the actual definition of live symbol (reg_$0 can no longer be retrieved by the program but reg_$1 can).

Nice explanation, thank you very much! I'll need some time to digest it to the full extent.
It makes sense so far, but I imagine it will become clear when I continue the investigation.

The original code tries to do the same for lazy compound values. I suspect that you'll find the actual bug when you descend into getInterestingValues().

What looks fishy about getInterestingValues() is that it assumes that the amount of interesting values is finite. This sounds incredibly wrong to me. If a lazy compound value contains any pointer symbol $p, then all values in the following infinite series are interesting:
$p,  *$p,  **$p,  ***$p,  ...

Well, there two places where I suspected this bug. The place I changed and the other was getInterestingValues(). The first looked more probable, hence we sticked to that.
We'll continue the investigation with the getInterestingValues() next time.

So I think the correct solution would be to maintain a list of explicitly-live compound values. Just like we already maintain a list of explicitly live symbols - TheLiving, and explicitly live regions - RegionRoots; neither of these lists enumerates all live symbols or regions as there are infinitely many of them, but each of these lists enumerates the ones that we've learned to be definitely live, and they're sufficient to determine liveness of any other symbol through recursive descent into the identity of that symbol. So with a list of live compound value regions, we can ask a question like "is this symbol stored in any of the known-live compound values?" to determine that symbol's liveness. This will keep the stored symbol alive without making the lazy compound value's base region alive.

Okay, I'll continue from here. Thanks for the

In D132236#3748017, @NoQ wrote:

So to be clear, I think your solution is probably an ok stop-gap solution. False negatives aren't that bad, I'm more worried that existing leak reports may become less understandable because they'll be reported later than necessary, which may obscure the reason why we think they're finally leaking. But I really want us to agree on how this facility is supposed to work, and explore the perfect solution, before implementing an imperfect solution.

Unfortunately, we found many FPs introduced by this change; so this should be definitely blocked.

Maybe we should introduce "weak region roots" that aren't necessarily live themselves but anything derived from them (any SymbolRegionValue<R'> or SymbolDerived<R', $x> for arbitrary sub-region R' of a weak-region-root R) is live. This could be pretty straightforward.

So, what you are suggesting here is an alternative solution to the one you already proposed in your last comment?

For transparency, we, unfortunately, reached the end of our timebox, and I'll work on something else for a couple of weeks.
I plan to come back to this in ~3 weeks if everything works well.

So, what you are suggesting here is an alternative solution to the one you already proposed in your last comment?

I was just expanding the initial suggestion but also symbol reaper is too complicated for me to make such suggestions confidently. So I'm just sharing some ideas but you'll probably have to dig deeper to understand whether they're correct.

Unfortunately, we found many FPs introduced by this change; so this should be definitely blocked.

Interesting, I didn't expect that. It could be insanely valuable to make tests out of those so that future generations didn't make the same mistake! Symbol reaper seems to be very under-tested for its complexity.

In D132236#3752209, @NoQ wrote:

So, what you are suggesting here is an alternative solution to the one you already proposed in your last comment?

I was just expanding the initial suggestion but also symbol reaper is too complicated for me to make such suggestions confidently. So I'm just sharing some ideas but you'll probably have to dig deeper to understand whether they're correct.

I see. I very much appreciate any suggestions. Many thanks.

Unfortunately, we found many FPs introduced by this change; so this should be definitely blocked.

Interesting, I didn't expect that. It could be insanely valuable to make tests out of those so that future generations didn't make the same mistake! Symbol reaper seems to be very under-tested for its complexity.

I agree. We were also surprised. I havent spent the time to understand the root cause, as only a single path resulted in a 160Mb+ html. So far I can tell that it messed up something including lambdas capturing by reference. I find it always difficult to catch "what is missing" from the puzzle in such cases. Its much easier to catch if something has a wrong binding etc. We'll see how much effort it takes to reduce. For FPs I dont know how to automate this process. :/

Lets see when we get back to this.

For FPs I dont know how to automate this process. :/

Given that our initial hypothesis was that there should be zero new false positives, it could probably work with a creduce predicate "this code has a warning after the patch but not before the patch". The initial hypothesis is probably not entirely correct, given that even a slight change in coverage could result in false positives, but I doubt that if you pick 4-5 examples, all of them would reduce to a situation where it's just a change of coverage. Especially given how common you say the problem appears to be. So I think it's worth a try.

tomasz-kaminski-sonarsource mentioned this in D134941: [analyzer][NFC] Add tests for D132236.Sep 30 2022, 1:58 AM

tomasz-kaminski-sonarsource added a child revision: D134941: [analyzer][NFC] Add tests for D132236.Sep 30 2022, 2:00 AM

tomasz-kaminski-sonarsource removed a parent revision: D132143: [analyzer] LazyCompoundVals should be always bound as default bindings.Sep 30 2022, 2:04 AM

tomasz-kaminski-sonarsource removed a child revision: D134941: [analyzer][NFC] Add tests for D132236.

tomasz-kaminski-sonarsource added a parent revision: D134941: [analyzer][NFC] Add tests for D132236.

tomasz-kaminski-sonarsource removed a parent revision: D134941: [analyzer][NFC] Add tests for D132236.Sep 30 2022, 3:05 AM

I'm abandoning this change in favor of D134947.
I'll leave the patch summary and the discussion here for the history.

In D132236#3753238, @NoQ wrote:

For FPs I dont know how to automate this process. :/

Given that our initial hypothesis was that there should be zero new false positives, it could probably work with a creduce predicate "this code has a warning after the patch but not before the patch". The initial hypothesis is probably not entirely correct, given that even a slight change in coverage could result in false positives, but I doubt that if you pick 4-5 examples, all of them would reduce to a situation where it's just a change of coverage. Especially given how common you say the problem appears to be. So I think it's worth a try.

D134941 addresses this comment. I'm also about to investigate other cases with differences - but no promises about that.

What looks fishy about getInterestingValues() is that it assumes that the amount of interesting values is finite. This sounds incredibly wrong to me. If a lazy compound value contains any pointer symbol $p, then all values in the following infinite series are interesting:
$p,  *$p,  **$p,  ***$p,  ...

We have also looked into this, and indeed the getInterestingValues() produces the first level of indirection for storage. However, the code in RemoveDeadBindingsWorker is recursively visiting each of the bindings found, so, at least per our understanding, we should visit all regions that are reachable through indirection:

const RegionStoreManager::SValListTy &Vals = RM.getInterestingValues(*LCS);
for (RegionStoreManager::SValListTy::const_iterator I = Vals.begin(),
                                                    E = Vals.end();
     I != E; ++I)
  VisitBinding(*I);

Also, visiting these regions in a current snapshot of the storage seems correct from the temporal perspective - we can reach current state in region, via the pointer to it, that was present at the time of copy.

tomasz-kaminski-sonarsource mentioned this in rG73716baa30eb: [analyzer][NFC] Add tests for D132236.Oct 3 2022, 6:43 AM

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Core/

RegionStore.cpp

18 lines

test/

Analysis/

trivial-copy-struct.cpp

25 lines

Diff 454861

clang/lib/StaticAnalyzer/Core/RegionStore.cpp

Show First 20 Lines • Show All 2,823 Lines • ▼ Show 20 Lines	for (ClusterBindings::iterator I = C->begin(), E = C->end(); I != E; ++I) {
// Element index of a binding key is live.		// Element index of a binding key is live.
SymReaper.markElementIndicesLive(I.getKey().getRegion());		SymReaper.markElementIndicesLive(I.getKey().getRegion());

VisitBinding(I.getData());		VisitBinding(I.getData());
}		}
}		}

void RemoveDeadBindingsWorker::VisitBinding(SVal V) {		void RemoveDeadBindingsWorker::VisitBinding(SVal V) {
// Is it a LazyCompoundVal? All referenced regions are live as well.		const MemRegion *R = V.getAsRegion();
if (Optional<nonloc::LazyCompoundVal> LCS =
V.getAs<nonloc::LazyCompoundVal>()) {

const RegionStoreManager::SValListTy &Vals = RM.getInterestingValues(*LCS);

for (RegionStoreManager::SValListTy::const_iterator I = Vals.begin(),
E = Vals.end();
I != E; ++I)
VisitBinding(*I);

return;		// Is it a LazyCompoundVal? All referenced regions are live as well.
}		if (auto LCS = V.getAs<nonloc::LazyCompoundVal>())
		R = LCS->getRegion();

// If V is a region, then add it to the worklist.		// If V is a region, then add it to the worklist.
if (const MemRegion *R = V.getAsRegion()) {		if (R) {
AddToWorkList(R);		AddToWorkList(R);
SymReaper.markLive(R);		SymReaper.markLive(R);

// All regions captured by a block are also live.		// All regions captured by a block are also live.
if (const BlockDataRegion *BR = dyn_cast<BlockDataRegion>(R)) {		if (const BlockDataRegion *BR = dyn_cast<BlockDataRegion>(R)) {
BlockDataRegion::referenced_vars_iterator I = BR->referenced_vars_begin(),		BlockDataRegion::referenced_vars_iterator I = BR->referenced_vars_begin(),
E = BR->referenced_vars_end();		E = BR->referenced_vars_end();
for ( ; I != E; ++I)		for ( ; I != E; ++I)
▲ Show 20 Lines • Show All 76 Lines • Show Last 20 Lines

clang/test/Analysis/trivial-copy-struct.cpp

Show All 28 Lines	void copy_on_heap(Node* n1) {
clang_analyzer_dump(n1->ptr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<int * Element{SymRegion{reg_${{[0-9]+}}<Node * n1>},0 S64b,struct Node}.ptr>}}}		clang_analyzer_dump(n1->ptr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<int * Element{SymRegion{reg_${{[0-9]+}}<Node * n1>},0 S64b,struct Node}.ptr>}}}
clang_analyzer_dump(n2->ptr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<int * Element{SymRegion{reg_${{[0-9]+}}<Node * n1>},0 S64b,struct Node}.ptr>}}}		clang_analyzer_dump(n2->ptr); // expected-warning-re {{&SymRegion{reg_${{[0-9]+}}<int * Element{SymRegion{reg_${{[0-9]+}}<Node * n1>},0 S64b,struct Node}.ptr>}}}

if (n1->ptr != n2->ptr)		if (n1->ptr != n2->ptr)
clang_analyzer_warnIfReached(); // unreachable		clang_analyzer_warnIfReached(); // unreachable
(void)(n1->ptr);		(void)(n1->ptr);
(void)(n2->ptr);		(void)(n2->ptr);
}		}

		struct List {
		List* next;
		};

		void ptr1(List* n) {
		List* n2 = new List(*n); // cctor
		if (!n->next) {
		if (n2->next) {
		clang_analyzer_warnIfReached(); // unreachable
		}
		}
		delete n2;
		}

		void ptr2(List* n) {
		List* n2 = new List(); // ctor
		n2 = n; // assignment
		if (!n->next) {
		if (n2->next) {
		clang_analyzer_warnIfReached(); // unreachable
		}
		}
		delete n2;
		}