This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/StaticAnalyzer/Core/PathSensitive/
-
clang/
-
StaticAnalyzer/
-
Core/
-
PathSensitive/
2/2
SymbolManager.h
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
1/1
RegionStore.cpp
3/5
SymbolManager.cpp
-
test/Analysis/
-
Analysis/
4/6
trivial-copy-struct.cpp

Differential D134947

[analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal
ClosedPublic

Authored by tomasz-kaminski-sonarsource on Sep 30 2022, 3:25 AM.

Download Raw Diff

Details

Reviewers

NoQ
martong
nicolasvasilache
jdoerfert
xazax.hun
int3

Group Reviewers

Restricted Project

Commits

rGa6b42040ad30: [analyzer] Fix the liveness of Symbols for values in regions referred by…

Summary

To illustrate our current understanding, let's start with the following program:
https://godbolt.org/z/33f6vheh1

void clang_analyzer_printState();

struct C {
   int x;
   int y;
   int more_padding;
};

struct D {
   C c;
   int z;
};

C foo(D d, int new_x, int new_y) {
   d.c.x = new_x;       // B1
   assert(d.c.x < 13);  // C1

   C c = d.c;           // L

   assert(d.c.y < 10);  // C2
   assert(d.z < 5);     // C3

   d.c.y = new_y;       // B2

   assert(d.c.y < 10);  // C4

   return c;  // R
}

In the code, we create a few bindings to subregions of root region d (B1, B2), a constrain on the values (C1, C2, ….), and create a lazyCompoundVal for the part of the region d at point L, which is returned at point R.

Now, the question is which of these should remain live as long the return value of the foo call is live. In perfect a word we should preserve:

only the bindings of the subregions of d.c, which were created before the copy at L. In our example, this includes B1, and not B2. In other words, new_x should be live but new_y shouldn’t.
constraints on the values of d.c, that are reachable through c. This can be created both before the point of making the copy (L) or after. In our case, that would be C1 and C2. But not C3 (d.z value is not reachable through c) and C4 (the original value of`d.c.y` was overridden at B2 after the creation of c).

The current code in the RegionStore covers the use case (1), by using the getInterestingValues() to extract bindings to parts of the referred region present in the store at the point of copy. This also partially covers point (2), in case when constraints are applied to a location that has binding at the point of the copy (in our case d.c.x in C1 that has value new_x), but it fails to preserve the constraints that require creating a new symbol for location (d.c.y in C2).

We introduce the concept of lazily copied locations (regions) to the SymbolReaper, i.e. for which a program can access the value stored at that location, but not its address. These locations are constructed as a set of regions referred to by lazyCompoundVal. A readable location (region) is a location that live or lazily copied . And symbols that refer to values in regions are alive if the region is readable.

For simplicity, we follow the current approach to live regions and mark the base region as lazily copied, and consider any subregions as readable. This makes some symbols falsy live (d.z in our example) and keeps the corresponding constraints alive.

The rename Regions to LiveRegions inside RegionStore is NFC change, that was done to make it clear, what is difference between regions stored in this two sets.

Regression Test: https://reviews.llvm.org/D134941
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tomasz-kaminski-sonarsource created this revision.Sep 30 2022, 3:25 AM

Herald added a reviewer: NoQ. · View Herald TranscriptSep 30 2022, 3:25 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: steakhal, manas, ASDenysPetrov and 9 others. · View Herald Transcript

tomasz-kaminski-sonarsource requested review of this revision.Sep 30 2022, 3:25 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2022, 3:25 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

tomasz-kaminski-sonarsource edited the summary of this revision. (Show Details)Sep 30 2022, 3:26 AM

tomasz-kaminski-sonarsource edited the summary of this revision. (Show Details)

tomasz-kaminski-sonarsource added a parent revision: D134941: [analyzer][NFC] Add tests for D132236.

tomasz-kaminski-sonarsource added a reviewer: martong.Sep 30 2022, 3:29 AM

Herald added a subscriber: rnkovacs. · View Herald TranscriptSep 30 2022, 3:29 AM

steakhal mentioned this in D132236: [analyzer] Fix liveness of LazyCompoundVals.Sep 30 2022, 3:34 AM

Harbormaster completed remote builds in B189640: Diff 464208.Sep 30 2022, 4:03 AM

tomasz-kaminski-sonarsource edited the summary of this revision. (Show Details)Sep 30 2022, 4:33 AM

tomasz-kaminski-sonarsource edited the summary of this revision. (Show Details)

I like the approach of this patch and I think this is somewhat aligned with @NoQ's ideas about

a list of explicitly-live compound values

and

"weak region roots" that aren't necessarily live themselves but anything derived from them ... is live

Coupled with the new tests for regression cases in D134941, I think this is really good.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
586	Could you please incorporate the definition of lazily copied locations (regions) from the summary to here as a comment?
657	Could you please incorporate the definition of readable locations (regions) from the summary to here as a comment?
clang/lib/StaticAnalyzer/Core/RegionStore.cpp
2841–2842	Just a nit, I wonder if you might have a test case for this (which should fail for now).
clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
461	Just out of curiosity, do you have plans to tackle this todo sometime?

Included additional tests that corresponds to TODO.

Harbormaster completed remote builds in B189966: Diff 464659.Oct 3 2022, 6:32 AM

Applied review suggestions.

Harbormaster completed remote builds in B189974: Diff 464673.Oct 3 2022, 7:07 AM

Applied all review suggestions.

clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
461	We do not plan to takle it in near future.

Updated diff to be mergable.

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptOct 3 2022, 7:17 AM

Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald Transcript

Herald added a reviewer: Restricted Project. · View Herald Transcript

Herald added subscribers: openmp-commits, zero9178, bzcheeseman and 24 others. · View Herald Transcript

Harbormaster completed remote builds in B189978: Diff 464677.Oct 3 2022, 7:18 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptOct 3 2022, 7:18 AM

Herald added a subscriber: sstefan1. · View Herald Transcript

Fighting with arcanist.

Harbormaster completed remote builds in B189979: Diff 464678.Oct 3 2022, 7:19 AM

tomasz-kaminski-sonarsource removed a parent revision: D134941: [analyzer][NFC] Add tests for D132236.Oct 3 2022, 7:23 AM

tomasz-kaminski-sonarsource edited the summary of this revision. (Show Details)Oct 3 2022, 7:26 AM

Thanks for the updates. I am okay with it now. LGTM. But please wait for NoQ's approval. So, this is a gentle ping for you @NoQ :)

clang/test/Analysis/trivial-copy-struct.cpp
61
97

This revision is now accepted and ready to land.Oct 3 2022, 7:31 AM

Applied suggested comment updates.

Harbormaster completed remote builds in B189984: Diff 464684.Oct 3 2022, 8:30 AM

If we end up going with this approach, I wonder if it would be a great time to update some of the docs here: https://clang.llvm.org/docs/analyzer/developer-docs/RegionStore.html
Usually, we are not doing a great job keeping these documentations up to date. I think the logic to determine which symbols and regions are live and how that logic interacts with the different types of memory regions might be important enough to have some documentation on it.

(Also, I think we should add link to the SA talks at the LLVM Dev Conf., but that is completely unrelated to this change.)

I also like the approach, but wait for @NoQ, he has the most experience in this area :)

Wow thanks!!

Yeah this matches my understanding of the problem. I still encourage you to test it on real-world code before committing, and carefully investigate every change in diagnostics, because symbol reaper is very convoluted and filled with insane cornercases.

clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
461	Could you add a negative/FIXME test for it? At a glance I suspect that this TODO is significantly less important for `isLiveRegion()` than it is for your new function, so I encourage you to explore the possibility of dropping `getBaseRegion()`, even if just a little bit and doesn't have to be in this patch. If a smaller subregion is truly live, any value inside of the base region can theoretically be accessed through safe pointer arithmetic. It's very difficult to prove that it can't be accessed anymore. Every pointer escape will be a potential access. In your case, however, if the superregion is neither live nor lazily copied, the information outside of the lazily copied subregion is truly lost, there's literally nothing the program can do to recover it.
clang/test/Analysis/trivial-copy-struct.cpp
97	Do you know what's causing this to not work? Is this a regression or just never worked?

int3 resigned from this revision.Oct 3 2022, 4:50 PM

tomasz-kaminski-sonarsource added inline comments.Oct 3 2022, 11:29 PM

clang/test/Analysis/trivial-copy-struct.cpp
97	This example never worked. We have an in-progress fix, that we are testing now.

steakhal added a child revision: D135136: [analyzer] Make directly bounded LazyCompoundVal as lazily copied.Oct 4 2022, 2:27 AM

First of all, thanks for the feedback!

In D134947#3830995, @xazax.hun wrote:

If we end up going with this approach, I wonder if it would be a great time to update some of the docs here: https://clang.llvm.org/docs/analyzer/developer-docs/RegionStore.html
Usually, we are not doing a great job keeping these documentations up to date. I think the logic to determine which symbols and regions are live and how that logic interacts with the different types of memory regions might be important enough to have some documentation on it.

Yes, I'll post a patch addressing this. Thanks for noting.

In D134947#3832130, @NoQ wrote:

Wow thanks!!

Yeah this matches my understanding of the problem. I still encourage you to test it on real-world code before committing, and carefully investigate every change in diagnostics, because symbol reaper is very convoluted and filled with insane cornercases.

That's true. We did a careful investigation and the numbers are promising even at large scale.
The upside is that even if it broke something, it does not have a significant impact. The downside is that we wished for greater improvement/impact by fixing this.

[...] carefully investigate every change in diagnostics, [...]

I investigated multiple cases, out of which I believe all of them were intentionally affected, hence improved.
Note that however, I did not investigate all the changes but only a handful of a (representative) set due to the nature of collecting, minimizing, and understanding the reports is really time-consuming.

I'd like to proceed with this patch as-is. And possibly land further incremental step(s) on top of this, such as D135136.
Other than D135136 though, we don't plan to push this area any further for the time being.

clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
461	Could you add a negative/FIXME test for it? At a glance I suspect that this TODO is significantly less important for `isLiveRegion()` than it is for your new function, so I encourage you to explore the possibility of dropping `getBaseRegion()`, even if just a little bit and doesn't have to be in this patch. So far we could not come up with a test case demonstrating this case. Right now we don't plan to investigate this area either in the foreseeable future.
clang/test/Analysis/trivial-copy-struct.cpp
97	Fixed by D135136.

steakhal added inline comments.Oct 4 2022, 4:39 AM

clang/test/Analysis/trivial-copy-struct.cpp
58

tomasz-kaminski-sonarsource edited the summary of this revision. (Show Details)Oct 5 2022, 6:05 AM

As a result of our internal test on around ~170 projects (~20 Widnows, ~150 Linux) that are compromised of several hundreds of millions of lines of code, the impact on the files that parsed correctly was: 5 issues disappearing and 4 new issues. We investigated all of the reports, and the changes seemed justified:

removing issues from impossible paths, that are now correctly recognized due to preserved constraints
the value of location not being reported as garbage
undefined behavior for overflowing shift operation, as we preserve constrain of the bits
recognition of null pointer values

tomasz-kaminski-sonarsource retitled this revision from [analyzer] Fix liveness of Symbols for values in regions reffered by LazyCompoundVal to [analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal.Oct 6 2022, 7:34 AM

tomasz-kaminski-sonarsource marked an inline comment as done.Oct 6 2022, 9:48 AM

tomasz-kaminski-sonarsource added inline comments.

clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
461	@NoQ We were banging our heads against this question, and we haven't been able to create or find any example when using base region would cause a problem. Moreover, we concluded that constructing an example, where the current approach would differ in reported issues, is probably impossible. To illustrate this I’ll refer back to the example in the summary of this patch. The side effect of our change is that for symbol `reg_<d.z>`, representing the falsely readable location, didn’t have any binding. `isLive(reg_<d.z>)` will return `true`. However, to observe the effect, either: a) the code needs to be able to read the `reg_<d.z>` b) we check if we should preserve constraint on the `reg_<d.z>` If we consider option `(a)`, that means that the code still has a reference/pointer to objects `d`, `d.z`, or to the copy of either of them. In these situations, the presence of such pointer/copy should make `reg_<d.z>` live - regardless of the existence of `lazyCompoundVal` to subregions of `d`, so `isLive(reg_<d.z>)` would return `true` anyway. In the case of `(b)`, we will preserve all constraints that refer to `reg_<d.z>`. If `reg_<d.z>` would be reachable/accessible, similar reasoning as for option `(a)` would conclude that it must be live anyway. In contrast, when the program can no longer reach/access the value of `d.z`, the presence of this constraint cannot impact the result of the analysis, hence it would do no harm. Given the tradeoff between additional dormant constraints and the complexity (and cost) of additional checking in `SymbolReaper`, we believe that using the base region is the right choice, and we should simply replace `TODO` with an appropriate explanation.

What do you think @NoQ?

It seems like we are all aligned.
I'll land this tomorrow.

Closed by commit rGa6b42040ad30: [analyzer] Fix the liveness of Symbols for values in regions referred by… (authored by tomasz-kaminski-sonarsource). · Explain WhyOct 19 2022, 7:06 AM

This revision was automatically updated to reflect the committed changes.

tomasz-kaminski-sonarsource added a commit: rGa6b42040ad30: [analyzer] Fix the liveness of Symbols for values in regions referred by….

Revision Contents

Path

Size

clang/

include/

clang/

StaticAnalyzer/

Core/

PathSensitive/

SymbolManager.h

18 lines

lib/

StaticAnalyzer/

Core/

RegionStore.cpp

4 lines

SymbolManager.cpp

20 lines

test/

Analysis/

trivial-copy-struct.cpp

50 lines

Diff 468903

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h

Show First 20 Lines • Show All 576 Lines • ▼ Show 20 Lines	class SymbolReaper {

using SymbolSetTy = llvm::DenseSet<SymbolRef>;		using SymbolSetTy = llvm::DenseSet<SymbolRef>;
using SymbolMapTy = llvm::DenseMap<SymbolRef, SymbolStatus>;		using SymbolMapTy = llvm::DenseMap<SymbolRef, SymbolStatus>;
using RegionSetTy = llvm::DenseSet<const MemRegion *>;		using RegionSetTy = llvm::DenseSet<const MemRegion *>;

SymbolMapTy TheLiving;		SymbolMapTy TheLiving;
SymbolSetTy MetadataInUse;		SymbolSetTy MetadataInUse;

RegionSetTy RegionRoots;		RegionSetTy LiveRegionRoots;
		// The lazily copied regions are locations for which a program
		martongUnsubmitted Done Reply Inline Actions Could you please incorporate the definition of lazily copied locations (regions) from the summary to here as a comment? martong: Could you please incorporate the definition of //lazily copied locations (regions)// from the…
		// can access the value stored at that location, but not its address.
		// These regions are constructed as a set of regions referred to by
		// lazyCompoundVal.
		RegionSetTy LazilyCopiedRegionRoots;

const StackFrameContext *LCtx;		const StackFrameContext *LCtx;
const Stmt *Loc;		const Stmt *Loc;
SymbolManager& SymMgr;		SymbolManager& SymMgr;
StoreRef reapedStore;		StoreRef reapedStore;
llvm::DenseMap<const MemRegion *, unsigned> includedRegionCache;		llvm::DenseMap<const MemRegion *, unsigned> includedRegionCache;

public:		public:
Show All 29 Lines	public:
/// this will keep the symbol alive as long as its associated region is also		/// this will keep the symbol alive as long as its associated region is also
/// live. For other symbols, this has no effect; checkers are not permitted		/// live. For other symbols, this has no effect; checkers are not permitted
/// to influence the life of other symbols. This should be used before any		/// to influence the life of other symbols. This should be used before any
/// symbol marking has occurred, i.e. in the MarkLiveSymbols callback.		/// symbol marking has occurred, i.e. in the MarkLiveSymbols callback.
void markInUse(SymbolRef sym);		void markInUse(SymbolRef sym);

using region_iterator = RegionSetTy::const_iterator;		using region_iterator = RegionSetTy::const_iterator;

region_iterator region_begin() const { return RegionRoots.begin(); }		region_iterator region_begin() const { return LiveRegionRoots.begin(); }
region_iterator region_end() const { return RegionRoots.end(); }		region_iterator region_end() const { return LiveRegionRoots.end(); }

/// Returns whether or not a symbol has been confirmed dead.		/// Returns whether or not a symbol has been confirmed dead.
///		///
/// This should only be called once all marking of dead symbols has completed.		/// This should only be called once all marking of dead symbols has completed.
/// (For checkers, this means only in the checkDeadSymbols callback.)		/// (For checkers, this means only in the checkDeadSymbols callback.)
bool isDead(SymbolRef sym) {		bool isDead(SymbolRef sym) {
return !isLive(sym);		return !isLive(sym);
}		}

void markLive(const MemRegion *region);		void markLive(const MemRegion *region);
		void markLazilyCopied(const MemRegion *region);
void markElementIndicesLive(const MemRegion *region);		void markElementIndicesLive(const MemRegion *region);

/// Set to the value of the symbolic store after		/// Set to the value of the symbolic store after
/// StoreManager::removeDeadBindings has been called.		/// StoreManager::removeDeadBindings has been called.
void setReapedStore(StoreRef st) { reapedStore = st; }		void setReapedStore(StoreRef st) { reapedStore = st; }

private:		private:
		bool isLazilyCopiedRegion(const MemRegion *region) const;
		// A readable region is a region that live or lazily copied.
		martongUnsubmitted Done Reply Inline Actions Could you please incorporate the definition of readable locations (regions) from the summary to here as a comment? martong: Could you please incorporate the definition of //readable locations (regions)// from the…
		// Any symbols that refer to values in regions are alive if the region
		// is readable.
		bool isReadableRegion(const MemRegion *region);

/// Mark the symbols dependent on the input symbol as live.		/// Mark the symbols dependent on the input symbol as live.
void markDependentsLive(SymbolRef sym);		void markDependentsLive(SymbolRef sym);
};		};

class SymbolVisitor {		class SymbolVisitor {
protected:		protected:
~SymbolVisitor() = default;		~SymbolVisitor() = default;

Show All 18 Lines

clang/lib/StaticAnalyzer/Core/RegionStore.cpp

Show First 20 Lines • Show All 2,832 Lines • ▼ Show 20 Lines	for (ClusterBindings::iterator I = C->begin(), E = C->end(); I != E; ++I) {
VisitBinding(I.getData());		VisitBinding(I.getData());
}		}
}		}

void RemoveDeadBindingsWorker::VisitBinding(SVal V) {		void RemoveDeadBindingsWorker::VisitBinding(SVal V) {
// Is it a LazyCompoundVal? All referenced regions are live as well.		// Is it a LazyCompoundVal? All referenced regions are live as well.
if (Optional<nonloc::LazyCompoundVal> LCS =		if (Optional<nonloc::LazyCompoundVal> LCS =
V.getAs<nonloc::LazyCompoundVal>()) {		V.getAs<nonloc::LazyCompoundVal>()) {
		// TODO: Make regions referred to by `lazyCompoundVals` that are bound to
		// subregions of the `LCS.getRegion()` also lazily copied.
		martongUnsubmitted Done Reply Inline Actions Just a nit, I wonder if you might have a test case for this (which should fail for now). martong: Just a nit, I wonder if you might have a test case for this (which should fail for now).
		if (const MemRegion *R = LCS->getRegion())
		SymReaper.markLazilyCopied(R);

const RegionStoreManager::SValListTy &Vals = RM.getInterestingValues(*LCS);		const RegionStoreManager::SValListTy &Vals = RM.getInterestingValues(*LCS);

for (RegionStoreManager::SValListTy::const_iterator I = Vals.begin(),		for (RegionStoreManager::SValListTy::const_iterator I = Vals.begin(),
E = Vals.end();		E = Vals.end();
I != E; ++I)		I != E; ++I)
VisitBinding(*I);		VisitBinding(*I);

▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

clang/lib/StaticAnalyzer/Core/SymbolManager.cpp

	Show First 20 Lines • Show All 405 Lines • ▼ Show 20 Lines
	}			}

	void SymbolReaper::markLive(SymbolRef sym) {			void SymbolReaper::markLive(SymbolRef sym) {
	TheLiving[sym] = NotProcessed;			TheLiving[sym] = NotProcessed;
	markDependentsLive(sym);			markDependentsLive(sym);
	}			}

	void SymbolReaper::markLive(const MemRegion *region) {			void SymbolReaper::markLive(const MemRegion *region) {
	RegionRoots.insert(region->getBaseRegion());			LiveRegionRoots.insert(region->getBaseRegion());
	markElementIndicesLive(region);			markElementIndicesLive(region);
	}			}

				void SymbolReaper::markLazilyCopied(const clang::ento::MemRegion *region) {
				LazilyCopiedRegionRoots.insert(region->getBaseRegion());
				}

	void SymbolReaper::markElementIndicesLive(const MemRegion *region) {			void SymbolReaper::markElementIndicesLive(const MemRegion *region) {
	for (auto SR = dyn_cast<SubRegion>(region); SR;			for (auto SR = dyn_cast<SubRegion>(region); SR;
	SR = dyn_cast<SubRegion>(SR->getSuperRegion())) {			SR = dyn_cast<SubRegion>(SR->getSuperRegion())) {
	if (const auto ER = dyn_cast<ElementRegion>(SR)) {			if (const auto ER = dyn_cast<ElementRegion>(SR)) {
	SVal Idx = ER->getIndex();			SVal Idx = ER->getIndex();
	for (auto SI = Idx.symbol_begin(), SE = Idx.symbol_end(); SI != SE; ++SI)			for (auto SI = Idx.symbol_begin(), SE = Idx.symbol_end(); SI != SE; ++SI)
	markLive(*SI);			markLive(*SI);
	}			}
	}			}
	}			}

	void SymbolReaper::markInUse(SymbolRef sym) {			void SymbolReaper::markInUse(SymbolRef sym) {
	if (isa<SymbolMetadata>(sym))			if (isa<SymbolMetadata>(sym))
	MetadataInUse.insert(sym);			MetadataInUse.insert(sym);
	}			}

	bool SymbolReaper::isLiveRegion(const MemRegion *MR) {			bool SymbolReaper::isLiveRegion(const MemRegion *MR) {
	// TODO: For now, liveness of a memory region is equivalent to liveness of its			// TODO: For now, liveness of a memory region is equivalent to liveness of its
	// base region. In fact we can do a bit better: say, if a particular FieldDecl			// base region. In fact we can do a bit better: say, if a particular FieldDecl
	// is not used later in the path, we can diagnose a leak of a value within			// is not used later in the path, we can diagnose a leak of a value within
	// that field earlier than, say, the variable that contains the field dies.			// that field earlier than, say, the variable that contains the field dies.
	MR = MR->getBaseRegion();			MR = MR->getBaseRegion();
				if (LiveRegionRoots.count(MR))
	if (RegionRoots.count(MR))
	return true;			return true;

	if (const auto *SR = dyn_cast<SymbolicRegion>(MR))			if (const auto *SR = dyn_cast<SymbolicRegion>(MR))
	return isLive(SR->getSymbol());			return isLive(SR->getSymbol());

	if (const auto *VR = dyn_cast<VarRegion>(MR))			if (const auto *VR = dyn_cast<VarRegion>(MR))
	return isLive(VR, true);			return isLive(VR, true);

	// FIXME: This is a gross over-approximation. What we really need is a way to			// FIXME: This is a gross over-approximation. What we really need is a way to
	// tell if anything still refers to this region. Unlike SymbolicRegions,			// tell if anything still refers to this region. Unlike SymbolicRegions,
	// AllocaRegions don't have associated symbols, though, so we don't actually			// AllocaRegions don't have associated symbols, though, so we don't actually
	// have a way to track their liveness.			// have a way to track their liveness.
	return isa<AllocaRegion, CXXThisRegion, MemSpaceRegion, CodeTextRegion>(MR);			return isa<AllocaRegion, CXXThisRegion, MemSpaceRegion, CodeTextRegion>(MR);
	}			}

				bool SymbolReaper::isLazilyCopiedRegion(const MemRegion *MR) const {
				// TODO: See comment in isLiveRegion.
				martongUnsubmitted Done Reply Inline Actions Just out of curiosity, do you have plans to tackle this todo sometime? martong: Just out of curiosity, do you have plans to tackle this todo sometime?
				tomasz-kaminski-sonarsourceAuthorUnsubmitted Done Reply Inline Actions We do not plan to takle it in near future. tomasz-kaminski-sonarsource: We do not plan to takle it in near future.
				NoQUnsubmitted Not Done Reply Inline Actions Could you add a negative/FIXME test for it? At a glance I suspect that this TODO is significantly less important for `isLiveRegion()` than it is for your new function, so I encourage you to explore the possibility of dropping `getBaseRegion()`, even if just a little bit and doesn't have to be in this patch. If a smaller subregion is truly live, any value inside of the base region can theoretically be accessed through safe pointer arithmetic. It's very difficult to prove that it can't be accessed anymore. Every pointer escape will be a potential access. In your case, however, if the superregion is neither live nor lazily copied, the information outside of the lazily copied subregion is truly lost, there's literally nothing the program can do to recover it. NoQ: Could you add a negative/FIXME test for it? At a glance I suspect that this TODO is…
				steakhalUnsubmitted Not Done Reply Inline Actions Could you add a negative/FIXME test for it? At a glance I suspect that this TODO is significantly less important for `isLiveRegion()` than it is for your new function, so I encourage you to explore the possibility of dropping `getBaseRegion()`, even if just a little bit and doesn't have to be in this patch. So far we could not come up with a test case demonstrating this case. Right now we don't plan to investigate this area either in the foreseeable future. steakhal: > Could you add a negative/FIXME test for it? > > At a glance I suspect that this TODO is…
				tomasz-kaminski-sonarsourceAuthorUnsubmitted Done Reply Inline Actions @NoQ We were banging our heads against this question, and we haven't been able to create or find any example when using base region would cause a problem. Moreover, we concluded that constructing an example, where the current approach would differ in reported issues, is probably impossible. To illustrate this I’ll refer back to the example in the summary of this patch. The side effect of our change is that for symbol `reg_<d.z>`, representing the falsely readable location, didn’t have any binding. `isLive(reg_<d.z>)` will return `true`. However, to observe the effect, either: a) the code needs to be able to read the `reg_<d.z>` b) we check if we should preserve constraint on the `reg_<d.z>` If we consider option `(a)`, that means that the code still has a reference/pointer to objects `d`, `d.z`, or to the copy of either of them. In these situations, the presence of such pointer/copy should make `reg_<d.z>` live - regardless of the existence of `lazyCompoundVal` to subregions of `d`, so `isLive(reg_<d.z>)` would return `true` anyway. In the case of `(b)`, we will preserve all constraints that refer to `reg_<d.z>`. If `reg_<d.z>` would be reachable/accessible, similar reasoning as for option `(a)` would conclude that it must be live anyway. In contrast, when the program can no longer reach/access the value of `d.z`, the presence of this constraint cannot impact the result of the analysis, hence it would do no harm. Given the tradeoff between additional dormant constraints and the complexity (and cost) of additional checking in `SymbolReaper`, we believe that using the base region is the right choice, and we should simply replace `TODO` with an appropriate explanation. tomasz-kaminski-sonarsource: @NoQ We were banging our heads against this question, and we haven't been able to create or…
				return LazilyCopiedRegionRoots.count(MR->getBaseRegion());
				}

				bool SymbolReaper::isReadableRegion(const MemRegion *MR) {
				return isLiveRegion(MR) \|\| isLazilyCopiedRegion(MR);
				}

	bool SymbolReaper::isLive(SymbolRef sym) {			bool SymbolReaper::isLive(SymbolRef sym) {
	if (TheLiving.count(sym)) {			if (TheLiving.count(sym)) {
	markDependentsLive(sym);			markDependentsLive(sym);
	return true;			return true;
	}			}

	bool KnownLive;			bool KnownLive;

	switch (sym->getKind()) {			switch (sym->getKind()) {
	case SymExpr::SymbolRegionValueKind:			case SymExpr::SymbolRegionValueKind:
	KnownLive = isLiveRegion(cast<SymbolRegionValue>(sym)->getRegion());			KnownLive = isReadableRegion(cast<SymbolRegionValue>(sym)->getRegion());
	break;			break;
	case SymExpr::SymbolConjuredKind:			case SymExpr::SymbolConjuredKind:
	KnownLive = false;			KnownLive = false;
	break;			break;
	case SymExpr::SymbolDerivedKind:			case SymExpr::SymbolDerivedKind:
	KnownLive = isLive(cast<SymbolDerived>(sym)->getParentSymbol());			KnownLive = isLive(cast<SymbolDerived>(sym)->getParentSymbol());
	break;			break;
	case SymExpr::SymbolExtentKind:			case SymExpr::SymbolExtentKind:
	▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

clang/test/Analysis/trivial-copy-struct.cpp

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines

void deadCode(List orig) {

List c = orig;

clang_analyzer_dump(c.value);

// expected-warning-re@-1 {{reg_${{[0-9]+}}<int orig.value>}}

if (c.value == 42)

return;

clang_analyzer_value(c.value);

// expected-warning@-1 {{32s:{ [-2147483648, 2147483647] }}}

// expected-warning@-1 {{32s:{ [-2147483648, 41], [43, 2147483647] }}}

// The symbol was garbage collected too early, hence we lose the constraints.

// Before symbol was garbage collected too early, and we lost the constraints.

if (c.value != 42)

return;

// Dead code should be unreachable

clang_analyzer_warnIfReached(); // no-warning: Dead code.

};

steakhalUnsubmitted

Not Done

clang_analyzer_warnIfReached(); // no-warning: Dead code.

- };

+ }

void ptr1(List* n) {

steakhal:

void ptr1(List* n) {

List* n2 = new List(*n); // ctor

martongUnsubmitted

Not Done

void ptr1(List* n) {

- List* n2 = new List(*n); // cctor

+ List* n2 = new List(*n); // ctor

if (!n->next) {

martong:

if (!n->next) {

if (n2->next) {

clang_analyzer_warnIfReached(); // unreachable

}

delete n2;

}

void ptr2(List* n) {

List* n2 = new List(); // ctor

*n2 = *n; // assignment

if (!n->next) {

if (n2->next) {

clang_analyzer_warnIfReached(); // unreachable

}

delete n2;

}

struct Wrapper {

List head;

int count;

};

void nestedLazyCompoundVal(List* n) {

Wrapper* w = 0;

{

Wrapper lw;

lw.head = *n;

w = new Wrapper(lw);

}

if (!n->next) {

if (w->head.next) {

// FIXME: Unreachable, w->head is a copy of *n, therefore

// w->head.next and n->next are equal

clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}

martongUnsubmitted

Done

// w->head.next and n->next are equal

+ // FIXME Should not be reachable!

clang_analyzer_warnIfReached(); // expected-warning {{REACHABLE}}

}

delete w;

martong:

NoQUnsubmitted

Done

Do you know what's causing this to not work? Is this a regression or just never worked?

NoQ: Do you know what's causing this to not work? Is this a regression or just never worked?

tomasz-kaminski-sonarsourceAuthorUnsubmitted

Done

This example never worked. We have an in-progress fix, that we are testing now.

tomasz-kaminski-sonarsource: This example never worked. We have an in-progress fix, that we are testing now.

steakhalUnsubmitted

Done

Fixed by D135136.

steakhal: Fixed by D135136.

}

delete w;

}

This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundValClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 468903

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h

clang/lib/StaticAnalyzer/Core/RegionStore.cpp

clang/lib/StaticAnalyzer/Core/SymbolManager.cpp

clang/test/Analysis/trivial-copy-struct.cpp

[analyzer] Fix the liveness of Symbols for values in regions referred by LazyCompoundVal
ClosedPublic