This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/StaticAnalyzer/Checkers/
-
StaticAnalyzer/
-
Checkers/
-
IteratorChecker.cpp
-
test/Analysis/
-
Analysis/
-
mismatched-iterator.cpp

Differential D53754

[Analyzer] Skip symbolic regions based on conjured symbols in comparison of the containers of iterators
ClosedPublic

Authored by baloghadamsoftware on Oct 26 2018, 4:54 AM.

Download Raw Diff

Details

Reviewers

NoQ
george.karpenkov

Commits

rGd703305e404d: [Analyzer] Skip symbolic regions based on conjured symbols in comparison of the…
rL356049: [Analyzer] Skip symbolic regions based on conjured symbols in comparison of the…
rC356049: [Analyzer] Skip symbolic regions based on conjured symbols in comparison of the…

Summary

Checking whether two regions are the same is a partially decidable problem: either we know for sure that they are the same or we cannot decide. A typical case for this are the symbolic regions based on conjured symbols. Two different conjured symbols are either the same or they are different. Since we cannot decide this and want to reduce false positives as much as possible we exclude these regions whenever checking whether two containers are the same at iterator mismatch check.

Diff Detail

Repository: rC Clang

Event Timeline

baloghadamsoftware created this revision.Oct 26 2018, 4:54 AM

Herald added a reviewer: george.karpenkov. · View Herald TranscriptOct 26 2018, 4:54 AM

Herald added subscribers: donat.nagy, Szelethus, mikhail.ramalho and 4 others. · View Herald Transcript

I wonder whether a method in MemRegion called isSameRegion or isSurelySameRegion would be better. I think it's likely that there are (or will be) checkers that would do similar things.

Maybe something like this?

bool MemRegion::isSurelySameRegion(const MemRegion *Other) const {
  // We can't reason about symbolic regions.
  if (/* this or Other is symbolic*/)
    return;
  return this == Other;
}

MTC added a subscriber: MTC.Oct 28 2018, 7:39 PM

In D53754#1277253, @Szelethus wrote:

I wonder whether a method in MemRegion called isSameRegion or isSurelySameRegion would be better. I think it's likely that there are (or will be) checkers that would do similar things.

Unfortunately the question whether two regions are the same is a partially decidable problem. We can never be sure whether two regions are "surely" the same. We can only disclose some highly unreliable region types such as symbolic regions based on conjured symbols to reduce false positives.

Herald added a subscriber: gamesh411. · View Herald TranscriptNov 26 2018, 1:01 AM

What makes you think that conjured symbols are special?

Doesn't the same problem appear in the following example?:

void foo(std::vector<int> &v1, std::vector<int> &v2) {
  v2.erase(v1.cbegin());
}

In this example if foo() is analyzed as a top level function, the respective symbols would be of SymbolRegionValue kind. It is also easy to come up with a test case that involves SymbolDerived.

In D53754#1315247, @NoQ wrote:
What makes you think that conjured symbols are special?

Doesn't the same problem appear in the following example?:
void foo(std::vector<int> &v1, std::vector<int> &v2) {
  v2.erase(v1.cbegin());
}
In this example if foo() is analyzed as a top level function, the respective symbols would be of SymbolRegionValue kind. It is also easy to come up with a test case that involves SymbolDerived.

I think that here the main difference is that if we analyze this function as top level, then we find a true positive: the regions for v1 and v2 may be the same but generally they are difference (hence the different parameters). However, we do not know anything about the sameness of the regions of different SymbolDeriveds, thus those findings may be false or true positives as well.

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 18 2019, 6:46 AM

In D53754#1401162, @baloghadamsoftware wrote:

I think that here the main difference is that if we analyze this function as top level, then we find a true positive: the regions for v1 and v2 may be the same but generally they are difference (hence the different parameters).

Aha, right, that's an interesting heuristic. I guess that the developer may also add a specific check (eg., if (&v1 == &v2) ..., but that's a separate story of aliasing and renaming, and i do admit that i don't see this sort of code being written sensibly.

Well, you can't really rely on my imagination, because i can still say the same about the SymbolConjured examples. I'm really curious how did this originally look, i.e. even if the user knows that a certain function always returns the same container, why would they call it twice? Was this happening in some sort of loop? Is there a more realistic test case that we can add?

Anyway, let's add a huge comment that explains why SymbolConjureds are special and commit. I mean, this definitely deserves a comment :)

This revision is now accepted and ready to land.Mar 6 2019, 5:54 PM

Herald added a subscriber: Charusso. · View Herald TranscriptMar 6 2019, 5:54 PM

Closed by commit rC356049: [Analyzer] Skip symbolic regions based on conjured symbols in comparison of the… (authored by baloghadamsoftware). · Explain WhyMar 13 2019, 6:54 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

StaticAnalyzer/

Checkers/

IteratorChecker.cpp

49 lines

test/

Analysis/

mismatched-iterator.cpp

14 lines

Diff 190407

lib/StaticAnalyzer/Checkers/IteratorChecker.cpp

Show First 20 Lines • Show All 1,092 Lines • ▼ Show 20 Lines	void IteratorChecker::verifyRandomIncrOrDecr(CheckerContext &C,
}		}
}		}

void IteratorChecker::verifyMatch(CheckerContext &C, const SVal &Iter,		void IteratorChecker::verifyMatch(CheckerContext &C, const SVal &Iter,
const MemRegion *Cont) const {		const MemRegion *Cont) const {
// Verify match between a container and the container of an iterator		// Verify match between a container and the container of an iterator
Cont = Cont->getMostDerivedObjectRegion();		Cont = Cont->getMostDerivedObjectRegion();

		if (const auto *ContSym = Cont->getSymbolicBase()) {
		if (isa<SymbolConjured>(ContSym->getSymbol()))
		return;
		}

auto State = C.getState();		auto State = C.getState();
const auto *Pos = getIteratorPosition(State, Iter);		const auto *Pos = getIteratorPosition(State, Iter);
if (Pos && Pos->getContainer() != Cont) {		if (!Pos)
		return;

		const auto *IterCont = Pos->getContainer();

		// Skip symbolic regions based on conjured symbols. Two conjured symbols
		// may or may not be the same. For example, the same function can return
		// the same or a different container but we get different conjured symbols
		// for each call. This may cause false positives so omit them from the check.
		if (const auto *ContSym = IterCont->getSymbolicBase()) {
		if (isa<SymbolConjured>(ContSym->getSymbol()))
		return;
		}

		if (IterCont != Cont) {
auto *N = C.generateNonFatalErrorNode(State);		auto *N = C.generateNonFatalErrorNode(State);
if (!N) {		if (!N) {
return;		return;
}		}
reportMismatchedBug("Container accessed using foreign iterator argument.", Iter, Cont, C, N);		reportMismatchedBug("Container accessed using foreign iterator argument.",
		Iter, Cont, C, N);
}		}
}		}

void IteratorChecker::verifyMatch(CheckerContext &C, const SVal &Iter1,		void IteratorChecker::verifyMatch(CheckerContext &C, const SVal &Iter1,
const SVal &Iter2) const {		const SVal &Iter2) const {
// Verify match between the containers of two iterators		// Verify match between the containers of two iterators
auto State = C.getState();		auto State = C.getState();
const auto *Pos1 = getIteratorPosition(State, Iter1);		const auto *Pos1 = getIteratorPosition(State, Iter1);
		if (!Pos1)
		return;

		const auto *IterCont1 = Pos1->getContainer();

		// Skip symbolic regions based on conjured symbols. Two conjured symbols
		// may or may not be the same. For example, the same function can return
		// the same or a different container but we get different conjured symbols
		// for each call. This may cause false positives so omit them from the check.
		if (const auto *ContSym = IterCont1->getSymbolicBase()) {
		if (isa<SymbolConjured>(ContSym->getSymbol()))
		return;
		}

const auto *Pos2 = getIteratorPosition(State, Iter2);		const auto *Pos2 = getIteratorPosition(State, Iter2);
if (Pos1 && Pos2 && Pos1->getContainer() != Pos2->getContainer()) {		if (!Pos2)
		return;

		const auto *IterCont2 = Pos2->getContainer();
		if (const auto *ContSym = IterCont2->getSymbolicBase()) {
		if (isa<SymbolConjured>(ContSym->getSymbol()))
		return;
		}

		if (IterCont1 != IterCont2) {
auto *N = C.generateNonFatalErrorNode(State);		auto *N = C.generateNonFatalErrorNode(State);
if (!N)		if (!N)
return;		return;
reportMismatchedBug("Iterators of different containers used where the "		reportMismatchedBug("Iterators of different containers used where the "
"same container is expected.", Iter1, Iter2, C, N);		"same container is expected.", Iter1, Iter2, C, N);
}		}
}		}

▲ Show 20 Lines • Show All 1,292 Lines • Show Last 20 Lines

test/Analysis/mismatched-iterator.cpp

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	void bad_move_find3(std::vector<int> &v1, std::vector<int> &v2, int n) {
std::find(v1.cbegin(), i0, n); // expected-warning{{Iterators of different containers used where the same container is expected}}		std::find(v1.cbegin(), i0, n); // expected-warning{{Iterators of different containers used where the same container is expected}}
}		}

void bad_comparison(std::vector<int> &v1, std::vector<int> &v2) {		void bad_comparison(std::vector<int> &v1, std::vector<int> &v2) {
if (v1.cbegin() != v2.cend()) { // expected-warning{{Iterators of different containers used where the same container is expected}}		if (v1.cbegin() != v2.cend()) { // expected-warning{{Iterators of different containers used where the same container is expected}}
*v1.cbegin();		*v1.cbegin();
}		}
}		}

		std::vector<int> &return_vector_ref();

		void ignore_conjured1() {
		std::vector<int> &v1 = return_vector_ref(), &v2 = return_vector_ref();

		v2.erase(v1.cbegin()); // no-warning
		}

		void ignore_conjured2() {
		std::vector<int> &v1 = return_vector_ref(), &v2 = return_vector_ref();

		if (v1.cbegin() == v2.cbegin()) {} //no-warning
		}