This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Skip symbolic regions based on conjured symbols in comparison of the containers of iterators
ClosedPublic

Authored by baloghadamsoftware on Oct 26 2018, 4:54 AM.

Details

Summary

Checking whether two regions are the same is a partially decidable problem: either we know for sure that they are the same or we cannot decide. A typical case for this are the symbolic regions based on conjured symbols. Two different conjured symbols are either the same or they are different. Since we cannot decide this and want to reduce false positives as much as possible we exclude these regions whenever checking whether two containers are the same at iterator mismatch check.

Diff Detail

Event Timeline

I wonder whether a method in MemRegion called isSameRegion or isSurelySameRegion would be better. I think it's likely that there are (or will be) checkers that would do similar things.

Maybe something like this?

bool MemRegion::isSurelySameRegion(const MemRegion *Other) const {
  // We can't reason about symbolic regions.
  if (/* this or Other is symbolic*/)
    return;
  return this == Other;
}
MTC added a subscriber: MTC.Oct 28 2018, 7:39 PM

I wonder whether a method in MemRegion called isSameRegion or isSurelySameRegion would be better. I think it's likely that there are (or will be) checkers that would do similar things.

Unfortunately the question whether two regions are the same is a partially decidable problem. We can never be sure whether two regions are "surely" the same. We can only disclose some highly unreliable region types such as symbolic regions based on conjured symbols to reduce false positives.

NoQ added a comment.Nov 30 2018, 2:08 PM

What makes you think that conjured symbols are special?

Doesn't the same problem appear in the following example?:

void foo(std::vector<int> &v1, std::vector<int> &v2) {
  v2.erase(v1.cbegin());
}

In this example if foo() is analyzed as a top level function, the respective symbols would be of SymbolRegionValue kind. It is also easy to come up with a test case that involves SymbolDerived.

In D53754#1315247, @NoQ wrote:

What makes you think that conjured symbols are special?

Doesn't the same problem appear in the following example?:

void foo(std::vector<int> &v1, std::vector<int> &v2) {
  v2.erase(v1.cbegin());
}

In this example if foo() is analyzed as a top level function, the respective symbols would be of SymbolRegionValue kind. It is also easy to come up with a test case that involves SymbolDerived.

I think that here the main difference is that if we analyze this function as top level, then we find a true positive: the regions for v1 and v2 may be the same but generally they are difference (hence the different parameters). However, we do not know anything about the sameness of the regions of different SymbolDeriveds, thus those findings may be false or true positives as well.

NoQ accepted this revision.Mar 6 2019, 5:54 PM

I think that here the main difference is that if we analyze this function as top level, then we find a true positive: the regions for v1 and v2 may be the same but generally they are difference (hence the different parameters).

Aha, right, that's an interesting heuristic. I guess that the developer may also add a specific check (eg., if (&v1 == &v2) ..., but that's a separate story of aliasing and renaming, and i do admit that i don't see this sort of code being written sensibly.

Well, you can't really rely on my imagination, because i can still say the same about the SymbolConjured examples. I'm really curious how did this originally look, i.e. even if the user knows that a certain function always returns the same container, why would they call it twice? Was this happening in some sort of loop? Is there a more realistic test case that we can add?

Anyway, let's add a huge comment that explains why SymbolConjureds are special and commit. I mean, this definitely deserves a comment :)

This revision is now accepted and ready to land.Mar 6 2019, 5:54 PM
This revision was automatically updated to reflect the committed changes.