It turns out that in certain cases SymbolRegions are wrapped by
ElementRegions; in others, it's not. This discrepancy can cause the
analyzer not to recognize if the two regions are actually referring to
the same entity, which then can lead to unreachable paths discovered.
Consider this example:
struct Node { int* ptr; }; void with_structs(Node* n1) { Node c = *n1; // copy Node* n2 = &c; clang_analyzer_dump(*n1); // lazy... clang_analyzer_dump(*n2); // lazy... clang_analyzer_dump(n1->ptr); // rval(n1->ptr): reg_$2<int * SymRegion{reg_$0<struct Node * n1>}.ptr> clang_analyzer_dump(n2->ptr); // rval(n2->ptr): reg_$1<int * Element{SymRegion{reg_$0<struct Node * n1>},0 S64b,struct Node}.ptr> clang_analyzer_eval(n1->ptr != n2->ptr); // UNKNOWN, bad! (void)(*n1); (void)(*n2); }
The copy of n1 will insert a new binding to the store; but for doing
that it actually must create a TypedValueRegion which it could pass to
the LazyCompoundVal. Since the memregion in question is a
SymbolicRegion - which is untyped, it needs to first wrap it into an
ElementRegion basically implementing this untyped -> typed conversion
for the sake of passing it to the LazyCompoundVal.
So, this is why we have Element{SymRegion{.}, 0,struct Node} for n1.
The problem appears if the analyzer evaluates a read from the expression
n1->ptr. The same logic won't apply for SymbolRegionValues, since
they accept raw SubRegions, hence the SymbolicRegion won't be
wrapped into an ElementRegion in that case.
Later when we arrive at the equality comparison, we cannot prove that
they are equal.
For more details check the corresponding thread on discourse:
https://discourse.llvm.org/t/are-symbolicregions-really-untyped/64406
In this patch, I'm eagerly wrapping each SymbolicRegion by an
ElementRegion; basically canonicalizing to this form.
It seems reasonable to do so since any object can be thought of as a single
array of that object; so this should not make much of a difference.
The tests also underpin this assumption, as only a few were broken by
this change; and actually fixed a FIXME along the way.
About the second example, which does the same copy operation - but on
the heap - it will be fixed by the next patch.
I'm not sure how much explanation I need to do here, but first, we should settle on the name of this member function if we decide to have it.