[analyzer] Model implied cast around operator new().
ClosedPublic

Authored by NoQ on Dec 14 2017, 11:08 AM.

Details

Summary

C++ overridable operator new() has the following prototype:

void *operator new(size_t size, user-defined arguments...);

The return value is void *. However, before passing it to constructor, we need to make a cast to the respective object pointer type. Hence an implicit cast is present here, which is not represented in the current AST or CFG. Modeling this cast is straightforward though. This is the change i mentioned in D40939.

I also noticed that evalCast from void * to T * is uncomfortable to use because sometimes it transforms &SymRegion{$x} into &element{T, 0S32b, SymRegion{$x}} even when $x is already of type T *. The form &SymRegion{$x} seems to be the canonical form of this symbolic pointer value in the rest of the analyzer, so i decided to change evalCast to preserve it.

The problem of how to represent memregion value casts better still stands - it wouldn't add much to the analyzer's quality, but we just keep running into it over and over again.

Diff Detail

Repository
rC Clang
NoQ created this revision.Dec 14 2017, 11:08 AM
NoQ updated this revision to Diff 127202.Dec 15 2017, 4:21 PM

VisitCXXNewExpr is too late. We need to perform cast before calling the constructor. Otherwise bad things happen, for instance performTrivialCopy would construct another void region :)

Move the cast to pushCXXNewAllocatorValue(). This way we perform the cast before putting this-value into our temporary storage (the top of CXXNewAllocatorValueStack, or _this in terms of http://lists.llvm.org/pipermail/cfe-dev/2017-December/056314.html), which seems correct. And this affects all two code paths on which we exit the allocator call - both the conservativeEvalCall path and the processCallExit path (and ideally the future evalCall path).

xazax.hun accepted this revision.Dec 16 2017, 10:14 AM

The code looks good to me. But the best way to verify these kinds of changes to see how the results change on large projects after applying the patch.

This revision is now accepted and ready to land.Dec 16 2017, 10:14 AM
NoQ updated this revision to Diff 127549.Dec 19 2017, 10:29 AM

Rebase.

I also noticed that evalCast from void * to T * is uncomfortable to use because sometimes it transforms &SymRegion{$x} into &element{T, 0S32b, SymRegion{$x}} even when $x is already of type T *. The form &SymRegion{$x} seems to be the canonical form of this symbolic pointer value in the rest of the analyzer, so i decided to change evalCast to preserve it.

Suddenly it turns out that this is not needed anymore. I'm struggling quite a bit to get the casts right, and still failing to identify the actual system we're trying to follow when representing pointer casts. I'd love to get to the bottom of it eventually.

NoQ updated this revision to Diff 129212.Jan 9 2018, 7:32 PM
In D41250#959755, @NoQ wrote:

I also noticed that evalCast from void * to T * is uncomfortable to use because sometimes it transforms &SymRegion{$x} into &element{T, 0S32b, SymRegion{$x}} even when $x is already of type T *. The form &SymRegion{$x} seems to be the canonical form of this symbolic pointer value in the rest of the analyzer, so i decided to change evalCast to preserve it.

Suddenly it turns out that this is not needed anymore. I'm struggling quite a bit to get the casts right, and still failing to identify the actual system we're trying to follow when representing pointer casts. I'd love to get to the bottom of it eventually.

Model the cast only around allocators that were inlined. Additionally, produce array element for array new[] allocator calls. This completely reverts the rather-accidental-than-intended change in behavior in the conservative case which i described above: we no longer get this

&element{T, 0S32b, SymRegion{$x}} even when $x is already of type T *

thing in the conservative case, while still behaving reasonably in the inlined case, without touching any other behavior of the analyzer (i.e. not touching the whole evalCast thing). So i believe this is the right thing to do, at least for this patch.

Eventually it might be better to return &element{T, 0S32b, SymRegion{$x}} where $x is a conjured void pointer - this would express the semantics and the origins of that SVal better. However, we cannot do that, because we cannot conjure a symbol without a void-pointer-type expression, and we don't have the expression that represents the call site for the allocator call.

NoQ updated this revision to Diff 129365.Jan 10 2018, 4:06 PM

Rebase.

Add a FIXME to bring back the cast in the conservative case.

dcoughlin accepted this revision.Jan 12 2018, 4:54 PM

This looks good to me, as well.

This revision was automatically updated to reflect the committed changes.