This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fix a crash on C++17 AST for non-trivial construction into a trivial brace initializer.
ClosedPublic

Authored by NoQ on Dec 5 2017, 9:08 AM.

Details

Summary

Yet another situation where we cannot easily guess, by looking at the CFG, the target region for C++ constructor call.

Consider the following brace initialization:

struct A {
  A();
 };

struct B : public A {
  int x;
};

void foo() {
  B b = {};
}

AST in C++14:

`-VarDecl 0x7ff07884eab8 <col:3, col:10> col:5 b 'struct B' cinit
  `-CXXConstructExpr 0x7ff07887a940 <col:9, col:10> 'struct B' 'void (void) noexcept(false)' list zeroing

AST in C++17:

`-VarDecl 0x7fb1cf012b18 <col:3, col:10> col:5 b 'struct B' cinit
  `-InitListExpr 0x7fb1cf012f38 <col:9, col:10> 'struct B'
    |-CXXConstructExpr 0x7fb1cf012fd0 <col:10> 'struct A' 'void (void)' list
    `-ImplicitValueInitExpr 0x7fb1cf013000 <<invalid sloc>> 'int'

CFG in C++14:

[B1]
  1: {} (CXXConstructExpr, struct B)
  2: B b = {};

CFG in C++17:

[B1]
  1: {} (CXXConstructExpr, struct A)
  2: /*implicit*/(int)0
  3: {}
  4: B b = {};

So, essentially, in C++17 we don't have the trivial constructor call for B present anywhere at all, neither in AST, nor in CFG.

This causes a crash in the analyzer because he tries to find the target region for CK_NonVirtualBase constructor of A by looking at the parent stack frame, expecting to find the derived class constructor (for B) here, and then taking a base region within its this-object.

This fix is a quick-and-dirty suppression for the crash. It follows the tradition of "when unsure, make up a temporary and construct into that, then discard the result" - which is wrong, but at least it gets some invalidation done.

In our example it would be correct to construct into base{A, b}. There can also be situations when such initializer-list is instead lifetime-extended by a const& or && reference (see the tests), so in this case we'd need to construct into a respective sub-region of the temporary (but not into the temporary itself, of course).

Finally, from the perspective of checker development, i believe it would be useful to actually revive the constructor call, so that PreCall/PostCall event occured.

Diff Detail

Repository
rC Clang

Event Timeline

NoQ created this revision.Dec 5 2017, 9:08 AM
NoQ updated this revision to Diff 125547.Dec 5 2017, 9:09 AM
NoQ edited the summary of this revision. (Show Details)

Add a forgotten //no-warning.

a.sidorin added inline comments.Dec 5 2017, 9:41 AM
lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
278

Could we try to make another lookup to see if we're initializing a variable of non-reference type? If so, we can make MemRegionManager use the region of the variable; otherwise, we can always fallback to temporary (it seems like sometimes this situation can happen out of DeclStmts). Or am I missing something?

NoQ added a subscriber: rsmith.Dec 5 2017, 2:33 PM

A reply from @rsmith was accidentally lost:

http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20171204/211785.html

(seems that phabricator cuts away all text below the first "On ..., Such-and-such wrote:" line)

NoQ updated this revision to Diff 125629.Dec 5 2017, 3:09 PM

Note that there is no constructor call here. This is aggregate initialization. And there's not really any part of this that's new, except that a class with base classes is now an a aggregate. You'll see the same kind of AST formed in all C++ language modes with a slightly modified example:

struct A {
  A();
};

struct B {
  A a;
  int x;
};

void foo() {
  B b = {};
}

The analyzer should presumably treat the new example the exact same way it treats that case.

Thank you Richard! This sheds light on what situations do we need to cover.

For now it seems that with the patch we'd indeed be treating the base class case the exact same way as the field case: by bailing out. So both cases would need to be fixed. And the field case is very important because it's not C++17-specific.

I guess i'd finish the operator new adventure first, but this issue is definitely on the list of important-to-fix C++ stuff. And for now I guess i'd just stick something in there so that it didn't crash.

Updated comments to explain the situation.

NoQ added inline comments.Dec 5 2017, 3:13 PM
lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
278

Yeah, that's what we do in other cases, i guess that's the shortest way to go. As usual, it would be way better if we just knew this stuff from, like, CFG. Maybe transform the constructor CFGElement from StmtElement to a better element that has a pointer to the parent expression that represents the object that's being constructed.

dcoughlin accepted this revision.Dec 13 2017, 9:54 PM

This looks good to me for a quick fix that we plan to address in a more principled fashion later.

However, I'd like you to add a comment at each of the places where you use the parent map to note something like "This is a quick dirty fix and we really shouldn't be using the parent map to model behavior." I understand why you're doing it, but I'd like to make sure that someone reading this code doesn't think that it is a good pattern to use the parent map and decide to use it elsewhere!

Using the parent map is slow and is a sign that our reasoning isn't compositional. The analyzer should be able to figure out the correct behavior based on the expression and its state/context alone. The fact that we need the parent map here is an indication that we don't have the right representation for constructors in the analyzer. (I know you're planning on tackling this in the future!)

This revision is now accepted and ready to land.Dec 13 2017, 9:54 PM
This revision was automatically updated to reflect the committed changes.