This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] ReturnVisitor: Bypass everything to see inlined calls
ClosedPublic

Authored by Charusso on Jun 5 2019, 12:30 PM.

Details

Summary

When we traversed backwards on ExplodedNodes to see where we processed the
given statement we break too early. With the current approach we do not
miss the CallExitEnd ProgramPoint which stands for an inlined call.

Diff Detail

Repository
rL LLVM

Event Timeline

Charusso created this revision.Jun 5 2019, 12:30 PM
Szelethus added inline comments.Jun 5 2019, 2:50 PM
clang/test/Analysis/new-ctor-null-throw.cpp
1–3 ↗(On Diff #203220)

Hmm, how come you removed -analyzer-config c++-allocator-inlining=true?

NoQ added a comment.Jun 5 2019, 7:44 PM

Aha, that's something! And nice to see we've already had this bug covered with tests. Because, of course, i added these tests a year ago without even thinking about what the correct behavior should look like :/

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
835–842 ↗(On Diff #203220)

This iteration may now take us straight to the root of the graph. I don't think it's supposed to be that slow; i think it's supposed to only skip within maybe a full-expression at most.

Which statements in the AST are getting peeled off here that weren't before? Which statements are supposed to get peeled off?

Might it be that we should simply add one more case to peelOffOuterExpr() or something like that?

clang/test/Analysis/new-ctor-null-throw.cpp
1–3 ↗(On Diff #203220)

It's on by default these days.

29 ↗(On Diff #203220)

Let's add a // no-warning here, probably with a comment that this the warning is intentionally suppressed.

clang/test/Analysis/new-ctor-null.cpp
29 ↗(On Diff #203220)

Let's add a // no-warning here as well.

Charusso updated this revision to Diff 203324.Jun 6 2019, 3:45 AM
Charusso marked 4 inline comments as done.
Charusso retitled this revision from [analyzer] ReturnVisitor: Bypass everything to see inlined calls to [analyzer] ReturnVisitor: Bypass constructing objects to see inlined calls.
  • More explicit approach.
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
835–842 ↗(On Diff #203220)

We have the following sequence in the loop:

"stmt_kind": "DeclStmt", "stmt_point_kind": "PostStmt"
"stmt_kind": "DeclStmt", "stmt_point_kind": "PostStore"
"stmt_kind": "DeclStmt",  "stmt_point_kind": "PreStmtPurgeDeadSymbols"
"pretty": "S *s = new S [10];"

"stmt_kind": "CXXNewExpr", "stmt_point_kind": "PostStmt"
"stmt_kind": "CXXNewExpr", "stmt_point_kind": "PreStmtPurgeDeadSymbols"
"pretty": "new S [10]"

"stmt_kind": "CXXConstructExpr", "stmt_point_kind": "PostStmt"
"stmt_kind": "CXXConstructExpr", "stmt_point_kind": "PreStmt"
"stmt_kind": "CXXConstructExpr", "stmt_point_kind": "PreStmtPurgeDeadSymbols"
"pretty": null

"kind": "CallExitEnd" - found the ReturnStmt, all good.

This ReturnVisitor is totally ProgramPoint-based and I wanted to be super generic. Do you know about more problematic constructing-object?

Charusso updated this revision to Diff 203326.Jun 6 2019, 3:49 AM
  • Better wording.
Charusso marked an inline comment as done.Jun 6 2019, 4:18 AM
Charusso added inline comments.
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
835–842 ↗(On Diff #203220)

Btw: this cool dump based on D62946

If I run scan-build on LLVM this is the only non-bypassed case since the first diff:

I think it is working well.

NoQ added a comment.Jun 6 2019, 9:00 PM

Aha, nice, that's much cleaner! Indeed, we have to skip the construct-expression, which is going to be on our way to the program point of new-expression which corresponds to calling the allocator, despite being nested within the new-expression. An annoying corner-case.

I'd like to have a test in which the constructor is inlined. I suspect that we'll have to skip the whole inlined call in this case, and it'll be a lot of various program points. We'll most likely have to consult location contexts to skip everything within the constructor's stack frame.

In the test cases that you've updated the constructor isn't inlined because it's an array-new operator and we don't bother inlining 10 default constructors (@Szelethus: a nice application of data flow analysis might be to find out if the default constructor always initializes the structure in the same way, say all nulls, so we could evaluate one constructor and default-bind it to the whole array in this case; @baloghadamsoftware: this in turn reminds me that i still need to do something about D22374).

Charusso updated this revision to Diff 203635.Jun 7 2019, 3:31 PM
Charusso marked 2 inline comments as done.
  • Added CallExpr as being purged away
  • More test
In D62926#1533560, @NoQ wrote:

I'd like to have a test in which the constructor is inlined.

What do you mean by that test?

NoQ added inline comments.Jun 7 2019, 10:13 PM
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
842–849 ↗(On Diff #203635)

Comparing statements is usually insufficient because the same statement may appear multiple times due to recursion. When recursion occurs, you may reach the same statement in a different location context. You should think in terms of (statement, location context) pairs to avoid these problems. Your aim here is to find the CallExitEnd node that corresponds to returning from an inlined operator new to the current location context. You should stop searching when you find an unrelated statement in the current location context or when you exit the current location context entirely.

Charusso updated this revision to Diff 203874.Jun 10 2019, 12:37 PM
Charusso retitled this revision from [analyzer] ReturnVisitor: Bypass constructing objects to see inlined calls to [analyzer] ReturnVisitor: Bypass everything to see inlined calls.
  • The most generic approach.
Charusso marked 2 inline comments as done.Jun 10 2019, 12:43 PM
Charusso added inline comments.
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
842–849 ↗(On Diff #203635)

I have made a little test when we have a 25-second long Static Analyzer run with predefined names and checker. The loop ran 500 times in summary and we have some serious performance impacts at other places.

We exit the current context to see inlined calls, so that could not work sadly. If you remove that nonsense second condition we run at the same time, so if we have not got any problem since 7 years ago I think it is good to go.

NoQ added inline comments.Jun 10 2019, 6:47 PM
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
842–849 ↗(On Diff #203635)

You should break the loop as soon as we go past the new-expression that we've started with in the stack frame that we've started with. That is, you should at most go through the constructor within the new-expression, and then break. You shouldn't explore the whole graph to the root every time the operator-new call isn't inlined.

This might still be slow in some rare cases, but it should be much faster.

Charusso updated this revision to Diff 204111.Jun 11 2019, 10:58 AM
Charusso marked an inline comment as done.
  • I have used LocationContext::isParentOf()which is not worked well, so I thought we are out of our context.
  • With that I made some misleading assumptions about our code.
Charusso marked 2 inline comments as done.Jun 11 2019, 10:59 AM
Charusso added inline comments.
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
842–849 ↗(On Diff #203635)

Sorry, I was dumb and I have used isParentOf() before, I think it is the most simple approach now truly.

NoQ added a comment.Jun 11 2019, 5:33 PM

All right, it seems that i'm completely misunderstanding this problem and we've been talking past each other this whole time.

The problem is not that we need to skip the CXXConstructExpr. The problem is that we need to skip CXXNewExpr (!!). CFG elements for an operator-new call go like this:

  1. CFGAllocatorCall
  2. CXXConstructExpr
  3. CXXNewExpr

We're staring at element 3. The original loop says "hey, that's our statement! we've found it!". However, even though this is the statement that's taken as the call site, the respective ExplodedNode does not correspond to the end of the call. Instead, it corresponds to a moment of time after construction has finished and the new-expression is fully evaluated. So we need to ignore element 3. Then, element 2 is ignored automatically because the statements don't match. Then, we need to learn to recognize that element 1 is the right element, which is going to be either a CallExitEnd (if the allocator call was inlined) or dunno-but-definitely-not-PostStmt if the allocator was evaluated conservatively.

Bonus: the constructor is never inlined in our scenario, so it's easy to skip element 2, as it's going to be just PreStmt/PostStmt and no inlined call within it. Why? Because how the hell do you inline a constructor of a null pointer. The Standard guarantees that when the allocator returns a null pointer, the constructor is gracefully omitted. We currently have a FIXME in prepareForObjectConstruction() to implement this behavior:

179         // TODO: Detect when the allocator returns a null pointer.
180         // Constructor shall not be called in this case.

The current incorrect behavior is to instead evaluate the constructor conservatively, with a fake temporary region instead of the null pointer (this is generally the conservative approach to evaluating the constructor when object-under-construction is not known).

So my request to come up with a test case in which the constructor is going to be inlined was completely pointless. But in the general case we aren't really sure that it's going to be a null pointer (we may want to track an arbitrary value), so we shouldn't rely on that.

clang/test/Analysis/new-ctor-null.cpp
37 ↗(On Diff #204111)

This is essentially the same FIXME: the constructor should not have been evaluated, so in particular we should not invalidate globals.

Charusso updated this revision to Diff 204331.Jun 12 2019, 11:01 AM
Charusso marked an inline comment as done.
Charusso added a comment.EditedJun 12 2019, 11:06 AM
In D62926#1539191, @NoQ wrote:

All right, it seems that i'm completely misunderstanding this problem and we've been talking past each other this whole time.

The problem is not that we need to skip the CXXConstructExpr. The problem is that we need to skip CXXNewExpr (!!). CFG elements for an operator-new call go like this:

I really wanted to create a generic approach without any overhead. I have added the proper test case what is going on: we enter into another contexts and that should be modeled properly. I was not sure about how to do so, but after a while it has been born. It contains all of your ideas from all the comments and now we only bypassing CXXNewExprs.

NoQ added a comment.Jun 12 2019, 9:00 PM

Yup, this makes sense now! I'll do some nit-picking for a little longer.

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
839 ↗(On Diff #204331)

For now this hack is clearly only for CXXNewExpr. And it happens because our AST is weird and there's no separate sub-expression dedicated entirely to invoking the allocator. Therefore let's give this variable a better name, maybe BypassCXXNewExprEvaluation or something like that.

856–860 ↗(On Diff #204331)

Branches here are unnecessary because there's no performance win here. Let's just

CurrentSFC = Node->getStackFrame();
862–866 ↗(On Diff #204331)

This code would currently bypass the call site for the conservatively evaluated allocator. It doesn't seem to be immediately important but let's add a FIXME.

Charusso updated this revision to Diff 204625.Jun 13 2019, 2:23 PM
Charusso marked 3 inline comments as done.
  • Fixed comments.
NoQ added inline comments.Jun 13 2019, 10:31 PM
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
855 ↗(On Diff #204625)

Mmm, wait a sec. This way the loop condition is trivially true. The second check on line 862 is also trivially true. I think you want to make it a separate variable.

Charusso updated this revision to Diff 204698.Jun 13 2019, 10:39 PM
Charusso marked an inline comment as done.
  • Whoops.
Charusso marked an inline comment as done.Jun 13 2019, 10:40 PM
Charusso added inline comments.
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
855 ↗(On Diff #204625)

I was never cool in algorithms, thanks!

NoQ accepted this revision.Jun 13 2019, 10:47 PM

Great, thanks!! Let's commit this.

This revision is now accepted and ready to land.Jun 13 2019, 10:47 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2019, 3:02 AM

This test fails to compile on Windows 64 bit builds. Please see http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/77

cfe/trunk/test/Analysis/inlining/placement-new-fp-suppression.cpp
16

I don't believe this is really ever portable, but definitely not on 64 bit Windows where a long is 32 bits.

Can't you just #include <stdint.h> instead?

Szelethus added inline comments.Jun 16 2019, 10:30 AM
cfe/trunk/test/Analysis/inlining/placement-new-fp-suppression.cpp
16

In the test, I'm pretty sure that he can't.

I grepped test/Analysis, and found that malloc.cpp does the following:

typedef unsigned __INTPTR_TYPE__ uintptr_t

Maybe that's worth a shot?

Szelethus added inline comments.Jun 16 2019, 10:32 AM
cfe/trunk/test/Analysis/inlining/placement-new-fp-suppression.cpp
16

I seem to have been proven wrong. :)

Charusso marked an inline comment as done.Jun 16 2019, 11:11 AM

This test fails to compile on Windows 64 bit builds. Please see http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/77

Thanks you! I have not got a mail, now it passes on my side: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/80/steps/stage%201%20check/logs/stdio
But some unrelated error occurs: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/80/steps/stage%202%20build/logs/stdio

Sorry for all the inconveniences it caused.

Charusso marked 3 inline comments as done.Jun 16 2019, 11:15 AM

@NoQ, thanks for the review! Now everything is working by rL363515.

cfe/trunk/test/Analysis/inlining/placement-new-fp-suppression.cpp
16

I was a little-bit surprised as well. Thanks for the idea!