This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] ConditionBRVisitor: Unknown condition evaluation support
ClosedPublic

Authored by Charusso on Jan 29 2019, 12:23 PM.

Details

Summary

If the constraint information is not changed between two program states the analyzer has not learnt new information and made no report. But it is possible to happen because we have no information at all. The new approach evaluates the condition to determine if that is the case and let the user know we just 'Assuming...' some value.

Diff Detail

Repository
rL LLVM

Event Timeline

Charusso created this revision.Jan 29 2019, 12:23 PM
NoQ added a comment.Jan 31 2019, 7:10 PM

These new notes are fantastic, we totally need them :) Changes in edges-new.mm and dtors.cpp look like a regression, i guess one of us would have to investigate it - feel free to start and poke me if it turns out to be hard, because i really appreciate where this is going!

test/Analysis/Inputs/expected-plists/edges-new.mm.plist
7348–7351 ↗(On Diff #184150)

This message shouldn't have appeared. The condition here is (0 || 0) and the Static Analyzer handles it correctly, so there isn't much assumption going on.

There wasn't a change in the constraints here, because there's just constants, so there's nothing to constrain.

I'm not sure why does the expression evaluate to an unknown or undefined value. I suspect it might be due to the short-circuit nature of operator ||. For now the Static Analyzer handles them in a fairly convoluted manner.

7732–7735 ↗(On Diff #184150)

Same as the one above.

8287–8290 ↗(On Diff #184150)

Same as the one above. The condition on line 276 of edges-new.mm is ((0 || (5 && 0)) ? 0 : ((0 || 4) ? call(1 && 5) : 0)) and column 7 corresponds to the (0 || (5 && 0) part.

8418–8421 ↗(On Diff #184150)

Same as the one above.

test/Analysis/diagnostics/dtors.cpp
20 ↗(On Diff #184150)

Hmm, this one's wrong, but for a fairly complicated reason.

The test involves calling .get() twice from the bar() function below.

Path notes looks like this:

dtors.cpp:19:5: warning: Use of memory after it is freed
    return (x || 0) ? nullptr : s;
    ^
dtors.cpp:27:3: note: Memory is released
  delete p.get();
  ^~~~~~~~~~~~~~
dtors.cpp:28:3: note: Calling 'smart_ptr::get'
  p.get()->foo();
  ^~~~~~~
dtors.cpp:19:13: note: Left side of '||' is false
    return (x || 0) ? nullptr : s;
            ^
dtors.cpp:19:12: note: '?' condition is false
    return (x || 0) ? nullptr : s;
           ^
dtors.cpp:19:5: note: Use of memory after it is freed
    return (x || 0) ? nullptr : s;
    ^      ~~~~~~~~~~~~~~~~~~~~~~

In the first .get() call within operator delete, "Assuming..." indeed happens over field x. However, this "Assuming..." piece is prunable: it resides within a stack frame that does not contain any interesting path notes. Therefore diagnostic notes for the whole stack frame are dropped entirely. The notes that you're seeing all correspond to the *second* .get() call (the one that has .foo() attached to it), which is a different stack frame within the same function .get().

Such path pruning is a fairly important feature, albeit counter-intuitive. The stack frame is considered to be interesting if it contains either a checker-specific diagnostic (eg., "Memory is released", "File descriptor is closed", etc., or the warning itself) or a normal diagnostic (such as any "Assuming..." note) for a value that's marked as "interesting" when the report was thrown. In this case x is not interesting (i.e., it's not the pointer that we're dereferencing) and there's no delete of s happening (it happens within bar()), so we decide that it is not interesting. If the stack frame is not interesting, all events within it are hidden from the user. It sounds weird at first ("Why are you hiding information from the user???", "The user needs more information in order to understand bug reports!!!", etc.), but in practice the Static Analyzer is almost unusable on any real-world code without path pruning, as even simple bugs suddenly turn out to contain hundreds of path notes, rendering them effectively unreadable.

If you want to see for yourself, you can always add -analyzer-config prune-paths=false to your -cc1 run-line. For this example the unpruned report is fairly readable (after all, it's not real-world code) and looks like this:

dtors.cpp:19:5: warning: Use of memory after it is freed
    return (x || 0) ? nullptr : s;
    ^
dtors.cpp:27:10: note: Calling 'smart_ptr::get'
  delete p.get();
         ^~~~~~~
dtors.cpp:19:13: note: Assuming the condition is false
    return (x || 0) ? nullptr : s;
            ^
dtors.cpp:19:13: note: Left side of '||' is false
dtors.cpp:19:12: note: '?' condition is false
    return (x || 0) ? nullptr : s;
           ^
dtors.cpp:27:10: note: Returning from 'smart_ptr::get'
  delete p.get();
         ^~~~~~~
dtors.cpp:27:3: note: Memory is released
  delete p.get();
  ^~~~~~~~~~~~~~
dtors.cpp:28:3: note: Calling 'smart_ptr::get'
  p.get()->foo();
  ^~~~~~~
dtors.cpp:19:13: note: Left side of '||' is false
    return (x || 0) ? nullptr : s;
            ^
dtors.cpp:19:12: note: '?' condition is false
    return (x || 0) ? nullptr : s;
           ^
dtors.cpp:19:5: note: Use of memory after it is freed
    return (x || 0) ? nullptr : s;
    ^      ~~~~~~~~~~~~~~~~~~~~~~

As you see, the "Assuming..." note for x should indeed logically be somewhere around there, but it is in fact already there, just pruned. Therefore the "Assuming the condition is false" piece that appeared after your patch is unnecessary here - all the expected information is already provided. Moreover, the new "Assuming the condition is false" piece appears during the second call of .get() when all assumptions have already been made.

It is likely that the same problem that caused edges-new.mm to regress also kicks in here for the operator ||, causing an extra note to appear during the second call to .get(). So even though this test is hard to understand, i believe it'll get fixed automatically if you fix other regressions.

test/Analysis/diagnostics/macros.cpp
33 ↗(On Diff #184150)

Whoa! This one's great.

test/Analysis/uninit-vals.m
167 ↗(On Diff #184150)

Yup, this one's correct as well! For a bad reason though - if the rest of the Static Analyzer worked correctly, there would have not been an assumption here. But for now it should totally be there.

222 ↗(On Diff #184150)

Yup, these are also correct.

Charusso updated this revision to Diff 184992.Feb 3 2019, 9:55 PM
Charusso marked 8 inline comments as done.
  • Support for short-circuit operator.
  • Side effect: ninja check-clang-analyzer-z3 has + 1 failure: use-after-move.cpp with exit code 70.
Failing Tests (2):
    Clang :: Analysis/plist-macros.cpp
    Clang :: Analysis/use-after-move.cpp

  Expected Passes    : 564
  Expected Failures  : 2
  Unsupported Tests  : 3
  Unexpected Failures: 2
Charusso added a comment.EditedFeb 3 2019, 9:56 PM

Excuse me @NoQ, I had to publish it and I was sure it is going well so I have not checked these.

I wanted to put the new patch to SValBuilder::evalBinOp(), so ExprEngine::VisitBinaryOperator() could call it, but the latter does not operates with short-circuit operators. That is why there is no binding found in Environment::lookupExpr() called by Environment::getSVal(), so we just returned UnknownVal().

Charusso updated this revision to Diff 185503.Feb 6 2019, 1:08 AM
  • Revert previous patch.
  • IgnoreParens().

Working as expected. ninja check-clang-analyzer-z3 output:

Failing Tests (1):
    Clang :: Analysis/plist-macros.cpp

  Expected Passes    : 565
  Expected Failures  : 2
  Unsupported Tests  : 3
  Unexpected Failures: 1

All the problematic code were ParenExprs and I saw the same problem yesterday in ConditionBRVisitor::VisitConditionVariable where we write out 'true\false' on integers, because of the same behaviour.

NoQ accepted this revision.Mar 6 2019, 5:37 PM

Looks great now! It's surprising how little we were missing.

lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
1843–1855 ↗(On Diff #185503)

I think you can write it just once:

switch (Term->getStmtClass()) {
  ...
}
Cond = Cond->IgnoreParens();
This revision is now accepted and ready to land.Mar 6 2019, 5:37 PM
Charusso updated this revision to Diff 189920.Mar 8 2019, 1:51 PM
Charusso marked an inline comment as done.
Charusso retitled this revision from [analyzer] Add support to unknown condition evaluations in the ConditionBRVisitor to [analyzer] ConditionBRVisitor: unknown condition evaluation support.
  • Rebased and deduplicated the code.
Charusso retitled this revision from [analyzer] ConditionBRVisitor: unknown condition evaluation support to [analyzer] ConditionBRVisitor: Unknown condition evaluation support.Mar 16 2019, 2:16 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 16 2019, 2:23 AM