This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fix constraint being dropped when analyzing a program without taint tracking enabled
ClosedPublic

Authored by mikhail.ramalho on Jun 27 2018, 8:15 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

+Adam because that's a lot of stuff he's interested in and his tests are affected.

Do we need a complexity threshold like in the original revision?:

// Use a lower bound to avoid recursive blowup, e.g. on PR24184.cpp
if (V.getSymbol()->computeComplexity() > 10)
  ...

I.e., could the same test be rewritten to cause performance problems on the current patch?

ddcc added a comment.EditedJun 27 2018, 12:56 PM

I don't see any #ifdef for the tests. Do they pass both with and without z3? I recall needing to case out the expected test output before.

Because he's not extending the SValBuilder visitors, it's possible the complexity threshold won't be a problem. But regardless, I think it would be good practice to allow runtime configuration of the complexity threshold.

For reference, my stale version is D35450. I haven't rebased yet to remove the APSInt fixes that have been merged.

baloghadamsoftware added a comment.EditedJun 28 2018, 6:29 AM

Do the tests pass now because of rL333670?

Do the tests pass now because of rL333670?

Actually, one test is crashing because of D32642. The following cast in lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:1168:

return assumeNoOverflow(NewState, cast<SymIntExpr>(CompSym)->getLHS(), 2);

A SymExpr object is reaching that call and the cast fails.

Actually, one test is crashing because of D32642. The following cast in lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:1168:

return assumeNoOverflow(NewState, cast<SymIntExpr>(CompSym)->getLHS(), 2);

A SymExpr object is reaching that call and the cast fails.

Thank you for pointing me this! The iterator checker may have worked incorrectly (but did not crash) without your patch. Now I fixed that, it works correctly and does not crash with your patch. I added my patch as parent revision for yours.

Actually, one test is crashing because of D32642. The following cast in lib/StaticAnalyzer/Checkers/IteratorChecker.cpp:1168:

return assumeNoOverflow(NewState, cast<SymIntExpr>(CompSym)->getLHS(), 2);

A SymExpr object is reaching that call and the cast fails.

Thank you for pointing me this! The iterator checker may have worked incorrectly (but did not crash) without your patch. Now I fixed that, it works correctly and does not crash with your patch. I added my patch as parent revision for yours.

Nice, thanks for fixing it.

Just found a test where this patch removes a true positive: test/Analysis/PR37855.c.

The path that leads to the second bug:

n[1].node->s; // expected-warning {{Dereference of undefined pointer value}}

is removed because the CSA is removing the branch where the loop termination condition node != l is false, which is the one that leads to the bug.

Just found a test where this patch

@mikhail.ramalho I don't really follow your description, could you clarify?

Just found a test where this patch

@mikhail.ramalho I don't really follow your description, could you clarify?

Ok, this is the test:

typedef struct o p;
struct o {
  struct {
  } s;
};

void q(*r, p2) { r < p2; }

void k(l, node) {
  struct {
    p *node;
  } * n, *nodep, path[sizeof(void)];
  path->node = l;
  for (n = path; node != l;) {
    q(node, n->node);
    nodep = n;
  }
  if (nodep)
    n[1].node->s; // warning: Dereference of undefined pointer value
}

Without this patch, the graph shows that the bugs is found when the loop is unrolled once and terminates. When this happens, nodep contains a valid value n (which is path), so a new core.NullDereference node is created to check the expression n[1].node->s;.

With this patch, since the constraints are not being dropped, it knows that neither node nor l change, so the loop doesn't terminate after the first iteration and the CSA doesn't evaluate anything after the loop.

However, there is another path that triggers the bug, when node != l holds, the loop body is not evaluated and, if nodep happens to be not null, n[1].node->s is a real bug. This path is not evaluated by the CSA, I only see a core.uninitialized.branch node and the analysis stops there. Am I correct to assume that, since if (nodep) is undefined behaviour in this path (nodep is uninitialized), the static analyzer doesn't evaluate it?

Am I correct to assume that, since if (nodep) is undefined behaviour in this path (nodep is uninitialized), the static analyzer doesn't evaluate it?

Yeah, so once we reach if (nodep) we are already reading from an uninitialized variable, and that should be reported as a bug.
So you see it in the exploded graph, but not in the analyzer report?

Am I correct to assume that, since if (nodep) is undefined behaviour in this path (nodep is uninitialized), the static analyzer doesn't evaluate it?

Yeah, so once we reach if (nodep) we are already reading from an uninitialized variable, and that should be reported as a bug.
So you see it in the exploded graph, but not in the analyzer report?

The CSA correctly reports:

arena.c:18:7: warning: Branch condition evaluates to a garbage value
  if (nodep)
      ^~~~~

but not:

arena.c:19:5: warning: Dereference of undefined pointer value
    n[1].node->s; // warning: Dereference of undefined pointer value
    ^~~~~~~~~~~~

but not:

That's expected, since the above error message is fatal, and the path is killed.
I'm fine with changing the test.
@NoQ any objections?

test/Analysis/svalbuilder-rearrange-comparisons.c
563 ↗(On Diff #153094)

Those FIXME's could probably be dropped.
@baloghadamsoftware Comments?

test/Analysis/svalbuilder-rearrange-comparisons.c
563 ↗(On Diff #153094)

I left them there because there are several cases in this file that are simplified to either TRUE or FALSE.

@mikhail.ramalho But they are clearly neither true nor false, right?

@mikhail.ramalho But they are clearly neither true nor false, right?

Indeed, most of them are something like conj_$2{int} == conj_$2{int} or conj_$2{int} < conj_$2{int}.

NoQ added a comment.Jun 29 2018, 1:40 PM

Am I correct to assume that, since if (nodep) is undefined behaviour in this path (nodep is uninitialized), the static analyzer doesn't evaluate it?

Yep. Even if it wasn't undefined behavior but only unspecified value, we would still interrupt the analysis because otherwise we'd just get a bunch of duplicate warnings about the same uninitialized value used all over the place.

test/Analysis/svalbuilder-rearrange-comparisons.c
563 ↗(On Diff #153094)

I guess the ultimate purpose of the test is to simplify "(x + 1) == x" to "1 == 0".

test/Analysis/svalbuilder-rearrange-comparisons.c
563 ↗(On Diff #153094)

Ah right, I see. Z3 should be able to do that here.

NoQ added inline comments.Jun 29 2018, 2:08 PM
test/Analysis/svalbuilder-rearrange-comparisons.c
563 ↗(On Diff #153094)

Not here, but ultimately yeah it'd help with the original problem.

Update test case

This revision was not accepted when it landed; it landed in state Needs Review.Jul 3 2018, 5:42 PM
This revision was automatically updated to reflect the committed changes.
george.karpenkov reopened this revision.Jul 3 2018, 5:53 PM

Oh sorry I've referenced wrong revision in the commit. Re-opening.

Revert back to the previous diff.

Update diff with -U99999

mikhail.ramalho retitled this revision from [analyzer] [WIP] Fix constraint being dropped when analyzing a program without taint tracking enabled to [analyzer] Fix constraint being dropped when analyzing a program without taint tracking enabled.

Decreased default complexity threshold to 25 as it finds the same amount of bugs as 10000 in the following projects: tmux, git, openssl, postgresql, redis, sqlite3, tmux and twin. Going < 25, and a number of bugs stop being reported.

Also, removed the FIXMEs from the test.

@mikhail.ramalho Could you also state how different values affect the effectivity of refutation?

@mikhail.ramalho Could you also state how different values affect the effectivity of refutation?

Hi, the same bugs are removed, with threshold 25 and 10000:

  • redis: 1 removed
  • openssl: 2 removed
  • git: 11 removed
  • postgresql: 5 removed
  • sqlite3: 14 removed

With complexity 10000, however, the verification time of openSSL went from 260s to 1500s. The other were roughly the same (~20s slower at most).

Fix expected warning locations

LGTM with a nit. We'll watch the bots after this is merged.

test/Analysis/bitwise-ops.c
11 ↗(On Diff #154861)

I think this comment can be dropped.

This revision is now accepted and ready to land.Jul 11 2018, 11:57 AM

Removed comment.

george.karpenkov accepted this revision.Jul 11 2018, 3:26 PM
This revision was automatically updated to reflect the committed changes.
NoQ added a comment.EditedJul 17 2018, 5:34 PM

This commit causes an assertion failure on void foo(int a, int *b) { (int)b < a; }

Run-line: clang -cc1 -analyze -analyzer-checker=core -analyzer-eagerly-assume test.c

So it kinda frustrates RangeConstraintManager with a SymSymExpr that expresses a comparison between a Loc and a NonLoc symbol.

I'll think a bit more on how do i want to address that.

Ka-Ka added a subscriber: Ka-Ka.Jul 23 2018, 4:47 AM
In D48650#1165966, @NoQ wrote:

This commit causes an assertion failure on void foo(int a, int *b) { (int)b < a; }

Run-line: clang -cc1 -analyze -analyzer-checker=core -analyzer-eagerly-assume test.c

So it kinda frustrates RangeConstraintManager with a SymSymExpr that expresses a comparison between a Loc and a NonLoc symbol.

I'll think a bit more on how do i want to address that.

Earlier today I created an issue in bugzilla for this problem.
https://bugs.llvm.org/show_bug.cgi?id=38273
However, the reproducer presented here is smaller and much nicer. I will update the issue in bugzilla.