This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Implement pointer arithmetic on constants
ClosedPublic

Authored by r.stahl on Sep 5 2017, 8:09 AM.

Details

Summary

The "Multiplicand" variable in SimpleSValBuilder::evalBinOpLN was always initialized to zero, causing all pointer arithmetic on constant values to be no-ops.

This patch enables pointer arithmetic as long as the pointer is not a null pointer. The exception preserves useful warnings from the DereferenceChecker.

Diff Detail

Event Timeline

r.stahl created this revision.Sep 5 2017, 8:09 AM
xazax.hun accepted this revision.Sep 6 2017, 4:08 AM

One nit, otherwise LGTM! Thanks for fixing this!

lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
938

The rest of the code does not abbreviate the Type. I would prefer to name this pointeeType.

This revision is now accepted and ready to land.Sep 6 2017, 4:08 AM
NoQ edited edge metadata.Sep 6 2017, 4:55 AM

I've seen this recently, and while i agree that the fix is correct, i'm not entirely sure that the test cases are correct. As weird as this may sound, null dereference is not an attempt to read from or write to memory address 0. Instead, it is about using a null pointer as if it was pointing to an actual object in memory, even if accessing it by a non-zero offset. For example, in

struct S {
  int x, y;
};

void foo() {
  struct S *s = NULL;
  s->y = 1;
}

we're in fact writing into *(0x4), not *(0x0), however it's intuitive that this code has a null pointer dereference, because we use a null pointer s as if it points to an actual object of type struct S. In this sense, i'd actually want the analyzer to warn in test4: it seems that the author of this code was expecting to find something useful by offset 1 to the pointer, so he must have made a mistake. Also i'm not entirely sure if i want the analyzer to warn in test1, test2, test3 (also this code pattern doesn't look widespread/idiomatic).

So the analyzer does this really really weird thing by treating many operations with concrete pointers as no-ops, keeping the pointer null when it was null before, and keeping it non-null when it was non-null before - not just in the place that you've fixed, but also when computing field or element or base-class offsets for null pointers. These are marked as FIXME all over the place, but taking up the fix would require to provide another heuristic to distinguish null dereferences from other fixed-address dereferences (i.e. the experimental FixedAddressChecker).

I guess there should be more comments on this issue near the FIXMEs you fixed, and more tests covering the intended behavior; adding them is definitely a good thing to do. I do not have any immediate ideas on how to fix the issue as a whole.

lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
938

Also resultTy->getPointeeType(). Note the fancy operator->() in QualType.

To be honest I was quite surprised that this change in behavior didn't cause more test failures, because for detecting null dereferences the old behavior is definitely more useful. Since it did not, I was convinced that this change is desired.

We use the analyzer for finding dereferences to fixed addresses - very similar to the FixedAddressChecker. For this purpose it is crucial that the execution engine works as perfect as possible, without "swallowing" any arithmetic.

For the struct example you mentioned you can still get the final address by asking the ASTContext if needed, but with pointer arithmetic the information is lost forever. Information is lost either way here. Either you forget that the arithmetic was based on a null pointer or you lose whatever was added to or subtracted from it.

So unless you can somehow tag the information in the SVal when an operation was based on a null pointer, this is pretty difficult. You also could introduce a heuristic that defines all dereferences around zero as null dereferences, but it would be very arbitrary and platform dependent. Or maybe the DereferenceChecker should explicitly break early on all statements that do arithmetic on pointers constrained to null. Overall I don't know enough about the analyzer to suggest more here.

Thanks for the comments, I will address them soon.

r.stahl updated this revision to Diff 114126.Sep 7 2017, 12:48 AM

addressed the review comments

r.stahl marked 2 inline comments as done.Sep 18 2017, 2:54 AM
dcoughlin requested changes to this revision.Sep 29 2017, 5:16 PM
dcoughlin added a subscriber: zaks.anna.

Rafael: Thanks for the patch! @NoQ, @zaks.anna, and I spoke about this off-line yesterday.

While this patch improves the modeling of pointer arithmetic, we're worried about losing valuable alarms that rely on the existing behavior.

Here is a case where the analyzer would warn before your patch but doesn't with it:

void foo() {
  int *p = 0;
  int q = *(p + 5); // expected-warning {{Dereference of null pointer}}
}

The existing diagnostic machinery relies on the fact that the analyzer treats p + 5 as 0 to report the bad dereference. This comes up more often than you might think because the analyzer is sometimes quite aggressive about promoting a symbol constrained to 0 to be a concrete value of 0. For example:

void bar(int *p) {
  if (p)
    return;

  int q = *(p + 5); // expected-warning {{Dereference of null pointer}}
}

It would be good to add test cases for these diagnostics!

I think you can preserve the existing (although missing from the test suite) good diagnostics and still improve the modeling by skipping the addition/subtraction if the LHS is a concrete int with value 0. Doing so would be a very minor change to this patch.

Modeling the pointer arithmetic when the LHS is 0 while still keeping the diagnostics will likely be a more involved effort, with ramifications in multiple parts of the analyzer. We could discuss that, if you'd like to tackle it! But it would probably be good for you to get a couple more patches under your belt before taking that on.

test/Analysis/inlining/inline-defensive-checks.c
144

The analyzer doesn't warn on these on purpose.

Throughout the analyzer, we have a broad heuristic that says: "if the programmer compares a pointer to NULL, then the analyzer should explicitly consider the case that the pointer is NULL". It will perform a case split in the symbolic execution: one case for when the value is definitely NULL and one case for one it is definitely not NULL. As a heuristic this works reasonably well: if the programmer bothered to add a check for NULL then they likely though the value could be NULL.

However, when context-sensitive analysis via inlining was added, this heuristic broke down for functions that called other functions with what we call "inlined defensive checks" or null.

Here is an example:

void hasInlinedDefensiveCheck(int *p) {
  if (!p)
    return;

  // Do something useful
}

void foo(int *param) {
  hasInlinedDefensiveCheck(param);
  
  *param = 7;
}

In this case the warning about *param = 7 is a false positive from foo's point of view because foo may have a strong invariant that param is not null; it doesn't care that hasInlinedDefensiveCheck() may have other callers that might call it with null. For this reason, we suppress reports about null pointer dereferences when we can detect an inlined defense check.

This revision now requires changes to proceed.Sep 29 2017, 5:16 PM
r.stahl updated this revision to Diff 118182.Oct 9 2017, 12:18 AM
r.stahl edited edge metadata.
r.stahl marked an inline comment as done.
r.stahl edited the summary of this revision. (Show Details)

addressed review comments. updated summary.

dcoughlin accepted this revision.Oct 9 2017, 9:14 PM

This looks good to me! Thanks for adding this. Do you have commit access, or do you need someone to commit it for you?

This revision is now accepted and ready to land.Oct 9 2017, 9:14 PM

Since I do not have commit access, it would be nice if someone committed this for me. Thanks!

This revision was automatically updated to reflect the committed changes.