This is an archive of the discontinued LLVM Phabricator instance.

calculate extent size for memory regions allocated by C++ new expression
ClosedPublic

Authored by dkrupp on Sep 7 2016, 11:06 AM.

Details

Summary

ArrayBoundChecker did not detect out of bounds memory access errors in case an array was allocated by the new expression.

  1. MallocChecker.cpp was updated to calculate the extent size in Bytes similarly how it was done for memory regions allocated by malloc. The size constraint is added for arrays and non-arrays allocated by new.
  1. ArrayBoundCheckerV2.cpp was updated to better handle accessing locations preceding a symbolic memory region (such as buf[-1] in test2(..) in out-of-bounds.cpp). So computeExtentBegin(..) was updated to assume that the extent of a symbolic region starts at 0 if we know the size of the extent (as is the case in case of malloc or new).
  1. out-of-bounds.cpp contains the relevant test cases for C++.

Diff Detail

Repository
rL LLVM

Event Timeline

dkrupp updated this revision to Diff 70562.Sep 7 2016, 11:06 AM
dkrupp retitled this revision from to calculate extent size for memory regions allocated by C++ new expression.
dkrupp updated this object.
dkrupp added reviewers: xazax.hun, NoQ, dcoughlin, zaks.anna.
dkrupp added a subscriber: cfe-commits.
zaks.anna added inline comments.Sep 7 2016, 1:41 PM
lib/StaticAnalyzer/Checkers/MallocChecker.cpp
1003 ↗(On Diff #70562)

I am not sure this code belongs to the malloc checker since it only supports the array bounds checker. Is there a reason it's not part of that checker?

xazax.hun added inline comments.Sep 7 2016, 2:06 PM
lib/StaticAnalyzer/Checkers/MallocChecker.cpp
1003 ↗(On Diff #70562)

I think it is part of the malloc checker because it already does something very very similar to malloc, see the MallocMemAux function. So in fact, for the array bounds checker to work properly, the malloc checker should be turned on.

dkrupp added inline comments.Sep 8 2016, 12:58 AM
lib/StaticAnalyzer/Checkers/MallocChecker.cpp
1003 ↗(On Diff #70562)

Extent size is used by ArrayBoundChecker, ArrayBoundCheckerV2 and CStringChecker checkers currently. New expression in case of simple allocations (0 allocation) was already handled in Malloc checker , that's why I implemented it there. But I agree it feels odd that one has to switch on unix.Malloc checker to get the size of new allocated heap regions. Should I move this to ArrayBoundChecker or ArrayBoundCheckerV2?

NoQ edited edge metadata.Sep 8 2016, 2:40 AM

Thanks for the patch! Not sure why, but i always have warm feelings for the MallocChecker and wish it to improve.

Added minor style comments, joined the "where to put the code" debate.

lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
69 ↗(On Diff #70562)

For ease of use, you no longer need to pass svalBuilder here - you can take it from the state->getStateManager().

81 ↗(On Diff #70562)

Whitespace slightly off.

83 ↗(On Diff #70562)

Perhaps you could consider the memory space of the region, it would look a bit less hacky to me.

In my dreams, i wish heap regions were no longer symbolic regions, and this hack would go away then.

Also, i recall there is a bug in isNull(): in the ConstraintManager class (this time i actually mean the abstract base class of RangeConstraintManager) this function boils down to assume(), but in RangeConstraintManager it is overridden to do a direct lookup into the constraint map; which means that in fact this function does not simplify symbolic expressions before answering. This code is probably unaffected because extents are always either concrete or atomic symbols, but i think i'd make a patch for that.

lib/StaticAnalyzer/Checkers/MallocChecker.cpp
995 ↗(On Diff #70562)

This code could use a bit more spaces around "=" and ",".

1003 ↗(On Diff #70562)
  1. Well, in my dreams, this code should be in the silent operator-new modelling checker, which would be separate from the stateless operator-new bug-finding checker. Then all the checkers that rely on extent would automatically load the modelling checker as a dependency.
  1. Yeah, i think many checkers may consider extents, not just array bound; other checkers may appear in the future. This info is very useful, which is why we have the whole symbol class just for that (however, checker communication through constraints on this symbol class wasn't IMHO a good idea, because it's harder for the constraint manager to deal with symbols and constraints rather than with concrete values).

Just wanted to speak out, thoughts below might perhaps be more on-topic

  1. The MallocChecker is not just unix.Malloc, but also cplusplus.NewDelete, etc., which makes a lot more sense to leave it here.
  1. Consider another part of MallocChecker's job - modeling the memory spaces for symbolic regions (heap vs. unknown). For malloc(), this is done in the checker. For operator-new, it is done in the ExprEngine(!), because this is part of the language rather than a library function. Perhaps ExprEngine would be the proper place for that code as well.
1010 ↗(On Diff #70562)

C.getLocationContext() is a bit shorter.

1017 ↗(On Diff #70562)

You can castAs<> here. Generally, there's no point in using getAs<> unless you're going to check later.

1026 ↗(On Diff #70562)

I think you're not using the fact that R is a SymbolicRegion. You could have as well written if (!isa<SymbolicRegion>(Region)) return.

This would also make variable names a bit more confusing.

1048 ↗(On Diff #70562)

Here also you can extract the SValBuilder from the state.

test/Analysis/out-of-bounds.cpp
143 ↗(On Diff #70562)

I think you should specify the target triple in the test run-line. Otherwise some buildbot with weird int size might fail (as the triple defaults to the current machine's triple).

zaks.anna added inline comments.Sep 8 2016, 3:47 PM
lib/StaticAnalyzer/Checkers/MallocChecker.cpp
1003 ↗(On Diff #70562)

Well, in my dreams, this code should be in the silent operator-new modelling checker, which would be
separate from the stateless operator-new bug-finding checker. Then all the checkers that rely on extent
would automatically load the modelling checker as a dependency.

I agree. This sounds like the best approach! (Though it's not a must have to get the patch in.)

Consider another part of MallocChecker's job - modeling the memory spaces for symbolic regions (heap vs.
unknown). For malloc(), this is done in the checker. For operator-new, it is done in the ExprEngine(!), because
this is part of the language rather than a library function. Perhaps ExprEngine would be the proper place for
that code as well.

Interesting point. Can you clarify the last sentence?

1020 ↗(On Diff #70562)

Please, be more explicit that this is not a size of allocation (as in not 1 byte)? Maybe call this ElementCount or something like that.

1021 ↗(On Diff #70562)

A bit of code repetition from above. Please, add comments explaining why we need subregion in one case and super region in the other.

Are there test cases for this branch?

1049 ↗(On Diff #70562)

This should probably be inline/have different name since it talks about ArrayIndexType and is not reused elsewhere.

test/Analysis/out-of-bounds.cpp
1 ↗(On Diff #70562)

Let's use a more specific file name.

dkrupp added inline comments.Sep 9 2016, 1:44 AM
lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
83 ↗(On Diff #70562)

Good point!
region->getMemorySpace() does a very similar recursion as the while loop in this function. So I guess the while loop can be refactored like this:

static SVal computeExtentBegin(SValBuilder &svalBuilder, 
const MemRegion *region) {
  const MemSpaceRegion *SR = region->getMemorySpace();
  if (SR->getKind() == MemRegion::UnknownSpaceRegionKind)
    return UnknownVal();
  else
    return svalBuilder.makeZeroArrayIndex();
 }

All test cases pass. Particularly it filters out this false positive from out-of-bounds.c :

// Don't warn when indexing below the start of a symbolic region's whose 
// base extent we don't know.
int *get_symbolic();
void test_index_below_symboloc() {
  int *buf = get_symbolic();
  buf[-1] = 0; // no-warning;
}
dkrupp updated this revision to Diff 70821.Sep 9 2016, 6:16 AM
dkrupp edited edge metadata.

I tried to address all your comments.

  1. computeExtentBegin() is greatly simplified.
  2. addExtendSize() is simplified (scaleValue() function inlined)
  3. new testcases added a) allocation and indexing of non-array element (int *ip =new int;) b) allocation of array with run-time size
dkrupp added inline comments.Sep 9 2016, 6:23 AM
lib/StaticAnalyzer/Checkers/MallocChecker.cpp
1011 ↗(On Diff #70821)

MemRegion has now method called castAs<>, only getAs<>, so I stayed with it.

1020 ↗(On Diff #70821)

I changed the type of Region to SubRegion, hope this is clearer this way.

1043 ↗(On Diff #70821)

now inlined

dkrupp marked 11 inline comments as done.Sep 12 2016, 9:36 AM

issues fixed

zaks.anna edited edge metadata.Sep 16 2016, 4:26 PM

I do not have any more comments; however, let's wait for @NoQ to review this as well.
Thanks!

NoQ accepted this revision.Sep 17 2016, 2:23 AM
NoQ edited edge metadata.

Looks good!

lib/StaticAnalyzer/Checkers/MallocChecker.cpp
291 ↗(On Diff #70821)

Whitespace a bit strange.

997 ↗(On Diff #70821)

Perhaps ExprEngine would be the proper place for that code as well.

Interesting point. Can you clarify the last sentence?

I'm thinking that the standard operator new() should be properly modeled by the analyzer core; we are already doing this with respect to memory space of the region it returns, why not do that for extent as well, somewhere at the same place.

We could probably make a refactoring pass over MallocChecker to move things around and return it to a readable state.

1011 ↗(On Diff #70821)

Ouch, i meant, cast<SubRegion>(State->getSVal(NE, LCtx).getAsRegion()) etc.

This revision is now accepted and ready to land.Sep 17 2016, 2:23 AM

Thanks. Gabor, could you please merge this? I don't have commit right.

This revision was automatically updated to reflect the committed changes.