Page MenuHomePhabricator

[analyzer][ReturnPtrRangeChecker] Fix a false positive on end() iterator
ClosedPublic

Authored by bruntib on Jul 13 2020, 6:19 AM.

Details

Summary

ReturnPtrRange checker emits a report if a function returns a pointer which points out of the buffer. However, end() iterator of containers is always such a pointer, so this always results a false positive report. This false positive case is now eliminated.

This patch resolves these tickets:
https://bugs.llvm.org/show_bug.cgi?id=20929
https://bugs.llvm.org/show_bug.cgi?id=25226
https://bugs.llvm.org/show_bug.cgi?id=27701

Diff Detail

Event Timeline

bruntib created this revision.Jul 13 2020, 6:19 AM
Szelethus retitled this revision from ReturnPtrRange checker false positive on end() iterator to [analyzer] ReturnPtrRange checker false positive on end() iterator.Jul 13 2020, 6:24 AM
Szelethus added reviewers: NoQ, vsavchenko.
Szelethus added a project: Restricted Project.
Szelethus retitled this revision from [analyzer] ReturnPtrRange checker false positive on end() iterator to [analyzer] Move ReturnPtrRange checker out of alpha by fixing a false positive on end() iterator.Jul 13 2020, 6:25 AM
vsavchenko added a comment.EditedJul 13 2020, 6:41 AM

Hi @bruntib!

Nice work! I like the functionality change of it, but I'm a bit worried about the category change. When we move a checker out of alpha, I would prefer to have more information and numbers on how this checker performs on a good variety of projects.
If you ask me, the numbers I'm the most interested in are the number of warnings on different projects and how this number corresponds to the overall number of warnings that the analyser emits for all non-alpha checkers together and individually. It will help us to see how noisy the new checker is (it is not a lint check, it should be pretty rare), and see if it belongs in non-alpha.

Of course, it would be better to have some stats on TP rate, but it can be pretty tedious to gather that info.

Szelethus added a comment.EditedJul 13 2020, 6:56 AM

Please know that the criteria for moving checkers (or anything, really) out of alpha includes testing on open source code bases. Could you create a public CodeChecker link with some evaluations? How does this, if at all, interact with other alpha array bound checkers?

I almost feel like this should play a similar role to CXXSelfAssignmentChecker -- returning an out-of-range pointer isn't a bug in itself (certainly not a fatal one), just a big time code smell. We don't check whether the actual pointee of the out-of-bounds pointer is undefined, so on certain platforms under special circumstances, this might even be okay. Suppose that you have an dynamically allocated array, you realloc it, and the OS managed to extend the already allocated memory segment, instead of allocating a brand new one, and then use the pointer. How about we just leave a NoteTag here and let other checkers find actual bugs?

Fun fact, this is among the very oldest checkers out there, first commited in 2009 (rG167bce9cf148838d5aa2355e509c68145ed3e618), so its worth checking whether everything in the code reflects our current stance on checker development.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
801 ↗(On Diff #277406)

If we go through with this, you'll need to fix some code in the docs as well, and change HasAlphaDocumentation to HasDocumentation.

clang/test/Analysis/return-ptr-range.cpp
1

core is missing.

This comment was removed by Szelethus.
NoQ requested changes to this revision.Jul 13 2020, 4:33 PM

This is a good patch!

The checker is still not ready to move out of alpha though: it's missing a bug visitor.

And yes, i too want FP rates :) I'm especially worried about functions with loops where we accidentally assume that more iterations were made than it's possible at runtime.

I almost feel like this should play a similar role to CXXSelfAssignmentChecker -- returning an out-of-range pointer isn't a bug in itself (certainly not a fatal one), just a big time code smell.

I vaguely recall that any pointer is expected to point either into a valid object or immediately past its end. I don't remember when exactly is this rule applied but i think it's a lot more UB in the standard than it's UB in practice.

clang/lib/StaticAnalyzer/Checkers/ReturnPointerRangeChecker.cpp
78

Categories should be broader than that. When a category is crafted to only incorporate a single checker, something's wrong. We could try "Buffer overflow" or something like that.

82–84

One thing that's definitely necessary but missing in this checker is a bug visitor. We need to explain to the user why do we think what region is returned and where does the element index is coming from and why do we think that the index is in fact out of range. Otherwise the report would be impossible to understand outside of trivial examples. We can't move out of alpha until it's implemented.

This revision now requires changes to proceed.Jul 13 2020, 4:33 PM
Szelethus requested changes to this revision.Jul 14 2020, 6:24 AM

Lets just postpone moving out-of-alpha to a standalone revision. That, without any other changes demands discussion.

In D83678#2148761, @NoQ wrote:

The checker is still not ready to move out of alpha though: it's missing a bug visitor.

What we need is "pointer set outside array's range here", and mark it interesting.

I almost feel like this should play a similar role to CXXSelfAssignmentChecker -- returning an out-of-range pointer isn't a bug in itself (certainly not a fatal one), just a big time code smell.

I vaguely recall that any pointer is expected to point either into a valid object or immediately past its end. I don't remember when exactly is this rule applied but i think it's a lot more UB in the standard than it's UB in practice.

I'd be interested to see that rule, though I can't imagine many circumstances under which one would go against it -- the only kind of programmer I would expect to come up with a valid use case would be a C programmer. :)

I almost feel like this should play a similar role to CXXSelfAssignmentChecker -- returning an out-of-range pointer isn't a bug in itself (certainly not a fatal one), just a big time code smell.

I vaguely recall that any pointer is expected to point either into a valid object or immediately past its end. I don't remember when exactly is this rule applied but i think it's a lot more UB in the standard than it's UB in practice.

I'd be interested to see that rule, though I can't imagine many circumstances under which one would go against it -- the only kind of programmer I would expect to come up with a valid use case would be a C programmer. :)

Mind that we wouldn't even need to handle the special case for past-the-end pointer if we repurposed this checker to create a NoteTag. If the pointer caused a bug, the note would be present, otherwise not.

Hi All,

Thanks for everyone for the helpful and fast review comments. Actually the review process is faster than I learn checker development :)

I was also thinking of merging the functionality of alpha.security.ReturnPtrRange into an existing out-of-bound checker. I found that the implementation of alpha.security.ArrayBound is 90% copy-paste of this ReturnPtrChecker, so this seems to be a good candidate. The main difference is that ArrayPtrRange checks all memory usages (Checker<check::Location>) and ReturnPtrRange checks return statements specifically (Checker< check::PreStmt<ReturnStmt> >). The former one is supposed to be more general, nevertheless, ReturnPtrRange emits reports on some inputs where ArrayBound remains silent (Of course I'll check the other direction later too). I inspected such a report and the reason was that the returning pointer was not used by the caller. I think it is not painful losing these (false positive?) reports of RetunPtrRange, since in practice it is safe to return the address of an undefined location as long as it's not used.

So would you agree if I'd continue the investigation in this direction, where the solution is the removal of ReturnPtrRange checker and maybe some improvement of ArrayBound checker? Moving ArrayBound out from alpha state could be a next step later.

In D83678#2148761, @NoQ wrote:

I vaguely recall that any pointer is expected to point either into a valid object or immediately past its end. I don't remember when exactly is this rule applied but i think it's a lot more UB in the standard than it's UB in practice.

AFAIK you are absolutely right - although I'm not a language lawyer yet.
The related parts of the standard draft are:

A possible reason for this restriction could be far pointers.
However, I don't expect such platforms widely used today.

Here is an interesting code snippet, which compiles on GCC but fails with clang (as it probably should though).
https://godbolt.org/z/cK3dsf

constexpr int point_out_of_range() {
  int nums[] = {1,2,3}; // 3 elements
  int sum = 0;
  // in the last iteration, 'p' will point to 'nums[4]'
  for (int *p = nums; p < &nums[3]; p += 2) {
    sum += *p;
  }
  return sum;
}

int force_constexpr_eval() {
  constexpr int val = point_out_of_range();
  return val;
}
NoQ added a comment.Jul 26 2020, 5:56 PM

I was also thinking of merging the functionality of alpha.security.ReturnPtrRange into an existing out-of-bound checker. I found that the implementation of alpha.security.ArrayBound is 90% copy-paste of this ReturnPtrChecker, so this seems to be a good candidate. The main difference is that ArrayPtrRange checks all memory usages (Checker<check::Location>) and ReturnPtrRange checks return statements specifically (Checker< check::PreStmt<ReturnStmt> >). The former one is supposed to be more general, nevertheless, ReturnPtrRange emits reports on some inputs where ArrayBound remains silent (Of course I'll check the other direction later too). I inspected such a report and the reason was that the returning pointer was not used by the caller. I think it is not painful losing these (false positive?) reports of RetunPtrRange, since in practice it is safe to return the address of an undefined location as long as it's not used.

So would you agree if I'd continue the investigation in this direction, where the solution is the removal of ReturnPtrRange checker and maybe some improvement of ArrayBound checker? Moving ArrayBound out from alpha state could be a next step later.

Loop (mis)modeling-related false positives in our array bound checkers are extremely terrible and fairly unsolvable (other than by better loop modeling). For that reason i don't envision array bound checkers moving out of alpha in the near future. As i mentioned above, the same probably applies to this checker. However i suspect that array bound + taint checking is much more reasonable. If the index is tainted, that means that it's also symbolic and therefore it couldn't have been obtained as a loop counter, so loop-related false positives can no longer hurt us. This checker should also probably move in this direction.

bruntib updated this revision to Diff 287881.Aug 26 2020, 2:24 AM

I reverted some changes so this checker remains an alpha checker. This patch now contains only the elimination of the false positive case. I wonder if this change is acceptable.

Szelethus retitled this revision from [analyzer] Move ReturnPtrRange checker out of alpha by fixing a false positive on end() iterator to [analyzer][ReturnPtrRangeChecker] Fix a false positive on end() iterator.Sep 9 2020, 7:25 AM
Szelethus edited the summary of this revision. (Show Details)
Szelethus changed the repository for this revision from rC Clang to rG LLVM Github Monorepo.
Szelethus accepted this revision.Sep 9 2020, 7:26 AM

LGTM on my end!

This revision was not accepted when it landed; it landed in state Needs Review.Nov 2 2020, 7:41 AM
This revision was automatically updated to reflect the committed changes.