This is an archive of the discontinued LLVM Phabricator instance.

[clang][analyzer] Improve bug report in alpha.security.ReturnPtrRange
ClosedPublic

Authored by balazske on Jul 29 2021, 2:47 AM.

Diff Detail

Event Timeline

balazske created this revision.Jul 29 2021, 2:47 AM
balazske requested review of this revision.Jul 29 2021, 2:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2021, 2:47 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Aside from that Returned pointer value points outside the original object with size of 10 'int' objects reads somewhat unnatural I have no objections.
A couple of nits here and there but overall I'm pleased to see more verbose bug reports.

clang/lib/StaticAnalyzer/Checkers/ReturnPointerRangeChecker.cpp
82–83
85–86
clang/test/Analysis/return-ptr-range.cpp
0–1

Is RUN1 intentional? If so, what does it do?

1

You can specify multiple match prefixes.
If you were passing -verify=expected,notes, you would not have to duplicate each warning.

1

This is not used in every test case. I would recommend moving this to the function parameter where it's actually being used.

4–6

Only a single array should be enough.
You can specify the match count for the diagnostic verifier.

// notes-note 3 {{Memory object...}} Will succeed only if exactly 3 times matched.

17–18

This is probably more of a taste.
I would prefer fewer indentations.
The same applies everywhere.

17–20

I think it's fine to leave the common suffix (potential ... out of the matched string. They give only a little value.
The same applies everywhere, except keeping a single one maybe.

One of the test files needs fixing: https://reviews.llvm.org/harbormaster/unit/view/931615/

Love the patch, though I agree that the note message needs some improvement. How about we change from this:

warning: Returned pointer value with index 11 points outside the original object with size of 10 'Data' objects (potential buffer overflow)

To this:

warning: Returned pointer points outside the original object (potential buffer overflow)
note: The original object 'Data' is an int array of size 10, the returned pointer points at the 11th element

or this:

warning: Returned pointer points to the 11th element of 'Data', but that is an int array of size 10 (potential buffer overflow)
clang/test/Analysis/return-ptr-range.cpp
0–1

We could just delete it. I guess that was the intent, to make this RUN line non-functional.

1

Don't forget core!

1

If this is a cpp file, just use namespaces, and redeclare everything inside them, like this: https://github.com/llvm/llvm-project/blob/main/clang/test/Analysis/track-control-dependency-conditions.cpp

17–18

I disagree, and prefer it as it is written now.

balazske added inline comments.Jul 30 2021, 12:50 AM
clang/test/Analysis/return-ptr-range.cpp
0–1

I wanted to make two runs, one for warnings only and one for notes only. But could not find out how to disable the warning messages and show only notes. Because the same warnings appear anyway it should be enough to use only one run with text output and remove the custom prefix.

17–18

The current style will be used, the comments are better grouped.

balazske updated this revision to Diff 362981.Jul 30 2021, 12:53 AM

Changed way of generating notes.
Re-structured the tests.
Other small changes.

balazske marked 9 inline comments as done.Jul 30 2021, 12:59 AM

First and foremost, I think this is a great change. I think the diagnostic is on the point.
BTW it seems like you missed two notes in misc-ps-region-store.m causing a test fail:

error: 'note' diagnostics seen but not expected:
  File /home/steakhal/git/llvm-project/clang/test/Analysis/misc-ps-region-store.m Line 464: Original object declared here
  File /home/steakhal/git/llvm-project/clang/test/Analysis/misc-ps-region-store.m Line 467: Original object 'test_cwe466_return_outofbounds_pointer_a' is an array of 10 'int' objects, returned pointer points at index 11
2 errors generated.

The following comments are not about your actual change, I'm rather pointing out further possibilities for improving the checker.

The checker seems to cover cases when we see the array, fine.
What about pointers/references to arrays? IMO we should trust those declarations about the size of the pointee.

using size_t = decltype(sizeof 1);
using int10 = int[10];

template <class T> void clang_analyzer_dump(T);
size_t clang_analyzer_getExtent(void *);
int conjure_index();


int *test_ptr_to_array(int10 *arr) {
  int x = conjure_index();
  if (x != 20)
    return *arr; // no-warning

  clang_analyzer_dump(clang_analyzer_getExtent(*arr));
  // expected-warning-re@-1 {{extent_${{[0-9]+}}{SymRegion{reg_${{[0-9]+}}<int10 * arr>}}}}
  // FIXME: Above should be '40 S64b' on Linux x86_64.

  int *result = *arr + x;
  clang_analyzer_dump(result);
  // expected-warning-re@-1 {{&Element{SymRegion{reg_${{[0-9]+}}<int10 * arr>},20 S64b,int}}}
  return result; // We should expect a warning here.
}

Please add this test to your patch.

It seems like the checker would catch it if MemRegionManager::getStaticSize() would consider the type of the SymbolicRegion for ConstantArray pointee types. I'll have a patch about that.

clang/test/Analysis/return-ptr-range.cpp
11

I would rather use a simple block {...} for opening a scope, but I don't know why you don't declare ptr in the original scope in the first place.
People usually use do {} while(0) constructs if they want to use break somewhere ~~ like a goto OR they implement a macro. You are doing none of these.

First and foremost, I think this is a great change. I think the diagnostic is on the point.
BTW it seems like you missed two notes in misc-ps-region-store.m causing a test fail:

error: 'note' diagnostics seen but not expected:
  File /home/steakhal/git/llvm-project/clang/test/Analysis/misc-ps-region-store.m Line 464: Original object declared here
  File /home/steakhal/git/llvm-project/clang/test/Analysis/misc-ps-region-store.m Line 467: Original object 'test_cwe466_return_outofbounds_pointer_a' is an array of 10 'int' objects, returned pointer points at index 11
2 errors generated.

The following comments are not about your actual change, I'm rather pointing out further possibilities for improving the checker.

The checker seems to cover cases when we see the array, fine.
What about pointers/references to arrays? IMO we should trust those declarations about the size of the pointee.

using size_t = decltype(sizeof 1);
using int10 = int[10];

template <class T> void clang_analyzer_dump(T);
size_t clang_analyzer_getExtent(void *);
int conjure_index();


int *test_ptr_to_array(int10 *arr) {
  int x = conjure_index();
  if (x != 20)
    return *arr; // no-warning

  clang_analyzer_dump(clang_analyzer_getExtent(*arr));
  // expected-warning-re@-1 {{extent_${{[0-9]+}}{SymRegion{reg_${{[0-9]+}}<int10 * arr>}}}}
  // FIXME: Above should be '40 S64b' on Linux x86_64.

  int *result = *arr + x;
  clang_analyzer_dump(result);
  // expected-warning-re@-1 {{&Element{SymRegion{reg_${{[0-9]+}}<int10 * arr>},20 S64b,int}}}
  return result; // We should expect a warning here.
}

Please add this test to your patch.

It seems like the checker would catch it if MemRegionManager::getStaticSize() would consider the type of the SymbolicRegion for ConstantArray pointee types. I'll have a patch about that.

It seems like we cannot implement this in a way fitting into the current memory model. To be able to do that we should be able to differentiate these two cases:

void test(int10 *arr) {
  clang_analyzer_dump((int*)(arr + 2)); // allow: `arr` might be a pointer to an array of `int10[100]`, in which case the resulting pointer is safe to use
  clang_analyzer_dump(*arr + 20);       // disallow: `*arr` :: `int[10]`, creating a pointer to the 20th element
  // Both dumps are the same:  &Element{SymRegion{reg_$0<int10 * arr>},20 S64b,int}
}

So, I'm actually puzzled about this. I think you can leave out the extra test case I proposed.

If the original memory object is not known the static size is not known too. Every pointer with unknown source can point into a bigger data structure.

clang/test/Analysis/return-ptr-range.cpp
11

I do not know why these loops are here but did not change the original code. Should we change it to simple block?

If the original memory object is not known the static size is not known too. Every pointer with unknown source can point into a bigger data structure.

You are right, but IMO pointers to arrays are so rare that we could probably trust them. At least, that was my idea.

clang/test/Analysis/return-ptr-range.cpp
11

Yes, please. The note for the loop is only noise in its current form.

balazske updated this revision to Diff 364495.Aug 5 2021, 8:36 AM

Fixes in tests.

steakhal accepted this revision.Aug 5 2021, 9:18 AM

Aside from the inline nit, I think it's good to go.
Let some time for the others to catch up, they might have objections.

clang/test/Analysis/return-ptr-range.cpp
17–20

I don't think we need this extra scope. Same for the others.

This revision is now accepted and ready to land.Aug 5 2021, 9:18 AM
balazske added inline comments.Aug 11 2021, 12:10 AM
clang/test/Analysis/return-ptr-range.cpp
17–20

The original test was used to make the x "dead" at return after the loop (at least without the fix), see D12726. In the case of loop the x is garbage-collected at end of the loop, if a block is used only at the end of the function (?) but not at end of the block. So I want to put back the loop to preserve the original code.

balazske updated this revision to Diff 365693.Aug 11 2021, 12:44 AM

Re-add empty while loops in tests, to have the original code.
These loops were intentionally created and probably still needed.

steakhal accepted this revision.Aug 11 2021, 1:59 AM
steakhal added inline comments.
clang/test/Analysis/return-ptr-range.cpp
17–20

Thank you for letting me know.

This revision was landed with ongoing or failed builds.Aug 11 2021, 3:52 AM
This revision was automatically updated to reflect the committed changes.