This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent
ClosedPublic

Authored by danix800 on Aug 23 2023, 11:10 PM.

Details

Summary

Size-type inconsistency (signedness) causes confusion and even bugs.
For example when signed compared to unsigned the result might not
be expected.

Summary of this commit:

Related APIs changes:

  1. getDynamicExtent() returns signed version of extent;
  2. Add getDynamicElementCountWithOffset() for offset version of element count;
  3. getElementExtent() could be 0, add defensive checking for getDynamicElementCount, if element is of zero-length, try ConstantArrayType::getSize() as element count;

Related checker changes:

  1. ArrayBoundCheckerV2: add testcase for signed <-> unsigned comparison from type-inconsistency results by getDynamicExtent()
  2. ExprInspection: use more general API to report more results

Fixes https://github.com/llvm/llvm-project/issues/64920

Diff Detail

Event Timeline

danix800 created this revision.Aug 23 2023, 11:10 PM
Herald added a project: Restricted Project. · View Herald Transcript
danix800 requested review of this revision.Aug 23 2023, 11:10 PM

Thanks for creating this commit, it's a useful improvement!

I added some inline comments on minor implementation details; moreover, note that "Releted checkers:" (instead of "related") is a typo in the commit message.

I also have a less concrete question about the root cause of these bugs. Does this commit fix the "root" of the issue by eliminating some misuse of correctly implemented (but perhaps surprising) SVal / APSInt arithmetic; or is there an underlying bug in the SVal / APSInt arithmetic which is just avoided (and not eliminated) by this commit? In the latter case, what obstacles prevent us from fixing the root cause?

clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
202–203 ↗(On Diff #552998)

I wonder whether it would be better to move this conversion into the definition of getDynamicExtent to ensure that it has a consistent return type. Are there any situations where it's useful that getDynamicExtent can return something that's not an ArrayIndexTy?

clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp
180–183 ↗(On Diff #552998)

I think checking the nullness of getAs() is more elegant than using a separate isUndef() check.

On a longer term / as a separate improvement, I'd also think about allowing UndefinedVal as the argument of the assert()-like functions, because the evalBinOp -> assert combination is very common in checkers and IIRC in most checkers the branch of UndefinedVal will produce the same result as UnknownVal.

186–192 ↗(On Diff #552998)

I'd use getAs<NonLoc>() and a conditional to avoid crashes in the (theoretical) case that evalBinOp returns UnknownVal; and I suspect that getAs<MemRegion>() is superfluous because MemRegion is a base class of ElementRegion.

clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
132

Are you sure that this cannot cause crashes? (E.g. did you check that there is no corner case when getElementExtent returns 0?)

I can accept this cast, especially if you have a clear proof that it's valid, but I'd prefer a more defensive solution that turns UndefinedVal into UnknownVal either here or preferably in the assert() function family that consumes the results from functions like this.

clang/test/Analysis/array-bound-v2-constraint-check.c
1

Perhaps only enable unix.Malloc to ensure that this test is not affected by changes to other checkers in the unix group. (It's enough for the testcase that you added.)

Maybe its too early for the patch, but if a nonlocal change, like changing some Core functionality, we usually measure the real world implications and compare it against some baseline to get an idea how this affects the whole system. It also helps uncovering corner cases and crashes. Have you done some measurements? If not, @donat.nagy might could help you out gathering some statistics.

Thanks for creating this commit, it's a useful improvement!

I added some inline comments on minor implementation details; moreover, note that "Releted checkers:" (instead of "related") is a typo in the commit message.

I also have a less concrete question about the root cause of these bugs. Does this commit fix the "root" of the issue by eliminating some misuse of correctly implemented (but perhaps surprising) SVal / APSInt arithmetic; or is there an underlying bug in the SVal / APSInt arithmetic which is just avoided (and not eliminated) by this commit? In the latter case, what obstacles prevent us from fixing the root cause?

For the observed signed compared to unsigned unexpectation from ArrayBoundV2,
the constraint manager does give the CORRECT result.

It's a misuse of the constraint API, mainly caused by unexpected unsigned extent
set by memory modeling. Actually ArrayBoundV2::getSimplifiedOffsets() has clear
comment that this signed <-> unsigned comparison is problematic.

// TODO: once the constraint manager is smart enough to handle non simplified
// symbolic expressions remove this function. Note that this can not be used in
// the constraint manager as is, since this does not handle overflows. It is
// safe to assume, however, that memory offsets will not overflow.
// NOTE: callers of this function need to be aware of the effects of overflows
// and signed<->unsigned conversions!
static std::pair<NonLoc, nonloc::ConcreteInt>
getSimplifiedOffsets(NonLoc offset, nonloc::ConcreteInt extent,
                     SValBuilder &svalBuilder) {
danix800 added inline comments.Aug 24 2023, 8:03 AM
clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
202–203 ↗(On Diff #552998)

Use consistent return type is necessary. The problem is which version should we choose.
I didn't touch this before I can get a clear selection.

At first I tended to think that all extent/count should be unsigned, but quickly dropped this idea
since getDynamicExtentWithOffset gives reasonable note that extent/offset (even count) can
be negative.

danix800 added inline comments.Aug 24 2023, 8:05 AM
clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp
180–183 ↗(On Diff #552998)

Thanks for the advice!

danix800 added inline comments.Aug 24 2023, 8:17 AM
clang/test/Analysis/array-bound-v2-constraint-check.c
1

Correct! Thanks for reminding!

As a general comment on requiring all extents to be of unsigned APSInts.
Checkers, like the MallocChecker, binds the result of arbitrary expression's values (the argument of malloc, for example) as extents of some regions.
Usually, that argument is of an unsigned static type, thus even if some other expression is there, and implicit cast will turn it into unsigned.
However, in CSA, we don't respect such casts, thus if it was signed, we will bind a signed value as an extent.

Here is an example demonstrating this:

int n = conjure();
clang_analyzer_dump(n);  // conj
clang_analyzer_value(n); // i32

char *p = (char*)malloc(n);
clang_analyzer_dump(clang_analyzer_getExtent(p));  // conj
clang_analyzer_value(clang_analyzer_getExtent(p)); // i32

Consequently, I'm not sure it's possible to turn extents "unsigned", without fixing casts.
I was sort of expecting this, but I didn't know real world issues materialized by this in the domain of extents.
Usually, they just happen to work, so I didn't really bother with it.
This was actually what I intended to deliver with my comment at D158499#4609046.

By this comment I'm not suggesting that this is a lost cause, or we shouldn't move in this direction. I think something like this would make sense, but I'd need to delve into the change to make sure.

danix800 added inline comments.Aug 24 2023, 9:11 AM
clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
132

Wow~ Yeah! This really is a bug with zero-sized types, the original getDynamicElementCount suffers from this too!
Verified on 13.0.1 & main.

void clang_analyzer_dumpElementCount(const void *);

int a[1][0];

void var_simple_ref() {
  clang_analyzer_dumpElementCount(a); // expected-warning {{1 S64b}}
}

Dividing by zero gives undefined, casts failed:

clang: /home/danis/Sources/llvm-project-main/llvm/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From &) [To = clang::ento::DefinedOrUnknownSVal, From = clang::ento::SVal]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

getDynamicElementCountWithOffset is the extended offset version of getDynamicElementCount, same issue.

Nice catch!

As a general comment on requiring all extents to be of unsigned APSInts.
Checkers, like the MallocChecker, binds the result of arbitrary expression's values (the argument of malloc, for example) as extents of some regions.
Usually, that argument is of an unsigned static type, thus even if some other expression is there, and implicit cast will turn it into unsigned.
However, in CSA, we don't respect such casts, thus if it was signed, we will bind a signed value as an extent.

Here is an example demonstrating this:

int n = conjure();
clang_analyzer_dump(n);  // conj
clang_analyzer_value(n); // i32

char *p = (char*)malloc(n);
clang_analyzer_dump(clang_analyzer_getExtent(p));  // conj
clang_analyzer_value(clang_analyzer_getExtent(p)); // i32

Consequently, I'm not sure it's possible to turn extents "unsigned", without fixing casts.
I was sort of expecting this, but I didn't know real world issues materialized by this in the domain of extents.
Usually, they just happen to work, so I didn't really bother with it.
This was actually what I intended to deliver with my comment at D158499#4609046.

By this comment I'm not suggesting that this is a lost cause, or we shouldn't move in this direction. I think something like this would make sense, but I'd need to delve into the change to make sure.

Wow! Really another unexpectation to me!

I think either signedness is ok, the stored data would not be lost. How clients use these values are the actual problem.

As a general comment on requiring all extents to be of unsigned APSInts.
Checkers, like the MallocChecker, binds the result of arbitrary expression's values (the argument of malloc, for example) as extents of some regions.
Usually, that argument is of an unsigned static type, thus even if some other expression is there, and implicit cast will turn it into unsigned.
However, in CSA, we don't respect such casts, thus if it was signed, we will bind a signed value as an extent.

Here is an example demonstrating this:

int n = conjure();
clang_analyzer_dump(n);  // conj
clang_analyzer_value(n); // i32

char *p = (char*)malloc(n);
clang_analyzer_dump(clang_analyzer_getExtent(p));  // conj
clang_analyzer_value(clang_analyzer_getExtent(p)); // i32

Consequently, I'm not sure it's possible to turn extents "unsigned", without fixing casts.

Wow! Really another unexpectation to me!

I think either signedness is ok, the stored data would not be lost. How clients use these values are the actual problem.

I'm not sure how could the clients, aka. checkers use extents "correctly". They cannot even explicitly wrap the extent by an "unsigned" type e.g. by modeling a cast to "size_t" because that is a noop, thus adding casts is an identity operation. See here.
Consequently, checkers using extents in comparisons or doing arithmetic in the symbolic domain e.g. when doing bounds checks under the hood, will be affected by the missing cast there.

danix800 edited the summary of this revision. (Show Details)Aug 24 2023, 6:53 PM
danix800 updated this revision to Diff 553335.Aug 24 2023, 8:10 PM
danix800 edited the summary of this revision. (Show Details)
  1. getDynamicExtent() can return both signed/unsigned results. They are converted to signed version as ArrayIndexType to keep consistency. All other APIs return this version

(even for static type size). Testcases are not affected. ArrayBoundV2 doesn't need explicit unsigned-to-signed casts to avoid signed<->unsigned comparison;

  1. getElementExtent() could be 0, add more defensive checking;
  2. MPIChecker added more defensive checking, e.g: ElementCount doesn't have to be ConcreteInt so avoid direct ElementCount.castAs<nonloc::ConcreteInt>().
danix800 edited the summary of this revision. (Show Details)Aug 24 2023, 8:18 PM
danix800 edited the summary of this revision. (Show Details)Aug 24 2023, 8:20 PM
danix800 updated this revision to Diff 553351.EditedAug 24 2023, 9:58 PM
danix800 edited the summary of this revision. (Show Details)

MPIChecker is not strictly related to this revision. Will be moved into another revison.

EDIT: See https://reviews.llvm.org/D158813

donat.nagy accepted this revision.Aug 28 2023, 4:30 AM

Thanks for the updates! I didn't spot any important issue, so I'm accepting this commit, but let's wait for the opinion of @steakhal before merging this.

[...] Does this commit fix the "root" of the issue by eliminating some misuse of correctly implemented (but perhaps surprising) SVal / APSInt arithmetic; or is there an underlying bug in the SVal / APSInt arithmetic which is just avoided (and not eliminated) by this commit? In the latter case, what obstacles prevent us from fixing the root cause?

For the observed signed compared to unsigned unexpectation from ArrayBoundV2,
the constraint manager does give the CORRECT result.

It's a misuse of the constraint API, mainly caused by unexpected unsigned extent
set by memory modeling. Actually ArrayBoundV2::getSimplifiedOffsets() has clear
comment that this signed <-> unsigned comparison is problematic.

// TODO: once the constraint manager is smart enough to handle non simplified
// symbolic expressions remove this function. Note that this can not be used in
// the constraint manager as is, since this does not handle overflows. It is
// safe to assume, however, that memory offsets will not overflow.
// NOTE: callers of this function need to be aware of the effects of overflows
// and signed<->unsigned conversions!
static std::pair<NonLoc, nonloc::ConcreteInt>
getSimplifiedOffsets(NonLoc offset, nonloc::ConcreteInt extent,
                     SValBuilder &svalBuilder) {

In fact, the "NOTE:" comment was added by my patch D135375 (which eliminated lots of false positives caused by a situation when the argument offset was unsigned), but I was still confused by this new bug. (Where the other argument extent was unsigned and this led to incorrect conclusions like "len cannot be larger than 3u, so len cannot be -1, because -1 is larger than 3u" 🙃 .) Thanks for troubleshooting this issue!

clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
34

I think it's a good convention if getDynamicExtent() will always return concrete values as ArrayIndexTy. (If I didn't miss some very obscure case, then this will be true when this commit is merged.)

49

Note that StripCasts() is overzealous and strips array indexing when the index is zero. For example if you have a rectangular matrix int m[6][8]; then this function would (correctly) say that the element count of m[1] is 8, but it would claim that the element count of m[0] is 6 (because it strips the 'cast' and calculates the element count of m).

This is caused by the hacky "solution" that casts are represented by ElementRegion{original region, 0, new type} which cannot be distinguished from a real element access where the index happens to be 0. (This is just a public service announcement, this already affects e.g. getDynamicExtent and I don't think that you can find a local solution. For more information, see also https://github.com/llvm/llvm-project/issues/42709 which plans to refactor the memory model.)

72–75

Perhaps use the method value_or of std::optional?

This revision is now accepted and ready to land.Aug 28 2023, 4:30 AM

Oh, so we would canonicalize towards a signed extent type. I see. I think I'm okay with that.
However, I think this needs a little bit of polishing and testing when the region does not point to the beginning of an array or object, but rather inside an object, or like this:

int nums[] = {1,2,3};
char *p = (char*)&nums[1];

clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
int *q = (int*)p;
clang_analyzer_dumpExtent(q);
clang_analyzer_dumpElementCount(q);
clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
47–48

I'd prefer hoisting this check to the callsite.
Usually, we assume MR is non-null. This is already the case for a sibling API, getDynamicExtent().
Let's keep these "overloads" behave the same.

56–57

Funnily enough, one must use the ASTContext::getAsConstantArrayType() to achieve this.
I'm not sure why, but I was bitten by this.
It says something like this at the ASTContext:

/// Type Query functions.  If the type is an instance of the specified class,
/// return the Type pointer for the underlying maximally pretty type.  This
/// is a member of ASTContext because this may need to do some amount of
/// canonicalization, e.g. to move type qualifiers into the element type.
const ArrayType *getAsArrayType(QualType T) const;
const ConstantArrayType *getAsConstantArrayType(QualType T) const {
  return dyn_cast_or_null<ConstantArrayType>(getAsArrayType(T));
}
58

For bool literal arguments we usually use "named parameter passing", aka. /*paramname=*/false constructs.

66–75

If ElementSize would be some symbol, constrained to null, we would pass the assertion, but still end up modeling a division by zero, resulting in Undefined, which then turned into Unknown - I see.

Looking at this, the assertion seems misleading as it won't protect the division.
That said, the early return on undef could be also dropped.

donat.nagy added a comment.EditedAug 28 2023, 7:21 AM

Oh, so we would canonicalize towards a signed extent type. I see. I think I'm okay with that.
However, I think this needs a little bit of polishing and testing when the region does not point to the beginning of an array or object, but rather inside an object, or like this:

int nums[] = {1,2,3};
char *p = (char*)&nums[1];

clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
int *q = (int*)p;
clang_analyzer_dumpExtent(q);
clang_analyzer_dumpElementCount(q);

Unfortunately the analyzer engine does not distinguish pointer arithmetic and element access, and we cannot handle these situations while that limitation exists. For example, if we have an array int nums[3];, then the element nums[1] and the incremented pointer int *ptr = nums + 1; are represented by the same ElementRegion object; so we cannot simultaneously say that (1) nums[1] is an int-sized memory region, and (2) it's valid to access the elements of the array as ptr[-1], ptr[0] and ptr[1].

The least bad heuristic is probably RegionRawOffsetV2::computeOffset() from ArrayBoundCheckerV2, which strips away all the ElementRegion layers, acting as if they are all coming from pointer arithmetic. This behavior is incorrect if we want to query the extent/elementcount of a "real" ElementRegion (e.g. a slice of a multidimensional array or (char*)&nums[1] in your example), but mostly leads to false negatives. On the other hand, if we process a pointer increment as if it were an element access, we can easily run into false positive reports -- this is why I had to abandon D150446.

I'd suggest that we should ignore pointer arithmetic in this commit (or create a testcase that documents the current insufficient behavior) and remember this as yet another reason to do through rewrite of the memory model. This is related to the plan https://github.com/llvm/llvm-project/issues/42709 suggested by @NoQ, but perhaps it would be enough to do a less drastic change in that direction.

Oh, so we would canonicalize towards a signed extent type. I see. I think I'm okay with that.
However, I think this needs a little bit of polishing and testing when the region does not point to the beginning of an array or object, but rather inside an object, or like this:

int nums[] = {1,2,3};
char *p = (char*)&nums[1];

clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
int *q = (int*)p;
clang_analyzer_dumpExtent(q);
clang_analyzer_dumpElementCount(q);

Unfortunately the analyzer engine does not distinguish pointer arithmetic and element access, and we cannot handle these situations while that limitation exists. For example, if we have an array int nums[3];, then the element nums[1] and the incremented pointer int *ptr = nums + 1; are represented by the same ElementRegion object; so we cannot simultaneously say that (1) nums[1] is an int-sized memory region, and (2) it's valid to access the elements of the array as ptr[-1], ptr[0] and ptr[1].

The least bad heuristic is probably RegionRawOffsetV2::computeOffset() from ArrayBoundCheckerV2, which strips away all the ElementRegion layers, acting as if they are all coming from pointer arithmetic. This behavior is incorrect if we want to query the extent/elementcount of a "real" ElementRegion (e.g. a slice of a multidimensional array or (char*)&nums[1] in your example), but mostly leads to false negatives. On the other hand, if we process a pointer increment as if it were an element access, we can easily run into false positive reports -- this is why I had to abandon D150446.

I'd suggest that we should ignore pointer arithmetic in this commit (or create a testcase that documents the current insufficient behavior) and remember this as yet another reason to do through rewrite of the memory model. This is related to the plan https://github.com/llvm/llvm-project/issues/42709 suggested by @NoQ, but perhaps it would be enough to do a less drastic change in that direction.

I'm aware of this, and I didn't mean to ask to fix this here.
I just want to see how these ExprInspection APIs will be affected in these edge-cases, where previously it returned "unknown", and now will return something else.
It's useful to demonstrate this if we ever come back to this commit.
Pinning down some tests and having some FIXMEs there should bring enough visibility.

Yes, adding tests that demonstrate the current behavior is a good idea.

danix800 added inline comments.Aug 29 2023, 9:05 AM
clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
34

Could you elaborate this a bit more please? Do you mean by nonloc::ConcreteInt rather than DefinedOrUnknownSVal?

danix800 updated this revision to Diff 554383.Aug 29 2023, 9:09 AM
  1. Cleanup unnecessary undef/zero checking;
  2. Use better defensive API for getting ConstantArrayType;
  3. Comment on bool argument for readability;
  4. Add more test for extent with offset cases.

Oh, so we would canonicalize towards a signed extent type. I see. I think I'm okay with that.
However, I think this needs a little bit of polishing and testing when the region does not point to the beginning of an array or object, but rather inside an object, or like this:

int nums[] = {1,2,3};
char *p = (char*)&nums[1];

clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
clang_analyzer_dumpExtent(p);
clang_analyzer_dumpElementCount(p);
++p;
int *q = (int*)p;
clang_analyzer_dumpExtent(q);
clang_analyzer_dumpElementCount(q);

Unfortunately the analyzer engine does not distinguish pointer arithmetic and element access, and we cannot handle these situations while that limitation exists. For example, if we have an array int nums[3];, then the element nums[1] and the incremented pointer int *ptr = nums + 1; are represented by the same ElementRegion object; so we cannot simultaneously say that (1) nums[1] is an int-sized memory region, and (2) it's valid to access the elements of the array as ptr[-1], ptr[0] and ptr[1].

The least bad heuristic is probably RegionRawOffsetV2::computeOffset() from ArrayBoundCheckerV2, which strips away all the ElementRegion layers, acting as if they are all coming from pointer arithmetic. This behavior is incorrect if we want to query the extent/elementcount of a "real" ElementRegion (e.g. a slice of a multidimensional array or (char*)&nums[1] in your example), but mostly leads to false negatives. On the other hand, if we process a pointer increment as if it were an element access, we can easily run into false positive reports -- this is why I had to abandon D150446.

I'd suggest that we should ignore pointer arithmetic in this commit (or create a testcase that documents the current insufficient behavior) and remember this as yet another reason to do through rewrite of the memory model. This is related to the plan https://github.com/llvm/llvm-project/issues/42709 suggested by @NoQ, but perhaps it would be enough to do a less drastic change in that direction.

I'm aware of this, and I didn't mean to ask to fix this here.
I just want to see how these ExprInspection APIs will be affected in these edge-cases, where previously it returned "unknown", and now will return something else.
It's useful to demonstrate this if we ever come back to this commit.
Pinning down some tests and having some FIXMEs there should bring enough visibility.

Thank you all for the above valuable information. It's helpful for me to get better understanding of CSA!

donat.nagy added inline comments.Aug 30 2023, 1:10 PM
clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
34

Sorry for the confusing comment! I just intended to summarize what you did without requesting any additional change.

My remark highlighted that (after your change) whenever the result of this function is a nonloc::ConcreteInt, the getType() method of that ConcreteInt object will return ArrayIndexTy (which is a QualType constant that represents the type long long).

There are also some situations when the result of getDynamicExtent() is not a nonloc::ConcreteInt; so it's correct that the return type should be DefinedOrUnknownSVal (which is the common supertype of all possible results [1]) and it's correct to use this return type in the getAs<> on the highlighted line.

[1] See inheritance diagram at https://clang.llvm.org/doxygen/classclang_1_1ento_1_1SVal.html

Thanks for clarifying!

If no further comments I'll commit this revison in a day or two!

steakhal added inline comments.Aug 31 2023, 9:08 AM
clang/test/Analysis/memory-model.cpp
167

If the array has zero extent, how can is have any elements?

donat.nagy added inline comments.Aug 31 2023, 9:46 AM
clang/test/Analysis/memory-model.cpp
167

It has five elements, each element is a 0-element array, total size is 5*0 = 0.

steakhal added inline comments.Sep 1 2023, 2:51 AM
clang/test/Analysis/memory-model.cpp
167

Okay, makes sense. Thanks.