After this patch, we no longer need the check::Location callback, as we
caught bugs earlier when forming bad references - aka. before loading or
binding anything to it.
(CWE-122 Heap Based Buffer Overflow: CWE-805-class-loop)
Depends on D159106
Differential D159107
[analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations steakhal on Aug 29 2023, 8:21 AM. Authored by
Details
After this patch, we no longer need the check::Location callback, as we (CWE-122 Heap Based Buffer Overflow: CWE-805-class-loop) Depends on D159106
Diff Detail
Event TimelineComment Actions Good direction of development, this will be useful for providing better bug reports (in addition to ensuring correct behavior some situations). Note that it's also possible to dereference pointers with the operator ->, which is represented by MemberExprs in the AST; we should probably handle that as if it was a UO_Deref. There is also a small corner case that for an array some_type arr[N] it's well-defined to form the past-the-end pointer as &arr[N] (instead of arr + N) -- while any other use of arr[N] is undefined behavior. If this occurs in practice, then we'll probably need some additional logic to handle it. (Note that the check::Location implementation dodged this question, because it didn't report anything when the program formed &arr[N], but later created a bug report when this pointer value was dereferenced.)
Comment Actions +1, +1
Ah, this disappoints me. You are right. This delayed mechanism definitely needs a more careful approach. Given these concerns and the quality of the diagnostics, I don't think it's something easy to fix. I'll abandon this for now, to focus on the low-hanging patches to upstream. Thanks for the thoughtful review! @donat.nagy
Comment Actions I don't think that the &arr[N] issue is too serious: we can just increment the array extent when the parent expression of the array subscript operator is the unary operator &. If the past-the-end pointer ends up dereferenced later, the current code is sufficient to report it as a bug (as the checker monitors all dereferences). I'd be happy to see (a slightly extended variant of) this commit merged, because I could provide much better warning messages if I can access the concrete subscript/dereference operations. Of course if you don't have time to work on this I can put this up for review myself (probably after your other commits are handled). Comment Actions My instinct suggests that there is more behind the curtain.
The thing is, that I only have this week, aka. 1.5 days before I leave for vacation. :D Comment Actions As a rough solution we can simply say that this checker ignores &arr[idx] (because it's just a different way of writing arr + idx) and only checks expressions where "real" dereference happens. This way the checker would won't emit false positives on past-the-end pointers and still report every out of bound memory access (including off-by-one errors). This can be refined by adding a check that which validates that in an expression like &arr[idx] the index satisfies 0 <= idx && idx <= array_size. This is conceptually independent (but can use the same implementation as the real memory access check) and would add some useful reports and constraints without breaking anything. Pointer arithmetic is very restricted in the standard, e.g. http://eel.is/c++draft/basic.compound says that
As this explicitly rules out before-first-element pointers and they are not too common (I don't recall any code that used them), I don't think that they deserve a special case (just report them if we see them). Of course, it's completely reasonable to focus on a limited set of changes before the vacation. In the meantime, I'll probably work on one or more commits that would (1) switch to using the concrete check::PostStmt callbacks instead of check::Location (and Bind) like this commit (2) improve the diagnostics, add more details to the messages. When will you return from the vacation? Should I wait for you with the review of these changes (if I can write and test them before your return)? Comment Actions It makes sense, but it would need to special case on the parent or child AST node, which isn't really clean. But I think we think the same on this subject.
Yes. They are really restricted, and AFAIK only past-the-end pointers are allowed, unlike before-the-first element. Such code might be written for reverse iterations.
I don't think you should wait for me. I'll be off for a week though, but I might not have the bandwidth to do reviews even after that. |
Which testcase would break without the check::Bind callback? (Not action needed, I'm just curious.)