Patch by Ilya Mirsky!
Fixes: http://llvm.org/PR44343
Differential D71714
[Sema] Fix -Warray-bounds false negative when casting an out-of-bounds array item ilya on Dec 19 2019, 9:17 AM. Authored by
Details
Diff Detail
Event TimelineComment Actions These are not the only AST nodes representing cast expressions (there is also CXXFunctionalCastExpr, ...). What about replacing the IgnoreParenImpCasts() above by IgnoreParenCasts() ? Incidentally doing this uncovers another test (Parser/cxx-ambig-decl-expr.cpp ) where this diagnostic is triggered. Comment Actions In addition to @riccibruno's comment, I found a couple of other suspicious things nearby (unrelated to the fix in this patch). No need to address them in this patch unless you feel motivated :)
Comment Actions To be rigorous, we should perform "pointer" checking for every operation that performs pointer arithmetic. Then we should perform "lvalue" checking (which doesn't allow pointers one past the end) in the following places:
This sort of piecemeal approach of recursively looking through arbitrary expressions seems likely to miss cases. For example, we currently don't perform checks inside compound literals. Comment Actions I agree that this approach is not good. I'm concerned that your direction might still miss things, though (and it seems to involve a lot of AST traversal, which would be nice to avoid). For example, __builtin_bitcast performs a load without an lvalue-to-rvalue conversion, and to be thorough we'd need to special-case it. Perhaps we could instead:
In C at least, that should be correct in all cases. I think it's correct in C++ as well; there are lots more forms of lvalue to walk into in the third step (eg, &(cond ? x[n] : y[n]) shouldn't warn), but it seems feasible to enumerate. (This would lose the warnings on *&x[n], but I don't think that's a disaster.) Comment Actions
I was thinking we'd just call into SemaChecking in appropriate places. I guess there's a little AST traversal to figure whether an expression forms an array address. Your idea seems simpler.
For C++, I think you might also need to handle discarded-value expressions? Maybe it's okay if we warn anyway in that case. :)
And more generally *(x+n), although I guess that isn't implemented now anyway. Comment Actions Thanks, using IgnoreParenCasts is even better.
Comment Actions
Comment Actions Implemented @riccibruno's and @rsmith's comments.
Comment Actions Ping. Could someone pick up review of this patch again, please? Ilya left off with a question to @rsmith , and at that point all activity fell off. Thanks.
Comment Actions Hi, this causes a warning in harfbuzz on code that looks like so: UChar decomposed[4]; ... int len = unorm2_getRawDecomposition(...decomposed, ARRAY_LENGTH (decomposed)...) if (len == 1) { U16_GET_UNSAFE (decomposed, 0, *a); where U16_GET_UNSAFE looks like so: #define U16_GET_UNSAFE(s, i, c) UPRV_BLOCK_MACRO_BEGIN { \ (c)=(s)[i]; \ if(U16_IS_SURROGATE(c)) { \ if(U16_IS_SURROGATE_LEAD(c)) { \ (c)=U16_GET_SUPPLEMENTARY((c), (s)[(i)+1]); \ } else { \ (c)=U16_GET_SUPPLEMENTARY((s)[(i)-1], (c)); \ } \ } \ } UPRV_BLOCK_MACRO_END (The BLOCK_MACRO macros are just a way to spell do...while(false) as far as I understand.) It's true that the else block in the macro evaluates to decomposed[0-1], which is out of bounds -- but it's also in a conditional, and in a macro. That seems like this can cause quite a few false positives. Maybe this warning should be suppressed in conditionals, or macros, or some such? (50k/70k files of chromium built with this patch, this so far is the only instance it flagged, and that's a false positive.) |
Hmm, don't we need to do different things for dot and arrow in this case?