Diagnose signed integer overflows. The core.UndefResultChecker will warn.
This is a bit unfinished.. I wonder if you like the approach. It only checks addition right now. and not underflow.
Differential D92634
[Analyzer] Diagnose signed integer overflow danielmarjamaki on Dec 3 2020, 11:43 PM. Authored by
Details
Diagnose signed integer overflows. The core.UndefResultChecker will warn. This is a bit unfinished.. I wonder if you like the approach. It only checks addition right now. and not underflow.
Diff Detail
Event TimelineComment Actions This would be a serious change. Especially if we internally already depend on modulo arithmetic. Furthermore, the readability of the bug reports in their current form is not quite enough. Comment Actions I think it could be better to implement this check with a checker on PreStmt<BinaryOperator> and so on. And IMO, checkers have enough functionalities to report these problems. Besides, the return value should be the exact value computed from the two integers, even unknown, rather than undefined. As the developers may overflow an integer on purpose. Comment Actions Hi, thanks for the patch! I am terribly sorry that we could not pick this up earlier. You can always ping the community with a simple 'Ping' message. Unfortunately, it often happens that a non-trivial patch stalls in the review queue, a ping might push that a bit. Comment Actions +1 Comment Actions
I am not sure what you mean. If there is undefined behavior then the value should be undefined and nothing else.. right? Comment Actions Exactly, it is undefined behavior in the C++ standard. However, the mainstream compilers like GCC and Clang implement this as the overflowed value, and some programmers also use this feature to do some tricky things. Therefore I suggest the computed value should be "the exact value computed from the two integers". Or it can be the Unknown SVal, but rather than the Undefined SVal, as the Undefined SVal is used to represent what is read from an uninitialized variable. But I do not favour the Unknown solution, as it could also trigger other problems in the engine, just as what has been mentioned by steakhal. Or maybe it would be no longer a problem if you implement this in a checker, but as a non-fatal error it is, you can just leave the overflowed value as it is, and report the problem only without terminating the symbol execution on this path. There is no need to report this problem all the time. Comment Actions
hmm.. you mean if some -fwrapv flag is used right. yes I should disable this checking then. For information, I am not very worried about signed integer overflow. I am mostly worried about the compiler optimisations. If the compiler determines that there is signed integer overflow in a execution path and removes all code in that path. I was inspired by this blog post: https://www.airs.com/blog/archives/120 This function: int f(int x) { return 0x7ffffff0 < x && x + 32 < 0x7fffffff; } might be optimized to: int f(int x) { return 0; } Comment Actions Yes, you are right. Currently, in the CSA, the symbolic execution engine will return true for this function. Condition 0x7ffffff0 < x will assume x to be [2147483632, 2147483647], then x + 32 will be evaluated to [-2147483632, -2147483617] which will make the check return true. But for the example you mentioned, I prefer implementing this check with a coding style checker for Clang-Tidy, although the integer overflow checker in the CSA is also necessary. BTW, I cannot optimize function f to returning zero directly with GCC-10.2.1 and Clang-10.0.1 under -O3. Should I add any other flags? Or it is version specific? Comment Actions This patch adds a new core checker. I wouldn't be comfortable enabling it by default without a much more careful evaluation (than a single lit test). In particular, i'm worried about false positives due to constraint solving issues and our incorrect cast representation - it sounds like both of these may disproportionally affect this checker. I also still wonder about "intentional" use of signed overflows; it's likely that some developers would prefer to silence the check. At least we should have a rough idea about how loud it is. Not necessarily a new checker but bug reports from this checker are already lacking a lot of information. In this new check both operands should be tracked with visitors. Also in other checks there's a bailout path that produces generic diagnostics ("the result... is undefined" without explaining why); we should probably silence those instead because it's clear that we can't explain what's going on. Comment Actions
Yeah I can't reproduce that with latest gcc/clang neither. Try gcc 4.x and gcc 5.x.
Yes if that was all we wanted to detect. I'd rather also flag more complex examples.. whenever the optimizer can see that there is overflow and has an opportunity to destroy the code. Here is a more complex example: https://stackoverflow.com/questions/57650026/optimized-clang-handling-overflow
I agree 1000%. The patch I provided so far was only a proof of concept. Thanks for all comments I feel I will remake this.
Yes I really agree about that. The current warning message is unacceptable and it must be clarified. Sorry I don't make much progress right now.. I hope to have some new code to show soon.. Comment Actions I disagree on this.
I don't think we should depend on implementation details. I also agree with @NoQ's D92634#2478703 comment. Comment Actions
well.. maybe it's better that I stop programming then and take this code I had and test it out. Comment Actions I have run clang static analysis on random open source projects. The very first finding that I look at seems (to me) to be a false positive. :-( My code seems to think that a variable print_count has the value INT_MAX for some reason and to me that seems impossible. I'll investigate this.. analyzed package: >libcerror_error.c:426:45: warning: The result of the '+' expression is undefined [core.UndefinedBinaryOperatorResult] error_string_size = (size_t) print_count + 1; Comment Actions
Typically in such cases bug visitors should be added/improved until it is clear from the user-facing report why does the analyzer think so. They'd highlight the important events, prevent path pruning, and potentially suppress reports if the reason is discovered to not be valid. Comment Actions Here is a link for our results on a few more projects. It might be useful for you. Comment Actions
Thanks! But in this case I do not think that my code works well. I have reduced the test case: #include <stdio.h> #define LIBCERROR_MESSAGE_INCREMENT_SIZE 64 #define LIBCERROR_MESSAGE_MAXIMUM_SIZE 4096 int get_print_count(); void foo( ) { size_t message_size = 0; size_t next_message_size = LIBCERROR_MESSAGE_INCREMENT_SIZE; int print_count = 0; while (1) { if( next_message_size >= LIBCERROR_MESSAGE_MAXIMUM_SIZE ) { next_message_size = LIBCERROR_MESSAGE_MAXIMUM_SIZE; } message_size = next_message_size; print_count = get_print_count(); if( print_count <= -1 ) { next_message_size += LIBCERROR_MESSAGE_INCREMENT_SIZE; // <- Assigned value is garbage or undefined } else if( ( (size_t) print_count >= message_size ) ) { next_message_size = (size_t) ( print_count + 1 ); print_count = -1; } else { int error_string_size = (size_t) print_count + 1; // <- The result of the '+' expression is undefined } if( message_size >= LIBCERROR_MESSAGE_MAXIMUM_SIZE ) { break; } } } I guess it's best that I rewrite my logic in a new check. But well I have the feeling it could take a couple of days before I have something new.. |
Bldr is always equal to *this here, there's no need to obtain it separately.