This patch fixes some issues for the IteratorPastEnd checkers. There are basically two main issues this patch targets: one is the handling of assignments between iterators, the other one is correct handling of the +, +=, - and -= operators of random-access iterators. The handling of these operators also checks the sign of the argument, e.g. a negative number for + or += is handled as decrementation.
Diff Detail
Event Timeline
Thanks for the update, most welcome!
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
229 | We usually write these as const auto *InstCall = cast<CXXInstanceCall>(&Call); Mostly because our custom cast<> has an assertion inside. | |
476 | I'd suggest renaming the parameters to "LHS" and "RHS" or something like that, so that not to confuse those with "lvalue" and "rvalue". | |
492 | Is there a test case for this hack? I'd also consider inspecting the AST (probably before passing the values to handleRandomIncrOrDecr()) and making the decision based on that. Because even though this pattern ("if a value is a loc and we expect a nonloc, do an extra dereference") is present in many places in the analyzer, in most of these places it doesn't work correctly (what if we try to discriminate between int* and int*&?). | |
515 | I think incrementing end() by 0 is not a bug (?) |
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
492 | I just want to get the sign of the integer value (if it is available). It turned out that I cannot do comparison between loc and nonloc. (Strange, because I can do anything else). After I created this hack, the Analyzer did not crash anymore on the llvm/clang code. I do not fully understand what I should fix here and how? In this particular place we expect some integer, thus no int* or int*&. | |
515 | I think it is not a bug, but how to solve it properly? If I chose just greaterThanZero, then we have the same problem for decrementing end() by 0. Is it worth to create three states here? |
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
492 | Loc value, essentially, *is* a pointer or reference value. If you're getting a Loc, then your expectations of an integer are not met in the actual code. In this case you *want* to know why they are not met, otherwise you may avoid the crash, but do incorrect things and run into false positives. So i'd rather have this investigated carefully. You say that you are crashing otherwise - and then it should be trivial for you to attach a debugger and dump() the expression for which you expect to take the integer value, and see why it suddenly has a pointer type in a particular case. From that you'd easily see what to do. Also, crashes are often easy to auto-reduce using tools like creduce. Unlike false positives, which may turn into true positives during reduction. If you still don't see the reason why your workaround is necessary and what exactly it does, could you attach a preprocessed file and an analyzer runline for the crash, so that we could have a look together? | |
515 | Yep, i believe that indeed, we need three states here. There are three possible cases. |
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
492 | Just to be clear: I know why it crashes without the hack: I simply cannot compare loc and nonloc. Since concrete 0 is nonloc I need another nonloc. I suppose this happens if an integer reference is passed to the operator +, +=, - or -=. So I thought that dereferencing it by getting the raw SVal is the correct thing to do. | |
515 | OK, I will do it. |
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
492 | Yep, in this case the correct thing to do would be to check AST types rather than SVal types. Eg., if (Arg->getType()->isReferenceType()) value = State->getRawSVal(*loc); (you might need to do it in the caller function, which still has access to the expressions) It is better this way because expectations are explicitly stated, and the assertion would still catch the situation when expectations are not met. Also, please still add a test case to cover this branch :) |
checkBind replaces checking of DeclStmt, CXXConstructExpr and assignment operators. Incremention by 0 is not a bug anymore regardless the position of the iterator.
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
492 | I tried it and failed in std::vector::back(). It seems that the problem is not the reference, but loc::ConcreteInt. I added a test case, but in our mocked vector the integer 1 in *(end()-1) is nonloc::ConcreteInt, but in the real one it is loc::ConcreteInt. I do not see why is there a difference, neither do I know how could something be a location and a concrete integer at once. What is loc::ConcreteInt and what to do with it? |
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
492 |
It is a concrete memory address. The null pointer, for example, or maybe a fixed magic pointer in some embedded driver code. Could you post an AST dump for the real (end()-1)on which you are failing? It might be that we end up looking at the other operator-() as in (end() - begin()), while iterators are implemented as pointers; no idea how that could be, but i'm suspecting something like that. |
lib/StaticAnalyzer/Checkers/IteratorPastEndChecker.cpp | ||
---|---|---|
492 | Also, dereferencing a loc::ConcreteInt loc (through getSVal/getRawSVal) would always yield UndefinedVal value. |
Hello,
Because i couldn't reproduce the loc-nonloc problem on my standard library, could you share a preprocessed file with the issue? I'm really curious at what's going on here, and it's the only issue remaining, so maybe we could squash it together.
We usually write these as
Mostly because our custom cast<> has an assertion inside.