This checker checks copy and move assignment operators whether they are protected against self-assignment. Since C++ core guidelines discourages explicit checking for `&rhs==this` in general we take a different approach: in top-frame analysis we branch the exploded graph for two cases, where &rhs==this and &rhs!=this and let existing checkers (e.g. unix.Malloc) do the rest of the work. It is important that we check all copy and move assignment operator in top frame even if we checked them already since self-assignments may happen undetected even in the same translation unit (e.g. using random indices for an array what may or may not be the same).
Details
- Reviewers
dcoughlin zaks.anna - Commits
- rGf57f90dfd1c3: [analyzer] Add checker modeling potential C++ self-assignment
rGeea0737a3467: [analyzer] Add checker modeling potential C++ self-assignment
rC276365: [analyzer] Add checker modeling potential C++ self-assignment
rC275820: [analyzer] Add checker modeling potential C++ self-assignment
rL276365: [analyzer] Add checker modeling potential C++ self-assignment
rL275820: [analyzer] Add checker modeling potential C++ self-assignment
Diff Detail
Event Timeline
Initial comments added to the checker and tests are converted from <CR><LF> (DOS) to <LF> (Unix) format.
Thanks for the patch! This looks like it will catch some nasty bugs.
My comments (inline) are mostly nits. But two are more substantial: (1) you should add a path note where the self-assignment assumption is made explaining the assumption to the user and (2) the potential cost of the eager case split could be high -- can it be mitigated by changing the inlining mode?
lib/StaticAnalyzer/Checkers/SelfAssignmentChecker.cpp | ||
---|---|---|
13 | An interesting part of this checker is that it doesn't actually perform any checks itself. Instead, it helps other checkers finds bugs by causing the analyzer to explore copy and move assignment operators twice: once for when 'this' aliases with the parameter and once for when it may not. I think it is worth mentioning this explicitly in the comment because it unusual. Most checkers don't work this way. | |
26 | What do you think about renaming this class to something like "CXXSelfAssignmentChecker"? I think name is a bit confusing because in Objective-C there is a notion of assigning to self (which is the ObjC equivalent of this). | |
lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp | ||
435 | It would be good to add a comment explaining why we re-analyze these methods at the top level. | |
436 | The LLVM style is to put spaces after if and around ||. http://llvm.org/docs/CodingStandards.html#spaces-before-parentheses | |
450 | One thing I am concerned about is that this will introduce an eager case split, which can be quite expensive. We could potentially mitigate this cost by changing getInliningModeForFunction() to return ExprEngine::Inline_Minimal for the copy and move assignment operators when analyzing them at the top level after they have already been inlined elsewhere. Inline_Minimal would prevent most inlining in these methods and thus reduce cost, but would also reduce coverage (leading to false negatives). Do you have a sense of how often self-assignment issues arise in inlined calls vs. in the operators themselves? (Also the spacing around && doesn't match LLVM style here.) | |
test/Analysis/self-assign-unused.cpp | ||
21 | I think it is important to add a path note saying something like "Assuming '*this' is equal to 'rhs'". To do this, you can add a new BugReporterVisitor subclass to BugReporterVisitors.h that walks a error path backwards to add the note. There is an example of how to do this in NonLocalizedStringBRVisitor in LocalizationChecker.cpp. Unlike the localization checker, however, the path note for the self assignment checker will be added to errors generated by *other* checkers, so you should register the new bug report visitor for all reports in GRBugReporter::generatePathDiagnostic() (similar to NilReceiverBRVisitor). | |
22 | I think it would be good to add; clang_analyzer_eval(*this == rhs); // expected-warning{{TRUE}} expected-warning{{UNKNOWN}} here to explicitly test for the case split. You'll need to add a prototype void clang_analyzer_eval(int); to use this. | |
26 | It would be good to also test the move assignment operator. | |
test/Analysis/self-assign-used.cpp | ||
1 | Can you combine these into one test file? This way someone updating the tests will know there is only one place to look. |
lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp | ||
---|---|---|
450 | I made a quick measurement on our build server using clang as target code. Real/user/system times were 75/575/18 for minimal and 76/587/18 for regular. This does not seem too much. However, since the semantics of a copy assignment operator is often composed of the semantics of a destructor and the semantics of a copy constructor implementations often put these two functionalities into separate functions and call them from the destructor, copy constructor and copy assignment operator. Such implementations requires regular inlining to be checked. |
I added tests for move assignment operators, but I could not find out any other simple test case than memory leak. However memory leaks are currently only detected by Unix.malloc for malloc. So I tried to replace strdup with malloc, strlen and strcpy, but even in this case leak bug is only reported for string used but not for string unused. That would require a more complex leak checker which we currently do not have.
Other than adding expected notes for the path notes to the test, this looks good to me. Thanks Ádám!
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp | ||
---|---|---|
1698 ↗ | (On Diff #63207) | This is great. |
lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp | ||
450 | Ok, makes sense to me. | |
test/Analysis/self-assign.cpp | ||
1 ↗ | (On Diff #63207) | I think it would be good to add -analyzer-output=text to this test line and check the expected output of that path notes. |
Re-opening. Reverted in r275880. It was causing a failure on the bots: http://bb.pgr.jp/builders/cmake-clang-x86_64-linux/builds/51780
Do I use a non-portable way to concatenate strings? "Assuming rhs == *this" becomes "0*this" for some strange reason. I tested it again with the latest master branch and all tests are passing like earlier.
Here is the failing test output in case the bot log goes away:
- TEST 'Clang :: Analysis/self-assign.cpp' FAILED ****
Script:
/home/bb/cmake-clang-x86_64-linux/build/./bin/clang -cc1 -internal-isystem /home/bb/cmake-clang-x86_64-linux/build/bin/../lib/clang/3.9.0/include -nostdsysteminc -std=c++11 -analyze -analyzer-checker=core,cplusplus,unix.Malloc,debug.ExprInspection /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp -verify -analyzer-output=text
Exit Code: 1
Command Output (stderr):
error: 'note' diagnostics expected but not seen:
File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 31: Assuming rhs == *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 31: Assuming rhs == *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 31: Assuming rhs != *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 38: Assuming rhs == *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 38: Assuming rhs != *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 65: Assuming rhs == *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 65: Assuming rhs == *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 65: Assuming rhs != *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 72: Assuming rhs == *this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 72: Assuming rhs != *this
error: 'note' diagnostics seen but not expected:
File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 31: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 31: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 31: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 38: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 38: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 65: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 65: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 65: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 72: 0*this File /home/bb/cmake-clang-x86_64-linux/llvm-project/clang/test/Analysis/self-assign.cpp Line 72: 0*this
20 errors generated.
Thx, I checked the output, but I do not understand why a simple string concatenation fails in your test environment. It works on our build server (Linux) with the latest master trunk.
cfe/trunk/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp | ||
---|---|---|
1738 ↗ | (On Diff #64346) | getName will return a StringRef here. Contatenating const char * and StringRef will give you a Twine. So Msg will be a twine which refers to temporary objects. This will result in a use after free. You shoud convert the result of the concatenation (the Twine) to a std::string, to copy the data and avoid use after free. |
An interesting part of this checker is that it doesn't actually perform any checks itself. Instead, it helps other checkers finds bugs by causing the analyzer to explore copy and move assignment operators twice: once for when 'this' aliases with the parameter and once for when it may not.
I think it is worth mentioning this explicitly in the comment because it unusual. Most checkers don't work this way.