Container operations such as push_back(), pop_front() etc. increment and decrement the abstract begin and end symbols of containers. This patch introduces note tags to ContainerModeling to track these changes. This helps the user to better identify the source of errors related to containers and iterators.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | You'll need to check whether the container is actually of interest to the bug report. We don't want notes to be added about changes to irrelevant containers. You can use a combination of "Report BR was emitted by one of the iterator checkers" and "The memory region of the container is marked as interesting" (while also actually marking it as interesting in the checker). Ideally we should instead make a new generic storage inside the BugReport object, in order to pass down the interesting information from the call site of emitReport ("Hi, i'm an iterator checker who emitted this report and i'm interested in changes made to the size of this container"). |
Also, yay, thanks for using the new API!
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
33–34 | While we're at it, let's not pass SVals by reference? They're small value-types by design. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | Are you sure in this? I already wondered how it works so I added a test that checks one container and changes another one and there were no note tags displayed for the one we did not check but change. See the last test. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | That's because you didn't do V2.cbegin(); V2.cend(); in the beginning. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
59 | While we're at it [part 2], can we make this = default? :) | |
69–70 | Prefer using. | |
715 | A similar conversation sparked up recently in between @boga95, @steakhal and me regarding reporting taintedness. Bug reports are fine up to the point where (in reverse) the first propagation happens, but finding out which value tainted the one that caused the report isn't handled at the moment. One idea was to mark the initial (again, in reverse) value as interesting, create a NoteTag at the point of propagation, where we should know which value was the cause of the spread, mark that interesting as well, etc. If NoteTags only emit a message when the concerning value is interesting, this should theoretically solve that problem. I guess you could say that we're propagating interestingness in reverse. I'm not immediately sure if this idea was ever mentioned or implemented here. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | Yes, that's the intended solution to such problems. trackExpressionValue works similarly, just with assignments instead of taint propagations. And in both cases note tags are a much more straightforward solution to the problem. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | Yes, you are right. My problem now is that how to mark interesting when debugging? I I filter for interesting containers only, I lose my ability to debug. Should I create a debug function just for marking a container as interesting. Or is there such function already? |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | I'm not immediately sure how interetingness ties into debugging, what specific scenario are you thinking about? |
Only track the right container. Furthermore, minor updates according to the comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | In the test of the modeling checker we use debug checkers. They should be able to mark the container interesting to be able to test the not tags. I managed to solve problem, even in a somewhat unorthodox way. |
First real checker IteratorRange now marks container as interesting to benefit from container note tags. The other two iterator checkers will be updated after a second phase of container note tags which also cares for reassignment (upon move) and invalidation of iterators.
Do we have a test where 2 containers are present but only one of them should be marked as interesting?
void deref_end_after_pop_back(std::vector<int> &V, std::vector<int> &V2) { const auto i = --V.end(); const auto i2 = --V2.end(); V.pop_back(); // expected-note{{Container 'V' shrinked from the right by 1 position}} V2.pop_back(); // no-note *i; // expected-warning{{Past-the-end iterator dereferenced}} // expected-note@-1{{Past-the-end iterator dereferenced}} }
clang/lib/StaticAnalyzer/Checkers/DebugContainerModeling.cpp | ||
---|---|---|
95–103 ↗ | (On Diff #243796) | Aha, makes sense, when calling clang_analyzer_container_end(V), we want to make the analyzer emit more information about V so its obviously interesting. |
clang/test/Analysis/container-modeling.cpp | ||
35 | I hate to be that guy, but this is quite ugly :). How about the handsome //===--------------------------===// // Container assignment tests. //===--------------------------===// But I don't insist, especially within the scope of this patch. |
Yes, of course we have, that was the starting point of the discussion. However, I try to make the tests orthogonal, so this test is in container-modeling.cpp where we do not dereference it, but print its begin or end.
LGTM, I like everything here, you worded the notes very nicely and the test cases seems to cover everything I could find! Please wait for @NoQ's approval, since he's the ranking member of among the NoteTag users.
Yea, right, silly me.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
484 | Past tense is "shrank", not "shrinked". |
In case of multiple container-related bugs only mark the container interesting on the correct bug path. Also a typo fixed.
Uh-oh, I'm not sure why this would ever be an issue? Interestingness is a property of a given BugReport, not the BugReporter class. How can this interfere with one another?
Okay I see what the issue is. DebugContainerModeling normally doesn't emit a report, only adds notes on interesting containers. Though it still makes me wonder whether this is the right approach.
First, I think changes to the DebugContainerModeling seems to spawn a different discussion, and separating it to a different patch might make for a satisfying splitting point. The rest of the patch seems to be ready to land in my opinion. Also, this functionality seems to be duplicated in DebugIteratorModeling in D74541, is this intended?
Second, that issue may be more appropriately solved by introducing a new debug interestingness kind (D65723), and just make the parameters of clang_analyzer_express interesting in the debug sense. If we did that, DebugContainerModeling could ask whether the symbol is debug-interesting instead of the SourceRange hackery. WDYT?
Apologies -- I'd definitely prefer to address the debug related changes in a separate pack, similarly to D74541.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
715 | The core issue with NoteTag it does not know about interestingness and nor about MemRegion. I believe everything based on MemRegions already and when you emit the report you know exactly which MemRegion raised an error. So I think first we need to solve that the NoteTags only report on given MemRegions and those regions of course mega-interesting: we do not need to keep around the interestingness then. |
The scope of the patch is small and is well tested. I suggest to move the discussion about debug messages and NoteTags to the next revision. LGTM!
I wonder whether I should change "to/from the left" to "towards/from the front" and "to/from the right" to "towards/from the right"?
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
713 | Hmm, i think you should instead look at ContReg, i.e. whether it's a non-anonymous VarRegion or a FieldRegion or something like that (in other patches as well). It would work more often and it'll transparently handle references. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
713 | Unfortunately it is a SymRegion so it does not work :-( (Even using getMostDerivedRegion() does not help.) |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
713 | You mean the first checking form SymbolicRegion, then get its symbol, check for SymbolRegionValue, then get its TypedValueRegion, check for DeclRegion and use its Decl? This sound waaay more complicated and less readable. I am not sure which are the side cases: is it always SymbolicRegion? Is the Symbol of SymbolicRegion always a SymbolRegionValue? Is ithe TypedValueRegion (the return value of its getRegion()) always a DeclRegion? |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
713 |
Emm, that's rarely the case. Only if it's a reference passed into a top-level function as a parameter. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
713 | (or to another unknown location) (please learn what SymbolicRegion is, it is very important for your work) I guess you should do both then, because when the analyzer is able to resolve the reference it's better in this case to point out what is the actual container that's being modified. |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
713 | Is it OK now? |
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp | ||
---|---|---|
713 | Looks good, can we also have tests for this case? I.e., anything where the container isn't passed by reference to an unknown location. |
While we're at it, let's not pass SVals by reference? They're small value-types by design.