This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Use note tags to track container begin and and changes
ClosedPublic

Authored by baloghadamsoftware on Jan 30 2020, 9:13 AM.

Details

Summary

Container operations such as push_back(), pop_front() etc. increment and decrement the abstract begin and end symbols of containers. This patch introduces note tags to ContainerModeling to track these changes. This helps the user to better identify the source of errors related to containers and iterators.

Diff Detail

Event Timeline

NoQ added inline comments.Feb 3 2020, 8:24 AM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

You'll need to check whether the container is actually of interest to the bug report. We don't want notes to be added about changes to irrelevant containers.

You can use a combination of "Report BR was emitted by one of the iterator checkers" and "The memory region of the container is marked as interesting" (while also actually marking it as interesting in the checker).

Ideally we should instead make a new generic storage inside the BugReport object, in order to pass down the interesting information from the call site of emitReport ("Hi, i'm an iterator checker who emitted this report and i'm interested in changes made to the size of this container").

NoQ added a comment.Feb 3 2020, 8:26 AM

Also, yay, thanks for using the new API!

clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
33–34

While we're at it, let's not pass SVals by reference? They're small value-types by design.

const SVal & -> SVal

baloghadamsoftware marked an inline comment as done.Feb 4 2020, 6:42 AM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

Are you sure in this? I already wondered how it works so I added a test that checks one container and changes another one and there were no note tags displayed for the one we did not check but change. See the last test.

NoQ added inline comments.Feb 4 2020, 10:09 AM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

That's because you didn't do

V2.cbegin();
V2.cend();

in the beginning.

Szelethus added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
59

While we're at it [part 2], can we make this = default? :)

69–70

Prefer using.

715

A similar conversation sparked up recently in between @boga95, @steakhal and me regarding reporting taintedness. Bug reports are fine up to the point where (in reverse) the first propagation happens, but finding out which value tainted the one that caused the report isn't handled at the moment. One idea was to mark the initial (again, in reverse) value as interesting, create a NoteTag at the point of propagation, where we should know which value was the cause of the spread, mark that interesting as well, etc.

If NoteTags only emit a message when the concerning value is interesting, this should theoretically solve that problem. I guess you could say that we're propagating interestingness in reverse.

I'm not immediately sure if this idea was ever mentioned or implemented here.

NoQ added inline comments.Feb 5 2020, 7:12 AM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

Yes, that's the intended solution to such problems. trackExpressionValue works similarly, just with assignments instead of taint propagations. And in both cases note tags are a much more straightforward solution to the problem.

baloghadamsoftware marked an inline comment as done.Feb 5 2020, 10:45 AM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

Yes, you are right. My problem now is that how to mark interesting when debugging? I I filter for interesting containers only, I lose my ability to debug. Should I create a debug function just for marking a container as interesting. Or is there such function already?

Szelethus added inline comments.Feb 7 2020, 4:42 AM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

I'm not immediately sure how interetingness ties into debugging, what specific scenario are you thinking about?

Only track the right container. Furthermore, minor updates according to the comments.

baloghadamsoftware marked 4 inline comments as done.Feb 10 2020, 8:23 AM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

In the test of the modeling checker we use debug checkers. They should be able to mark the container interesting to be able to test the not tags. I managed to solve problem, even in a somewhat unorthodox way.

baloghadamsoftware updated this revision to Diff 243796.EditedFeb 11 2020, 3:39 AM

First real checker IteratorRange now marks container as interesting to benefit from container note tags. The other two iterator checkers will be updated after a second phase of container note tags which also cares for reassignment (upon move) and invalidation of iterators.

Do we have a test where 2 containers are present but only one of them should be marked as interesting?

void deref_end_after_pop_back(std::vector<int> &V, std::vector<int> &V2) {
  const auto i = --V.end();
  const auto i2 = --V2.end();

  V.pop_back();  // expected-note{{Container 'V' shrinked from the right by 1 position}}
  V2.pop_back(); // no-note

  *i; // expected-warning{{Past-the-end iterator dereferenced}}
      // expected-note@-1{{Past-the-end iterator dereferenced}}
}
clang/lib/StaticAnalyzer/Checkers/DebugContainerModeling.cpp
95–104

Aha, makes sense, when calling clang_analyzer_container_end(V), we want to make the analyzer emit more information about V so its obviously interesting.

clang/test/Analysis/container-modeling.cpp
35

I hate to be that guy, but this is quite ugly :). How about the handsome

//===--------------------------===//
// Container assignment tests.
//===--------------------------===//

But I don't insist, especially within the scope of this patch.

Do we have a test where 2 containers are present but only one of them should be marked as interesting?

Yes, of course we have, that was the starting point of the discussion. However, I try to make the tests orthogonal, so this test is in container-modeling.cpp where we do not dereference it, but print its begin or end.

Szelethus accepted this revision.Feb 12 2020, 6:12 AM

LGTM, I like everything here, you worded the notes very nicely and the test cases seems to cover everything I could find! Please wait for @NoQ's approval, since he's the ranking member of among the NoteTag users.

Do we have a test where 2 containers are present but only one of them should be marked as interesting?

Yes, of course we have, that was the starting point of the discussion. However, I try to make the tests orthogonal, so this test is in container-modeling.cpp where we do not dereference it, but print its begin or end.

Yea, right, silly me.

This revision is now accepted and ready to land.Feb 12 2020, 6:12 AM
nicolas17 added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
484

Past tense is "shrank", not "shrinked".

In case of multiple container-related bugs only mark the container interesting on the correct bug path. Also a typo fixed.

baloghadamsoftware marked an inline comment as done.Feb 13 2020, 3:42 AM

Minor update: ignore parentheses and casts when taking the name of the expression.

In case of multiple container-related bugs only mark the container interesting on the correct bug path. Also a typo fixed.

Uh-oh, I'm not sure why this would ever be an issue? Interestingness is a property of a given BugReport, not the BugReporter class. How can this interfere with one another?

Okay I see what the issue is. DebugContainerModeling normally doesn't emit a report, only adds notes on interesting containers. Though it still makes me wonder whether this is the right approach.

First, I think changes to the DebugContainerModeling seems to spawn a different discussion, and separating it to a different patch might make for a satisfying splitting point. The rest of the patch seems to be ready to land in my opinion. Also, this functionality seems to be duplicated in DebugIteratorModeling in D74541, is this intended?

Second, that issue may be more appropriately solved by introducing a new debug interestingness kind (D65723), and just make the parameters of clang_analyzer_express interesting in the debug sense. If we did that, DebugContainerModeling could ask whether the symbol is debug-interesting instead of the SourceRange hackery. WDYT?

Szelethus requested changes to this revision.Mar 2 2020, 10:34 AM

Apologies -- I'd definitely prefer to address the debug related changes in a separate pack, similarly to D74541.

This revision now requires changes to proceed.Mar 2 2020, 10:36 AM

Back to the version where we do not check for interestingness.

Charusso added inline comments.Mar 3 2020, 4:32 AM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
715

The core issue with NoteTag it does not know about interestingness and nor about MemRegion. I believe everything based on MemRegions already and when you emit the report you know exactly which MemRegion raised an error. So I think first we need to solve that the NoteTags only report on given MemRegions and those regions of course mega-interesting: we do not need to keep around the interestingness then.

Szelethus accepted this revision.Mar 4 2020, 5:57 AM

The scope of the patch is small and is well tested. I suggest to move the discussion about debug messages and NoteTags to the next revision. LGTM!

This revision is now accepted and ready to land.Mar 4 2020, 5:57 AM

Minor style change.

I wonder whether I should change "to/from the left" to "towards/from the front" and "to/from the right" to "towards/from the right"?

NoQ added inline comments.Mar 15 2020, 10:06 PM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
713

Hmm, i think you should instead look at ContReg, i.e. whether it's a non-anonymous VarRegion or a FieldRegion or something like that (in other patches as well). It would work more often and it'll transparently handle references.

baloghadamsoftware marked an inline comment as done.Mar 16 2020, 12:20 PM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
713

Unfortunately it is a SymRegion so it does not work :-( (Even using getMostDerivedRegion() does not help.)

baloghadamsoftware marked an inline comment as done.Mar 16 2020, 12:38 PM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
713

You mean the first checking form SymbolicRegion, then get its symbol, check for SymbolRegionValue, then get its TypedValueRegion, check for DeclRegion and use its Decl? This sound waaay more complicated and less readable. I am not sure which are the side cases: is it always SymbolicRegion? Is the Symbol of SymbolicRegion always a SymbolRegionValue? Is ithe TypedValueRegion (the return value of its getRegion()) always a DeclRegion?

NoQ added inline comments.Mar 23 2020, 7:03 PM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
713

Unfortunately it is a SymRegion

Emm, that's rarely the case. Only if it's a reference passed into a top-level function as a parameter.

NoQ added inline comments.Mar 23 2020, 7:05 PM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
713

(or to another unknown location) (please learn what SymbolicRegion is, it is very important for your work)

I guess you should do both then, because when the analyzer is able to resolve the reference it's better in this case to point out what is the actual container that's being modified.

Updated according to a comment.

baloghadamsoftware marked an inline comment as done.Mar 24 2020, 10:14 AM
baloghadamsoftware added inline comments.
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
713

Is it OK now?

NoQ added inline comments.Mar 25 2020, 12:18 AM
clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
713

Looks good, can we also have tests for this case? I.e., anything where the container isn't passed by reference to an unknown location.

baloghadamsoftware marked an inline comment as done.Mar 25 2020, 2:20 AM
NoQ accepted this revision.Mar 25 2020, 10:51 AM

Lovely, thank you!

This revision was automatically updated to reflect the committed changes.