The checker applies constraints in a sequence and adds new nodes for these states. If a constraint violation is found this sequence should be stopped with a sink (error) node. Instead the generateErrorNode did add a new error node as a new branch that is parallel to the other node sequence, the other branch was not stopped and analysis was continuing on that invalid branch. To add an error node after any previous node a new version of generateErrorNode is needed, this function is added here and used by StdLibraryFunctionsChecker. The added test executes a situation where the checker adds a number of constraints before it finds a constraint violation.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
954 | Let me know if I got this right. The reason behind generateErrorNode not behaving like it usually does for other checkers is because of the explicitly supplied NewState parameter -- in its absence, the current path of execution is sunk. With this parameter, a new parallel node is. Correct? |
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
954 | The NewState only sets the state of the new error node, if it is nullptr the current state is used. A new node is always added. The other new node functions (addTransition, generateNonFatalErrorNode, generateSink and addSink) have a version that can take a predecessor node, only generateErrorNode did not have this (and I can not find out why part of these is called "generate" and other part "add" instead of using only "generate" or "add"). The new function is used when a node sequence CurrentNode->A->B->ErrorNode is needed. Without the new function it is only possible to make a CurrentNode->ErrorNode transition, and the following incorrect graph is created: CurrentNode->A->B |->ErrorNode The code here does exactly this (before the fix), in NewNode a sequence of nodes is appended (like A and B above), and if then an error node is created it is added to the CurrentNode. Not this is needed here, the error node should come after B. Otherwise analysis can continue after node B (that path is invalid because a constraint violation was found). |
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
954 | I've been wondering that, especially looking at the test case. Seems like this loop runs only once, how come that new nodes are added on top of CurrentNode (which, in this case refers to C.getPredecessor(), right?)? I checked the checker's code, and I can't really see why A and B would ever appear. Isn't that a bug? |
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
954 | My thinking was that each checker, unless it does state splits, should really only create a single node per callback, right? The new state, however many changes it contains, should be added all at once in the single callback, no? |
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
954 | The problem is that multiple NoteTags are added. It is only possible to add a single NoteTag in a single transition. This is why in line 969 (in the currently shown code at time of this comment) addTransition is used for every new SuccessSt (otherwise NewState could be used without NewNode). Or is there a possibility for multiple NoteTags at one transition, or can such a feature be added? (But if the other state add functions all have a version that accepts a predecessor node, why is generateErrorNode exception?) (This state apply loop was changed in the recent time at least once.) |
Folks, I'm glad you caught it! This is a classic mistake to make with addTransition() APIs.
I wish we had better APIs that don't have this problem, but instead make it very clear how many execution paths does the checker callback *intend* to create. Eg.,
- C.addStateUpdate(State) would "chain" nodes by default if called multiple times, like you did in this patch;
- C.addStateSplit(State1, Tag1, State2, Tag2, ..., StateN, TagN) would add at most N nodes (some may merge) and can be called only once per checker callback, otherwise it traps with assertion failure;
- Similarly, mixing C.addStateUpdate() and C.addStateSplit() in the same checker callback will trap with assertion failure ("Make up your mind, are you trying to split the path or not?!");
- Then we can have a Swiss-Army-knife function C.addArbitraryTransition(Pred, State, Tag) for all other use cases which *requires* you to specify the predecessor manually even if it's just C.getPredecessor().
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
954 | I think you're right, even though technically it's always possible to make all updates in a single transition, in practice it often leads to annoying architectural problems. It's nice to have separation of concerns between different parts of checker code, and "chaining" nodes together is a neat way to achieve that. |
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h | ||
---|---|---|
219 | What I'm missing here is some guidance. Why would I pick this overload instead of the 2-parameter one? Especially for beginners, this is very confusing. |
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h | ||
---|---|---|
219 | The whole documentation contains not enough information, at least not for beginners. This overload works the same way as at generateNonFatalErrorNode (and this documentation is made similar as at that function). addTransition works similar too, there is a bit more documentation. Probably somebody who understands addTransition and the Pred parameter can understand generateErrorNode too. If the name would be addErrorNode the similarity would be even stronger. |
What I'm missing here is some guidance. Why would I pick this overload instead of the 2-parameter one? Especially for beginners, this is very confusing.