This patch is an implementation of the ideas discussed on the mailing list. It didn't cause any crashes on the already existing test (I checked it by adding the flag to them and run lit), but then again, I wasn't able to check it on larger code bases just yet, and LLVM tends to have a couple tricks up it's sleeve, so be beware of that.
Details
- Reviewers
george.karpenkov NoQ xazax.hun rnkovacs - Commits
- rGffe93a167039: [analyzer][UninitializedObjectChecker] New flag to ignore guarded uninitialized…
rC352959: [analyzer][UninitializedObjectChecker] New flag to ignore guarded uninitialized…
rL352959: [analyzer][UninitializedObjectChecker] New flag to ignore guarded uninitialized…
Diff Detail
- Repository
- rL LLVM
Event Timeline
@george.karpenkov @xazax.hun, is this how you imagined checking whether an access is guarded?
I played around with it, and this seemed a bit cleaner. Since I would only match against the definition of the record (and the definition of its methods), the matcher would need to look like this: stmt(hasDescendant(blahBlah(...))), and I'd only be able to retrieve (with my limited knowledge) the entire method definition, not the actual FieldDecl in the case of a match.
I didn'd do a whole lot of AST based analysis, so feel free to correct me on this one ^-^
You can retrieve any node from the ASTMatcher by .bind()ing sub-matchers to string keys and later retrieving them from the MatchResult dictionary by those keys. It's like a regex capturing groups (and, well, ASTMatchers also support backreferences to such groups, i.e. equalsBoundNode()).
Reimplemented with AST matchers.
@george.karpenkov I know you wanted to test this feature for a while, but sadly I've been busy with the macro expansion related projects, I hope it's still relevant.
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
501 ↗ | (On Diff #168592) | I would consider any access that is guarded by if, ternary operator, switch as a guard. For example I would consider field guarded in the following method: int *method() { return cond ? field : nullptr; } But unguarded in this: int *method() { return field ? field2 : nullptr; } Also, knowing that a variable is guarded or not might be a useful heuristic for other checks too. So if you do plan to cover all those cases. I would recommend to implement it outside of this check so other checks might also leverage this functionality. |
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
519 ↗ | (On Diff #168592) | I am not sure if this is a reliable way to check if the access is before the guard. Consider: switch(x): { case 2: guard; access; break; case 1: access break; } Here, we have no particular ordering between the access in case 1 and the guard in case 2 at runtime. But relying on the source locations we might come to the false conclusion that there is. Loops, gotos can cause similar problems. |
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
519 ↗ | (On Diff #168592) |
I'm 100% sure it isn't. Using the CFG instead of matchers sounds like a great and difficult to implement (at least to me, as I never touched them) idea. It should get rid of the false negatives, at least in part.
There are some :) |
Checking for asserts makes sense, but as a rough estimate will suppress roughly zero false positives I have seen.
In general, LLVM codebase is exceptional in its use of asserts, and most projects, unfortunately, don't really know how to use them.
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
519 ↗ | (On Diff #168592) |
That would also require building a dataflow framework, which we do not have (yet) |
Added new guards as recommended by @xazax.hun
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
519 ↗ | (On Diff #168592) | Too bad :/ Makes me understand though why it would be valuable to implement it. |
Code looks good to me, but I'd wait for @george.karpenkov because it was his idea.
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
502 ↗ | (On Diff #168885) | In a lot of standard libraries assert() is implemented as a macro. You might want to catch the corresponding builtin or look at macro names (not sure if we have a matcher for the latter but it should be easy to add). |
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
515 ↗ | (On Diff #168885) | I feel that it's a good practice to assert(FirstAccess) here (same for FirstGuard and generally after every getNodeAs). The assertion will hold because "access" is always bound to on every possible match and is always a member-expression, and it'd be great to check that after the matcher is changed. |
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
502 ↗ | (On Diff #168885) | Unfortunately, macros are not a part of the AST, so I'm struggling a lot with this patch, but I'm making progress. |
@george.karpenkov Matching macros is a very non-trivial job, how would you feel if we shipped this patch as-is, and maybe leave a TODO about adding macro asserts down the line?
@george.karpenkov Matching macros is a very non-trivial job, how would you feel if we shipped this patch as-is, and maybe leave a TODO about adding macro asserts down the line?
The only solution I saw in clang-tidy was to match binary expressions as a heuristic, and check whether they are inside a macro named assert. That is hacky at best, any objection against the current state of this patch?
I'm planning to evaluate the checker in a variety of projects after this one, see how things are looking, and push it out of alpha.
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
502 ↗ | (On Diff #168885) | Maybe [[noreturn]] or just catch specific builtins such as __assert_fail - see what NoReturnFunctionChecker reacts to. |
- Rebased to master.
- Adding functions calls with the noreturn attribute to the guards.
- Shamelessly stole all sorts of assert-like function names from NoReturnFunctionChecker.
- Added new test cases accordingly.
This should be it! Sorry for being so slow with this -- I genuinely struggled a lot with macros before ultimately giving up.
Ping^3. @NoQ @george.karpenkov Could you please take a look? @xazax.hun Any other objections?
I have no objections. George, this was your idea, does it look good to you?
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp | ||
---|---|---|
577–578 ↗ | (On Diff #177080) | That's quite unreliable, but in a good way. Like, in if (x) foo(); else bar();, in terms of source locations bar() goes after foo(), but in practice bar() never gets called after foo() (unless the whole thing is also in a loop). But when we're trying to heuristically suppress the specific false positive pattern in which statements definitely go in that order, it's not the end of the world when the heuristic suppresses a few other patterns. So i think it's worth pointing out to the reader that this approach is a bit wonky, but not necessarily worth improving upon. That said, it should be possible to do this sort of stuff with a CFG-based analysis, probably even with one of the existing analyses in lib/Analysis. |
If you don't mind, I'd prefer to finally get over this patch. I'll commit on the 31st, but will wait for any potential feedback 'til then.