Page MenuHomePhabricator

[analyzer][UninitializedObjectChecker] New flag to ignore guarded uninitialized fields
ClosedPublic

Authored by Szelethus on Sep 10 2018, 9:25 AM.

Details

Summary

This patch is an implementation of the ideas discussed on the mailing list. It didn't cause any crashes on the already existing test (I checked it by adding the flag to them and run lit), but then again, I wasn't able to check it on larger code bases just yet, and LLVM tends to have a couple tricks up it's sleeve, so be beware of that.

Diff Detail

Repository
rL LLVM

Event Timeline

Szelethus created this revision.Sep 10 2018, 9:25 AM
Szelethus edited the summary of this revision. (Show Details)Sep 10 2018, 9:25 AM

@george.karpenkov @xazax.hun, is this how you imagined checking whether an access is guarded?

Thanks! The usual question: would it be easier to implement using AST matchers?

I played around with it, and this seemed a bit cleaner. Since I would only match against the definition of the record (and the definition of its methods), the matcher would need to look like this: stmt(hasDescendant(blahBlah(...))), and I'd only be able to retrieve (with my limited knowledge) the entire method definition, not the actual FieldDecl in the case of a match.
I didn'd do a whole lot of AST based analysis, so feel free to correct me on this one ^-^

NoQ added a comment.Sep 20 2018, 2:01 PM

You can retrieve any node from the ASTMatcher by .bind()ing sub-matchers to string keys and later retrieving them from the MatchResult dictionary by those keys. It's like a regex capturing groups (and, well, ASTMatchers also support backreferences to such groups, i.e. equalsBoundNode()).

Szelethus updated this revision to Diff 168592.Oct 7 2018, 9:32 AM

Reimplemented with AST matchers.
@george.karpenkov I know you wanted to test this feature for a while, but sadly I've been busy with the macro expansion related projects, I hope it's still relevant.

Szelethus retitled this revision from [analyzer][UninitializedObjectChecker][WIP] New flag to ignore guarded uninitialized fields to [analyzer][UninitializedObjectChecker] New flag to ignore guarded uninitialized fields.Oct 8 2018, 12:19 AM
xazax.hun added inline comments.Oct 8 2018, 2:35 AM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
501 ↗(On Diff #168592)

I would consider any access that is guarded by if, ternary operator, switch as a guard.

For example I would consider field guarded in the following method:

int *method() {
  return cond ? field : nullptr;
}

But unguarded in this:

int *method() {
  return field ? field2 : nullptr;
}

Also, knowing that a variable is guarded or not might be a useful heuristic for other checks too. So if you do plan to cover all those cases. I would recommend to implement it outside of this check so other checks might also leverage this functionality.

xazax.hun added inline comments.Oct 8 2018, 2:41 AM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
519 ↗(On Diff #168592)

I am not sure if this is a reliable way to check if the access is before the guard.

Consider:

switch(x): {
   case 2: guard; access; break;
   case 1: access break;
}

Here, we have no particular ordering between the access in case 1 and the guard in case 2 at runtime. But relying on the source locations we might come to the false conclusion that there is. Loops, gotos can cause similar problems.
I do understand that this might not be too easy to solve without traversing the cfg and we might not want to do that but I think we should at least add a test/todo.

Szelethus added inline comments.Oct 8 2018, 9:30 AM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
519 ↗(On Diff #168592)

I am not sure if this is a reliable way to check if the access is before the guard.

I'm 100% sure it isn't. Using the CFG instead of matchers sounds like a great and difficult to implement (at least to me, as I never touched them) idea. It should get rid of the false negatives, at least in part.

[...] I think we should at least add a test/todo.

There are some :)

Checking for asserts makes sense, but as a rough estimate will suppress roughly zero false positives I have seen.
In general, LLVM codebase is exceptional in its use of asserts, and most projects, unfortunately, don't really know how to use them.

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
519 ↗(On Diff #168592)

Using the CFG instead of matchers sounds like a great and difficult to implement

That would also require building a dataflow framework, which we do not have (yet)

Szelethus updated this revision to Diff 168885.Oct 9 2018, 3:21 PM

Added new guards as recommended by @xazax.hun

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
519 ↗(On Diff #168592)

Too bad :/ Makes me understand though why it would be valuable to implement it.

NoQ accepted this revision.Oct 19 2018, 1:59 PM

Code looks good to me, but I'd wait for @george.karpenkov because it was his idea.

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
502 ↗(On Diff #168885)

In a lot of standard libraries assert() is implemented as a macro. You might want to catch the corresponding builtin or look at macro names (not sure if we have a matcher for the latter but it should be easy to add).

This revision is now accepted and ready to land.Oct 19 2018, 1:59 PM
NoQ added inline comments.Oct 19 2018, 2:02 PM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
515 ↗(On Diff #168885)

I feel that it's a good practice to assert(FirstAccess) here (same for FirstGuard and generally after every getNodeAs). The assertion will hold because "access" is always bound to on every possible match and is always a member-expression, and it'd be great to check that after the matcher is changed.

Szelethus planned changes to this revision.Oct 25 2018, 9:29 AM

Hmm, I'll investigate the assert issue @NoQ mentioned before moving forward.

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
502 ↗(On Diff #168885)

Hmm, maybe go for [[noreturn]] functions?

Szelethus added inline comments.Oct 31 2018, 10:53 AM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
502 ↗(On Diff #168885)

Unfortunately, macros are not a part of the AST, so I'm struggling a lot with this patch, but I'm making progress.

@george.karpenkov Matching macros is a very non-trivial job, how would you feel if we shipped this patch as-is, and maybe leave a TODO about adding macro asserts down the line?

@george.karpenkov Matching macros is a very non-trivial job, how would you feel if we shipped this patch as-is, and maybe leave a TODO about adding macro asserts down the line?

The only solution I saw in clang-tidy was to match binary expressions as a heuristic, and check whether they are inside a macro named assert. That is hacky at best, any objection against the current state of this patch?

I'm planning to evaluate the checker in a variety of projects after this one, see how things are looking, and push it out of alpha.

Szelethus requested review of this revision.Nov 28 2018, 10:58 AM

Changes are no longer planned for the reasons explained in my earlier comments.

This revision is now accepted and ready to land.Nov 28 2018, 10:58 AM
NoQ added inline comments.Nov 29 2018, 4:16 PM
lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
502 ↗(On Diff #168885)

Maybe [[noreturn]] or just catch specific builtins such as __assert_fail - see what NoReturnFunctionChecker reacts to.

Szelethus planned changes to this revision.Dec 6 2018, 5:12 AM

Let's try that again then! :)

Szelethus updated this revision to Diff 177080.Dec 6 2018, 4:36 PM
  • Rebased to master.
  • Adding functions calls with the noreturn attribute to the guards.
  • Shamelessly stole all sorts of assert-like function names from NoReturnFunctionChecker.
  • Added new test cases accordingly.

This should be it! Sorry for being so slow with this -- I genuinely struggled a lot with macros before ultimately giving up.

This revision is now accepted and ready to land.Dec 6 2018, 4:36 PM
Szelethus requested review of this revision.Dec 7 2018, 5:37 AM

Marking this patch as non-accepted.

Ping^3. @NoQ @george.karpenkov Could you please take a look? @xazax.hun Any other objections?

NoQ accepted this revision.Jan 20 2019, 1:20 PM

I have no objections. George, this was your idea, does it look good to you?

lib/StaticAnalyzer/Checkers/UninitializedObject/UninitializedObjectChecker.cpp
577–578 ↗(On Diff #177080)

That's quite unreliable, but in a good way. Like, in if (x) foo(); else bar();, in terms of source locations bar() goes after foo(), but in practice bar() never gets called after foo() (unless the whole thing is also in a loop). But when we're trying to heuristically suppress the specific false positive pattern in which statements definitely go in that order, it's not the end of the world when the heuristic suppresses a few other patterns.

So i think it's worth pointing out to the reader that this approach is a bit wonky, but not necessarily worth improving upon. That said, it should be possible to do this sort of stuff with a CFG-based analysis, probably even with one of the existing analyses in lib/Analysis.

This revision is now accepted and ready to land.Jan 20 2019, 1:20 PM

If you don't mind, I'd prefer to finally get over this patch. I'll commit on the 31st, but will wait for any potential feedback 'til then.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2019, 6:49 AM