This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][CallAndMessage] Add checker options for each bug category
ClosedPublic

Authored by Szelethus on Apr 10 2020, 4:25 AM.

Details

Summary

As listed in the summary D77846, we have 5 different categories of bugs we're checking for in CallAndMessage. I think the documentation placed in the code explains my thought process behind my decisions quite well.

A non-obvious change I had here is removing the entry for CallAndMessageUnInitRefArg. In fact, I removed the CheckerNameRef typed field back in D77845 (it was dead code), so that checker didn't really exist in any meaningful way anyways.

Diff Detail

Event Timeline

Szelethus created this revision.Apr 10 2020, 4:25 AM
martong accepted this revision.Apr 15 2020, 7:21 AM

Looks good to me!

clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
194

I like joke :D But there are people who may not, so, the code would be more pro without it.

clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
64

So, we are not going to have subcheckers, rather options, okay. Perhaps the comment would be more appropriate to be in Checkers.td? (If we can put comments there.)

This revision is now accepted and ready to land.Apr 15 2020, 7:21 AM

Looks good to me!

Just take care of the failures you have with harbourmaster, there seem to be CSA failures with the bot!

Looks good to me!

Just take care of the failures you have with harbourmaster, there seem to be CSA failures with the bot!

That should go away after fixing D75430#inline-708353. Sharp eyes though, thanks!

balazske added inline comments.Apr 15 2020, 9:00 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
194
if (!State)
  State = getState();

is better?

Szelethus marked 4 inline comments as done.May 21 2020, 6:46 AM
Szelethus added inline comments.
clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
64

I prefer it here, Checkers.td is a rarely visited file, and the comments in the test cases all point here where the majority of the work has to be done. Which still isn't much, but this is the bit we should not screw up.

Btw we can totally put comments there, as seen here: D78120

This revision was automatically updated to reflect the committed changes.
Szelethus marked an inline comment as done.
NoQ added a comment.Jun 2 2020, 3:52 AM

We (me, Valeriy, Devin)'ve just been talking about this and mostly of agreed that core.CallAndMessage should ideally be removed (at least as a Checkers.td entity) and individual checks moved to the relevant package (null dereference, use of uninitialized value, etc.). It won't hurt backwards compatibility too much because so far there was no reasonable outcome that one might expect from disabling core.CallAndMessage, given that it's a weird mix of completely unrelated checks. The benefit is that disabling, say, null dereference checker will actually disable all null dereference warnings (as opposed to having to additionally disable the ones that come from CallAndMessage).

Also a regular reminder that we wouldn't have had this conversation if we had hashtags instead of packages (#CallAndMessage #NullDereference #C++ #AllPlatforms #UndefinedBehavior #SecurityCritical #PathSensitive #Stable #2012).

In D77866#2068394, @NoQ wrote:

We (me, Valeriy, Devin)'ve just been talking about this and mostly of agreed that core.CallAndMessage should ideally be removed (at least as a Checkers.td entity) and individual checks moved to the relevant package (null dereference, use of uninitialized value, etc.). It won't hurt backwards compatibility too much because so far there was no reasonable outcome that one might expect from disabling core.CallAndMessage, given that it's a weird mix of completely unrelated checks. The benefit is that disabling, say, null dereference checker will actually disable all null dereference warnings (as opposed to having to additionally disable the ones that come from CallAndMessage).

Luckily this should be super easy to implement after this patch, and it is totally what I, and most folks on Ericsson's end want this checker to look like. I trust that you've glanced over the comments that justify the tiptoeing creating smaller checkers, and it is a result of a conversation we had that the price on the user would have to pay the way we implement our database is too great.

I'll initialize a new discussion on this on our end, but speaking from my perspective, we should just deal the cost of this and be done with it. However, since sooooo many reports originate from this checker, I wonder whether we should go all the way and make sense out of the hash generation we currently have.

Also a regular reminder that we wouldn't have had this conversation if we had hashtags instead of packages (#CallAndMessage #NullDereference #C++ #AllPlatforms #UndefinedBehavior #SecurityCritical #PathSensitive #Stable #2012).

Could you detail this a bit?

NoQ added a comment.Jun 2 2020, 10:22 AM

Also a regular reminder that we wouldn't have had this conversation if we had hashtags instead of packages (#CallAndMessage #NullDereference #C++ #AllPlatforms #UndefinedBehavior #SecurityCritical #PathSensitive #Stable #2012).

Could you detail this a bit?

Wait, did i never mention this before?

Like, i mean, the tree of packages that we currently have is a wrong abstraction. We shouldn't be asking the user questions like "Is this checker alpha or unix?" or "Are you looking for a checker for C++ or for finding portability issues?". These are different, independent classifications but our tree-like package structure implicitly assumes that these "or"s are exclusive. We partially mitigate this by introducing conventions for package naming (like "alpha package has the same layout as the root package") but we could achieve the same naturally by simply introducing multiple classifications. The most ad-hoc approach that's better than the current situation would be twitter-like hashtags. A better - stricter but still sufficiently flexible - approach would be to have some actual different classifications: "language:C++ platform:unix stability:alpha severity:undefinedbehavior" etc. Like, these would let the user to make easy single-word -enable-checker/-disable-checker queries like "please disable all null dereference checks (regardless of which individual indivisible checker implements them)" or "please disable all path-sensitive checkers because i want a super quick analysis" or "please only warn me when you've found an actual crash in the program, i don't care about coding convention violations or portability issues".

In D77866#2069144, @NoQ wrote:

Like, i mean, the tree of packages that we currently have is a wrong abstraction.

I totally agree with this.

The most ad-hoc approach that's better than the current situation would be twitter-like hashtags.

While this sounds really good for basic use cases I think it quickly becomes unmanageable for power users. For example, if the user wants to enable undefined behavior related checks but disable null checks how should this affect a check that is tagged with both categories? I am not sure whether using the enable/disable order is user-friendly enough. Moreover, we still need to be able to suppress each check individually (I meant checks as in one checker could implement multiple checks or diagnostics and the checker might not be the right granularity for suppression). I really like the idea of this approach but I wonder how to make this intuitive.

A better - stricter but still sufficiently flexible - approach would be to have some actual different classifications: "language:C++ platform:unix stability:alpha severity:undefinedbehavior" etc.

I do not see a big difference between this method and the tags but I might be missing something.

I really support the idea of having a more intuitive way to handle checks and tags sound promising. But there are a lot of details we need to think about. Nevertheless, I'd love to see a more detailed RFC about the topic if someone has the time/up to it. I think this approach is probably something that the clang-tidy checks could also profit from.

NoQ added a comment.Jun 2 2020, 12:28 PM

While this sounds really good for basic use cases I think it quickly becomes unmanageable for power users.

It's strictly better than the current situation because nothing prevents us from keeping the existing checker names as a subset of tags (i.e., #core/#core.CallAndMessage or checker:core/checker:core.CallAndMessage). We don't even need to break backwards compatibility.

I am not sure whether using the enable/disable order is user-friendly enough.

I guess it's either the order or allowing users to write boolean formulas :|

Also if they're so smart why don't they introduce their own categories so that to avoid writing complicated formulas? (checkmate atheists!) (at least that's what i expect IDE developers to do on a regular basis; complicated formulas don't need to be autogenerated from high-level ordinary-user-facing checkboxes).

I'd love to see a more detailed RFC about the topic if someone has the time/up to it.

Yup, this would absolutely require a discussion.

clang/test/Analysis/uninit-const.c