This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Move out tracking retain count for OSObjects into a separate checker
ClosedPublic

Authored by george.karpenkov on Dec 6 2018, 4:30 PM.

Details

Summary

Allow enabling and disabling tracking of ObjC/CF objects
separately from tracking of OS objects.

ObjC and CF are still grouped together, as we can't rename existing checkers.

Diff Detail

Event Timeline

NoQ added inline comments.Dec 6 2018, 4:36 PM
clang/test/Analysis/test-separate-retaincount.cpp
22–49

I think you can merge these two and #ifdef out the warning in each section.

george.karpenkov updated this revision to Diff 177271.
NoQ accepted this revision.Dec 7 2018, 12:21 PM
NoQ added inline comments.
clang/lib/StaticAnalyzer/Core/RetainSummaryManager.cpp
32–33

This comment needs to go up a bit.

clang/test/Analysis/test-separate-retaincount.cpp
4–19

Why did we never make a header simulator for those?

This revision is now accepted and ready to land.Dec 7 2018, 12:21 PM
This revision was automatically updated to reflect the committed changes.
Szelethus added a comment.EditedJan 20 2019, 3:19 PM

I'm working on rebasing my dependency branches on top of trunk, and I'm somewhat stuck on this patch, could you lend a hand please?

So, I'm seeing in essence 3 checkers: the class RetainCountChecker, that does the modeling, and two checkers that are exposed to the users, osx.cocoa.RetainCount and osx.OSObjectRetainCount. While generally the solution would be to register checker called RetainCountBase, that is a common base of both of the already registered checkers, it raises a serious problem: all of the plist files that say "this error originates from checker osx.cocoa.RetainCount" will now instead refer to osx.cocoa.RetainCountBase. Unfortunately, I can't just make both checkers have the same name, as how would the user disable one and not the other?

There were other checkers that suffered from this, but those affected very few test files, and this one hits around 9, and I know that very serious work is being put into it from your part. I'd like to propose 3 solutions, and I'm very much open to new ones:

  • Deal with the consequences of this, and just correct all plist files to now refer to the new base checker.
  • Each time a report is emitted from these checkers, create a ProgramPointTag, and "manually" make sure the emitted checker name doesn't change.
  • Rename osx.cocoa.RetainCount to something else. This one is clearly nonsensical, but let's put it here for the sake of completeness.

Deal with the consequences of this, and just correct all plist files to now refer to the new base checker.

What does it mean?

Each time a report is emitted from these checkers, create a ProgramPointTag, and "manually" make sure the emitted checker name doesn't change.

I'm not sure what do you propose exactly, but sounds pretty bad.

Rename osx.cocoa.RetainCount to something else. This one is clearly nonsensical, but let's put it here for the sake of completeness.

I don't think we can rename the old checker, as older Xcode versions have "osx.cocoa.RetainCount" hardcoded in them.

TBH I'm not really sold on the way we "bundle" multiple checkers (from the tablegen) into a single class.
For one, that means options don't work at all, and essentially the checker name is defined by the class which was registered by the tablegen first (which should not even be visible, and should be an internal implementation detail).
Do you have better proposals on how, conceptually, grouped checkers should be organized?

Essentially, we have a single checker class with two checks, which should be toggle-able orthogonally from each other.
For legacy reasons, we can't quite get rid of osx.cocoa.RetainCount (but even if we could, what then?)

Szelethus added a comment.EditedJan 22 2019, 12:41 PM

Poor wording on my end, sorry about that. Let me clarify.

Deal with the consequences of this, and just correct all plist files to now refer to the new base checker.

What does it mean?

When a report is emitted in the plist output, it'll also that which checker is "responsible" for that warning. I vaguely remember reading that this isn't used that much in your application, but we use this information extensively here.

Now, what I'd propose, is register a new, osx.RetainCountBase checker, and make both of the already existing checkers depend on that. Long story short, this will make the name of the checker that is listed in that plist entry osx.RetainCountBase, as opposed to osx.cocoa.RetainCount, but this doesn't sound that illogical, considering that both of those checkers emit the same checker name anyways (meaning, that if one report originates from osx.OSObjectRetainCount and one from osx.cocoa.RetainCount, both reports will be listed as they originated from osx.cocoa.RetainCount).

Each time a report is emitted from these checkers, create a ProgramPointTag, and "manually" make sure the emitted checker name doesn't change.

I'm not sure what do you propose exactly, but sounds pretty bad.

It does. I think IvarInvalidation uses something like that, and it's both unsustainable and very ugly.

Rename osx.cocoa.RetainCount to something else. This one is clearly nonsensical, but let's put it here for the sake of completeness.

I don't think we can rename the old checker, as older Xcode versions have "osx.cocoa.RetainCount" hardcoded in them.

Yea, this one makes little sense. Should've just left this one out really.

[..]essentially the checker name is defined by the class which was registered by the tablegen first (which should not even be visible, and should be an internal implementation detail).

Do I understand correctly that you refer that the way checker names are generated is very poor?

TBH I'm not really sold on the way we "bundle" multiple checkers (from the tablegen) into a single class.
Do you have better proposals on how, conceptually, grouped checkers should be organized?

I think the current system would be fine as long as there would be a way to hide these implementation detail checkers, which could also be defined in tblgen. Having a dependency tree would be neat, and also somewhat unavoidable in order to move forward with some very important changes I'm planning, like listing all checker options.

For one, that means options don't work at all, and [...]

Hmmm, does this mess with options that bad? Could you please clarify?

Hmmm, does this mess with options that bad? Could you please clarify?

registerChecker gets-or-creates a checker object. A checker name (used for getting the options) is set the first time it's created.
The checker which was created first "wins" and gets to name the resulting checker.
In practice it basically means that options and checkers reusing the same class do not work.
Do you have better ideas on how this could be arranged?

I think the current situation is a mess - ideally I would prefer to be able to access the options in the constructor, but we can't even do that,
since registerChecker sets the checker name and is called after the object has been constructed.
It seems that it would only make sense if the checker name is known at the time the checker is constructed: probably the function registerXChecker should get it as an argument.

Hmmm, does this mess with options that bad? Could you please clarify?

registerChecker gets-or-creates a checker object. A checker name (used for getting the options) is set the first time it's created.
The checker which was created first "wins" and gets to name the resulting checker.
In practice it basically means that options and checkers reusing the same class do not work.
Do you have better ideas on how this could be arranged?

Sure, in fact its already implemented and got accepted by @NoQ, I just didnt have the time to commit and fine tune :)

The solution is essentially to create CheckerManager::getChecker alongside registerChecker, and making them in order assert if the checker is unregistered, and assert when the checked is already registered. In an earlier patch, I implement handling of dependencies on a higher level, essentially making CheckerRegistry take care of this, instead of making checker registry functions register multiple checkers, or, like in this case, register a common base. Before the registry function for either RetainCound or OSObjectRetainCount is called (which will now invoke getChecker), the registry function for RetainCountBase will be called, that will register the actual checker object.

You can take a look at how this would look like in D54438.

Would love to hear your thoughts on this! :)

I think the current situation is a mess - ideally I would prefer to be able to access the options in the constructor, but we can't even do that,
since registerChecker sets the checker name and is called after the object has been constructed.
It seems that it would only make sense if the checker name is known at the time the checker is constructed: probably the function registerXChecker should get it as an argument.

The proposed solution solves this issue as well.