This is an archive of the discontinued LLVM Phabricator instance.

[Static Analyzer] Merge the Objective-C Generics Checker into Dynamic Type Propagation checker.
ClosedPublic

Authored by xazax.hun on Aug 26 2015, 2:36 PM.

Details

Summary

This patch merged the functionality from ObjCGenericsChecker into DynamicTypePropagation checker.
Note that the Generics Checker can still be turned on or off individually but this only affects whether diagnostics are generated or not.
There is no intended functional change in this patch.

Rationale

This is a refactoring that makes some collaboration between the ObjCGenericsChecker and DynamicTypePropagation checker possible.
The final goal (which will be achieved by some followup patches) is to use the reasoning of ObjCGenericsChecker about generics to infer dynamic type for objects.

Lets consider the following scenario:
id object = arrayOfStrings.firstObject;

Here the DynamicTypePropagation checker can not infer any dynamic type information because the static type of the arrayOfStrings.firstObject expression is id when arrayOfStrings is not annotated. However the generics checker might be able to infer an upper bound (NSString *) for the same expression from the usage of the symbol.

In a follow up patch when the DynamicTypePropagation checker fails to infer the dynamic type for an object, we would fall back to the inference of the generics checker, because it may have additional information.

Impact

When an exact type is inferred as a dynamic type (this happens when the allocation was visible by the analyzer), method calls on that object will be "devirtualized" (inlined).
When the allocation is not visible but an upper bound can be inferred, there will be a path split on method calls. On one path the method will be inlined (when a body is available) on the other, there will be no inlining.
There are some heuristic cases, where an upper bound is treated as an exact type.

The expected effect of the follow up patch is that, upper bound can be inferred more frequently. Due to cross translation unit limits, this might not change the inlining behavior significantly. However there are other advantages. Utilizing this dynamic type information, a generic type checker could be implemented trivially which could catch errors like:

id object = myNSNumber;
NSString *myString = object;

Why not keep the Generics checker and the dynamic type propagation as completely separate checks?

One of the problem is that, right now both checkers infer type information from casts. In order to be able to fallback to the type inference of Generics checker when the type inference of dynamic type propagation failed, the Generics checker should assume that the dynamic type propagation is done already by the time its callback is invoked. Since there is no way to specify ordering between the checkers right now, the only way to do it correctly is to merge two checks into one checker.

What do you think?

Diff Detail

Repository
rL LLVM

Event Timeline

xazax.hun updated this revision to Diff 33248.Aug 26 2015, 2:36 PM
xazax.hun retitled this revision from to [Static Analyzer] Merge the Objective-C Generics Checker into Dynamic Type Propagation checker..
xazax.hun updated this object.
xazax.hun updated this object.
xazax.hun added a subscriber: cfe-commits.
zaks.anna edited edge metadata.Aug 27 2015, 6:17 PM

Partial review in-line.

lib/StaticAnalyzer/Checkers/Checkers.td
456 ↗(On Diff #33248)

We need a better description here this one is too vague. Maybe something like this one:
"Check for type errors when using Objective-C generics."

lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
17 ↗(On Diff #33248)

Mention ObjC here.

45 ↗(On Diff #33248)

nit: Let's go back to the formatting with a single line per callback.

48 ↗(On Diff #33248)

Better name please, we might add more bug types.

76 ↗(On Diff #33248)

Please, move the definitions outside to reduce clutter.

81 ↗(On Diff #33248)

How do we know that the string is big enough?

83 ↗(On Diff #33248)

The error message does not sounds like a proper sentence..

Assigning from 'A' to 'B' sounds more natural.

Are we only warning on assignments? Maybe converting would be more applicable in some contexts..

Something like this might be better: "Conversion from value of type 'Integer' to incompatible type 'String'"

105 ↗(On Diff #33248)

Doxygen comment please.

179 ↗(On Diff #33248)

This line used to be unconditional and now, it's only executed if we are casting between ObjC Types.

407 ↗(On Diff #33248)

Do you know if the info tracked by the DynamicTypeInfo checker gets cleaned up for dead symbols?

448 ↗(On Diff #33248)

Please, use doxygen.
Also, the comment is not clear.. Ex: "most derived class if From"

456 ↗(On Diff #33248)

There s nothing about picking more informative type here.. especially with respect to informative term used in the previous function..

xazax.hun updated this revision to Diff 33709.Sep 1 2015, 10:14 AM
xazax.hun edited edge metadata.
xazax.hun marked 9 inline comments as done.
  • Addressed the comments in the review.
  • Added tests to cover some more corner cases and updated the code accordingly.
  • Updated to latest trunk.
xazax.hun marked an inline comment as not done.Sep 1 2015, 10:16 AM
xazax.hun added inline comments.
lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
81 ↗(On Diff #33248)

When the string is not big enough, there will be an allocation. So this is not a correctness issue. However I checked that, and the error messages tend to be very long. I could either increase the size of this smallstring to something like 150 which should be enough for the common of the cases, or I could just make it a string. Which one do you prefer?

179 ↗(On Diff #33248)

It should not be a problem. The code bellow only executes for bitcasts, and getBetterObjCType only returns a valid value, when the cast was between Obj-C types.

407 ↗(On Diff #33248)

That information is stored in DynamicTypeMap which is populated in lib/StaticAnalyzer/Core/ProgramState.cpp

I could not find any cleanup code. What do you think, what would be the best way to do the cleanup. Exposing a removeDynamicTypeInfo method from the ProgramState does not seem to be elegant, but it would work.

xazax.hun updated this revision to Diff 33754.Sep 1 2015, 4:37 PM

Simplifications to make the code easier to maintain.

xazax.hun marked 2 inline comments as done.Sep 8 2015, 12:22 PM

There are several fallouts from this review, so I decided to split this patch up the following way:

  1. I created a patch to incorporate the result of this review into ObjCGenericsChecker: http://reviews.llvm.org/D12701
  2. I will created a separate patch to purge the dynamic type information from the GDM for dead symbols.
  3. Once the former two patch is accepted I will rebase this patch on the top of those, so this will only contain minimal changes required for the merge.

There are several fallouts from this review, so I decided to split this patch up the following way:

  1. I created a patch to incorporate the result of this review into ObjCGenericsChecker: http://reviews.llvm.org/D12701
  2. I will created a separate patch to purge the dynamic type information from the GDM for dead symbols.

The second patch is available here: http://reviews.llvm.org/D12767

  1. Once the former two patch is accepted I will rebase this patch on the top of those, so this will only contain minimal changes required for the merge.
xazax.hun updated this revision to Diff 34606.Sep 11 2015, 5:09 PM
  • Rebased on the top of latest trunk (which contains patch #1 and patch #2 from my previous comments)
zaks.anna accepted this revision.Sep 12 2015, 10:41 AM
zaks.anna edited edge metadata.

Please, commit after the comments are addressed!

Thank you,
Anna.

lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
39 ↗(On Diff #34606)

I think this needs more explanation. We already have another map from a symbol to it's type. What is different about this one?

Maybe add something along these lines:
The type inflation is tracked by DynamicTypeMap.This is an auxiliary map that tracks more information in some cases where the known specialized type is an ancestor of the most precise type.

Maybe the name of the map could be enhanced as well? Now you have TypeParamMap and DynamicTypeMap..

121 ↗(On Diff #34606)

It's odd that we are using this API.. Are we keeping track of non-symbolic regions? If not, can't we just check if Region->getSymbol() is dead?
(Same in the nullability checker.)

275 ↗(On Diff #34606)

I do not see where you are checking the precondition.

This revision is now accepted and ready to land.Sep 12 2015, 10:41 AM
xazax.hun marked 3 inline comments as done.Sep 13 2015, 4:00 PM
xazax.hun added inline comments.
lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
121 ↗(On Diff #34606)

The DynamicTypeMap might contain non-symbolic regions. (The tests fails when I cast the regions stored in the map to symbolic regions.)

This observation is good however for the nullability checker.

275 ↗(On Diff #34606)

The check is in line 507. It is checked anyways for the generics checker and I did not want to add redundant checks to this function.

This revision was automatically updated to reflect the committed changes.