This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Track runtime types represented by Obj-C Class objects
ClosedPublic

Authored by vsavchenko on Apr 16 2020, 2:47 AM.

Details

Summary

Objective-C Class objects can be used to do a dynamic dispatch on
class methods. The analyzer had a few places where we tried to overcome
the dynamic nature of it and still guess the actual function that
is being called. That was done mostly using some simple heuristics
covering the most widespread cases (e.g. [[self class] classmethod]).
This solution introduces a way to track types represented by Class
objects and work with that instead of direct AST matching.

rdar://problem/50739539

Diff Detail

Event Timeline

vsavchenko created this revision.Apr 16 2020, 2:47 AM

Fix unfinished comment

NoQ added inline comments.Apr 16 2020, 3:42 AM
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
135

Let's explain what true means here, i.e. /*Precise=*/true.

163

I suspect that self, unlike this, can be reassigned in the middle of the method.

164–165

That's a good place for const auto *.

208–211

When exactly does this situation happen? Can we predict it during pattern-matching instead of patching it up after the fact?

227–228

This needs to be filled in for the new trait as well! It's very important :3

334–335

Looks like a missed word somewhere.

clang/lib/StaticAnalyzer/Core/DynamicType.cpp
37

I'd rather word this as "pointed-to type".

vsavchenko marked 6 inline comments as done.Apr 16 2020, 3:53 AM
vsavchenko added inline comments.
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
135

Sure

163

That's true, and we didn't account for that in here. I'll try to make a test on that (and maybe re-order how we handle these cases a bit

164–165

Right, didn't pay much attention to this moved code

227–228

Whoops! I forgot!
Good catch!

334–335

OK, I'll try to re-write to make it more clear

clang/lib/StaticAnalyzer/Core/DynamicType.cpp
37

Sure

vsavchenko marked 8 inline comments as done.

Fix issues pointed by @NoQ.

clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
227–228

But it looks like that info was not removed from the state neither for dynamic types, nor for dynamic casts (see the diff)

NoQ added inline comments.Apr 20 2020, 3:26 AM
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
113

static?

134

These comments will never be parsed by doxygen, i don't think those \code thingies are needed here. Or is it some editor/IDE-specific thingies that i'm not educated about?

193

Why are we discarding CanBeSubClassed here?

198

I believe this is pretty much always the case. At least whenever getInstanceReceiver() is available. Another exception seem to be when ReceiverSVal is an UnknownVal (in this case self is going to be SymbolRegionValue because it's never set in the Store), but that's it. I inferred this by looking at ObjCMethodCall::getInitialStackFrameContents().

I think we should have used getSelfSVal() to begin with.

919

I don't think this line of code does anything.

(time for some LLVM_NODISCARD???)

clang/lib/StaticAnalyzer/Core/CallEvent.cpp
1219–1226

Before i forget: Stuff that's shared across analyses usually lives in AnalysisConsumer. Cf. FunctionSummaries. This is the intended way of avoiding globals in such cases.

1293–1319

Wait, how do we get a decl here that's anyhow relevant if the compiler doesn't even know that it's a class method?

clang/lib/StaticAnalyzer/Core/DynamicType.cpp
85–87

My favorite idiom for this stuff so far:

const DynamicTypeInfo *DTI = State->get<DynamicClassObjectMap>(Sym);
return DTI ? *DTI : {};

(not sure ?: will typecheck correctly in case of {} though)

146

Feel free to rename isLiveRegion instead if it helps :)

vsavchenko marked 7 inline comments as done.Apr 20 2020, 4:04 AM
vsavchenko added inline comments.
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
113

It is all inside of anonymous namespace, so no need in putting static here.

134

No, you are correct. It's more like a matter of habit and consistent look of the comment. I would anyway prefer to somehow tell the reader that this is a snippet. It can be with a more or less familiar way of doing it. (I've seen a good amount of doxygen-style comments for static functions in the project, I guess similar logic applies to those as well)

193

Good point, we should propagate it further

198

I believe this is pretty much always the case.

I didn't quite get what you mean by that

919

Whoops, that's true!

clang/lib/StaticAnalyzer/Core/CallEvent.cpp
1219–1226

We can re-do it in a separate commit I guess.

1293–1319

Compiler knows it, but still marks it as an instance method (because technically self is an object).

NoQ added inline comments.Apr 20 2020, 4:33 AM
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

What i'm trying to say is that C.getSVal(RecE) and Message.getSelfSVal() and Message.getReceiverSVal() are basically the same SVal. It shouldn't be necessary to check both or check whether they're the same; you must have meant to check for something else, probably something purely syntactic.


I inferred this by looking at ObjCMethodCall::getInitialStackFrameContents().

Wait, so it's only true for inlined methods. For non-inlined methods getSelfSVal() will be unknown :/

vsavchenko marked an inline comment as done.Apr 20 2020, 5:25 AM
vsavchenko added inline comments.
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

Yeah, that might be a bit extraneous to do it with SVals, but this code for sure does its job (it is definitely not a redundant check). getSelfSVal() returns receiver of the function containing the call and not the call itself. So, it does check if we the receiver of the message is self.

I changed it to this way of doing things because it is consistent with how the same thing is done in getRuntimeDefinition.

NoQ added inline comments.Apr 20 2020, 7:29 AM
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

getSelfSVal() returns receiver of the function containing the call and not the call itself

😱 looks broken to me.

NoQ added inline comments.Apr 20 2020, 10:05 AM
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

Let's rename getSelfSVal() so that it screamed that it's the callee's self as opposed to the caller's self, and explain in a long comment why do we even care about the caller's self. I.e., that we heuristically assume that if a class method jumps into another class method on the same class object, it's going to be devirtualized to the same class. Which isn't always the case, hence !Precise.

vsavchenko marked 9 inline comments as done.

Fix review remarks

clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

I don't really think that it's a good idea to mix these two worlds:

  • world one - interface function that allows to get an SVal for self of the containing method. It does exactly this, for no specific reason. I'm on board with renaming, but we need to come up with an appropriate name that describes what it gives and not why.
  • world two - use-case of this interface method that tries to figure out the type of the receiver (for devirtualization purposes or not).

So, the comment can be only here. I agree, I can add more explanation about what we are doing in this particular piece of code, but it won't make any sense to add this (or similar) comment for getSelfSVal().

clang/lib/StaticAnalyzer/Core/DynamicType.cpp
85–87

Yep, initializer lists are not allowed in ternary operators.

146

Unfortunately, it is named a bit differently for a reason, - compiler couldn't decide which function to use bool isLive(const MemRegion *region) or bool isLive(const VarRegion *VR, bool includeStoreBindings = false) const for a call isLive(VarRegion *) (even though it's pretty obvious). Splitting isLive for VarRegion into two functions doesn't seem like an easy one. The only option I can see is the introduction of isLiveImpl (not very pretty in a header file) and two functions isLive(const VarRegion*) const and isLiveIncludingStoreBindings(const VarRegion *) const.

NoQ accepted this revision.Apr 23 2020, 7:07 AM

Let's land this, I guess? Fantastic work!

clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

Ideally i'd detach getSelfSVal() from CallEvent entirely. Like, it doesn't even depend on the call site, why would it be part of CallEvent to begin with? This additionally takes care of world one.

so that it screamed that it's the callee's self as opposed to the caller's self

Mmm, the opposite, of course. getCallerSelfSVal()?

clang/lib/StaticAnalyzer/Core/DynamicType.cpp
146

Or rename isLive(const VarRegion *, bool includeStoreBindings)? (I suspect this function should really be private anyway)

This revision is now accepted and ready to land.Apr 23 2020, 7:07 AM
vsavchenko marked 2 inline comments as done.Apr 23 2020, 8:04 AM

Let's land this, I guess? Fantastic work!

Thanks :-)

clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

Where do you think it should be?

My vote goes to location context.

clang/lib/StaticAnalyzer/Core/DynamicType.cpp
146

I don't know about that, it doesn't seem like a good change. I think that all of them should be isLive

NoQ added inline comments.Apr 23 2020, 8:17 AM
clang/lib/StaticAnalyzer/Checkers/DynamicTypePropagation.cpp
198

As of now LocationContext is an entity so low-level that it doesn't even live in libStaticAnalyzer*, so it can't be made aware of SVals.

Given that it's basically a Store lookup, i'd recommend putting it into ProgramState. I.e., State->getSelfSVal(LCtx);.

Move getSelfSVal from CallEvent to ProgramState

NoQ added a comment.Apr 29 2020, 2:44 AM

Perfect! Please get yourself some commit access already -__-

This revision was automatically updated to reflect the committed changes.