This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Find better description for tracked symbolic values
ClosedPublic

Authored by vsavchenko on Apr 22 2021, 3:07 AM.

Details

Summary

When searching for stores and creating corresponding notes, the
analyzer is more specific about the target region of the store
as opposed to the stored value. While this description was tweaked
for constant and undefined values, it lacked in the most general
case of symbolic values.

This patch tries to find a memory region, where this value is stored,
to use it as a better alias for the value.

rdar://76645710

Diff Detail

Event Timeline

vsavchenko created this revision.Apr 22 2021, 3:07 AM
vsavchenko requested review of this revision.Apr 22 2021, 3:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2021, 3:07 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Minor fix in comments

Awesome!

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
1529

I'd rather use identity here (and at line 1509) instead of foo, I think that could make this explanation easier to follow.

1532

So here, we have two or three bindings attached to c? foo(b) and a (and b as well) ?
What is the guarantee that FindUniqueBinding will return with the correct one foo(b)? Seems like we iterate through an ImmutableMap (AVL tree) with MemRegion* keys. And the order of the iteration depends on pointer values. What I want to express, is that I don't see how do we find the correct binding, seems like we just find one, which might be the one we look for if we are lucky.

Perhaps we could have a lit test for this example?

vsavchenko added inline comments.Apr 23 2021, 7:47 AM
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
1529

Good idea!

1532

Good idea, I should a test for that!
FindUniqueBinding does not only have a region, it also carries a flag checking if there are more than one binding. operator bool then checks for both, thus guaranteeing that we won't choose random binding out of more than 1 - we won't choose at all.

martong added inline comments.Apr 23 2021, 8:26 AM
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
1532

Yeah, I missed that about FindUniqueBinding.

It's just side questions then: Could we handle multiple bindings by finding the 'last' binding and tracking that back?
And in this example, what are the bindings for c?

vsavchenko added inline comments.Apr 23 2021, 8:33 AM
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
1532

Can you please elaborate on what you mean by 'last', like 'last used'?
As for c, you can see above that I'm looking for a node where 'c' is not bound to that value. So, we start with the node where 'a', 'b', and 'c' all bound to symbolic value... let's say 'x'. I go up the graph to the point where 'c' is not bound to 'x' and there I look for unique binding for 'x'. In that example, we won't find it because it has two: 'a' and 'b'.

NoQ added inline comments.Apr 25 2021, 11:50 PM
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
163–165

This isn't your code but it came to my attention that there's a convenient (but equally confusing) shortcut for this:

return N->getState()->getLValue(VD, N->getLocationContext());
clang/test/Analysis/uninit-const.cpp
78

I suspect that this wording can be improved a lot. The note "'p' initialized to the value of 't'" sounds like an accurate description for int p = t but not so much for int &p = t. Can we detect this case and say something like "'p' refers to 't'" instead?

Add one more testg covering the situation when we can't pick up that one region to associate with the value.

martong accepted this revision.Apr 26 2021, 3:27 AM

For me it looks good now.

The changes, however, seem to be unrelated to retain count checker. Do you think it would make sense to upstream this patch independently from the other 3 patches you have in your stack? That might require some changes in the test files though.

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
1532

Thanks for the explanation!

Can you please elaborate on what you mean by 'last', like 'last used'?

Yes, I meant the 'last used', in this case it should be foo(b).

This revision is now accepted and ready to land.Apr 26 2021, 3:27 AM

Refactor the way we get Region for variable declaration

vsavchenko marked 4 inline comments as done.Apr 26 2021, 3:37 AM
vsavchenko added inline comments.
clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
1532

That would be perfect actually, but not foo(b), just b. Reporting foo(b) won't give more information than the code itself already provides.
However, I don't know a simple way to get region b from foo(b) expression (of course, we are talking about more general case than just getting the first argument of foo).

clang/test/Analysis/uninit-const.cpp
78

That's a good point, but it feels like a can miss some cases here and we need to handle this situation separately.

This is an improvement! Good job.
I had no time reviewing this, but I think it's already in a pretty good shape.

vsavchenko marked an inline comment as done.

Limit the number of cases where the new note refinement takes place

NoQ accepted this revision.Apr 26 2021, 8:43 PM

Awesome, thank you!