This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Toning down invalidation a bit
ClosedPublic

Authored by xazax.hun on Jan 25 2019, 4:42 AM.

Details

Summary

This is a patch for the following discussion on the mailing list: http://clang-developers.42468.n3.nabble.com/analyzer-Toning-down-invalidations-td4058816.html

The consensus is, while this approach might temporarily increase the false positive rate a bit systems of mutually-canceling bugs are worth untangling.
Most of the extra results are not false positive due to the less invalidation but other reasons, so we could focus on those problems instead of them being hidden.

Another consideration is that we actually introduce a new class of false positives, when a function is doing an offset trickery with fields. I think such functions should have a special annotation to suppress such false positives.

There are a few questions regarding that:

  • What should be the spelling of such an annotation?
  • How to handle indirect calls? Even if we were in an ideal world where all the callees are annotated, we might not know who the actual callee is (function pointers, virtual calls etc). So what should we do? Less or more invalidation for all indirect calls or have a separate mechanism to let the user define at the call site how to handle a specific call?

Some other ideas from Artem on the mailing list:

  • Relaxing the C++ container inlining heuristic, i.e. replacing it with visitor-based suppressions, so that to still enjoy the benefits of inlining. This will also likely to result in less invalidation, but it could have severe effect on how and where we spend our budget (given the complexity of STL implementations for performance tuning).
  • It shouldn't be all that hard to model extents of bindings within RegionStore, so that bindings to sub-structures didn't overwrite bindings to super-structures simply because they have the same base region and the same offset. The only problem here is to model extents of *integers* because we don't represent casts as part of SymbolRefs. All other sorts of SVals have well-defined extents (including, say, lazy compound values).

Diff Detail

Repository
rL LLVM

Event Timeline

xazax.hun created this revision.Jan 25 2019, 4:42 AM
xazax.hun edited the summary of this revision. (Show Details)Jan 25 2019, 4:44 AM
Szelethus accepted this revision.Jan 25 2019, 5:53 AM
Szelethus edited reviewers, added: NoQ; removed: dergachev.a.
Szelethus added a subscriber: NoQ.

Let's also have a link to your cfe-dev mail in this patch: http://lists.llvm.org/pipermail/cfe-dev/2019-January/060968.html

Overall, I like this quite a bit, as I personally experienced the consequence of this invalidation technique while developing UninitializedObjectChecker. I'd be interested to see how many more reports will it emit with this patch, I have a suspicion that it'll be very significant.

Let's wait for what @NoQ thinks of this patch.

What should be the spelling of such an annotation?

How about these: uses_offsetof, may_use_offsetof?

How to handle indirect calls? Even if we were in an ideal world where all the callees are annotated, we might not know who the actual callee is (function pointers, virtual calls etc). So what should we do? Less or more invalidation for all indirect calls or have a separate mechanism to let the user define at the call site how to handle a specific call?

Hmm, I suspect that such functions are few and far in between, so maybe the inconvenience of annotating calls to a function through a function pointer that may use offsetof can be justified.
For virtual methods, I guess we can't expect the user to always have the ability of adding the annotation to first base where the virtual method is declared, and not even CTU can ensure that we can scan all methods that implement it. Maybe for these very exceptionally rare cases, annotating the actual calls to the functions would also be justified.

Sadly, I can't say anything meaningful about Artem's ideas on top of my head. :)

test/Analysis/cxx-uninitialized-object.cpp
371 ↗(On Diff #183516)

I bet that the current invalidation technique crippled much of this checker's capabilities, so I'm happy to see it change. Hooray!

This revision is now accepted and ready to land.Jan 25 2019, 5:53 AM
NoQ added a comment.Jan 25 2019, 6:10 PM

Thanks! A surprise, to be sure :) I'll try to test this on my set of projects as well :)

Most of the extra results are not false positive due to the less invalidation but other reasons, so we could focus on those problems instead of them being hidden.

Could you share reproducible examples for these, probably in the form of FIXME tests? Given that they are "regressions", they are easy to creduce down to a small repro by using the test "there is still a change in behavior on this file".

lib/StaticAnalyzer/Core/CallEvent.cpp
320–321 ↗(On Diff #183516)

I suspect that the trait for non-base MR would never be read. The only place where this trait is accessed is in RegionStore.cpp where it asks whether the trait is applied to a cluster base, which is always a base region.

test/Analysis/call-invalidation.cpp
146 ↗(On Diff #183516)

Let's leave at least one positive check around, eg. demonstrate that invalidation does happen for s1.y here.

xazax.hun marked 2 inline comments as done.Jan 26 2019, 7:43 AM
In D57230#1372275, @NoQ wrote:

Could you share reproducible examples for these, probably in the form of FIXME tests? Given that they are "regressions", they are easy to creduce down to a small repro by using the test "there is still a change in behavior on this file".

I think the most common cause of false positives is infeasible paths. Do you have success reducing false positives using creduce? My problem usually is that we cannot tell if a reduction rendered a false positive into a true positive.

lib/StaticAnalyzer/Core/CallEvent.cpp
320–321 ↗(On Diff #183516)

I see some test failures when I always used the base region. I suspect the reason is that InvalidateRegionsWorker::AddToWorkList will add the region itself instead of the base region when TK_DoNotInvalidateSuperRegion is set. So if we only set the TK_PreserveContents trait for the base region InvalidateRegionsWorker::VisitCluster will not see the TK_PreserveContents trait.

In fact, the naming of regions in the those functions are very confusing. Even though the formal paramter is called baseR, my suspicion is that, we might visit non-base regions (due to the TK_DoNotInvalidateSuperRegion trait).

xazax.hun updated this revision to Diff 183702.Jan 26 2019, 7:44 AM
  • Added some tests
NoQ added a comment.Jan 26 2019, 9:27 AM
In D57230#1372275, @NoQ wrote:

Do you have success reducing false positives using creduce? My problem usually is that we cannot tell if a reduction rendered a false positive into a true positive.

False positives - no. Improvements and regressions - totally! Just run two different clangs in the creduce test and check that there's a difference in results.

In D57230#1372523, @NoQ wrote:
In D57230#1372275, @NoQ wrote:

Do you have success reducing false positives using creduce? My problem usually is that we cannot tell if a reduction rendered a false positive into a true positive.

False positives - no. Improvements and regressions - totally! Just run two different clangs in the creduce test and check that there's a difference in results.

Oh, I see. Great idea, I never did this. Will look into it.

I tried to creduce one file where the result differed and this is the result:

typedef struct {
  int a;
  int b
} c;
d;
e(c *f) {
  d < f->a;
  c g;
  h(&g.b);
  e(&g);
}

I think this the core idea is quite straightforward but this example is a bit convoluted due to the recursion. I do not see any value of adding this to the regression tests as this case is already covered there. Do you think I should try to reduce additional files?

I'm in favor of this change, I never understood how invalidating a field invalidates entire structure.

NoQ accepted this revision.Jan 28 2019, 2:36 PM

Do you think I should try to reduce additional files?

Aha, ok, it reduced an interesting positive into a non-interesting positive. So i guess my method only works when you're catching changes that are more unexpected than this one :) Ok, nvm then, thank you for trying this out! I didn't have time to evaluate this change yet, but that definitely shouldn't block you from committing.

I think this the core idea is quite straightforward but this example is a bit convoluted due to the recursion.

Hint: You can almost always "unroll" recursions or loops in reduced tests by looking at the Exploded Graph and figuring out how many times were they actually executed before the bug was found and then copy-paste-ing the code that many times.

lib/StaticAnalyzer/Core/CallEvent.cpp
320–321 ↗(On Diff #183516)

Aha, yeah, you're right! The word "Cluster" is also confusing because it is usually used for ClusterBindings :)

This revision was automatically updated to reflect the committed changes.

Thanks for all the reviews. Do you have any preference about the spelling of the annotation mentioned in the description?

There were two ideas so far: uses_offsetof, may_use_offsetof

While I like those, I wonder if it is a good idea to have offsetof in the name. One might use other methods, e.g. cast the address of the first field to a pointer to struct to access other members.

NoQ added a comment.Feb 5 2019, 11:47 AM

Hmm. writes_to_superobject?

Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 11:47 AM
NoQ added a comment.Feb 6 2019, 2:33 PM

There seem to be a few regressions - weird memory leaks of inner objects in C++ destructors. Trying to investigate/reproduce.

xazax.hun added a comment.EditedFeb 7 2019, 1:48 AM
In D57230#1387834, @NoQ wrote:

There seem to be a few regressions - weird memory leaks of inner objects in C++ destructors. Trying to investigate/reproduce.

Oh, that is unfortunate. Feel free to share a repro as soon as you have one and I will also try to look into it. The last time I saw strange leaks were due to the fact we did not invalidate symbols after unmodelled casts. But if these warnings were introduced by this change it is probably something else.

NoQ added a comment.Feb 7 2019, 3:09 PM

That's the one:

typedef __typeof(sizeof(int)) size_t;
void *malloc(size_t);

void escape(int **);

struct S {
  int *ptr;
};

void foo() {
  struct S s1;
  s1.ptr = malloc(sizeof(int));
  escape(&s1.ptr);
}

After the patch the allocated symbol no longer escapes. It didn't end up having much to do with destructors. I'll also think about it a bit more.

I think I might have a theory, but I would like to discuss it as I am not familiar with the internals bindings.

My theory is the following: when we store the bindings, we store them in a map where the key is a base region.
So when we try to look the bindings up with a non-base region, we will not get any bindings.

So in our current case, we end up having a non-base region in the worklist of InvalidateRegionsWorker.
ClusterAnalysis::RunWorkList will look up the cluster for the non-base region.
Without a cluster found we will not visit the bindings. With not visiting the bindings, we will not invalidate the symbols.
With no symbols to invalidate, the checkers will not get notified.

I think the whole ClusterAnalysis is flawed at this point. Most of the code expects to only see base regions, but some code paths might end up adding non-base regions.

So the question is, what should be the proper way to handle the TK_DoNotInvalidateSuperRegion trait?
Maybe we should always look up the bindings using the base region. But if we do, should we actually visit all of the bindings?

I did not have time yet to play with the possible solutions and will come back to this problem soon, just wanted to write down what I got so far.

Experimental patch is up in https://reviews.llvm.org/D58121
Unfortunately, it is not perfect yet.

NoQ added a comment.Mar 18 2019, 5:32 PM

Hmm, here's another one:

struct ListInfo {
  struct ListInfo *next;
};

struct X {
  struct ListInfo li;
  int i;
};

void list_add(struct ListInfo *list, struct ListInfo *item);

void foo(struct ListInfo *list) {
  struct X *x = malloc(sizeof(struct X));
  list_add(list, &x->li); // will free 'x'.
}

People are C-style-inheriting from a list item base, and are then happy to release the memory through a pointer to a field. Now we're reporting a memory leak on such code.

It looks as if we should have somehow disabled invalidation but not pointer escape for the base region.