This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][WIP] Attempt to fix traversing bindings of non-base regions in ClusterAnalysis
AcceptedPublic

Authored by xazax.hun on Feb 12 2019, 7:26 AM.

Details

Reviewers
NoQ
Summary

Follow up for a problem described in https://reviews.llvm.org/D57230

I think the ClusterAnalysis is fundamentally flawed at this point and it does not support non-base regions properly. The reason we could get away with this so far is that it is very rare to have non-base regions there.

I made an attempt to fix this and I successfully fixes the false positive described by Artem in https://reviews.llvm.org/D57230.
While I am not completely sure yet whether this is the most desirable solution, I wanted to share the work in progress solution to get some feedback about the directions.

Diff Detail

Event Timeline

xazax.hun created this revision.Feb 12 2019, 7:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 12 2019, 7:26 AM
xazax.hun updated this revision to Diff 186489.Feb 12 2019, 8:52 AM
xazax.hun marked an inline comment as done.
xazax.hun edited the summary of this revision. (Show Details)
  • Fix test failures.
NoQ added a comment.Feb 15 2019, 5:50 PM

After staring at this for an hour or two, i didn't manage to force myself to understand how our cluster analysis works here, but i totally agree that it's most likely broken; i guess, we should eventually move away from the idea that everything works through base regions, but for now this idea seems to be hardcoded in a lot of places.

That said, the patch does fix the test case that i provided, but it doesn't fix my original regression :( so i'll run the reduce again to see if i have more to say. And i'll also try harder to understand what's going on.

lib/StaticAnalyzer/Core/RegionStore.cpp
1039

The patch introduces a whitespace error here :p

NoQ accepted this revision.Mar 6 2019, 5:44 PM

Ugh, it takes forever to debug, because it gets reduced to a true positive now, and the original unprunned report is ~1300 pieces long (the normal report being completely worthless) and the exploded graph is too slow to view. While i find myself parsing dot files with regular expressions, i'll be totally happy if you simply add the test and commit, but i'll definitely keep trying to understand what's going on, because it's still a problem down there.

This revision is now accepted and ready to land.Mar 6 2019, 5:44 PM
NoQ added a comment.Mar 20 2019, 8:38 PM

Ok, got it! Yeah, this sounds like a valid way of supporting non-base-region-based worklist items, i'm seeing no problems with it and i don't immediately see a solution that'd be better than gradually supporting non-base-regions in more and more places.

Here's the slight modification of the original test case (D57230#1389766) that my problem finally boiled down to:

typedef __typeof(sizeof(int)) size_t;
void *malloc(size_t);

struct S {
  int x; // Make sure ptr has a non-zero offset within baseR.
  int *ptr;
};

struct T {
  struct S s; // This is baseR.
};

void escape(struct S *);

void foo() {
  struct T t; // This is realBaseR.
  t.s.ptr = malloc(sizeof(int));
  escape(&t.s);
}

This happens because in this patch you're pin-pointing the value by trying to look it up via an exact binding key (up to default/direct), however it may also reside at a non-zero offset within baseR. In my case the offset was also symbolic - which kept me confused for a while because i was thinking that symbolic offsets are the root of all evil. I think you may be able to generalize the patch to cover this scenario by replacing the exact offset lookup with a collectSubRegionBindings() (it might be cute to make some sort of iterSubRegionBindings() that takes a lambda, and/or combine it with iterBindings()).

I'm still worried about D57230#1434161 though. It kinda sounds like a good idea to suppress store invalidation but not escaping, but it's a very annoying thing to implement because our invalidation traits are very inflexible. In order to express complex requirements like "please don't invalidate this region, except this sub-region", the whole data structure has to be changed. This is slightly similar to this problem we have in:

void escape(int *x, const int *y);
// ...
escape(&a, &a);

where the value of a would not be invalidated because y is a const pointer, even though it is passed through a non-const pointer x. @xazax.hun, do you have enough time/inspiration to dig that deeply into this stuff?

First of all, sorry for the inactivity regarding this patch.

In D58121#1437433, @NoQ wrote:

Ok, got it! Yeah, this sounds like a valid way of supporting non-base-region-based worklist items, i'm seeing no problems with it and i don't immediately see a solution that'd be better than gradually supporting non-base-regions in more and more places.

Here's the slight modification of the original test case (D57230#1389766) that my problem finally boiled down to:

typedef __typeof(sizeof(int)) size_t;
void *malloc(size_t);

struct S {
  int x; // Make sure ptr has a non-zero offset within baseR.
  int *ptr;
};

struct T {
  struct S s; // This is baseR.
};

void escape(struct S *);

void foo() {
  struct T t; // This is realBaseR.
  t.s.ptr = malloc(sizeof(int));
  escape(&t.s);
}

This happens because in this patch you're pin-pointing the value by trying to look it up via an exact binding key (up to default/direct), however it may also reside at a non-zero offset within baseR. In my case the offset was also symbolic - which kept me confused for a while because i was thinking that symbolic offsets are the root of all evil. I think you may be able to generalize the patch to cover this scenario by replacing the exact offset lookup with a collectSubRegionBindings() (it might be cute to make some sort of iterSubRegionBindings() that takes a lambda, and/or combine it with iterBindings()).

Oh, I see. Thank you for tracking this down! Your proposed solution sounds reasonable to me. I hope there will be no significant performance hit.

I'm still worried about D57230#1434161 though. It kinda sounds like a good idea to suppress store invalidation but not escaping, but it's a very annoying thing to implement because our invalidation traits are very inflexible. In order to express complex requirements like "please don't invalidate this region, except this sub-region", the whole data structure has to be changed. This is slightly similar to this problem we have in:

void escape(int *x, const int *y);
// ...
escape(&a, &a);

where the value of a would not be invalidated because y is a const pointer, even though it is passed through a non-const pointer x. @xazax.hun, do you have enough time/inspiration to dig that deeply into this stuff?

I totally agree that invalidation and escaping are orthogonal. I really glad you found this example, I think you are right, the current data structure is not flexible enough. Unfortunately, I have some unrelated deadlines coming up, so it is very likely that I can only start working on this problem around early May. If you have some time to work on these problems feel free to commandeer this patch and I will definitely try to find time to review your patches.

NoQ added a comment.EditedApr 2 2019, 7:06 PM

Mmm. I'm also pretty pinned down (it's seasonal), so i'm thinking of temporarily reverting :( until one of us gets to fixing the accidental effect on escaping, 'cause it just keeps biting us. Like, i wholeheartedly appreciate the work and i have already noticed a few times how it makes things better and loved it, it just seems to accidentally have something missing that nobody could predict, and i'll be super eager to get it back in.

In D58121#1452483, @NoQ wrote:

Mmm. I'm also pretty pinned down (it's seasonal), so i'm thinking of temporarily reverting :( until one of us gets to fixing the accidental effect on escaping, 'cause it just keeps biting us. Like, i wholeheartedly appreciate the work and i have already noticed a few times how it makes things better and loved it, it just seems to accidentally have something missing that nobody could predict, and i'll be super eager to get it back in.

Sounds perfectly reasonable. I was also thinking about suggesting a temporary revert.

NoQ added a comment.EditedApr 3 2019, 11:27 AM

Reverted D57230 in rC357620.
(also added that offending test case with lists)