This is an archive of the discontinued LLVM Phabricator instance.

[CFLAA] Cleaned up StratifiedAttrs handling
ClosedPublic

Authored by grievejia on Jun 4 2016, 9:49 PM.

Details

Summary

This revision patched up three different things regarding StratifiedAttrs.

  1. There are two overloaded version of StratifiedSets::noteAttribute(): one version accepts an unsigned index as its second argument, and the other one accepts a bitset. The two overloads perform slightly different operations. For some reason, when StratifiedSets::noteAttribute() is invoked with an unsigned integer (the intention was to invoke the first overloaded version), my toolchain (clang-3.8 + libstdc++-5.3) always construct a bitset out of the given integer and invoke the second overloaded version. The wrongly resolved overload function call leads to subtle error in StratifiedAttrs computation, and the error is not caught by any of the existing regression tests.

To eliminate the potential ambiguity in overload resolution, the patch gets rid of the first overloaded version and make every caller of noteAttribute() to call the second overloaded version instead.

  1. Previously, AttrUnknown and AttrAll were treated in the same way. There was no practical differences between the two. This patch removes AttrAll competely and relies on AttrUnknown to label values that are not understood by CFLAA.
  1. A new StratifiedAttr, called "AttrEscaped", is added. The motivation for this attribute is to make the analysis more precise by distinguish between "values that escape the current function" and "values that do not escape the current function". The major difference between the two classes of values is that the former class does not alias global/argument/unknown values the the latter class may alias them. Consider this example:

    define void @foo() { %a = alloca i32 %b = alloca i32 %i = ptrtoint %a ... }

Here we cast %a to an integer and thus let it escape from the world that CFLAA can understand. CFLAA was conservative on cases like this: as long as a value escapes, the value will be marked as unknown. Therefore the analysis concludes that %a and %b may alias each other (but it is obvious from the function body that they don't).

We can be more aggressive about it because an escaped value isn't completely opaque to CFLAA: the analysis still keeps track of the value. A value escapes only means that it is possible for other opaque values to alias it.

This patch uses AttrEscaped for ptrtoint only. Our ultimate goal is to improve upon interprocedural case like this:

; Assume @external_func is opaque in the current translation unit
declare @external_func(i32*, i32*)

define void @foo() {
  %a = alloca i32
  %b = alloca i32
  call void @external_func(i32* %a, i32* %b)
  ret void
}

Here %a and %b escaped @foo yet they should not be aliases. CFLAA currently cannot prove this. The idea is to mark %a and %b as AttrEscaped, but we need additional mechanisms to guarantee soundness -- they will be added in subsequent patches.

Diff Detail

Repository
rL LLVM

Event Timeline

grievejia updated this revision to Diff 59657.Jun 4 2016, 9:49 PM
grievejia retitled this revision from to [CFLAA] Cleaned up StratifiedAttrs handling.
grievejia updated this object.
grievejia updated this object.
grievejia added a subscriber: llvm-commits.
grievejia updated this object.Jun 4 2016, 9:53 PM

Thanks for the patch!

A few high-level things:

  • This needs tests. Ideally at least one for each logical change that's being made (though I'm not sure how testable the overload-related issue would be). :)
  • We generally request that people try to make reviews as targeted as possible. So, if you could please split patches like these out into N different reviews in the future, that would be great!

There are two overloaded version of StratifiedSets::noteAttribute

One is noteAttribute, the other is noteAttributes, so I don't think overload resolution should be kicking in at all here. :)

lib/Analysis/CFLAliasAnalysis.cpp
1100 ↗(On Diff #59657)

This looks a bit subtle. I think we can make it less so if we move the index number check above this. If we do that, it seems that we can simplify this entire if/else chain to something like

if (a.none() || b.none() || (a == Escaped && b == Escaped))
  return NoAlias;
return MayAlias;

Right?

1102 ↗(On Diff #59657)

Style nit:

if (foo)
  return bar;
if (baz)
  ...

Is preferred over

if (foo)
  return bar;
else if (baz)
  ...
1105 ↗(On Diff #59657)

AttrsA.test(AttrEscapedIndex) && AttrsB.test(AttrEscapedIndex)?

Either way, this looks like it'll hand us NoAlias if I have e.g. two sets marked with every StratifiedAttr except AttrUnknown.

lib/Analysis/StratifiedSets.h
436 ↗(On Diff #59657)

Does Link.setAttr have any uses aside from this one? If not, we should remove it, too. :)

grievejia added inline comments.Jun 6 2016, 2:29 PM
lib/Analysis/CFLAliasAnalysis.cpp
1100 ↗(On Diff #59657)

If both a.none() and b.none() are true, we still need to check SetA.Index == SetB.Index, right?

1105 ↗(On Diff #59657)

If we go with Attrs.test(AttrEscapedIndex), the problem is that we may miss the check for global/argument attrs. They are essentially the same as AttrUnknown except that we are aware why they are unknown.

Maybe I'll restructure the checks a little bit here and try to merge the check of global/argument/AttrUnknown together, to make it more clear.

lib/Analysis/CFLAliasAnalysis.cpp
1100 ↗(On Diff #59657)

Yup! Which is why I said we should move the index check above this. :)

grievejia updated this revision to Diff 59791.Jun 6 2016, 3:03 PM
grievejia edited edge metadata.
grievejia marked 3 inline comments as done.

Updated aliasing logic.

Test case added.

  • This needs tests. Ideally at least one for each logical change that's being made

This is my mistake. I have no excuse for it :(

  • We generally request that people try to make reviews as targeted as possible. So, if you could please split patches like these out into N different reviews in the future, that would be great!

OK I'll try to do that in the future.

There were two reasons why I did this giant patch: (1) I thought they are all related to StratifiedAttrs and share a common theme. (2) I haven't figure out how to work on patches that depends on each other, so I tend to use giant patch to minimize inter-patch dependency. Say I break this patch into two pieces P0 and P1, where P1 depends on P0. If for some reasons P0 get changed before it is committed in-tree, is there a clean way I can propagate the changes to P1 as well?

One is noteAttribute, the other is noteAttributes, so I don't think overload resolution should be kicking in at all here. :)

No wonder!

Then the problem could have been solved in a far less intrusive manner. Well, I'll just leave the changes as-is because I'm too lazy to manually revert all those changes...

grievejia added inline comments.Jun 6 2016, 3:21 PM
lib/Analysis/CFLAliasAnalysis.cpp
1104–1111 ↗(On Diff #59791)

Aha, I see what you are saying. But I'm not sure moving the index check will simplify the logic here.

In general we have four types of sets: (1) AttrNone (2) AttrUnknown is set (3) only AttrEscaped is set (4) Any global/argument attr is set.

Let's perform a case-by-case analysis:

  • (1) x (1): index check
  • (1) x (2): NoAlias
  • (1) x (3): NoAlias
  • (1) x (4): NoAlias
  • (2) x (2): MayAlias
  • (2) x (3): MayAlias
  • (2) x (4): MayAlias
  • (3) x (3): index check
  • (3) x (4): NoAlias
  • (4) x (4): MayAlias

The puzzle here is to figure out the clearest way to express the above logic. I agree with you that I may not have the best answer, but simply putting the index check on top is not a panacea either. (Or I misunderstood what you're saying...)

grievejia added inline comments.Jun 6 2016, 3:24 PM
lib/Analysis/CFLAliasAnalysis.cpp
1104–1111 ↗(On Diff #59791)

Oops, I made a mistake: (1) x (3) should be "index check" rather than "NoAlias". I'll update the codes accordingly.

grievejia updated this revision to Diff 59795.Jun 6 2016, 3:51 PM
grievejia marked an inline comment as done.

Updated aliasing logic

Looks good after a few more comments.

This is my mistake. I have no excuse for it :(

If no one made mistakes, code reviews wouldn't exist. Just do your best. :)

Say I break this patch into two pieces P0 and P1, where P1 depends on P0. If for some reasons P0 get changed before it is committed in-tree, is there a clean way I can propagate the changes to P1 as well?

I use git branches. Whenever I change branch P0, I swap to P1 and rebase. It's not the prettiest solution, and sometimes requires effort, but it's the best thing I know of. If you use svn, I can't help you; my svn directory literally only exists so I can throw a patch on it and commit said patch.

Well, I'll just leave the changes as-is because I'm too lazy to manually revert all those changes

Yeah, the refactored code seems less error-prone anyway, so I'm okay with that.

lib/Analysis/CFLAliasAnalysis.cpp
116 ↗(On Diff #59795)

Style nits: attr -> Attr, and we generally try to keep "static" functions static and out of unnamed namespaces.

117 ↗(On Diff #59795)

Should be return attr.none() || attr == AttrEscaped;?

1095–1114 ↗(On Diff #59795)

The reason I'm fond of moving the index check up is that it obviates the need for "index check" in your table. Those all get to turn into NoAliases, and it's one less thing to worry about when reading this attribute logic. :)

Either way, I didn't think the (3) x (4) case was NoAlias when I originally commented (...though I suspect we'll need to make it MayAlias in the near future, but we'll get there when we get there), so this is a bit more complex than I originally thought. What you have now seems correct to me, so I'm happy with it.

1113 ↗(On Diff #59795)

Nit: No else. :)

grievejia updated this revision to Diff 59884.Jun 7 2016, 7:17 AM
grievejia marked 5 inline comments as done.

Stylish correction.

... And yet another aliasing logic update.

grievejia added inline comments.Jun 7 2016, 7:22 AM
lib/Analysis/CFLAliasAnalysis.cpp
1099–1117 ↗(On Diff #59884)

Can you explain your suspicion that "we'll need to make it (3 x 4) MayAlias in the near future"? I thought the definition of AttrEscaped has already eliminate this possibility, since (3) only includes locally identifiable memory objects (allocas and mallocs) and those objects shouldn't alias any global/argument.

george.burgess.iv edited edge metadata.

LGTM -- will commit.

lib/Analysis/CFLAliasAnalysis.cpp
1099–1117 ↗(On Diff #59884)

It's my understanding that our plan is to eventually mark call args as AttrEscaped instead of AttrUnknown. With everything in its current state, it seems that doing so would break in a case like:

int *G;
// assume this is external to CFLAA
void f(uintptr_t A) {
  G = (int *)A;
}

void callF() {
  int S;
  int *P = &S;
  f((uintptr)P);
  int *PAlias = G;
}

If we're not doing this, then (3 x 4) can stay NoAlias. :)

This revision is now accepted and ready to land.Jun 7 2016, 11:32 AM
This revision was automatically updated to reflect the committed changes.
grievejia added inline comments.Jun 7 2016, 1:32 PM
lib/Analysis/CFLAliasAnalysis.cpp
1099–1117 ↗(On Diff #59884)

I see the problem here. Thanks for the example. It's very instructive.