This is an archive of the discontinued LLVM Phabricator instance.

[CFLAA] Add yet another StratifiedAttr
ClosedPublic

Authored by grievejia on Jun 7 2016, 5:14 PM.

Details

Summary

First of all, this patch fixed the issue with AttrEscaped. Now AttrEscaped always may-alias AttrGlobal/AttrArgXXX.

To compensate for the precision loss introduced by the fix, we add yet another StratifiedAttr called "AttrExternal". This is an attribute designed specifically for global values and argument values, to distinguish the values themselves from the memory they point to (which are still tagged with "AttrGlobal"/"AttrArgXXX" respectively). The biggest difference between AttrExternal and AttrGlobal/AttrArgXXX/AttrUnknown is that the former does not alias AttrEscaped value, while the latter may.

Here's an example:

declare @external(i32*)
global @g = i32 0

define void @foo(i32* %x) {
  %a = alloca i32
  call @external(i32* %a)
  ret void
}

Let's assume for now that we tag %a as AttrEscaped (this hasn't happened yet but will happen in the next patch). Since %a is an identifiable stack allocation, neither %x nor @g can alias it.

  • %x cannot alias %a because %a is allocated after %x therefore there's no way %x can be assigned the value of %a without pointer-guessing, which is UB.
  • @g cannot alias %a because @g is also an identifiable allocation. It is also a global, therefore a constant, whose value cannot be changed to the value of %a

Of course, if we try to load from @g or %x, the result may alias %a. However, if that happens, the analysis will tag the loaded value as AttrUnknown, hence it is still sound.

Diff Detail

Event Timeline

grievejia updated this revision to Diff 59973.Jun 7 2016, 5:14 PM
grievejia retitled this revision from to [CFLAA] Add yet another StratifiedAttr.
grievejia updated this object.
grievejia added a subscriber: llvm-commits.

Tests, please. :)

Also, can you please go over how this doesn't make arg/global attributes basically redundant? Any set tagged with External attributes will have all sets below it tagged with External. So, it looks like this change would make it more difficult to handle interprocedural queries where we know the source of arguments.

lib/Analysis/CFLAliasAnalysis.cpp
658–665

Nit: Remove 'says ', 'comes' -> 'come'

662

'comes' -> 'come'

663

'points' -> 'point'

1014

ISTM we would end up with different attributes for the StratifiedSets in cases like:

void foo(int **a) {
  int **aalias = a;
  int *b = *a;
}

and

void foo(int **a) {
  int **aalias = a;
  int *b = *aalias;
}

...Since, when analyzing *a, MaybeCurAttr is marked as External (and Arg #0), whereas when looking at *aalias, we won't mark any attributes. After set finalization/etc, the former code sample will end with two sets, each with External and Arg #0 attrs, but the latter will produce two sets with only the External attr set.

Is this intentional?

1037–1038

Is there a reason that we don't honor the noalias attribute anymore here?

1120–1122

"like external memory or an integer cast"

Tests, please. :)

Sorry I should've mentioned it in the description. The test I uploaded with the last patch (attr-escaped.ll) should serve the purpose. There's a CHECK on an AttrEscaped value and an AttrExternal value in that test.

Also, can you please go over how this doesn't make arg/global attributes basically redundant?

Arg/Global attributes are practically identical to AttrUnknown at this moment, except that we know a little bit more on their origins. Arg/Global/Unknown all represent "values that we don't know where they come from". They may alias each other in arbitrary way. Basically, everything we load through AttrExternal values should be put in this class.

AttrExternal, on the other hand, is only used to tag global values/arguments themselves. say we have this function:

define @foo(i32** %x) {
  %y = load i32*, i32** %x
  %z = alloca i32
  ...
}

Then %x will be marked AttrExternal, and %y will be marked AttrArg0 (which if functionally equivalent to AttrUnknown). The motivation for separating AttrArg and AttrExternal is that if we don't, CFLAA will say that if %z escapes then %z may alias %x, which could be made less conservative.

Any set tagged with External attributes will have all sets below it tagged with External.

Yes. But AttrGlobal/AttrArgXXX takes precedence over it. So I don't see AttrExternal breaks soundness here.

So, it looks like this change would make it more difficult to handle interprocedural queries where we know the source of arguments.

We didn't lose the information for the memory objects that get pointed to by globals/arguments. We only lose the information on the globals/arguments themselves. It is also possible to preserve the information at the value level by further expanding AttrExternal into AttrExternalGlobal, AttrExternalArg0.

As I put more thoughts on it I also start to grow suspicious about the idea that separating AttrGlobals/Args from AttrUnknown is somewhat useful. The intention, as you mentioned in our earlier discussion, is that if we can prove from the caller that certain globals/args don't alias then we may be able to improve upon precision. However, since we are doing a bottom-up style summary-based analysis here, the callees are always going to be analyzed before the callers (assume the absense of callgraph cycles). At the time when the callee is analyzed, we can't use any information from the caller because they may not be available yet. At the time when the caller is analyzed, we have the callee's info but it is not clear to me how AttrGlobal/AttrArgs can be utilized there. I was thinking maybe we should get rid of them at some point. But I'm also not 100% confident that they are completely useless...

grievejia added inline comments.Jun 8 2016, 1:49 PM
lib/Analysis/CFLAliasAnalysis.cpp
1014

But a and aalias will be merged together into one node later, right? Whatever attribute that a gets will be transferred to aalias.

1037–1038

Sorry. That was my negligence.

grievejia updated this revision to Diff 60109.Jun 8 2016, 3:16 PM
grievejia edited edge metadata.
grievejia marked 5 inline comments as done.

Updated according to George's review

I think we're arguing for more or less the same thing here.

The whole point of AttrArgs/AttrGlobals was to say "a value in this set
came (directly, through a load, ...)" from this specific arg/global. If we
take this patch, that becomes "a value in this set came through a load of"
a specific arg/global, which is presumably not more useful than
AttrUnknown. So, if this goes in as-is, we should probably kill
AttrArg/AttrGlobal as well, because they have no purpose.

A more interesting approach may be basically flipping this patch upside
down: only use AttrArg/AttrGlobal when we have an assign edge to
args/globals, and put AttrUnknown on the sets under the ones marked with
AttrArg/AttrGlobal. Obviously, that would only be useful if we think
differentiating between args/globals would be useful. If we don't think
this, then I'm happy to accept this patch with the removal of
Attr{Arg,Global}.

it is not clear to me how AttrGlobal/AttrArgs can be utilized there. I

was thinking maybe we should get rid of them at some point

When making arg attrs/global attrs, my thoughts were "this might be useful.
If not, it'll take a few minutes to refactor this down to a single bit.
Let's be flexible." I don't have a concrete plan for how they'll be useful.
I'd assumed it would be useful when used with interprocedural magicks, but
if that's not possible, that's fine. :)

That said, I'm reluctant to do that few minutes of refactoring now rather
than later (unless it's a maintenance burden) if it's reasonably possible
that we would find this information useful in the relatively near future.

lib/Analysis/CFLAliasAnalysis.cpp
1014

But a and aalias will be merged together into one node later, right? Whatever attribute that a gets will be transferred to aalias

Sure, but I'm talking about b's set. :)

In the former case, b's set will end up with a different set of attributes than the latter case, since valueToAttr(aalias) won't return argument attributes, whereas valueToAttr(a) does.

So, we would have differing attributes on b's set in both cases, no?

grievejia added inline comments.Jun 8 2016, 4:29 PM
lib/Analysis/CFLAliasAnalysis.cpp
1014

Ah I see what you are saying!

Yeah this is definitely not intentionally and is a big problem... It's also not clear to me how it can be solved cleanly. I think I may have to bring in the "fake a node below this argument" trick here...

A more interesting approach may be basically flipping this patch upside
down: only use AttrArg/AttrGlobal when we have an assign edge to
args/globals, and put AttrUnknown on the sets under the ones marked with
AttrArg/AttrGlobal. Obviously, that would only be useful if we think
differentiating between args/globals would be useful. If we don't think
this, then I'm happy to accept this patch with the removal of
Attr{Arg,Global}.

I (mistakenly) thought AttrGlobal/Arg was mainly motivated by the loads from globals/args, not the values. Now that you've explained your thoughts and I have a better understanding, I think your plan is definitely superior :)

I (mistakenly) thought AttrGlobal/Arg was mainly motivated by the loads from globals/args, not the values. Now that you've explained your thoughts and I have a better understanding, I think your plan is definitely superior :)

Ehh, it was initially motivated by the values, but turned into a very convenient (albeit unclear) way to handle the loads, as well. Looking back, I could've done a much better job of splitting it all out properly, rather than going with the easiest solution. :P

grievejia updated this revision to Diff 60203.Jun 9 2016, 10:58 AM
grievejia marked an inline comment as done.

Removed AttrExternal and enforced a revised meaning of AttrGlobal/Arg instead

Looks good with a few nits/comments. Thanks for the patch!

lib/Analysis/CFLAliasAnalysis.cpp
1041

Style nit: It's generally encouraged to do auto * if the value is a pointer. (Though, I'm not sure that the rest of the file follows this perfectly)

1042

If there's no set below Val when calling addAttributesBelow, do we need to create one just to tag it with attributes? If not, maybe we could add a forceCreation flag (that defaults to true) or something?

lib/Analysis/StratifiedSets.h
393 ↗(On Diff #60203)

breif -> brief

test/Analysis/CFLAliasAnalysis/attr-escape.ll
6 ↗(On Diff #60203)

Nit that I should've caught earlier: CHECK-LABEL should be used for functions. This lets FileCheck check the remainder of functions in the file if there's a problem with this one (for whatever reason).

18 ↗(On Diff #60203)

CHECK-LABEL

grievejia marked 3 inline comments as done.Jun 9 2016, 1:59 PM
grievejia added inline comments.
lib/Analysis/CFLAliasAnalysis.cpp
1042

If we don't create it, and if another node is put there (either through insertion or through merge) some time later, how can we make sure that the newly created node is tagged AttrUnkown?

test/Analysis/CFLAliasAnalysis/attr-escape.ll
6 ↗(On Diff #60203)

All existing tests under test/Analysis/CFLAliasAnalysis/ use "CHECK". That's why I followed this "convention"...

grievejia updated this revision to Diff 60233.Jun 9 2016, 2:02 PM
grievejia marked 2 inline comments as done.

Nits fixes

george.burgess.iv accepted this revision.Jun 9 2016, 4:07 PM
george.burgess.iv edited edge metadata.

LGTM. Will commit.

lib/Analysis/CFLAliasAnalysis.cpp
1042

Insertion shouldn't be a problem, since the only things we insert after the declaration of this are isolated nodes. Also, I think merging should be fine, since we never actually merge attributes, but I could be wrong.

Either way, I'm happy to leave this as-is and revisit it if it becomes a problem (since the worst case seems to be that we end up with a handful of "useless" bytes in a StratifiedSets instance)

test/Analysis/CFLAliasAnalysis/attr-escape.ll
6 ↗(On Diff #60233)

I'll fix them to be less bad, then. Thanks for the heads-up :)

This revision is now accepted and ready to land.Jun 9 2016, 4:07 PM
grievejia added inline comments.Jun 9 2016, 4:21 PM
lib/Analysis/CFLAliasAnalysis.cpp
1042

OK I see your point (how many times did I say this... I should really work on my comprehension skills).

The thing is, addAttributeBelow() is written with the assumption that useful nodes will be inserted below some time in the future. I guess you are right that right now this assumption does not hold. However, I'm going to mark some actual arguments of CallInst AttrEscape in the next patch, and depending on the way this is implemented we may end up calling addBelow() after addAttributeBelow() at some point.

This revision was automatically updated to reflect the committed changes.