Download Raw Diff

Tokens

"Like" token, awarded by TijmenW.

Details

Reviewers

chandlerc
• dberlin
rsilvera
hfinkel

Summary

This is the first in a series of patches to extend the scoped-noalias metadata implementation by providing an intrinsic for implicit alias scope assignment (in contrast to the alias.scope metadata which provides explicit scope assignment).

The general design for this was discussed at the end of last year (http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078786.html), and while there will need to be further extensions (to support dynamic predicates, at least), this should address two core motivations:

Provide a way to retain dominance information which is necessary to answer loop-sensitive queries
Make it practical to use scoped-noalias metadata to model block-level restrict-qualified pointers from Clang.

To summarize from the llvmdev e-mail, the problem underlying (1) is that, when using the existing metadata scheme, we cannot differentiate between these two after inlining:

void foo(float * restrict a, float * restrict b) {
  for (int i = 0; i < 1600; ++i)
    a[i] = b[i] + 1;
}

and:

void inner(float * restrict a, float * restrict b) {
  *a = *b + 1;
}
void foo(float * a, float * b) {
  for (int i = 0; i < 1600; ++i)
    inner(a+i, b+i);
}

and we currently cheat and assume the former to enable vectorization. This is not correct, however, and the underlying issue is that the current scheme cannot differentiate directly between:

Does access A alias with access B, both in loop L, within a loop iteration
Does access A alias with the memory accessed by B is any loop iteration

the existing metadata really only can answer (a) but the vectorizer needs (b). The existing noalias attribute (like the restrict-qualifier on which it is based) does provide the necessary information (because it applies to all derived pointers, and thus those used in every loop iteration), but we loose it when converting to the existing metadata scheme.

Regarding problem (2), while in theory block-level restrict-qualified pointers can be modeled using the existing metadata scheme, it suffers from a lack of dominance information (from problem (1)), and is also difficult to apply in practice. The core issue is that to determine whether or not a variable is derived from some particular restrict-qualified pointer, you need to determine which accesses might be derived from that pointer via some capture. This is much easier to do after optimization (mem2reg, funcattrs to add nocapture attributes, etc.) than in the frontend directly. In practice, we really want to delay this determination until later in the pipeline (and not try to do it in the frontend).

To address these concerns, we'll add a new intrinsic for implicit scope assignment. The scope, in the sense of modeling restrict-qualified pointers, represents the set of restrict-qualified pointers from which some access might derive. This can be explicitly represented with alias.scope metadata. Then the noalias metadata specified with which scopes the tagged access does not alias. To use the new intrinsic, llvm.noalias, you tag all potentially relevant accesses with noalias metadata and derive all pointers in a particular scope from the return value of the llvm.noalias intrinsic with that scope as an argument. The reason that all relevant accesses need to have noalias metadata, including those using pointers derived from the noalias intrinsic, is a compile-time optimization: we don't need to search for noalias intrinsics, a potentially expensive operation, if there is no noalias metadata tag, and thus, there should be no significant cost to the feature if you're not using it.

One issue (directly relevant in a later patch providing the implementation), is that the scope list provided as the metadata argument can have only one item. This is because we need the ability to search for other intrinsics with the same scope argument, and because metadata no longer have use lists, we can do this only by searching the use list of the MetadataAsValue wrapper (and, as a result, can only find complete matches, so we need to have trivial lists). This may be something we wish to address is some different way, because as it stands, we cannot combine adjacent noalias intrinsics.

To bring up a naming issue: I named the intrinsic llvm.noalias because it is like the noalias function argument. However, conceptually, is serves the same purpose as the alias.scope metadata (and works along with the noalias metadata, which is still needed), so maybe it should be named llvm.alias.scope?

Also, on the data dependence, while poses obvious difficulties on the optimizer (and may of the patches are to address this), a strict control dependence is not sufficient (so we can't make it more like the lifetime intrinsics which take a pointer argument). Specifically, we need to be able to represent this kind of construct:

void foo(T * restrict x, T * restrict y) {
  for (int i = 0; i < 1600; ++i)
    x[2*i+1] = y[2*i] + 1;
}

when inlined from the call foo(q, q).

Diff Detail

Event Timeline

hfinkel updated this revision to Diff 24708.Apr 30 2015, 8:02 AM

hfinkel retitled this revision from to An llvm.noalias intrinsic.

hfinkel updated this object.

hfinkel edited the test plan for this revision. (Show Details)

hfinkel added reviewers: • dberlin, rsilvera, chandlerc, reames.

hfinkel added a subscriber: Unknown Object (MLST).

hfinkel mentioned this in D9376: llvm.noalias - handling of dead intrinsics.Apr 30 2015, 8:07 AM

hfinkel mentioned this in D9377: llvm.noalias - don't block EarlyCSE.

hfinkel mentioned this in D9378: llvm.noalias - Add IRBuilder support.Apr 30 2015, 8:12 AM

hfinkel mentioned this in D9379: llvm.noalias - don't interfere with llvm.assume.Apr 30 2015, 8:17 AM

hfinkel mentioned this in D9380: llvm.noalias - CodeGen support.Apr 30 2015, 8:20 AM

hfinkel mentioned this in D9381: llvm.noalias - SCEV can look through it.Apr 30 2015, 8:24 AM

hfinkel mentioned this in D9382: llvm.noalias - don't prevent loop vectorization.Apr 30 2015, 8:27 AM

hfinkel mentioned this in D9383: returned - BasicAA should look through functions with returned arguments.Apr 30 2015, 8:31 AM

hfinkel mentioned this in D9384: llvm.noalias - isPointerDereferenceable should look through them.Apr 30 2015, 8:35 AM

hfinkel mentioned this in D9385: llvm.noalias - Look through noalias for shouldAlignPointerArgs.Apr 30 2015, 8:38 AM

hfinkel mentioned this in D9386: llvm.noalias - CaptureTracking needs to look through them.Apr 30 2015, 8:41 AM

hfinkel mentioned this in D9387: llvm.noalias - Pointer comparison folding should look through them.Apr 30 2015, 8:45 AM

hfinkel mentioned this in D9388: llvm.noalias - isSafeToLoadUnconditionally can look through them.

hfinkel mentioned this in D9389: llvm.noalias - ObjectSizeOffsetVisitor should look through them.Apr 30 2015, 8:49 AM

hfinkel mentioned this in D9390: llvm.noalias - ObjectSizeOffsetVisitor should look through them.Apr 30 2015, 8:53 AM

hfinkel mentioned this in D9391: llvm.noalias - ObjectSizeOffsetVisitor should look through them.Apr 30 2015, 9:02 AM

hfinkel mentioned this in D9392: llvm.noalias - GetStringLength should look through them too.Apr 30 2015, 9:05 AM

hfinkel mentioned this in D9393: llvm.noalias - InstCombiner::OptimizePointerDifference should look through them.

hfinkel mentioned this in D9394: llvm.noalias - InstCombiner::FoldGEPICmp should look through them.Apr 30 2015, 9:08 AM

hfinkel mentioned this in D9395: llvm.noalias - isObjectSizeLessThanOrEq in InstCombine should look through them.

hfinkel mentioned this in D9397: llvm.noalias - computeKnownBits should look through them.Apr 30 2015, 9:15 AM

hfinkel mentioned this in D9398: llvm.noalias - GetUnderlyingObjects to optionally collect noalias calls.Apr 30 2015, 9:17 AM

hfinkel mentioned this in D9400: llvm.noalias - Use noalias intrinsics when inlining (and update them when cloning metadata).Apr 30 2015, 9:22 AM

hfinkel mentioned this in D9401: llvm.noalias - The AA implementaton.Apr 30 2015, 9:28 AM

hfinkel mentioned this in D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers.Apr 30 2015, 9:42 AM

This looks 100% reasonable to me otherwise ;)

docs/LangRef.rst
12293	This wording is a little odd. I realize you are trying to say something about other pointers, but this intrinsic is really saying something about the accesses with the right metadata, not about their pointers. (unless there's something i missed, and the fact that a pointer has the right noalias data in one access also implies something about that pointer when used in an access not in the right scope) So i would say "llvm.noalias allows the assume that accesses using pointers derived from the return value don't alias with those accesses tagged with associated noalias scopes" or something You also need to very carefully define "derived". Derived could mean a lot of things (IE if i ptrtoint it, add 6, and intoptr it, is it still derived?)
12313	Similar wording issue You probably mean: `llvm.noalias` allows the optimizer to assume that memory accesses using pointers derived from the return value don't alias with memory accesses tagged with `noalias` metadata with compatible scopes.

After thinking about this a bit, I think this is a reasonable approach. I don't claim to be an expert on the semantics of restrict, but am willing to trust that Hal is. :)

I'd mildly prefer we generalize the intrinsic so that's not specific to it's current use case, but I'm completely willing to see that happen at a later date.

I've started going through the "simple" reviews in the following sequence. I'm not going to have lots of review bandwidth to spare, but I'll help where I can.

Resigning as a reviewer to get a very stale review off my list of blocking tasks in phabricator. Please reopen when desired.

Wording changes (and rebased)

Herald added a subscriber: mcrosier. · View Herald TranscriptJul 7 2016, 1:21 AM

Rebase to the current ArgMemOnly spelling.

nhaehnle added a subscriber: nhaehnle.Jul 7 2016, 5:51 AM

nhaehnle added inline comments.

docs/LangRef.rst
12301	Typo: The first argument is the pointer (I haven't really looked at anything, my email client just happened to open here and I started reading out of curiosity...)

Fixed typo (pointed out by @nhaehnle)

hfinkel mentioned this in D22201: Add a 'Returned' intrinsic property corresponding to the 'returned' argument attribute..Jul 10 2016, 11:31 AM

Added the 'Returned' property (from D22201).

Seems fine to me but an alias analysis expert needs to take a closer look.

So, the summary says:

for (int i = 0; i < 1600; ++i)
  x[2*i+1] = y[2*i] + 1;

Is this summary still correct?
If so, i want to understand this. I don't want to make you do the work, i just want to understand what the issue is.

The inlined version will still have noalias calls that will not be able to be hoisted past the pointer arguments they take, no?

Why won't that just work?

Is there any way you could just provide me a bunch of examples that show "what broken thing will happen"?

(I'd say we should whiteboard it, but that seems hard :P)

(I'm also wondering, offhand, if memoryssa's memory versioning does not already automatically provide the thing we want - something that says "this thing goes with this version of memory state, and you can't hoist it above that version of memory state". You'd take the memory version as the argument, and they would stay where they should - below that memory version)

In particular, note that recent issues with llvm.assume are precisely because it is defined as a construct with control flow scope, but is not itself part of control flow.

If instead, llvm.assume's semantics were like noalias, where it was "this intrinsic takes a value as a first argument, a condition as a second argument, and returns a value. Any use of the value it returns may assume the condition is true", it would not matter whether llvm.assume calls got hoisted or not, even all the way to right after the argument, the values being used would still have correct semantics.

(and the only way something could validly replace a non-assume value with an assume value is if it can prove the condition holds).

Of course, if the *goal* is to be able to have some scope in llvm that something holds, you need real control flow, and you just can't avoid this, and we shouldn't avoid it :P

The only solution i can see for that, and to actually have them try to define "scopes", is to either take the control flow as inputs to the lifetime markers, or attach it to existing control flow (IE lifetime start markers, assume, etc, can be properly thought of as attributes of either the top/bottom of a basic block, or an edge, depending how granular you want to be).

(These types of control flow intrinsics are much like having an analysis pass where you cheat because you don't want everything to have to do the work to preserve what you meant. lifetime start/end are precisely like having dominator tree markers in the IR)

In D9375#481339, @dberlin wrote:
So, the summary says:

Also, on the data dependence, while poses obvious difficulties on the optimizer (and may of the patches are to address this), a strict control dependence is not sufficient (so we can't make it more like the lifetime intrinsics which take a pointer argument). Specifically, we need to be able to represent this kind of construct:
"void foo(T * restrict x, T * restrict y) {
for (int i = 0; i < 1600; ++i)
  x[2*i+1] = y[2*i] + 1;
}"

Is this summary still correct?
If so, i want to understand this. I don't want to make you do the work, i just want to understand what the issue is.

Certainly. The summary says:

Also, on the data dependence, while poses obvious difficulties on the optimizer (and may of the patches are to address this), a strict control dependence is not sufficient (so we can't make it more like the lifetime intrinsics which take a pointer argument). > Specifically, we need to be able to represent this kind of construct:

void foo(T * restrict x, T * restrict y) {
for (int i = 0; i < 1600; ++i)
  x[2*i+1] = y[2*i] + 1;
}
when inlined from the call foo(q, q).

and you didn't quote that last line, so I wonder if you missed it. So the problem here is that, if we have foo(q,q), and I had some intrinsic that only had a control dependence, like our current @llvm.noalias but without the return value, then we'd end up with:

call @llvm.noalias(%q, !scope)
call @llvm.noalias(%q, !scope)
%p1 = gep %q, ... (2*i)
%p2 = gep %q, ... (2*i+1)
%v = load %p1, !scope
store %v, %p2, !scope

In this example the indexing is simple and we'd get the (lack of) aliasing anyway, but in the general case the problem is that we lose which pointer values are derived from one restrict pointer and which are derived from the other once we inline. That's why I have @llvm.noalias return a value.

In this example the indexing is simple and we'd get the (lack of) aliasing anyway, but in the general case the problem is that we lose which pointer values are derived from one restrict pointer and which are derived from the other once we inline. That's why I have @llvm.noalias return a value.

Sorry, i misunderstood.
I thought you were saying that llvm.noalias *still* has this problem, in the current implementation, and, as i think i pointed out, it should not, *because* it returns a value. It seems like we are in violent agreement.

But that leads me to the other question:
What control dependence does it have that requires it be marked as argmemonly and not "readnone"?
I don't see any tests attached where it would change the semantics if they were hoisted to wherever, and i don't see how they could be DSE'd/DCE'd if the pointers themselves are not dead :)

So what am i missing?

In D9375#516105, @dberlin wrote:

In this example the indexing is simple and we'd get the (lack of) aliasing anyway, but in the general case the problem is that we lose which pointer values are derived from one restrict pointer and which are derived from the other once we inline. That's why I have @llvm.noalias return a value.

Sorry, i misunderstood.
I thought you were saying that llvm.noalias *still* has this problem, in the current implementation, and, as i think i pointed out, it should not, *because* it returns a value. It seems like we are in violent agreement.

Looks that way.

But that leads me to the other question:
What control dependence does it have that requires it be marked as argmemonly and not "readnone"?
I don't see any tests attached where it would change the semantics if they were hoisted to wherever, and i don't see how they could be DSE'd/DCE'd if the pointers themselves are not dead :)

So what am i missing?

There are no tests because I can't write them yet ;) -- But this certainly a good point, and definitely a good thing to discuss. This goes back to an earlier part of the description (which I've edited a bit to make the point more explicit):

To summarize from the llvmdev e-mail, the problem underlying (1) is that, when using the existing metadata scheme, we cannot differentiate between these two after inlining:
void foo(float * restrict a, float * restrict b) {
  for (int i = 0; i < 1600; ++i)
    a[i] = b[i] + 1;
}
and:
void inner(float * restrict a, float * restrict b, int i) {
  a[i] = b[i] + 1;
}
void foo(float * a, float * b) {
  for (int i = 0; i < 1600; ++i)
    inner(a, b);
}
and we currently cheat and assume the former to enable vectorization. This is not correct, however, and the underlying issue is that the current scheme cannot differentiate directly between:

Does access A alias with access B, both in loop L, within a loop iteration

Does access A alias with the memory accessed by B is any loop iteration

After inlining, the only difference here is that, the two noalias intrinsics:

%... = @llvm.noalias(%a, !scope)
%... = @llvm.noalias(%b, !scope)

will, in the first case, be outside the loop and, in the second case, be inside the loop. There's no other difference, but I need to be able to distinguish them in order to correctly answer the queries the vectorizer wants to make. The vectorizer needs to be able to do alias(a[i], b[i], L), where L is the loop, and the associated MemLocs have sizes that are infinite (or corresponding somehow to the entire range of indices), instead of just referring to the individual accesses. If the intrinsics can be hosted out of loops, then this won't work.

I can't write a test for this yet because we have no such interface yet. Proposing that is what I plan for the next step. Maybe we'll think of a better way, and we can make the intrinsics readnone.

Rebased

Marking this as accepted, based on okay by @dberlin (http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382593.html), under the condition that the intrinsic be named llvm.noalias.scope (I'll rebase everything for the rename).

This revision is now accepted and ready to land.Oct 12 2016, 2:45 PM

jeroen.dobbelaere added a subscriber: jeroen.dobbelaere.Oct 17 2016, 1:29 AM

Rebased.

nhaehnle added inline comments.Sep 29 2017, 8:01 AM

docs/LangRef.rst
12315	-from
12326	-from

marksl added a subscriber: marksl.Jul 31 2018, 11:11 AM

Herald added a subscriber: bollu. · View Herald TranscriptJul 31 2018, 11:11 AM

troyj added a subscriber: troyj.Aug 27 2018, 8:52 AM

Hi Hal,

I have a question about this: as an 'llvm.noalias' intrinsic is introduced for every 'assignment to a restrict pointer', it is not possible to see a difference between 'foo' and 'bar':

void foo(int* pA, int* pC, int n) {
  int* __restrict rp;

  for (int i=0; i<n; ++i) {
    rp=pA+i;
    *rp=42;
    pC[i]=99;
  }
}

void bar(int* pA, int* pC, int n) {
  for (int i=0; i<n; ++i) {
    int* __restrict rp;
 
    rp=pA+i;
    *rp=42;
    pC[i]=99;
  }
}

The problem occurs when unrolling the loop: for 'bar', the metadata must be cloned, as the restrict scope starts and ends with the loop body. For 'foo', the restrict scope resides outside the loop body and the metadata must not be cloned when unrolling.
In the more complex case, the metadata must be partially cloned (= the part that covers restrict scopes introduced inside the loop body). Any idea on an approach for differentiating between those two cases ?

lebedev.ri added a subscriber: lebedev.ri.Dec 19 2018, 2:33 AM

lebedev.ri added a child revision: D9376: llvm.noalias - handling of dead intrinsics.Dec 19 2018, 2:53 AM

rpjohnst added a subscriber: rpjohnst.Jan 4 2019, 12:51 PM

lzutao added a subscriber: lzutao.Mar 26 2019, 10:04 AM

Herald added a subscriber: jdoerfert. · View Herald TranscriptMar 26 2019, 10:04 AM

mati865 added a subscriber: mati865.Jul 31 2019, 2:26 AM

TijmenW awarded a token.Aug 20 2019, 6:25 AM

TijmenW added a subscriber: TijmenW.

Ping @hfinkel @jdoerfert

In D9375#1679424, @xbolva00 wrote:

Ping @hfinkel @jdoerfert

@jeroen.dobbelaere has been working on an update to this patch series, addressing bugs and additional features compared to my patches, and I understand from a recent off-list email with him, that he expects to post something to the list soon.

jeroen.dobbelaere mentioned this in D68485: [PATCH 02/38] [noalias] D9375: An llvm.noalias intrinsic.Oct 4 2019, 2:09 PM

chriselrod added a subscriber: chriselrod.Dec 3 2019, 2:48 AM

This is an archive of the discontinued LLVM Phabricator instance.

An llvm.noalias intrinsic
AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 63432

docs/LangRef.rst

include/llvm/IR/Intrinsics.td

This is an archive of the discontinued LLVM Phabricator instance.

An llvm.noalias intrinsicAcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 63432

docs/LangRef.rst

include/llvm/IR/Intrinsics.td

An llvm.noalias intrinsic
AcceptedPublic